# Week 10 Discussion

## Infographic

* [Racial Discrimination in Auto Insurance Prices][propublica]

[propublica]: https://www.propublica.org/article/minority-neighborhoods-higher-car-insurance-premiums-methodology

## Links

* [Learn X in Y Minutes, X = JavaScript][js-intro] -- a brief intro to JavaScript
* [MDN JavaScript Guide][js-guide] -- a detailed guide to JavaScript
* [MDN Learning Materials][web-intro] -- more information about web development
* [UC Berkeley Library's GeoData][geodata]

Please fill out TA evals!

[js-intro]: https://learnxinyminutes.com/docs/javascript/
[js-guide]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide
[web-intro]: https://developer.mozilla.org/en-US/docs/Learn
[geodata]: https://geodata.lib.berkeley.edu/

## Web Visualization

Web browsers are ubiquitous and support interactivity (via JavaScript), so the web is an excellent platform for visualizations.

Popular JavaScript libraries used for web visualizations:

<table><tr>
  <th>Library</th><th>Based On</th><th>Python Support</th><th>Description</th>
</tr><tr>
  <td>[D3.js](https://d3js.org/)</td><td>-</td><td>[mpld3](http://mpld3.github.io/)</td>
  <td>
    Short for Data-Driven Documents, D3 allows you to bind data to HTML tags.
    In other words, you can use data to control the structure and style of a
    web page.
  </td>
</tr><tr>
  <td>[Vega](https://vega.github.io/vega/)</td><td>D3.js</td><td>-</td>
  <td>
    A visualization grammar (the same idea as ggplot) built on top of D3. You
    write a description of what you want in JSON, and Vega produces a D3
    visualization.
  </td>
</tr><tr>
  <td>[Vega Lite](https://vega.github.io/vega-lite/)</td><td>Vega</td><td>[altair](https://altair-viz.github.io/)</td>
  <td>
    A visualization grammar for _common statistical graphics_ built on top of
    Vega. You write a JSON description which is translated to Vega and then D3.
  </td>
</tr><tr>
  <td>[plotly.js](https://plot.ly/javascript/)</td><td>D3.js</td><td>[plotly](https://plot.ly/python/)</td>
  <td>
    A visualization library that supports the Python, R, Julia, and MATLAB
    plotly packages. Although this is an open-source library, development
    is controlled by Plotly (a private company).
  </td>
</tr><tr>
  <td>[BokehJS](http://bokeh.pydata.org/en/latest/docs/dev_guide/bokehjs.html)</td><td>-</td><td>[bokeh](http://bokeh.pydata.org/)</td>
  <td>
    A visualization library designed to be used from other (non-JavaScript)
    languages. You write Python, R, or Scala code to produce visualizations.
  </td>
</tr><tr>
  <td>[Leaflet](http://leafletjs.com/)</td><td>-</td><td>[folium](https://github.com/python-visualization/folium)</td>
  <td>
    An interactive maps library that can display GeoJSON data.
   </td>
</tr></table>

Also worth mentioning is the [pygal](http://www.pygal.org/en/stable/) package, which produces SVG plots that can be viewed in a web browser but do not require any JavaScript library.

## Static Visualizations

In [None]:
import pandas as pd

dogs = pd.read_feather("data/dogs.feather")
dogs.head()

To display Bokeh plots in a Jupyter notebook, you must first call the setup function `output_notebook()`. You don't have to do this if you're going to save your plots to HTML instead.

In [None]:
import bokeh.io # conda install bokeh

bokeh.io.output_notebook()

Now we can make a plot. The `bokeh.charts` submodule has functions to create common statistical plots. You can also use functions in the `bokeh.models` submodule to fine-tune plots.

Bokeh's plotting functions work with data frames in [tidy](http://vita.had.co.nz/papers/tidy-data.pdf) form.

In [None]:
from bokeh.plotting import figure, show

#colormap = {'setosa': 'red', 'versicolor': 'green', 'virginica': 'blue'}
#colors = [colormap[x] for x in flowers['species']]

p = figure(title = "Dogs", width = 300, height = 300)
p.xaxis.axis_label = "Datadog Score"
p.yaxis.axis_label = "Popularity"

p.scatter("datadog", "popularity", source = dogs, fill_alpha = 0.2)

show(p)

# Optional: save the plot to a standalone HTML file.
#bokeh.io.output_file("MY_PLOT.html")

## Maps

In [None]:
import folium

# Make a map.
m = folium.Map(location = [45.5236, -122.6750])

# Optional: set up a Figure to control the size of the map.
fig = folium.Figure(width = 600, height = 200)
fig.add_child(m)

# Optional: save the map to a standalone HTML file.
# fig.save("MY_MAP.html")

The dataset about recent restaurant inspections in Yolo County is available [here](http://anson.ucdavis.edu/~nulle/yolo_food.feather)

In [None]:
food = pd.read_feather("data/yolo_food.feather")
food.head()

In [None]:
food.shape

In [None]:
food = food[food.lat.notna() & food.lng.notna()]

In [None]:
m = folium.Map(location = [38.54, -121.74], zoom_start = 11)

cols = ["FacilityName", "lat", "lng"]
for name, lat, lng in food[cols].itertuples(index = False):
    popup = folium.Popup(name, parse_html = True)
    folium.Marker([float(lat), float(lng)], popup = popup).add_to(m)

fig = folium.Figure(width = 800, height = 400)
fig.add_child(m)

Folium can also display boundaries stored in GeoJSON files. See the README for more info.

You can convert shapefiles to GeoJSON with geopandas.


In [None]:
m = folium.Map(location = [37.76, -122.44], zoom_start = 12)
m.choropleth("shapefiles/sf_neighborhoods.geojson", fill_opacity = 0.2, fill_color = "green")

fig = folium.Figure(width = 800, height = 400)
fig.add_child(m)

## Interactive Visualizations

In order to make a visualization interactive, you need to run some code when the user clicks on a widget. The code can run _client-side_ on the user's machine, or _server-side_ on your server.

For client-side interactivity:

* Your code must be written in JavaScript.
* You can host your visualization on any web server. No special setup is needed.
* Your visualization will use the user's CPU and memory.

For server-side interactivity:

* Your code can be written in any language the server supports. This may require special setup.
* Your visualization will use the server's CPU and memory.
* You can update the data in real-time.
* You can save data submitted by the user.

Shiny is a server-side framework for R. There are lots of server-side frameworks for Python. Two of the most popular are [Django][django] and [Flask][flask].

[django]: https://www.djangoproject.com/
[flask]: http://flask.pocoo.org/

### Client-side

Client-side interactivity is cheaper to get started with because you can use a free web server (like GitHub Pages).

Let's make the diamonds plot interactive so that the user can select which variables get plotted. Unfortunately, Bokeh charts don't work with interactivity, so we have to build the plot with simpler functions. We'll lose the color-coding, although you could still add that with a bit more work.

In [None]:
dogs.head()

In [None]:
import bokeh.layouts
from bokeh.models import ColumnDataSource, CustomJS, widgets
from bokeh.plotting import figure, show

original = ColumnDataSource(dogs)

source = ColumnDataSource({"x": dogs["datadog"], "y": dogs["popularity"]})
plt = figure(title = "Dogs", tools = [])
plt.xaxis.axis_label = "datadog"
plt.yaxis.axis_label = "popularity"

plt.scatter("x", "y", source = source, fill_alpha = 0.2)

In [None]:
# Callback for x selector box.
callback_x = CustomJS(args = {"original": original, "source": source, "axis": plt.xaxis[0]}, code = """
    // This is the JavaScript code that will run when the x selector box is changed.
    
    // You can use the alert() function to "print" values.
    //alert(cb_obj.value);
    
    axis.axis_label = cb_obj.value;
    source.data['x'] = original.data[cb_obj.value];
    source.change.emit();
""")

# Callback for y selector box.
callback_y = CustomJS(args = {"original": original, "source": source, "axis": plt.yaxis[0]}, code = """
    // This is the JavaScript code that will run when the y selector box is changed.
    
    axis.axis_label = cb_obj.value;
    source.data['y'] = original.data[cb_obj.value];
    source.change.emit();
""")

# Set up selector boxes.
numeric_cols = ["datadog", "popularity", "lifetime_cost", "longevity"]
sel_x = widgets.Select(title = "x-axis", options = numeric_cols, value = "datadog")
sel_y = widgets.Select(title = "y-axis", options = numeric_cols, value = "popularity")

sel_x.js_on_change("value", callback_x)
sel_y.js_on_change("value", callback_y)

# Position the selector boxes to the right of the plot.
layout = bokeh.layouts.column(sel_x, sel_y)
layout = bokeh.layouts.row(plt, layout)

show(layout)

### Server-side

Server-side interactivity is a lot more flexible. Flask is a simple framework with great documentation, so it's easy to get started with.

The core of a flask website (or "app") is a script with functions that return the text that should be displayed on each page.

See `hello_app.py` for an example flask website.

#### Example: Query Slack

As an example, let's make a flask website that displays recent messages from the class' Slack.

First you need to [get a Slack API token][slack-apps]. Make sure it has the `channels:read` and `channels:history` permissions.

Then you can use the `slackclient` package to query the Slack API.

[slack-apps]: https://api.slack.com/apps

In [None]:
from slackclient import SlackClient

with open("flask/slack_token") as f:
    slack_token = f.readline().strip()

sc = SlackClient(slack_token)

We'll display messages from the `#flask` channel.

Slack tracks channels by ID, not name, so we need to get the channel ID.

Use `channels.list` to get a list of public channels:

In [None]:
channels = sc.api_call("channels.list")
channels = channels["channels"]

In [None]:
chan_id = next(x["id"] for x in channels if x["name"] == "flask")
chan_id

Now let's get the history of the channel:

In [None]:
history = sc.api_call("channels.history", channel = chan_id)

In [None]:
messages = pd.DataFrame(history["messages"])
messages

These steps are turned into a flask website in `slack_app.py`.