# Discussion 10

## Web Visualization

Web browsers are ubiquitous and support interactivity through JavaScript. This means the web is an excellent platform for visualizations! The Mozilla Developer Network is a good source for [learning more about web development][web-intro].

When making web visualizations, it helps to know a little bit of JavaScript. Here's a [brief intro][js-intro] and a [more detailed guide][js-guide]. 

[js-intro]: https://learnxinyminutes.com/docs/javascript/
[js-guide]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide
[web-intro]: https://developer.mozilla.org/en-US/docs/Learn

Here are the most popular JavaScript libraries used for web visualizations:

<table><tr>
  <th>Library</th><th>Based On</th><th>Python Support</th><th>Description</th>
</tr><tr>
  <td>[D3.js](https://d3js.org/)</td><td>-</td><td>[mpld3](http://mpld3.github.io/)</td>
  <td>
    Short for Data-Driven Documents, D3 allows you to bind data to HTML tags.
    In other words, you can use data to control the structure and style of a
    web page.
  </td>
</tr><tr>
  <td>[Vega](https://vega.github.io/vega/)</td><td>D3.js</td><td>~~vincent~~</td>
  <td>
    A visualization grammar (the same idea as ggplot) built on top of D3. You
    write a description of what you want in JSON, and Vega produces a D3
    visualization.
  </td>
</tr><tr>
  <td>[Vega Lite](https://vega.github.io/vega-lite/)</td><td>Vega</td><td>[altair](https://altair-viz.github.io/)</td>
  <td>
    A visualization grammar for _common statistical graphics_ built on top of
    Vega. You write a JSON description which is translated to Vega and then D3.
  </td>
</tr><tr>
  <td>[plotly.js](https://plot.ly/javascript/)</td><td>D3.js</td><td>[plotly](https://plot.ly/python/)</td>
  <td>
    A visualization library that supports the Python, R, Julia, and MATLAB
    plotly packages. Although this is an open-source library, development
    is controlled by Plotly (a private company).
  </td>
</tr><tr>
  <td>[BokehJS](http://bokeh.pydata.org/en/latest/docs/dev_guide/bokehjs.html)</td><td>-</td><td>[bokeh](http://bokeh.pydata.org/)</td>
  <td>
    A visualization library designed to be used from other (non-JavaScript)
    languages. You write Python, R, or Scala code to produce visualizations.
  </td>
</tr><tr>
  <td>[Leaflet](http://leafletjs.com/)</td><td>-</td><td>[folium](https://github.com/python-visualization/folium)</td>
  <td>
    An interactive maps library that can display GeoJSON data.
   </td>
</tr></table>

Also worth mentioning is the [pygal](http://www.pygal.org/en/stable/) package, which produces SVG plots that can be viewed in a web browser and does not rely on any JavaScript library.

## Static Visualizations

Let's use Bokeh to make a scatterplot of the diamonds data.

In [None]:
import pandas as pd

diamonds = pd.read_csv("diamonds.csv")
diamonds.head()

To display Bokeh plots in a Jupyter notebook, you must first call the setup function `output_notebook()`. You don't have to do this if you're going to save your plots to HTML instead.

In [None]:
import bokeh.io

bokeh.io.output_notebook()

Now we can make a plot. The `bokeh.charts` submodule has functions to create common statistical plots. You can also use functions in the `bokeh.models` submodule to fine-tune plots.

Bokeh's plotting functions work with data frames in [tidy](http://vita.had.co.nz/papers/tidy-data.pdf) form.

In [None]:
import bokeh.charts

plt = bokeh.charts.Scatter(diamonds, x = "carat", y = "price", color = "cut",
    webgl = True, tools = "wheel_zoom,pan", active_scroll = "wheel_zoom"
)
bokeh.charts.show(plt)

# Optional: save the plot to a standalone HTML file.
#bokeh.io.output_file("MY_PLOT.html")

## Maps

In [None]:
import folium

# Make a map.
m = folium.Map(location = [45.5236, -122.6750])

# Optional: set up a Figure to control the size of the map.
fig = folium.Figure(width = 800, height = 400)
fig.add_child(m)

# Optional: save the map to a standalone HTML file.
# fig.save("MY_MAP.html")

The Bay Area Rapid Transit (BART) system publishes [data about where its stations are located](http://www.bart.gov/schedules/developers/geo). The data is in KML format, which is an XML format for geospatial data. We can extract the information directly or find a suitable KML reader for Python.

In [None]:
import lxml.etree as lx

# Extract the names and coordinates from the KML file.
xml = lx.parse("bart.kml")
# XML files use namespaces.
ns = {"d": "http://www.opengis.net/kml/2.2"}
places = xml.findall("//d:Placemark", ns)
places = [(p.find("./d:name", ns).text, p.find(".//d:coordinates", ns).text) for p in places]

# Convert to a dataframe, then split the longitude and latitude.
places = pd.DataFrame(places, columns = ["name", "lonlat"])
places.lonlat = places.lonlat.str.split(",", 2)
places["lon"] = places.lonlat.str.get(0).astype(float)
places["lat"] = places.lonlat.str.get(1)
# Latitude is sometimes malformed, with a space and an extra coordinate.
places.lat = places.lat.str.split(" ", 1).str.get(0).astype(float)
places.drop("lonlat", axis = 1, inplace = True)

places.head()

A GeoDataFrame would also be appropriate. Now we can plot the points on a map.

In [None]:
m = folium.Map(location = [37.8, -122.3], zoom_start = 11)

for name, lon, lat in places.itertuples(index = False):
    folium.Marker([lat, lon], popup = name).add_to(m)

fig = folium.Figure(width = 800, height = 400)
fig.add_child(m)

Folium can also display boundaries stored in GeoJSON files. See the README for more info.

You can use GeoPandas to convert shapefiles to GeoJSON files.

Let's display the distribution of the walrus using [data from the International Union for Conservation of Nature](http://www.iucnredlist.org/technical-documents/spatial-data).

![walrus](https://iucnredlist-photos.s3.amazonaws.com/medium/227009881.jpg?AWSAccessKeyId=AKIAJIJQNN2N2SMHLZJA&Expires=1519478369&Signature=SkDxW9zJSVR6zxU7spntiuEbE%2Bo%3D)

In [None]:
m = folium.Map()
m.choropleth(geo_path = "walrus.geojson")

fig = folium.Figure(width = 800, height = 400)
fig.add_child(m)

## Interactive Visualizations

In order to make a visualization interactive, you need to run some code when the user clicks on a widget. The code can run _client-side_ on the user's machine, or _server-side_ on your server.

For client-side interactivity:

* Your code must be written in JavaScript.
* You can host your visualization on any web server. No special setup is needed.
* Your visualization will use the user's CPU and memory.

For server-side interactivity:

* Your code can be written in any language the server supports. This may require special setup.
* Your visualization will use the server's CPU and memory.
* You can update the data in real-time.
* You can save data submitted by the user.

Shiny is a server-side framework for R. There are lots of server-side frameworks for Python. Two of the most popular are [Django][django] and [Flask][flask].

[django]: https://www.djangoproject.com/
[flask]: http://flask.pocoo.org/

### Client-side

Client-side interactivity is cheaper to get started with because you can use a free web server (like GitHub Pages).

Let's make the diamonds plot interactive so that the user can select which variables get plotted. Unfortunately, Bokeh charts don't work with interactivity, so we have to build the plot with simpler functions. We'll lose the color-coding, although you could still add that with a bit more work.

In [None]:
diamonds.head()

In [None]:
import bokeh.layouts
import bokeh.models
import bokeh.plotting

original = bokeh.models.ColumnDataSource(diamonds)

source = bokeh.models.ColumnDataSource({"x": diamonds.carat, "y": diamonds.price})
plt = bokeh.plotting.figure(tools = [], webgl = True)
plt.circle("x", "y", source = source)

# Set up selector boxes.
numeric_cols = ["carat", "depth", "table", "price", "x", "y", "z"]
sel_x = bokeh.models.widgets.Select(title = "x-axis", options = numeric_cols, value = "carat")
sel_y = bokeh.models.widgets.Select(title = "y-axis", options = numeric_cols, value = "price")

# Callback for x selector box.
callback_x = bokeh.models.CustomJS(args = {"original": original, "source": source}, code = """
    // This is the JavaScript code that will run when the x selector box is changed.
    
    // You can use the alert() function to "print" values.
    alert(cb_obj.value);
    
    source.data['x'] = original.data[cb_obj.value];
    source.trigger('change');
""")

sel_x.js_on_change("value", callback_x)

# Callback for y selector box.
callback_y = bokeh.models.CustomJS(args = {"original": original, "source": source}, code = """
    // This is the JavaScript code that will run when the y selector box is changed.
    
    source.data['y'] = original.data[cb_obj.value];
    source.trigger('change');
""")

sel_y.js_on_change("value", callback_y)

# Position the selector boxes to the right of the plot.
layout = bokeh.layouts.column(sel_x, sel_y)
layout = bokeh.layouts.row(plt, layout)

bokeh.charts.show(layout)

### Server-side

Server-side interactivity is a lot more flexible. Flask is a simple framework with great documentation, so it's easy to get started with.

A demo flask website is available at: <https://github.com/nick-ulle/flask-demo>

The core of a flask website (or "app") is a script with functions that return the text that should be displayed on each page.

```python
# gh_barplot.py
import flask
from flask import Flask

import gh_events

# Set up a Flask app.
app = Flask(__name__)

# This function returns the "/" page.
@app.route("/")
def index():
    events = gh_events.fetch()
    events = gh_events.parse_events(events)
    script, div = gh_events.bar_plot_types(events)
    # Substitute values into the `index.html` template file.
    return flask.render_template("index.html", script = script, div = div)

# This function returns the "/hello1" page.
@app.route("/hello<int:n>")
def hello(n):
    if n == 1:
        return "Hello, world!"
    else:
        return "Hello, all {} worlds!".format(n)
```

This website also uses another script, `gh_events.py` to fetch data from GitHub's API. The `gh_events.py` script is a regular Python script and doesn't contain any flask code.

```python
# gh_events.py
import bokeh, bokeh.charts
from bokeh.plotting import figure
from bokeh.embed import components
import pandas as pd
import requests

# Fetch events from the GitHub API.
def fetch():
    response = requests.get("https://api.github.com/events")
    response.raise_for_status()

    return response.json()

# Parse the event data into a data frame.
def parse_events(events):
    data = (
        (evt['type'], evt['actor']['login'], evt['repo']['name'])
        for evt in events
    )

    return pd.DataFrame.from_records(data, columns = ["Type", "User", "Repo"])

# Make a Bokeh bar plot of the event types.
def bar_plot_types(events):
    plot = bokeh.charts.Bar(events, "Type")

    return bokeh.embed.components(plot)
```

The website's homepage is based on the template file `index.html`. This file uses [Jinja](http://jinja.pocoo.org/) syntax to indicate where substitutions should be made.

```html
<!-- index.html -->
<html>
<head>
  <!-- Bokeh CSS & JavaScript Files -->
  <link
      href="http://cdn.pydata.org/bokeh/release/bokeh-0.12.4.min.css"
      rel="stylesheet" type="text/css">
  <link
      href="http://cdn.pydata.org/bokeh/release/bokeh-widgets-0.12.4.min.css"
      rel="stylesheet" type="text/css">
  
  <script src="http://cdn.pydata.org/bokeh/release/bokeh-0.12.4.min.js">
  </script>
  <script src="http://cdn.pydata.org/bokeh/release/bokeh-widgets-0.12.4.min.js">
  </script>
  <!-- End of Bokeh Files -->
  {{script|safe}}
</head>

<body>
  This is the display.
  {{div|safe}}
</body>
</html>
```