# Interactive visualization

In this notebook we'll look at a simple example of interactive visualization, using the
[Bokeh](https://bokeh.pydata.org) library for Python.

This notebook accompanies the [DataTree course](https://datatree.org.uk), [module 5, Visualization](https://datatree.org.uk/course/view.php?id=5).

First we'll set up our environment by importing some useful stuff:

In [1]:
import numpy as np

from bokeh.io import show, output_notebook
from bokeh.plotting import figure
output_notebook()

## Example data

We are using the same data ([HadCRUT4](https://www.metoffice.gov.uk/hadobs/hadcrut4/)) as we used in the "Timeseries Visualization" example, also in this repository. Please see that notebook for more details about the data. In summary, the data contain a timeseries of temperature changes, from 1850 to (nearly) the present day.

In [2]:
# Some standard imports
import pandas as pd

# Read the data
df = pd.read_csv('HadCRUT.4.6.0.0.monthly_nh.txt',
                 delim_whitespace = True,                       # Columns are delimited by whitespace
                 index_col = 0,                                 # First column is our "index", i.e. the unique row label
                 usecols = [0,1,10,11],                         # We only need some of the columns
                 names = ['date', 'tanom', 'lbound', 'ubound'], # We add our own column names
                 parse_dates = [0],                             # Dates are in the first column... 
                 date_parser = lambda d: pd.to_datetime(d, format='%Y/%m'))  # ... in the format year/month

## An interactive plot

The bokeh library is very powerful and we're only using part of its capabilities here. We're going to recreate almost the same timeseries plot as we saw in the "Visualizing timeseries" part of the course, but this time the plot will be interactive. We'll be able to pan and zoom using the mouse, which helps us to examine the data more closely. This is very useful with such a dense timeseries.

We're also going to include uncertainty information in the visualization. This is hard to see when we are "zoomed out", but as we zoom in we will see this appear. Interactivity has helped to solve the problem of interpreting a data-rich plot without having to make any a priori decisions on what is interesting.

In [3]:
# We use "xwheel_zoom" instead of "wheel_zoom" in this case because the plot
# is more usable if the mouse wheel only zooms the x axis
p = figure(x_axis_type="datetime", title="Northern Hemisphere monthly temperature anomaly",
           tools='xwheel_zoom,pan,box_zoom,reset,save', active_scroll='xwheel_zoom')
p.grid.grid_line_alpha = 0.4 # Use faint grid lines
p.xaxis.axis_label = 'Date'
p.yaxis.axis_label = 'Temperature anomaly (degrees Celsius)'
p.plot_height = 400
p.plot_width = 800

# df.index is the list of date/time values, df['tanom'] holds the values of temperature
p.line(df.index, df['tanom'])

# Draw the region between the lower and upper uncertainty bounds as a
# "patch". Essentially we are creating a complex polygon and drawing it
# onto the plot. This uses code from
# https://github.com/bokeh/bokeh/blob/master/examples/plotting/file/bollinger.py.
# band_x and band_y are the vertices of the polygon that we will draw
band_x = np.append(df.index, df.index[::-1])
band_y = np.append(df['lbound'], df['ubound'][::-1])
p.patch(band_x, band_y, color='#7570B3', line_alpha=0, fill_alpha=0.4)

show(p)

You can navigate this visualization as follows:
 - Use the mouse wheel (or scroll up/down on a trackpad) to zoom the x axis
 - Click and drag to pan around
 - Click the magnifying glass icon, then click-drag a box to zoom into a particular area
 - Click the circular arrows to reset to the original view
 - Click the "save" icon to save an image of the current visualization