<a href="https://colab.research.google.com/github/suzannelittle/ca682i/blob/master/notebooks/bokeh.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Data visualisation with Bokeh

Bokeh is a Python interactive visualisation library designed for presentation on modern web browsers. The library’s main aims are:
* To provide elegant, concise construction of basic exploratory and advanced custom graphics. 
* To provide high-performance interactivity over very large or streaming data sets.

In [1]:
# standard imports
from bokeh.io import output_notebook, show
from bokeh.plotting import figure

import numpy as np

To display your bokeh graphs in your notebook you need to run `output_notebook()`. You only need to call this once, and all subsequent calls to show() will display inline in the notebook.

In [2]:
output_notebook()

Let's first try a simple line graph to compare with matplotlib and seaborn. This isn't a fair comparison as Bokeh's strengths are in more complex interactive graphs. 

In [3]:
p = figure(plot_width=400, plot_height=400)  # make your figure and set the size

# add a line renderer for each set of data
p.line([0,1,2,3,4,5,6,7,8], [0,1,2,3,4,4,3,2,1], color='red')
p.line([0,1,2,3,4,5,6,7,8], [0,4,3,2,1,1,2,3,4], color='blue')
p.line([0,1,2,3,4,5,6,7,8], [0,2,2,1,3,2,2,1,0], color='green')

show(p) # show the results

You can see some tool options to the right for pan, zooming and saving the figure. 

Now let's see a more complex chart with some interactivity using some built-in sample data.

In [4]:
# Plot a complex chart with interactive hover in a few lines of code

from bokeh.models import ColumnDataSource, HoverTool
from bokeh.plotting import figure
from bokeh.sampledata.autompg import autompg_clean as df
from bokeh.transform import factor_cmap

df.cyl = df.cyl.astype(str)
df.yr = df.yr.astype(str)

group = df.groupby(by=['cyl', 'mfr'])
source = ColumnDataSource(group)

p = figure(plot_width=800, plot_height=300, title="Mean MPG by # Cylinders and Manufacturer",
           x_range=group, toolbar_location=None, tools="")

p.xgrid.grid_line_color = None
p.xaxis.axis_label = "Manufacturer grouped by # Cylinders"
p.xaxis.major_label_orientation = 1.2

index_cmap = factor_cmap('cyl_mfr', palette=['#2b83ba', '#abdda4', '#ffffbf', '#fdae61', '#d7191c'], 
                         factors=sorted(df.cyl.unique()), end=1)

p.vbar(x='cyl_mfr', top='mpg_mean', width=1, source=source,
       line_color="white", fill_color=index_cmap, 
       hover_line_color="darkgrey", hover_fill_color=index_cmap)

p.add_tools(HoverTool(tooltips=[("MPG", "@mpg_mean"), ("Cyl, Mfr", "@cyl_mfr")]))

show(p)

Notice how hovering over the bars gives a popup displaying the data labels. 

## Marker types (glyphs, renderers)

There are a lot of different types of markers (or glyphs) available in Bokeh. These set the shape and style of the marks on your graph. See a full list and lots of examples in the [Bokeh User Guide](https://docs.bokeh.org/en/latest/docs/user_guide/plotting.html#userguide-plotting). 

<img height="60" src="https://camo.githubusercontent.com/cc903cc1484e509d6ad0da73d196137e626eea38/68747470733a2f2f7261772e67697468756275736572636f6e74656e742e636f6d2f73757a616e6e656c6974746c652f6361363832692f6d61737465722f7265736f75726365732f666967757265732f7472792e706e67"/>For now, can you edit the code below to change the points to green squares with purple borders? Try out a few of the different glyph options.

In [5]:
# create a new plot with default tools, using figure
p = figure(plot_width=400, plot_height=400)

p.circle([1, 2, 3, 4, 5], [6, 7, 2, 4, 5], size=15, line_color="navy", fill_color="orange", fill_alpha=0.5)

show(p) # show the results