## Bokeh

Bokeh is a library that provides very customisable interactive data visualisations.  Because it is so flexible it can be a little complex to use.

The documentation can be found here: https://bokeh.pydata.org/en/latest/

In [None]:
# Allow import of libraries from parent directory
import sys
sys.path.append("..")

In [None]:
from dasi_library import *

In [None]:
# Import the libraries we need for Bokeh
from bokeh.io import output_notebook, show, output_file
from bokeh.plotting import figure
from bokeh.models import ColumnDataSource,HoverTool, CategoricalColorMapper
from bokeh.transform import transform

In [None]:
# This line tells Jupyter that we want to output the charts embedded in the Jupyter notebook
output_notebook()

In [None]:
# Import pandas
#import pandas as pd

In [None]:
# Load the data
df = readCsv("../../datasets/World University Rankings/world_university_rankings_clean.csv")

In [None]:
# Remind ourselves what the data is
df.head()

In [None]:
# Get 2 new dataframes - one with 2016 only and one with 2015 and 2016
df2016 = selectRows(df, df.year==2016)
df201516 = selectRows(df, (df.year==2016) | (df.year==2015))

## The basic scatter plot

In [None]:
# Bokeh needs us to wrap the dataframe to create a source
source = ColumnDataSource(df2016)

# Make a basic plot
p = figure(plot_width=800, plot_height=400)
p.circle('research', 'total_score', source=source)
show(p)

## Extending the scatter plot
What if we want to have both 2015 and 2016 data and use different colours for each?   The code below shows how to do that.  We can also add layout enhancements.

In [None]:
# First we need to convert the year to a string, because the transform() function needs it that way
df201516["year"] = df201516.year.astype(str)

In [None]:
# Bokeh needs us to wrap the dataframe to create a source
source = ColumnDataSource(df201516)

# Create a mapping from the years to a set of colours
factors = list(df201516.year.unique().astype(str)) 
colors = ["red","green","blue","black","orange","brown","grey","purple","yellow","white","pink","peru"]
mapper = CategoricalColorMapper(factors = factors,palette = colors)

# Create a hover tool
hover = HoverTool(tooltips = [("","@university_name"),("Research","@research"),("Total Score","@total_score"),("Income","@income")], mode="hline")

# Make the plot
p = figure(title="University Scores and Research", plot_width=800, plot_height=400, tools=[hover,"crosshair"])
p.circle('research', 'total_score', source=source, color = transform("year",mapper))

# Set the axes labels
p.xaxis.axis_label = 'Research'
p.yaxis.axis_label = 'Total Score'

show(p)

## Your turn>>
Use Bokeh to plot a scatter plot with your own data.

In [None]:
# Enter your code here

## Your turn>>
Explore other Bokeh charts.

In [None]:
# Enter your code here