## Bokeh Tutorial

In [2]:
# Import libraries required

%matplotlib inline
import pandas as pd
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
import math

# Set some Pandas options
pd.set_option('display.notebook_repr_html', False)
pd.set_option('display.max_rows', 250)
pd.set_option('display.max_columns', 250)
pd.set_option('display.width', 200)

In [3]:
# Check the installed versions of Bokeh and Seaborn

from bokeh import __version__ as bokeh_version
print("Bokeh - %s" % bokeh_version)              
from seaborn import __version__ as seaborn_version
print("Seaborn - %s" % seaborn_version)   

Bokeh - 0.12.9
Seaborn - 0.8.0


In [4]:
# Import the required libraries for seaborn and bokeh

import seaborn as sns
from bokeh.io import show, output_notebook
from bokeh.plotting import figure
from bokeh.models import ColumnDataSource
from bokeh.palettes import Spectral5
from bokeh.transform import factor_cmap
output_notebook()

In [5]:
# Download the sample data from the bokeh library

#import bokeh.sampledata
#bokeh.sampledata.download() 

### Basic Plots
Bokeh has various marker types and we use them to create simple scatter plots. Some of the marker types are:<br>
1) asterisk()<br>
2) circle()<br>
3) circle_cross()<br>
4) circle_x()<br>
5) cross()<br>
6) diamond()<br>
7) square(), etc.

In [6]:
p = figure(plot_width=300, plot_height=300)      # create a new plot with default tools, using figure

# add a triangle renderer with a size, color, and alpha
p.triangle([1, 2, 3, 4], [36, 22, 55, 64], size=20, line_color="firebrick", fill_alpha=0.7)
show(p)             # show the results

In [7]:
p = figure(plot_width=300, plot_height=300, title="Line Plot")        # create a new plot (with a title) using figure

p.line([1, 2, 3, 4], [36, 22, 55, 64], line_width=3)                  # add a line renderer
show(p) # show the results

The annulus() methods accepts inner_radius and outer_radius, which can be used to draw filled rings. <br>
The Label annotation allows you to easily attach single text labels to plots. The position and text to display are configured as x, y, and text. Label objects also have standard text, line (border_line) and fill (background_fill) properties.

In [9]:
from bokeh.models.annotations import Label

p = figure(plot_width=300, plot_height=300)
p.annulus(x=[1, 2, 3, 4, 5], y=[1, 2.5, 2.5, 4.5, 8], inner_radius=0.1, outer_radius=0.20, color="red", alpha=0.6)

label = Label(x=3, y=2.5, x_offset=10, text_font_size="10pt", text="Third Point", text_baseline="middle")
p.add_layout(label)

show(p)

### ColumnDataSource and LabelSet
The ColumnDataSource is a mapping of column names to sequences of values. The mapping is provided by passing a dictionary with string keys and lists as values. The values could also be arrays or Pandas sequences. All the columns in a ColumnDataSource must always be the SAME length.<br>

The LabelSet annotation allows you to create many labels at once. For example, if you want to label an entire set of scatter markers, we can use LabelSet, which is similar to Label, but they can also accept a ColumnDataSource as the source property, and then x and y may refer to columns in the data source.

In [37]:
from bokeh.models import ColumnDataSource, LabelSet

source = ColumnDataSource(data=dict(
    average=[55, 74, 38, 41, 44, 26],
    strike_rate=[165, 139, 155, 141, 160, 174],
    names=['Sachin', 'Kohli', 'Dhawan', 'Rahane', 'Yuvraj', 'Dhoni']))

p = figure(x_range=(20, 100))
p.scatter(x='average', y='strike_rate', size=12, source=source)
p.xaxis.axis_label = 'Average'
p.yaxis.axis_label = 'Strike Rate'

labels = LabelSet(x='average', y='strike_rate', text='names', level='glyph',
                  x_offset=5, y_offset=5, source=source, render_mode='canvas')

p.add_layout(labels)

show(p)

In [47]:
# A simple legend is used as shown below:

x = np.linspace(0,3*np.pi,1000)
y = np.sin(x)
z = np.cos(x)

p = figure(height=300)

p.circle(x, y, legend="sin(x)")           # Simple Legend
p.line(x, z, legend="cos(x)", line_dash=[3, 3], line_color="red", line_width=2)

show(p)

### Bar Charts
A simple bar chart is created using the vbar method for drawing vertical bars. There is also a corresponding hbar for horizontal bars. To create a plot with a categorical range, an ordered list of categorical values is passed. Here, we passed the list of countries as x_range, which form the x-axis.<br>
The vbar method takes an x location for the center of the bar, a top and bottom (which defaults to 0), and a width. When we are using a categorical range as we are here, each category implicitly has width of 1, so setting width=0.8 as we have done here makes the bars shrink away from each other.

In [18]:
# Create a list of categorical values 
country = ['Great Britain', 'USA', 'Japan', 'Germany', 'China', 'Russia', 'South Korea', 'France']

# Set the x_range to the list of categories above
p = figure(x_range=country, plot_height=350, title="2016 Summer Olympics")

# Categorical values can also be used as coordinates
p.vbar(x=country, top=[67, 121, 41, 42, 70, 56, 21, 42], width=0.8)

# Set some properties to make the plot look better
p.xgrid.grid_line_color = None
p.y_range.start = 0

show(p)

We can also use vbar with a ColumnDataSource. Here, we put the data in a ColumnDataSource and use that to drive our plot.

In [36]:
from bokeh.models import ColumnDataSource
from bokeh.palettes import Spectral7

country = ['USA', 'Japan', 'Germany', 'China', 'Russia', 'South Korea', 'France']
medals = [ 121, 41, 42, 70, 56, 21, 42]

source = ColumnDataSource(data=dict(country=country, medals=medals, color=Spectral7))

p = figure(x_range=country, plot_height=350, y_range=(0, 150), title="Medal Counts")
p.vbar(x='country', top='medals', width=0.8, color='color', legend="country", source=source)

p.xgrid.grid_line_color = None
p.legend.orientation = "horizontal"
p.legend.location = "top_right"

show(p)

In [41]:
from bokeh.models import FactorRange

country = ['USA', 'Japan', 'Germany', 'China', 'Russia', 'South Korea', 'France']
years = ['2008', '2012', '2016']

data = {'country' : country,
        '2008'   : [111, 25, 41, 99, 59, 32, 42],
        '2012'   : [103, 38, 44, 88, 69, 30, 35],
        '2016'   : [121, 41, 42, 70, 56, 21, 42]}

# Creates [ ("USA", "2008"), ("USA", "2012"), ("USA", "2016"), ("Japan", "2008), ... ]
x = [ (country, year) for country in country for year in years ]
counts = sum(zip(data['2008'], data['2012'], data['2016']), ())        # like a hstack

source = ColumnDataSource(data=dict(x=x, counts=counts))

p = figure(x_range=FactorRange(*x), plot_height=250, title="Medal Counts by Year")

p.vbar(x='x', top='counts', width=0.9, source=source)

p.y_range.start = 0
p.x_range.range_padding = 0.1          # added some padding around the categorical range (at both ends of the axis)
p.xaxis.major_label_orientation = 1
p.xgrid.grid_line_color = None

show(p)

In [44]:
from bokeh.transform import factor_cmap

p = figure(x_range=FactorRange(*x), plot_height=250, title="Medal Counts by Year")

p.vbar(x='x', top='counts', width=0.9, source=source, line_color="white",

       # use the palette to colormap based on the the x[1:2] values
       fill_color=factor_cmap('x', palette=['firebrick', 'green', 'blue'], factors=years, start=1, end=2))

p.y_range.start = 0
p.x_range.range_padding = 0.1
p.xaxis.major_label_orientation = 1
p.xgrid.grid_line_color = None

show(p)

### References
https://github.com/bokeh/bokeh-notebooks