# Bokeh for interactive plotting in Python

Bokeh is a python library for creating interactive vizualisations both in browsers and embeded in notebooks

<a id='Contents'></a>
## Contents
<b>

- [1 Resources](#"1") 
- [2 Plots](#"2") 
- [3 Subplots](#3) 
- [4 Widgets](#4) 
- [5 End to End Dashboard](#5) 


</b>

In this notebook we are going to look at how you can use the bokeh package for plotting both in your jupyter notebook and in your browser. Bokhe is a powerful plotting language that can be used to produce simple impactful graphs however at the same time it can also be used to create complete interactive dashboards similar to Power BI. It should be noted that producing an end to end dashboard, most will find more challenging than creating the dashboard in Power BI, however for those skilled in Python it is a great skill to have vailable to you

---
<a id="1"></a>
# 1 Bokeh Resources
[Back to Contents](#Contents)

- This resource
- Bokeh example galary : https://docs.bokeh.org/en/latest/docs/gallery.html#notebook-examples Get some inspiration for some new plotting styles from this gallery
- Tutorial guide : https://www.geeksforgeeks.org/python-bokeh-tutorial-interactive-data-visualization-with-bokeh/   . 
- Bokeh for beginers : https://www.tutorialspoint.com/bokeh/index.htm . This is a great resource for those completely new to bokeh that will take you from simple plots to complete dashboards
- Bokeh user guide : https://docs.bokeh.org/en/latest/docs/user_guide.html
- bokeh chaeat sheet 

Google bokeh resources and there is a huge amount of online resource available to you

In [1]:
from IPython.display import IFrame
plot_fn = 'f0c1e06f-53ba-4f3b-aa9f-b196221f55a3.pdf'
IFrame(plot_fn, width=600, height=400)

---
<a id="2"></a>
# 2 Basic Plots
[Back to Contents](#Contents)
    
Bokeh may not come pre installed in your environment . If this is the case we can install it vai pip from the jupyter environment as shown below. The ! indicates that this cell is going to be executed on as a windows or linux command . The command is requesting a download of the bokeh package. Further infomation on pip can be found here https://realpython.com/what-is-pip/#:~:text=The%20pip%20install%20command,the%20requirements%20that%20it%20needs.

In [2]:
!pip install bokeh --user





<a id="1"></a>

# 1 Basic Plotting with Bokeh
[Back to Contents](#Contents)


In [3]:
import pandas as pd #  bring in pandas package
from bokeh.sampledata.penguins import data # bokeh provides some sample datasets that we can use to experiment with graphs
df = data # assign data to the df variable
df.head() # visualise the first 5 rows of the data


Unnamed: 0,species,island,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g,sex
0,Adelie,Torgersen,39.1,18.7,181.0,3750.0,MALE
1,Adelie,Torgersen,39.5,17.4,186.0,3800.0,FEMALE
2,Adelie,Torgersen,40.3,18.0,195.0,3250.0,FEMALE
3,Adelie,Torgersen,,,,,
4,Adelie,Torgersen,36.7,19.3,193.0,3450.0,FEMALE


### Plotting inside the notebook or in a separate tab
Bokeh can plot in a new tab ijn your web browser or it can plot inline in your notebook. The way we turn this on or off is as follows: 

If you want to perform a plot inline in your notebook you run the following command *bokeh.io.output_notebook()* 

In [4]:
import bokeh.io #  import bokeh.io 
bokeh.io.reset_output()
bokeh.io.output_notebook() #  run this line of code if you want bokeh to plot inline in your notebook

If you would like bokeh to perform your plot in a separate browser tab then run the command below

In [5]:

bokeh.io.reset_output()#  run this line of code if you want bokeh to plot in a separate window

Lets actually make a plot of our penguins data both inline the notebook and as a separate window

First we will plot it in a separate window 

In [6]:
#### plotting in a separate window ######
bokeh.io.reset_output()
from bokeh.plotting import figure #  This function is needed to make our figure 
from bokeh.io import output_file, show # to save our figure and to show our figure

# Create a new figure
fig = figure(x_axis_label="bill_length_mm",  # https://docs.bokeh.org/en/2.4.2/docs/reference/plotting/figure.html gives the options for the kwargs when settting up your figure
             y_axis_label="flipper_length_mm") # we are setting the name of our axis at this point 

fig.scatter("bill_length_mm", # x axis column name . We are also selection the scatter option from figure Bokeh supports all major types of graphs
            "flipper_length_mm",   # y axis column name
            source = df, # set the data source to the dataframe holding your data
           fill_alpha = 0.4, # marker transprarancy
           size =12) # # marker size

# Call function to produce html file and display plot
output_file(filename="my_first_plot.html") # saves the file
show(fig) #  displays the file

In [7]:
#### plotting inline in the notebook ######
bokeh.io.reset_output() # note you need both reset_output() and output_notebook()
bokeh.io.output_notebook() # 
# Create a new figure

fig = figure(x_axis_label="bill_length_mm",  # https://docs.bokeh.org/en/2.4.2/docs/reference/plotting/figure.html gives the options for the kwargs when settting up your figure
             y_axis_label="flipper_length_mm") # we are setting the name of our axis at this point 

fig.scatter("bill_length_mm", # x axis column name . We are also selection the scatter option from figure Bokeh supports all major types of graphs
            "flipper_length_mm",   # y axis column name
            source = df, # set the data source to the dataframe holding your data
           fill_alpha = 0.4, # marker transprarancy
           size =12) # # marker size

# Call function to produce html file and display plot
output_file(filename="my_first_plot.html") # saves the file
show(fig) #  displays the file

### Line graph 

In [8]:
# Often line graphs are used for time data. We will add a time column to the penguin data that could be the dat the penguin was identified and measured
from datetime import date, timedelta # packages for working with datetime format
dates = pd.date_range(date(2022, 1, 1)-timedelta(days=len(df) -1), date(2022, 1, 1),freq='d') #  add a single observation over the ay range
df['observation_date'] = dates

df.head()

Unnamed: 0,species,island,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g,sex,observation_date
0,Adelie,Torgersen,39.1,18.7,181.0,3750.0,MALE,2021-01-23
1,Adelie,Torgersen,39.5,17.4,186.0,3800.0,FEMALE,2021-01-24
2,Adelie,Torgersen,40.3,18.0,195.0,3250.0,FEMALE,2021-01-25
3,Adelie,Torgersen,,,,,,2021-01-26
4,Adelie,Torgersen,36.7,19.3,193.0,3450.0,FEMALE,2021-01-27


Bokeh supports a variety of colour themes for your graphs these can be brought in in built in themes. We also need to bring in a function curdoc that stands for current document. Effectively we are setting the current document theme to *dark minimal* below. We then change the figure type from fig.scatter to fig.line

In [9]:
from bokeh.themes import built_in_themes
from bokeh.io import curdoc
bokeh.io.reset_output() # note you need both reset_output() and output_notebook()
bokeh.io.output_notebook() # 

curdoc().theme = 'dark_minimal' # set the current document to the theme dark minimal

fig = figure(x_axis_label="observation_date", # set up the figure 
             y_axis_label="body_mass_g",
            x_axis_type = 'datetime') #  this has been added in for a certain axis type https://docs.bokeh.org/en/2.4.0/docs/reference/models/axes.html


fig.line(x="observation_date", 
         y ="body_mass_g", 
         source = df,
        line_alpha = 0.9,
        line_color = 'red') # https://docs.bokeh.org/en/latest/docs/user_guide/styling.html#specifying-colors

show(fig)



### Bar graph

The same structure is followed for creating a bar graph. Immediately below we are going to create a new dataframe to produce some data suitable for plotting with a bar graph

In [10]:
species_count = df.species.value_counts().reset_index()
species_count.rename(columns = {'index': 'species_name', 'species': 'count'}, inplace = True)
species_count.head()

Unnamed: 0,species_name,count
0,Adelie,152
1,Gentoo,124
2,Chinstrap,68


In [11]:
fig = figure(x_axis_label="species", # axis labels
             y_axis_label="species_count", # axis labels
            x_range = species_count["species_name"].values) # set the range of categorical labels. Pass in a list of values

fig.vbar(x = "species_name", # x axis column name . See df above 
           top =  "count", # this is the y axis of the bar graph
        source = species_count, # the dataframe source
        width = 0.5) # the bar width
show(fig)

### Bar Graph

In [12]:
import numpy as np #  bring in numpy that we will use to get pi
from bokeh.transform import cumsum # used to calculate the cummulative angles when creating the pi chart
from bokeh.palettes import Category20c # package to map values to a colour scheme 

species_count['angle'] = 2 * np.pi* species_count['count'] / species_count['count'].sum() # We must add a column that represents the value we want to represent as a fraction of a circle in radians
species_count['color'] = Category20c[len(species_count)] # create a colour code

species_count.head()

Unnamed: 0,species_name,count,angle,color
0,Adelie,152,2.776291,#3182bd
1,Gentoo,124,2.264869,#6baed6
2,Chinstrap,68,1.242025,#9ecae1


Notice how the colour column on the dataframe has been create from the *Category20c* package. 

In [13]:
curdoc().theme = 'night_sky'
fig = figure(height = 350,
             title = 'penguin species'
            )

fig.wedge(x = 0, # x axis column name
          y = 1,
          radius = 0.4,
          start_angle=cumsum('angle', include_zero=True),
          end_angle=cumsum('angle'),
          legend_field = 'species_name',
          line_color = 'white',
          fill_color = 'color',
        source = species_count,
        width = 0.5
         )


fig.xgrid.grid_line_color = None
fig.ygrid.grid_line_color = None
fig.xaxis.visible = False
fig.yaxis.visible = False 
fig.background_fill_color = None
fig.border_fill_color = None

show(fig)

### Bubble plots





In [14]:
# Create a new figure
df["body_mass_g_norm"] = df["body_mass_g"] / df["body_mass_g"].max() # create a normlaised column for the body mass index. THis will be used for both the bubble size and colour


In [15]:
from bokeh.palettes import RdBu8
from bokeh.transform import linear_cmap

# Create mapper
# the mapper function creates a linear colour map from column , field name, with a specified colour palleted between the min and max of the specified column
mapper = linear_cmap(field_name="body_mass_g_norm", palette=RdBu8, low=min(df["body_mass_g_norm"]), high=max(df["body_mass_g_norm"]))

fig = figure(x_axis_label="bill_length_mm", 
             y_axis_label="flipper_length_mm")

fig.circle("bill_length_mm", # x axis column name
            "flipper_length_mm",   # y axis column name
           radius = "body_mass_g_norm", #  set the column that will be used for the radius of the plot
            source = df, # dataframe source
           fill_alpha = 0.4, # marker transprarancy
           size =12,
          color = mapper) 

show(fig) #  displays the file

### Changing Style 
- Note we changed the style us curdoc().theme. Here is a guide to the themes available https://docs.bokeh.org/en/latest/docs/reference/themes.html

### Selecting the tools in the plot

Here is a guide to the list of tools you have avialble to you to put in your plot. The tools are the interactive features that you want to have available in your interactive plot
https://www.tutorialspoint.com/bokeh/bokeh_plot_tools.htm

In [16]:
from bokeh.themes import built_in_themes
from bokeh.io import curdoc
curdoc().theme = 'dark_minimal'

tools = ['poly_select', 'wheel_zoom', 'reset', 'save', 'zoom_in', 'crosshair'] # select the tools you want in your plot


fig = figure(x_axis_label="observation_date",
             y_axis_label="body_mass_g",
            tools = tools,
            x_axis_type = "datetime")

fig.line(x="observation_date", 
         y ="body_mass_g", 
         source = df,
        line_alpha = 0.9,
        line_color = 'red'
        ) # https://docs.bokeh.org/en/latest/docs/user_guide/styling.html#specifying-colors

show(fig)


Notice the addtion of the crosshair and zoom in which were not part of the default in the previous plot

### Adding tooltips

- Tooltips will display the value of a point when you hover over it. Lets go back to the basic scatter graph to see this in action

In [17]:
curdoc().theme = 'contrast'
# tool tips takes a list of tuples where the first value of the tuple is the tooltip name and the second is the column name from the data source
TOOLTIPS = [("Species", "@species"), ("Island", "@island")] # 

fig = figure(x_axis_label="bill_length_mm", 
             y_axis_label="flipper_length_mm",
            tooltips = TOOLTIPS)

fig.scatter("bill_length_mm", # x axis column name
            "flipper_length_mm",   # y axis column name
            source = df, # dataframe source
           fill_alpha = 0.4, # marker transprarancy
           size =12) # # marker size

show(fig)

Notice how when you hover over the datapoint we get the infomation about the species and the island that this penguin was found on 

### Overlaying plots

Once an axis has been created using fig, you can plot multiple graphs on that axis. This can be used to highlight different categories in your data

In [18]:


fig = figure(x_axis_label="bill_length_mm", 
             y_axis_label="flipper_length_mm")

# create the first figure which is a scatter plot using circle markers
fig.circle("bill_length_mm", # x axis column name
            "flipper_length_mm",   # y axis column name
            source = df.query("species == 'Adelie'"), # dataframe source
           fill_alpha = 0.4, # marker transprarancy
           size =12,
           ) # # marker size

# the second figure is a scatter plot using square markers 
fig.square("bill_length_mm", # x axis column name
            "flipper_length_mm",   # y axis column name
            source = df.query("species == 'Chinstrap'"), # dataframe source
           fill_alpha = 0.4, # marker transprarancy
           size =12,
          color = 'red') # # marker size

fig.triangle("bill_length_mm", # x axis column name
            "flipper_length_mm",   # y axis column name
            source = df.query("species == 'Gentoo'"), # dataframe source
           fill_alpha = 0.4, # marker transprarancy
           size =12,
          color = 'green') # # marker size


show(fig)

Note the same type of visualisation could also be produced in one type of plot too as seen below

In [19]:
fig = figure(title = "Penguins")
fig.xaxis.axis_label = 'bill_length_mm'
fig.yaxis.axis_label = 'flipper_length_mm'

colormap = {'Adelie': 'red', 'Chinstrap': 'green', 'Gentoo': 'blue'} # This is a dictionary where the key represents the penguin species and the value represents the colour we want those species represented by
colors = [colormap[x] for x in df['species']]
print('colors length :   ', len(colors), 'Unique colors:   ' , list(set(colors))) # infomation about the list colors we created

colors length :    344 Unique colors:    ['blue', 'red', 'green']


In [20]:
for specie in colormap.keys(): #  Takes the penguine species from the colormap dict
    df_spec = df[df['species']==specie] # make a temp dataframe that will contain the result from filtering the original df on the particular species. Note could use df.query instead
    fig.circle(df_spec["bill_length_mm"], df_spec['flipper_length_mm'], # create the circle or scatter plot
             color=colormap[specie], fill_alpha=0.2, size=10, legend=specie)
    # print(specie)

fig.legend.location = "top_left"
show(fig)



<a id="3"></a>

# 3 Subplots
[Back to Contents](#Contents)

In matplotlib and seaborn we can easily create subplots. This is where we have more than one plot contained in a single figure. These can be plotted horrizontally next to each other or vertically. Bokeh provides exactly the same functionality 

### Horrizontal Plotting

In [21]:
from bokeh.layouts import row #  import row function to align bokeh objects side by side 

In [22]:
pd.get_option('display.max_rows')


60

In [23]:
fig_line = figure(x_axis_label="observation_date", # create line figure that we created before
             y_axis_label="body_mass_g",
            tools = tools,
            x_axis_type = "datetime")

fig_line.line(x="observation_date",  # create the line plot
         y ="body_mass_g", 
         source = df,
        line_alpha = 0.9,
        line_color = 'red'
        ) # https://docs.bokeh.org/en/latest/docs/user_guide/styling.html#specifying-colors


fig_scatter = figure(x_axis_label="bill_length_mm", # create the scatter plots that we have seen before
             y_axis_label="flipper_length_mm",
            tooltips = TOOLTIPS)

fig_scatter.scatter("bill_length_mm", # x axis column name
            "flipper_length_mm",   # y axis column name
            source = df, # dataframe source
           fill_alpha = 0.4, # marker transprarancy
           size =12) # # marker size
# show(fig)
show(row(fig_scatter, fig_line)) #  when we display the graphs using show we use the row command that will apply the graphs side by side

### Vertical Plotting

In [24]:
from bokeh.layouts import column # this function allows us the plot figures on top of each other vertically. 

fig_line = figure(x_axis_label="observation_date", # create line plot figure
             y_axis_label="body_mass_g",
            tools = tools,
            x_axis_type = "datetime")

fig_line.line(x="observation_date", 
         y ="body_mass_g", 
         source = df,
        line_alpha = 0.9,
        line_color = 'red'
        ) # https://docs.bokeh.org/en/latest/docs/user_guide/styling.html#specifying-colors


fig_scatter = figure(x_axis_label="bill_length_mm",  #create scatter plot figure
             y_axis_label="flipper_length_mm",
            tooltips = TOOLTIPS)

fig_scatter.scatter("bill_length_mm", # x axis column name
            "flipper_length_mm",   # y axis column name
            source = df, # dataframe source
           fill_alpha = 0.4, # marker transprarancy
           size =12) # # marker size
# show(fig)
show(column(fig_scatter, fig_line)) # combine plots on top of eachother

### Plot in a grid

Often we may be running through a series of plots in a for loop that we can look to plot in a grid 


In [22]:
species = list(df.species.unique())

Below we are creating 3 separate figures within a forloop for demonstration purposes. We then extract the 3 figures, *fig0, fig1, and fig2* that have been saved as a global variable and pass these figures in as a list to the grid plot. In subsequent bits of code we will adjust this code so we can easily use grid plot for any number of plots

In [23]:
from bokeh.layouts import gridplot #  import the function that allows plots to be displayed in a grid
for i in range(len(species)): #  a plot for each number of species
    df_s = df[df['species'] == species[i]] # create a temporary df that contains the filtered data by the species
    globals() ['fig' + str(i)] = figure(x_axis_label="bill_length_mm",  y_axis_label="flipper_length_mm",  tooltips = TOOLTIPS) #  we are using globals() [<varname>] to create a new variable in a forloop
    
    globals() [str('fig') + str(i)].scatter("bill_length_mm", # x axis column name
                "flipper_length_mm",   # y axis column name
                source = df_s, # dataframe source
               fill_alpha = 0.4, # marker transprarancy
               size =12,
               legend_label = species[i]) # # marker size

show(gridplot([fig0, fig1, fig2] ,ncols = 2))

### Cycling through plots in for loop 

In the code below we can iterate through many plots, add the plots to a list of plots and then pass those plots to gridplot to perform our grid plot. However this time we also add the funcionality to link the axis. This will produce a different output to that above. If we look to zoom or drag a single figure within the grid above the axis are not linked. In the code below we have sections of code that will connect the axis together

In [24]:
bokeh.io.reset_output() # ensuring the the plot displays inline in the notebook
bokeh.io.output_notebook()

plots = [] # create an empty list in which we are going to store the plots 

species = list(df.species.unique()) # get a list of unique species

shared_range_x = None # create a variable with the value None that will be used to link our axis together
shared_range_y = None # create a variable the with value None that will be used to link our axis together. 
for i in range(len(species)):
    df_s = df[df['species'] == species[i]] # filter the df by species
    fig = figure(x_axis_label="bill_length_mm",  # create the figure
             y_axis_label="flipper_length_mm",
            tooltips = TOOLTIPS)
    
    fig.scatter("bill_length_mm", # x axis column name
                "flipper_length_mm",   # y axis column name
                source = df_s, # dataframe source
               fill_alpha = 0.4, # marker transprarancy
               size =12,
               legend_label = species[i]) # # marker size
    
    
    if shared_range_x is None:  # linking x axis together. If the shared_range_x is none this indicates that this is the first plot
        shared_range_x = fig.x_range # if you are the first plot allocate the created figure x_range to shared_range_x
    else: # if this is not the first figure in the for loop this section will be entered. At this point the previous figue x_range will be called shared_x_range
        fig.x_range = shared_range_x #  link the new figures x axis to the previous figures x axis 
        
    if shared_range_y is None:  # linking y axis together with the same logic as the x axis 
        shared_range_y = fig.y_range
    else:
        fig.y_range = shared_range_y
                
    plots.append(fig)
    
    
show(gridplot(plots, ncols =2))

Notice above how all of the axis are linked together. If you zoom in on one of the plots then it zooms in on all of the plots. If you scroll across on 1 axis then it will scroll across on all of the axis

---
<a id="3"></a>
# 3 Widgets
[Back to Contents](#Contents)

Widgets are interactive features that we can add to our plots in order to change the data and the plots features 

In [27]:
# Changing the marker size 

In [25]:
from bokeh.layouts import layout
from bokeh.models import Spinner
from bokeh.models import Div

bokeh.io.reset_output()
bokeh.io.output_notebook() 

labels = species
fig = figure(x_axis_label="bill_length_mm", y_axis_label="bill_depth_mm", title = 'Bill depth v length')
scatter = fig.circle(x="bill_length_mm", y="bill_depth_mm", source=df)

title = Div(text="Bill depth v length")

# Create spinner
spinner = Spinner(title="Glyph size", low=1, high=30, step=1, value=4, width=60)

# Set up the widget action
spinner.js_link("value", scatter.glyph, "size")
output_file(filename="Bill length V depth.html")

# Display the layout
show(layout([title], [spinner, fig]))

### Combining widgets with spinners sliders and colour pickers and other widgets

We have selected three widgets as examples and there uses here . The following link contains infomation on more of the widgets the are available in Bokeh. 

These include:
- Button
- CheckboxButtonGroup
- CheckboxGroup
- ColorPicker
- DataTable
- DatePicker
- DateRangeSlider
- Div
- Dropdown
- FileInput
- MultiChoice
- MultiSelect
- Paragraph
- PasswordInput
- PreText
- RadioButtonGroup
- RadioGroup
- RangeSlider
- Select
- Slider
- Spinner
- Tabs
- TextAreaInput
- TextInput
- Toggle

In [26]:
from bokeh.models import RangeSlider
from bokeh.models import ColorPicker

fig = figure(x_axis_label="bill_length_mm", y_axis_label="bill_depth_mm", title = 'Bill depth v length')
scatter = fig.circle(x="bill_length_mm", y="bill_depth_mm", source=df)

title = Div(text="Bill depth v length")

# Create slider x
slider_x = RangeSlider(title="Bill length mm ", start=10, end=47, value=(10, 47), step=1)
#create slider y 
slider_y = RangeSlider(title="Bill Depth mm ", start=10, end=47, value=(10, 47), step=1, orientation =  "vertical")



spinner = Spinner(title="Marker size", low=1, high=30, step=1, value=4, width=60)
picker = ColorPicker(title="Marker Color")
  

# Set up the widget action
spinner.js_link("value", scatter.glyph, "size")

slider_x.js_link("value", fig.x_range, "start", attr_selector=0)
slider_x.js_link("value", fig.x_range, "end", attr_selector=1)

slider_y.js_link("value", fig.y_range, "start", attr_selector=0)
slider_y.js_link("value", fig.y_range, "end", attr_selector=1)

picker.js_link('color', scatter.glyph, 'fill_color')


output_file(filename="Bill length V depth.html")

# Display the layout
show(layout([spinner, picker], [slider_x], [  slider_y, fig]))

---
<a id="4"></a>
# 4 Project Dashboard - Using the Bokeh Server
[Back to Contents](#Contents)

In order to create entrire dashboards we will want to make use of the bokeh server https://www.tutorialspoint.com/bokeh/bokeh_server.htm

Bokeh architecture has a decouple design in which objects such as plots and glyphs are created using Python and converted in JSON to be consumed by BokehJS client library. If we choose not to use the bokeh server it means we have to define to behaviour of our widget using Java Script, which is a language we dont support. However if you are skilled in Java you can look to run interactive plots without the server. 

The bokeh server allows us to write all of our logic in Python and effectively the server converts it into Java Script for us. THe code that is written in Python is used to create a Bokeh document. Every new connection from our browser results in the bokeh server creating a new document for us


For this tutorial we are going to work through the same process that is given in the following link: 
https://coderzcolumn.com/tutorials/data-science/simple-dashboard-with-widgets-python-bokeh#google_vignette

The tutorial walks us through creating the dashboard in stages

- load in the data
- prepare the plots
- prepare the widgets
- write update function for the plots based on the widgets
- connect them together
- run the server


In [27]:
from sklearn.datasets import load_iris # bring in iris dataset from scikit learn
bokeh.sampledata.download() # run bokeh download sample data command to bring the sample data into local bokeh
from bokeh.sampledata.stocks import GOOG as google # import stock price data in to environment 

Using data directory: C:\Users\PeterBaksh\.bokeh\data
Skipping 'CGM.csv' (checksum match)
Skipping 'US_Counties.zip' (checksum match)
Skipping 'us_cities.json' (checksum match)
Skipping 'unemployment09.csv' (checksum match)
Skipping 'AAPL.csv' (checksum match)
Skipping 'FB.csv' (checksum match)
Skipping 'GOOG.csv' (checksum match)
Skipping 'IBM.csv' (checksum match)
Skipping 'MSFT.csv' (checksum match)
Skipping 'WPP2012_SA_DB03_POPULATION_QUINQUENNIAL.zip' (checksum match)
Skipping 'gapminder_fertility.csv' (checksum match)
Skipping 'gapminder_population.csv' (checksum match)
Skipping 'gapminder_life_expectancy.csv' (checksum match)
Skipping 'gapminder_regions.csv' (checksum match)
Skipping 'world_cities.zip' (checksum match)
Skipping 'airports.json' (checksum match)
Skipping 'movies.db.zip' (checksum match)
Skipping 'airports.csv' (checksum match)
Skipping 'routes.csv' (checksum match)
Skipping 'haarcascade_frontalface_default.xml' (checksum match)


In [28]:
# bring in the iris dataset as a function of time
iris = load_iris()
iris_df = pd.DataFrame(data=iris.data, columns=iris.feature_names)
iris_df["FlowerType"] = iris.target
iris_df.head()

Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),FlowerType
0,5.1,3.5,1.4,0.2,0
1,4.9,3.0,1.4,0.2,0
2,4.7,3.2,1.3,0.2,0
3,4.6,3.1,1.5,0.2,0
4,5.0,3.6,1.4,0.2,0


In [29]:
# Bring in the stock price of google as a function of time. 
from bokeh.sampledata.stocks import GOOG as google
 # bokeh.sampledata.download()
google_df = pd.DataFrame(google)
google_df["date"] = pd.to_datetime(google_df["date"])
google_df.head()

Unnamed: 0,date,open,high,low,close,volume,adj_close
0,2004-08-19,100.0,104.06,95.96,100.34,22351900,100.34
1,2004-08-20,101.01,109.08,100.5,108.31,11428600,108.31
2,2004-08-23,110.75,113.48,109.05,109.4,9137200,109.4
3,2004-08-24,111.24,111.6,103.57,104.87,7631300,104.87
4,2004-08-25,104.96,108.0,103.88,106.0,4598900,106.0


In [30]:
bokeh.io.reset_output()
bokeh.io.output_notebook() #  run this line of code if you want bokeh to plot inline in your notebook
line_chart = figure(plot_width=1000, plot_height=400, x_axis_type="datetime",
                    title="Google Stock Prices from 2005 - 2013") # create figure for line chart 

line_chart.line(
        x="date", y="open",
        line_width=0.5, line_color="dodgerblue",
        legend_label = "open",
        source=google_df
        )

# Set some chart paramters
line_chart.xaxis.axis_label = 'Time'
line_chart.yaxis.axis_label = 'Price ($)'
line_chart.legend.location = "top_left"
show(line_chart)

In [31]:
scatter = figure(plot_width=500, plot_height=400,
                 title="Sepal Length vs Sepal Width Scatter Plot") # create a scatter plot that will contain the iris data of sepal length plotted against width

color_mapping = {0:"tomato", 1:"dodgerblue", 2:"lime"} # creating a colour mapping for the difference species of iris

for cls in [0,1,2]: # Loop over 3 scatter plots for each iris type
    scatter.circle(x=iris_df[iris_df["FlowerType"]==cls]["sepal length (cm)"], # filtering by iriis type and selecting sepal length
               y=iris_df[iris_df["FlowerType"]==cls]["sepal width (cm)"], # filter by iris width and select only sepal width
               color=color_mapping[cls], # set the colour mapping
               size=10,
               alpha=0.8,
               legend_label=iris.target_names[cls])

scatter.xaxis.axis_label= "sepal length (cm)".upper()
scatter.yaxis.axis_label= "sepal width (cm)".upper()

show(scatter)

In [32]:
iris_avg_by_flower_type = iris_df.groupby(by="FlowerType").mean() # group the data by flower type using mean aggregation

bar_chart = figure(plot_width=500, plot_height=400, # set up a bar plot figure
                   title="Average Sepal Length (cm) per Flower Type")

bar_chart.vbar(x = [1,2,3], # create a bar plot
         width=0.9,
         top=iris_avg_by_flower_type["sepal length (cm)"],
         fill_color="blue", line_color="blue", alpha=0.9)

bar_chart.xaxis.axis_label="FlowerType"
bar_chart.yaxis.axis_label="Sepal Length"

bar_chart.xaxis.ticker = [1, 2, 3]
bar_chart.xaxis.major_label_overrides = {1: 'Setosa', 2: 'Versicolor', 3: 'Virginica'}

show(bar_chart)

In [33]:
from bokeh.layouts import row
layout = column(line_chart, row(scatter, bar_chart)) #  set the layout of the dashboard figures. USe row and column in combination
show(layout) # display the dahsboard 

Select the wedigets for the dashboard

In [34]:
from bokeh.models import CheckboxButtonGroup

checkbox_options = ['open','high','low','close'] # define the labels for each button

checkbox_grp = CheckboxButtonGroup(labels=checkbox_options, # labels = button labels,  
                                   active=[0],  # This determines the default button selection
                                   button_type="primary") # button type. options are default, primary, success, warning, danger, light

show(checkbox_grp)

In [35]:
from bokeh.models import Select # import the drop down seleciton box for bokee 

# Set two boxes that will contain options for the columns of the data frame that we wish to plot

drop_scat1 = Select(title="X-Axis-Dim", # box title
                    options=iris.feature_names, # list of options for the drop down box
                    value=iris.feature_names[0], # the default ooption that will display
                    width=200) # # box wdith 

drop_scat2 = Select(title="Y-Axis-Dim", # box title
                    options=iris.feature_names, # list of options for the drop down box
                    value=iris.feature_names[1], # the default ooption that will display
                    width=200)# box wdith 

show(row(drop_scat1, drop_scat2)) # format the boxes side by side

In [36]:
# Design another drop down box that we shall use to change the dimension we will use for the bar plot we have designed
drop_bar = Select(title="Dimension",
                  options=iris.feature_names,
                  value=iris.feature_names[0])

show(drop_bar)

In [37]:
# Re design the dashboard with both the plots and the widgets in place
layout_with_widgets = column(
                            column(checkbox_grp, line_chart),
                            row(
                                column(row(drop_scat1, drop_scat2), scatter),
                                column(drop_bar, bar_chart)))


show(layout_with_widgets)

In [38]:
# If you want to view the aspects of your dashboard we can use the .children method. This returns an object ID.
# If we apply this within the show function we can use specific components of the dashboard
# Below we break downn each aspect of the dashboard
show(layout_with_widgets.children[0])

In [39]:
show(layout_with_widgets.children[1].children[0])

In [40]:
show(layout_with_widgets.children[1].children[0].children[0])

### Creating Callbacks and widget registration

At this point we now have the dashboard and figures designs. We need to figure out what actions need to take place when the widgets are adjusted. We do this using callbacks for our dashboard. Callbacks are functions that we be called when a widget is changed and these functions will determine how the dashboard changes. The callback is registred using the *on_change()* method. 

Below we create a callback function that will update each figure. These functions take the values from the widget and update the plots accordingly. 
Lets look first at the update line chart function

Below we are creating the first callback which gets called when any changes to the checkbox group happen. All callbacks have the same function signature which requires passing attribute name, old value, and new value as three parameters


In [41]:
def update_line_chart(attrname, old, new):
    '''
        Code to update Line Chart as Per Check Box Selection
    '''
    line_chart = figure(plot_width=1000, plot_height=400, x_axis_type="datetime", # copy the graph formatting from when you designed the graph
                        title="Google Stock Prices from 2005 - 2013")

    price_color_map = {"open":"dodgerblue", "close":"tomato", "low":"lime", "high":"orange"}

    for option in checkbox_grp.active: # this will return the integer value that is active in the box. This can be more than one value 
        line_chart.line(
                x="date", y=checkbox_options[option], # this will return the string of the value of the box
                line_width=0.5, line_color=price_color_map[checkbox_options[option]], # this updates the line colour
                legend_label=checkbox_options[option], # returns list of active options
                source=google_df # uses the dataframe that was loaded in initially
            )

    line_chart.xaxis.axis_label = 'Time' # chart formatting
    line_chart.yaxis.axis_label = 'Price ($)' # chart formatting

    line_chart.legend.location = "top_left" # chart formatting

    layout_with_widgets.children[0].children[1] = line_chart # update the widget with this graph


checkbox_grp.on_change("active", update_line_chart) #  this line will run the callback when widget is updated. 

In [42]:
def update_scatter(attrname, old, new):
    '''
        Code to update Scatter Chart as Per Dropdown Selections
    '''
    scatter = figure(plot_width=500, plot_height=400,
                     title="%s vs %s Scatter Plot"%(drop_scat1.value.upper(), drop_scat2.value.upper())) #  update the plot name with the values from the drop down boxes 

    for cls in [0,1,2]:
        scatter.circle(x=iris_df[iris_df["FlowerType"]==cls][drop_scat1.value], #  filter based on drop down box value
                   y=iris_df[iris_df["FlowerType"]==cls][drop_scat2.value], # filter based on drop down box value
                   color=color_mapping[cls], # apply colour mapping for each type
                   size=10,
                   alpha=0.8,
                   legend_label=iris.target_names[cls]) # update legend 

    scatter.xaxis.axis_label= drop_scat1.value.upper() update axis labels
    scatter.yaxis.axis_label= drop_scat2.value.upper()

    layout_with_widgets.children[1].children[0].children[1] = scatter # update the chart in the correct position


drop_scat1.on_change("value", update_scatter) #update the chart when drop down 1 is called
drop_scat2.on_change("value", update_scatter) #  update chart when dropdown 2 is called

SyntaxError: invalid syntax (3195719332.py, line 16)

In [43]:
def update_bar_chart(attrname, old, new):
    '''
        Code to Update Bar Chart as Per Dropdown Selections
    '''
    bar_chart = figure(plot_width=500, plot_height=400,
                       title="Average %s per Flower Type"%drop_bar.value.upper()) # update graph plot with drop down box value

    bar_chart.vbar(x = [1,2,3],
             width=0.9,
             top=iris_avg_by_flower_type[drop_bar.value],
             fill_color="tomato", line_color="tomato", alpha=0.6)

    bar_chart.xaxis.axis_label="FlowerType"
    bar_chart.yaxis.axis_label=drop_bar.value.upper()

    bar_chart.xaxis.ticker = [1, 2, 3]
    bar_chart.xaxis.major_label_overrides = {1: 'Setosa', 2: 'Versicolor', 3: 'Virginica'}

    layout_with_widgets.children[1].children[1].children[1] = bar_chart


drop_bar.on_change("value", update_bar_chart)

We now have all the components required to launch our dashboard. 
We need ot bring all of these components together into a python script. 

Below I have collected all the components and placed them into one cell. 

At the end we have to add the following code. 

*# curdoc().add_root(layout_with_widgets)*

*curdoc().add_root(layout_with_widgets)*

The final step of the dashboard is where we need to host the dashboard. This cant be done inline in jupyter ti requires the Bokeh server. 
Therefore we copy the contents of the cell below into a python file and run it from either inside jupyter or in command line as explained below

In [44]:
import pandas as pd

from bokeh.io import curdoc
from bokeh.plotting import figure
from bokeh.layouts import row, column

from bokeh.models import Select, CheckboxButtonGroup ### Widgets

### Dataset Imports
from bokeh.sampledata.stocks import GOOG as google
from sklearn.datasets import load_iris, load_wine


checkbox_options = ['open','high','low','close']
color_mapping = {0:"tomato", 1:"dodgerblue", 2:"lime"}
price_color_map = {"open":"dodgerblue", "close":"tomato", "low":"lime", "high":"orange"}

#### IRIS Dataset Loading #####
iris = load_iris()

iris_df = pd.DataFrame(data=iris.data, columns=iris.feature_names)
iris_df["FlowerType"] = iris.target


### Google Price Dataset Loading ##############
google_df = pd.DataFrame(google)
google_df["date"] = pd.to_datetime(google_df["date"])



### Line Chart of Google Prices Code Starts ###########

line_chart = figure(plot_width=1000, plot_height=400, x_axis_type="datetime",
                    title="Google Stock Prices from 2005 - 2013")

line_chart.line(
                x="date", y="open",
                line_width=0.5, line_color="dodgerblue",
                legend_label = "open",
                source=google_df
                )

line_chart.xaxis.axis_label = 'Time'
line_chart.yaxis.axis_label = 'Price ($)'

line_chart.legend.location = "top_left"

### Line Chart of Google Prices Code Ends ###########


### Scatter Chart Of IRIS Dimesions Code Starts ###########
scatter = figure(plot_width=500, plot_height=400,
                 title="Sepal Length vs Sepal Width Scatter Plot")

for cls in [0,1,2]:
    scatter.circle(x=iris_df[iris_df["FlowerType"]==cls]["sepal length (cm)"],
               y=iris_df[iris_df["FlowerType"]==cls]["sepal width (cm)"],
               color=color_mapping[cls],
               size=10,
               alpha=0.8,
               legend_label=iris.target_names[cls])

scatter.xaxis.axis_label= "sepal length (cm)".upper()
scatter.yaxis.axis_label= "sepal width (cm)".upper()

### Scatter Chart Of IRIS Dimesions Code Ends ###########


### Bar Chart Of IRIS Dimesions Code Starts ###########
iris_avg_by_flower_type = iris_df.groupby(by="FlowerType").mean()

bar_chart = figure(plot_width=500, plot_height=400,
                    title="Average Sepal Length (cm) per Flower Type")

bar_chart.vbar(x = [1,2,3],
                width=0.9,
                top=iris_avg_by_flower_type["sepal length (cm)"],
                fill_color="tomato", line_color="tomato", alpha=0.9)

bar_chart.xaxis.axis_label="FlowerType"
bar_chart.yaxis.axis_label="Sepal Length"

bar_chart.xaxis.ticker = [1, 2, 3]
bar_chart.xaxis.major_label_overrides = {1: 'Setosa', 2: 'Versicolor', 3: 'Virginica'}

### Bar Chart Of IRIS Dimesions Code Starts ###########


### Widgets Code Starts ################################
drop_scat1 = Select(title="X-Axis-Dim",
                    options=iris.feature_names,
                    value=iris.feature_names[0],
                    width=225)

drop_scat2 = Select(title="Y-Axis-Dim",
                    options=iris.feature_names,
                    value=iris.feature_names[1],
                    width=225)

checkbox_grp = CheckboxButtonGroup(labels=checkbox_options, active=[0], button_type="success")

drop_bar = Select(title="Dimension", options=iris.feature_names, value=iris.feature_names[0])

### Widgets Code Ends ################################


##### Code to Update Charts as Per Widget  State Starts #####################

def update_line_chart(attrname, old, new):
    '''
        Code to update Line Chart as Per Check Box Selection
    '''
    line_chart = figure(plot_width=1000, plot_height=400, x_axis_type="datetime",
                        title="Google Stock Prices from 2005 - 2013")

    for option in checkbox_grp.active:
        line_chart.line(
                x="date", y=checkbox_options[option],
                line_width=0.5, line_color=price_color_map[checkbox_options[option]],
                legend_label=checkbox_options[option],
                source=google_df
            )

    line_chart.xaxis.axis_label = 'Time'
    line_chart.yaxis.axis_label = 'Price ($)'

    line_chart.legend.location = "top_left"

    layout_with_widgets.children[0].children[1] = line_chart


def update_scatter(attrname, old, new):
    '''
        Code to update Scatter Chart as Per Dropdown Selections
    '''
    scatter = figure(plot_width=500, plot_height=400,
                     title="%s vs %s Scatter Plot"%(drop_scat1.value.upper(), drop_scat2.value.upper()))

    for cls in [0,1,2]:
        scatter.circle(x=iris_df[iris_df["FlowerType"]==cls][drop_scat1.value],
                   y=iris_df[iris_df["FlowerType"]==cls][drop_scat2.value],
                   color=color_mapping[cls],
                   size=10,
                   alpha=0.8,
                   legend_label=iris.target_names[cls])

    scatter.xaxis.axis_label= drop_scat1.value.upper()
    scatter.yaxis.axis_label= drop_scat2.value.upper()

    layout_with_widgets.children[1].children[0].children[1] = scatter


def update_bar_chart(attrname, old, new):
    '''
        Code to Update Bar Chart as Per Dropdown Selections
    '''
    bar_chart = figure(plot_width=500, plot_height=400,
                       title="Average %s Per Flower Type"%drop_bar.value.upper())

    bar_chart.vbar(x = [1,2,3],
             width=0.9,
             top=iris_avg_by_flower_type[drop_bar.value],
             fill_color="tomato", line_color="tomato", alpha=0.9)

    bar_chart.xaxis.axis_label="FlowerType"
    bar_chart.yaxis.axis_label=drop_bar.value.upper()

    bar_chart.xaxis.ticker = [1, 2, 3]
    bar_chart.xaxis.major_label_overrides = {1: 'Setosa', 2: 'Versicolor', 3: 'Virginica'}

    layout_with_widgets.children[1].children[1].children[1] = bar_chart

##### Code to Update Charts as Per Widget  State Ends #####################


#### Registering Widget Attribute Change with Methods Code Starts ############# 
checkbox_grp.on_change("active", update_line_chart)

drop_scat1.on_change("value", update_scatter)
drop_scat2.on_change("value", update_scatter)

drop_bar.on_change("value", update_bar_chart)

#### Registering Widget Attribute Change with Methods Code Ends #############

####### Widgets Layout #################
layout_with_widgets = column(
                            column(checkbox_grp, line_chart),
                            row(
                                column(row(drop_scat1, drop_scat2), scatter),
                                column(drop_bar, bar_chart)))


############ Creating Dashboard ################
# curdoc().add_root(layout_with_widgets)
curdoc().add_root(layout_with_widgets)

Once you have saved the dashboard, in my case as *main_dash.py* you need to run it, We can do it from inside our notebook as shown below. The dashboard will appear on the following port *http://localhost:5006/main_dash*

In [None]:
# !bokeh serve --show main_dash.py
!bokeh serve --show  main_dash.py

In [None]:
# A further example is in the shares_dash.py
!bokeh serve --show  shares_dash.py

The easiest way of stopping the server is to kill the server application from command line or from windows task manager you can kill it 

## Extra Bokeh Mapping function

Bokeh mapping https://docs.bokeh.org/en/latest/docs/user_guide/geo.html

In [None]:
from bokeh.plotting import figure, output_file, show
from bokeh.tile_providers import CARTODBPOSITRON, get_provider
from bokeh.models import ColumnDataSource, GMapOptions

output_file("tile.html")

tile_provider = get_provider(CARTODBPOSITRON)

# range bounds supplied in web mercator coordinates
p = figure(x_range=(-2000000, 6000000), y_range=(-1000000, 7000000),
           x_axis_type="mercator", y_axis_type="mercator")
p.add_tile(tile_provider)

source = ColumnDataSource(
    data=dict(lat=[ 30.29,  30.20,  30.29],
              lon=[-97.70, -97.74, -97.78])
)

p.circle(x="lon", y="lat", size=15, fill_color="blue", fill_alpha=0.8, source=source)


show(p)