<a href="https://colab.research.google.com/github/rskarbez/colab_notebooks/blob/main/CSE2DV_CSE5INV_Week_11_lab_worksheet.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# CSE2DV/CSE5INV - Week 10 lab notebook

In this week's lab, we're focusing our attention on making *interactive* visualizations. We won't be talking about the visual design of an individual vis; visual design is more about how to convey a particular message or meaning, while interactive design is more about how to enable exploratory data analysis. We don't necessarily know what the data means yet, nor does the user; we're just trying to give them the tools they need to find out.

We're going to be using the `bokeh` visualization library to create our interactive visualizations: http://bokeh.org/. Let's jump right in and see `bokeh` in (inter)action:

In [None]:
# Standard imports 

# bokeh.io, as the name suggests, handles system input/output in bokeh.
# 
# We're using output_notebook because we're running in a (Colab) notebook.
# 
# show is the command to actually render a bokeh plot.
from bokeh.io import output_notebook, show
output_notebook()

In [None]:
# Plot a complex chart with interactive hover in a few lines of code

# bokeh.models provides Bokeh model “building block” classes.
#
# "One of the central design principals of Bokeh is that, regardless of how the 
#  plot creation code is spelled in Python (or other languages), the result is 
#  an object graph that encompasses all the visual and data aspects of the 
#  scene. Furthermore, this scene graph is to be serialized, and it is this 
#  serialized graph that the client library BokehJS uses to render the plot. 
#  The low-level objects that comprise a Bokeh scene graph are called Models.""
#
# The ColumnDataSource (CDS) is the core of most Bokeh plots. 
# It provides the data to the glyphs of your plot.
# Think of a ColumnDataSource as a collection of sequences of data that each 
#   have their own, unique column name.
# (Basically, a column is an attribute.)
from bokeh.models import ColumnDataSource

# By default, the hover tool displays informational tooltips whenever the cursor
#  is directly over a glyph. The data to show comes from the glyph’s data 
#  source, and what to display is configurable with the tooltips property that 
#  maps display names to columns in the data source, or to special known 
#  variables.
# (Basically, use HoverTool to create tooltips.)
from bokeh.models import HoverTool

# The bokeh.plotting API is Bokeh’s primary interface, and lets you focus on 
#  relating glyphs to data. It automatically assembles plots with default 
#  elements such as axes, grids, and tools for you.
# The figure function is at the core of the bokeh.plotting interface. 
# This function creates a Figure model that includes methods for adding 
#  different kinds of glyphs to a plot. This function also takes care of 
#  composing the various elements of your visualization, such as axes, grids, 
#  and default tools.
from bokeh.plotting import figure

# autompg_clean is the name of our pandas dataset, it contains data about cars 
#  and their fuel consumption (in miles per gallon).
from bokeh.sampledata.autompg import autompg_clean as df

# bokeh.transform contains helper functions for applying client-side 
#  computations such as transformations to data fields or ColumnDataSource 
#  expressions.
# The factor_cmap() function creates a dict that applies a client-side 
#  CategoricalColorMapper transformation to a ColumnDataSource column.
# (Basically, it assigns a different color to each category.)
from bokeh.transform import factor_cmap

# Indicate that the 'cyl' (number of cylinders) and 'yr' (model year) attributes 
#  should be treated as strings (text attributes).
df.cyl = df.cyl.astype(str)
df.yr = df.yr.astype(str)

# Hierarchically group the data in our data frame first by the 'cyl' attribute 
#  (number of cylinders), and then by the 'mfr' attribute (manufacturer)
group = df.groupby(by=['cyl', 'mfr'])
# Create a ColumnDataSource object from our grouped datafrme.
source = ColumnDataSource(group)

# Create a Figure object with the following parameters:
#  plot_width=800  => figure is 800 pixels wide
#  plot_height=300 => figure is 300 pixels high
#  title="Mean ..."=> Sets the title of the plot
#  x_range=group   => Customizes the x-axis to group data as described in group
#  toolbar_location=None => Hide the bokeh toolbar
#                           (By default, a toolbar would appear above the view.)
#  tools=""        => Don't include any tools
p = figure(plot_width=800, plot_height=300, title="Mean MPG by # Cylinders and Manufacturer",
           x_range=group, toolbar_location=None, tools="")

# Configure the x-axis in our Figure p
#  xgrid.grid_line color         => color of the grid lines in the x direction
#  xaxis.axis_label              => label of the x-axis
#  xaxis.major_label_orientation => angle of the label text measured in radians
#   (This is what makes the manufacturer labels appear tilted.)
p.xgrid.grid_line_color = None
p.xaxis.axis_label = "Manufacturer grouped by # Cylinders"
p.xaxis.major_label_orientation = 1.2

# Create a color map with a different color for each cylinder count
index_cmap = factor_cmap('cyl_mfr', palette=['#2b83ba', '#abdda4', '#ffffbf', '#fdae61', '#d7191c'], 
                         factors=sorted(df.cyl.unique()), end=1)
# In our Figure p, create a 'vbar' (vertical bar) chart
#  x='cyl_mfr' => x position given by 'cyl_mfr' attribute
#  top='mpg_mean' => max height of each bar given by 'mpg_mean' attribute
#  width=1        => each bar has width 1
#  source=source  => Our (data) source is our ColumnDataSource source variable
#  fill_color=index_cmap => The color of each bar is given by our index_cmap
#  hover_line_color="darkgrey" => The selection outline for our tooltips is grey
#  hover_fill_color="index_cmap" => Selecting a bar doesn't change its color
#    (Rather, it uses the same fill color we were already using: index_cmap)
p.vbar(x='cyl_mfr', top='mpg_mean', width=1, source=source,
       line_color="white", fill_color=index_cmap, 
       hover_line_color="darkgrey", hover_fill_color=index_cmap)

# Add tools to our Figure p
# Specifically, add a HoverTool which provides tooltip information
p.add_tools(HoverTool(tooltips=[("MPG", "@mpg_mean"), ("Cyl, Mfr", "@cyl_mfr")]))

# Put it all together and render the Figure
show(p)

OK, so you might notice that the code we needed to create this figure was just a *bit* more complicated than what we needed for our `seaborn` plots a couple of weeks ago. That's true, but we're doing a lot more with this vis. `bokeh` is a much more powerful and configurable visualization library, and it isn't built on top of `matplotlib` like `seaborn` was.

Let's look at another example that focuses even more on interactivity. The source code for this example is available at: https://github.com/bokeh/bokeh/blob/branch-3.0/examples/server/app/sliders.py

(More example bokeh apps are available here: https://demo.bokeh.org/. The source code for each of these apps can be found in this github directory: https://github.com/bokeh/bokeh/blob/branch-3.0/examples/server/app/. The links are broken most places on the internet.)

In [None]:
# Create and deploy interactive data applications

from IPython.display import IFrame
IFrame('https://demo.bokeh.org/sliders', width=900, height=500)

`bokeh` has very good tutorials and documentation. So as not to duplicate effort, I will point you toward some tutorials and examples which you can dive into as deeply as you would like. (I suggest starting from the beginning, and trying to get as close to LINKING AND INTERACTIONS as you can, given the topic of this week's lecture.)

*   [Basic plotting](https://notebooks.gesis.org/binder/jupyter/user/bokeh-bokeh-notebooks-ueuzscsp/notebooks/tutorial/01%20-%20Basic%20Plotting.ipynb)
*   [Styling and theming](https://notebooks.gesis.org/binder/jupyter/user/bokeh-bokeh-notebooks-ueuzscsp/notebooks/tutorial/02%20-%20Styling%20and%20Theming.ipynb)
*   [Data sources and transformations](https://notebooks.gesis.org/binder/jupyter/user/bokeh-bokeh-notebooks-ueuzscsp/notebooks/tutorial/03%20-%20Data%20Sources%20and%20Transformations.ipynb)
*   [Annotations](https://notebooks.gesis.org/binder/jupyter/user/bokeh-bokeh-notebooks-ueuzscsp/notebooks/tutorial/04%20-%20Adding%20Annotations.ipynb)
*   [Layouts](https://notebooks.gesis.org/binder/jupyter/user/bokeh-bokeh-notebooks-ueuzscsp/notebooks/tutorial/05%20-%20Presentation%20Layouts.ipynb)
*   [LINKING AND INTERACTIONS](https://notebooks.gesis.org/binder/jupyter/user/bokeh-bokeh-notebooks-ueuzscsp/notebooks/tutorial/06%20-%20Linking%20and%20Interactions.ipynb)
*   [Bar and categorical data plots](https://notebooks.gesis.org/binder/jupyter/user/bokeh-bokeh-notebooks-ueuzscsp/notebooks/tutorial/07%20-%20Bar%20and%20Categorical%20Data%20Plots.ipynb)
*   [Graph and network plots in bokeh](https://notebooks.gesis.org/binder/jupyter/user/bokeh-bokeh-notebooks-ueuzscsp/notebooks/tutorial/08%20-%20Graph%20and%20Network%20Plots.ipynb)
*   [Geospatial plots in bokeh](https://notebooks.gesis.org/binder/jupyter/user/bokeh-bokeh-notebooks-ueuzscsp/notebooks/tutorial/09%20-%20Geographic%20Plots.ipynb)
*   https://developers.refinitiv.com/en/article-catalog/article/bokeh--an-interactive-data-visualization-library-in-codebook (This is not from `bokeh` directly, but from Refinitiv.com, which is apparently a financial data and infrastructure provider. Unsurprisingly, the focus is heavily on financial data, but the examples are very thorough.)
*   https://thedatafrog.com/en/articles/interactive-visualization-bokeh-jupyter/ (Also not from `bokeh`, but a decent example of how to handle interactivity in a post on The Data Frog.)



That is it for this lab, and for new lab content this semester. We've gone from Tableau to Python with Bokeh, with a couple of stops in between.

Next week's - Week 12's - labs will be devoted to revision, exam prep, and questions.

Thanks for your hard work throughout the semester. See you one last time next week!