# Introduction to Data Visualization with Bokeh <img src="https://static.bokeh.org/logos/logotype.svg" width=150px height=auto align=right>

---

## Hi! I'm Pavithra :)

* Bokeh core
* Wikimedian
* Student

I go by @pavithraes on twitter and *almost* everywhere else on the internet.

## Get started

mybinder: https://tinyurl.com/BokehWorkshopBeginner

Open the above link in your web browser and navigate to `workshop/workshop.ipynb`

## Table of Contents

* [Data visualization](#Data-visualization)
* [Bokeh](#Bokeh)
* [So, what can bokeh do?](#So,-what-can-bokeh-do?)
* [Installation and setup](#Installation-and-setup)
* [First bokeh plot!](#First-bokeh-plot!)
* [Bar plot](#Bar-plot)
* [Axes and grids](#Axes-and-grids)
* [Data sources](#Data-sources)
* [Tools](#Tools)
* [Widgets](#Widgets)
* [Export and save](#Export-and-save)
* [Resources](#Resources)

---

## Data visualization

**Visualization** is the visual representation of some information: infographics, charts, maps, etc.

**Data Visualization** is the display of data designed to enable analysis, exploration, and discovery.

There are many data visualization tools: ggplot, matplotlib, plotly, tableau, D3, etc.

---

## Bokeh

<img src="../assets/bokeh-plots.png" width=700>

Interactive data visualization library for modern web browsers.

- Standalone HTML, server apps, dashboards
- Versatile graphics
- Large, dynamic or streaming data
- Integration with R, Scala, etc.
- Just Python, no JavaScript needed!

---

## So, what can bokeh do?

I'm glad you asked!

In [58]:
from IPython.display import IFrame
IFrame('https://demo.bokeh.org/sliders', width=900, height=500)

In [59]:
IFrame('https://demo.bokeh.org/movies', width=900, height=800)

In [60]:
IFrame('https://demo.bokeh.org/gapminder', width=1000, height=700)

---

## Installation and setup

In [61]:
from IPython import __version__ as ipython_version
from pandas import __version__ as pandas_version
from numpy import __version__ as numpy_version
from bokeh import __version__ as bokeh_version
print("IPython - %s" % ipython_version)
print("Pandas - %s" % pandas_version)
print("Numpy - %s" % numpy_version)
print("Bokeh - %s" % bokeh_version)

IPython - 7.16.1
Pandas - 1.1.1
Numpy - 1.19.1
Bokeh - 2.2.0


---

## First bokeh plot!

In [65]:
# import statement
from bokeh.plotting import figure, output_notebook, show

# prepare some data
x = [1, 2, 3, 4, 5]
y = [6, 7, 2, 4, 5]

# output to jupyter notebook
output_notebook()

# create a new plot canvas with a title and axis labels
p = figure(title="Scatter plot", x_axis_label='Ix', y_axis_label='y')

# add a circle glyph to renderer
p.diamond(x, y, size=20)

# show the results
show(p)

## Interfaces

Bokeh provides two interfaces levels:

**bokeh.models:** A low-level interface that provides the most flexibility to application developers

**bokeh.plotting:** A higher-level interface centered around composing visual glyphs

We will be using the `bokeh.plotting` interface for this workshop.

## Exercise 1

Tweak the above code to:

a. Use diamond glyphs instead of circle glyphs

b. Change the color of the all the points to "gold"

c. Make all points use differnt colors

d. Add a line glyph in the same figure

*Hint: https://docs.bokeh.org/en/latest/docs/user_guide/plotting.html*

---

## Bar plot

In [66]:
# prepare some categorical data
fruits = ['Apples', 'Pears', 'Nectarines', 'Plums', 'Grapes', 'Strawberries']
counts = [5, 3, 4, 2, 4, 6]

# create a new plot canvas with a title
p = figure(x_range=fruits, plot_height=250, title="Fruit Counts")

# add vertical bar glyphs
p.vbar(x=fruits, top=counts, width=0.9)

# change starting point to zero
# Bonus: Comment out the follwoing line and run this code to see what happens
p.y_range.start = 0

# show the results
show(p)

## Exercise 2

Modify the above graph to display a horizontal bar graph. 

*Hint: Use the glyph method `hbar()`*

---


## Axes and grids

There are three main visual properties:

* **line properties:** line color, width, etc.
* **fill properties:** fill color, alpha, etc.
* **text properties:** font styles, colors, etc.

Let's explore with an example.

In [67]:
# bokeh.sampledata.download()

from bokeh.sampledata.glucose import data

data.head()

Unnamed: 0_level_0,isig,glucose
datetime,Unnamed: 1_level_1,Unnamed: 2_level_1
2010-03-24 09:51:00,22.59,258
2010-03-24 09:56:00,22.52,260
2010-03-24 10:01:00,22.23,258
2010-03-24 10:06:00,21.56,254
2010-03-24 10:11:00,20.79,246


### Datetime axis

In [68]:
# reduce data to one week
week = data.loc['2010-10-01':'2010-10-08']

# set x_axis_type to "datetime"
p = figure(x_axis_type="datetime", title="Glocose Range", plot_height=350, plot_width=800)

# set x and y axis labels
p.xaxis.axis_label = 'Time'
p.yaxis.axis_label = 'Value'

p.line(week.index, week.glucose)

show(p)

In [69]:
# use the formatter property to change the date format
p.xaxis.formatter.days = '%d/%m/%Y'

show(p)

In [70]:
# change some things about the x-axis
p.xaxis.axis_line_width = 3
p.xaxis.axis_line_color = "red"

# change some things about the y-axis
p.yaxis.major_label_text_color = "orange"
p.yaxis.major_label_orientation = "vertical"

# change things on all axes
p.axis.minor_tick_in = -3
p.axis.minor_tick_out = 6

show(p)

### Grids

In [71]:
# change some things about the x-grid
p.xgrid.grid_line_color = None

# change some things about the y-grid
p.ygrid.grid_line_alpha = 0.5
p.ygrid.grid_line_dash = [6, 4]

show(p)

### Bands

In [72]:
# change some things about the x-grid
p.xgrid.grid_line_color = None

# change some things about the y-grid
p.ygrid.band_fill_alpha = 0.1
p.ygrid.band_fill_color = "navy"

show(p)

## Exercise 3

Modify the above plot:

* Make the background clear, no grids or bands
* Change x-axis tick label format to "dd/mm"
* Add x and y axis labels to complete the plot

---

## Data sources

In [73]:
#some imports for upcomming examples

import pandas as pd
import numpy as np

As we have seen, bokeh can work well with Python lists, NumPy arrays, etc.

**`ColumnDataSource`** data type is the central data source object used throughout Bokeh. In fact, at lower levels, the above mentioned inputs are converted to a Bokeh ColumnDataSource. It is a mapping of column names (strings) to sequences of values.

In [74]:
# import ColumnDataSource
from bokeh.models import ColumnDataSource

# create a ColumnDataSource mapping for a python dictionary
source = ColumnDataSource(data = {
    'x' : [1, 2, 3, 4, 5],
    'y' : [3, 7, 8, 5, 1],
})

p = figure(plot_width=400, plot_height=400, title="Scatter plot")
p.circle('x', 'y', size=10, source=source)

show(p)

In [75]:
from bokeh.sampledata.iris import flowers as df

# create a ColumnDataSource mapping for a pandas dataframe
source = ColumnDataSource(df)

p = figure(plot_width=400, plot_height=400, title="Iris dataset", x_axis_label='Petal length', y_axis_label='Petal width')
p.circle('petal_length', 'petal_width', source=source)

show(p)

## Exercise 4

* Import `autpmpg` dataset from sampledata
* Use `head()` to inspect the dataset and create a scatterplot

---

## Tools

The right-toolbar which appears by default on our plots has a bunch of "tools". These interactive tools can be used to report information, to change plot parameters such as zoom level or range extents, or to add, edit, or delete glyphs. Learn more [here](https://docs.bokeh.org/en/latest/docs/user_guide/tools.html).

In [76]:
from bokeh.sampledata.iris import flowers as df

source = ColumnDataSource(df)

p = figure(plot_width=400, plot_height=400, title="Iris dataset", x_axis_label='Petal length', y_axis_label='Petal width',
            tools="wheel_zoom, reset", # specify the exact tools to be displayed
            toolbar_location="above", # toolbar location can be "above", "below", "left", "right", or None
          )
p.circle('petal_length', 'petal_width', source=source)

show(p)

### Tap selection tool

In [77]:
p = figure(plot_width=400, plot_height=400, tools="tap", title="Select a circle")

renderer = p.circle([1, 2, 3, 4, 5], [2, 5, 8, 2, 7], size=50,

                    # set visual properties for selected glyphs
                    selection_color="firebrick",

                    # set visual properties for non-selected glyphs
                    nonselection_fill_alpha=0.2,
                    nonselection_fill_color="grey",
                    nonselection_line_color="firebrick",
                    nonselection_line_alpha=1.0)

show(p)

### Hover tool

In [78]:
from bokeh.models import HoverTool

source = ColumnDataSource(
        data=dict(
            x=[1, 2, 3, 4, 5],
            y=[2, 5, 8, 2, 7],
            desc=['A', 'b', 'C', 'd', 'E'],
        )
    )

hover = HoverTool(
        tooltips=[
            ("index", "$index"),
            ("(x,y)", "($x, $y)"),
            ("desc", "@desc"),
        ]
    )

p = figure(plot_width=400, plot_height=400, tools=[hover], title="Mouse over the dots")

p.circle('x', 'y', size=10, source=source)

show(p)

### Box select and lasso select tools

In [79]:
from bokeh.layouts import gridplot

x = list(range(-20, 21))
y0, y1 = [abs(xx) for xx in x], [xx**2 for xx in x]

# create a ColumnDataSource for the plots to share
source = ColumnDataSource(data=dict(x=x, y0=y0, y1=y1))

TOOLS = "box_select,lasso_select,help"

# create a new plot and add a renderer
left = figure(tools=TOOLS, width=300, height=300)
left.circle('x', 'y0', source=source)

# create another new plot and add a renderer
right = figure(tools=TOOLS, width=300, height=300)
right.circle('x', 'y1', source=source)

p = gridplot([[left, right]])

show(p)

---

## Widgets

Widgets are interactive controls that can be added to Bokeh applications to provide a front end user interface to a visualization. Learn more [here](https://docs.bokeh.org/en/latest/docs/user_guide/interaction/widgets.html).

### ColorPicker tool

In [80]:
from bokeh.io import show
from bokeh.layouts import column
from bokeh.models import ColorPicker
from bokeh.plotting import Figure

plot = Figure(x_range=(0, 1), y_range=(0, 1), plot_width=350, plot_height=350)
line = plot.line(x=(0,1), y=(0,1), color="black", line_width=4)

picker = ColorPicker(title="Line Color")
picker.js_link('color', line.glyph, 'line_color')

show(column(plot, picker))

### Slider widget

In [82]:
from bokeh.layouts import column
from bokeh.models import Slider
from bokeh.plotting import figure, show

plot = figure(plot_width=400, plot_height=400)
r = plot.circle([1,2,3,4,5,], [3,2,5,6,4], radius=0.2, alpha=0.5)

slider = Slider(start=0.1, end=2, step=0.01, value=0.2, title="Circle radius")
slider.js_link('value', r.glyph, 'radius')

show(column(plot, slider))

## Exercise 5

Modify the above code such that the slider value links to the *opacity* of the rendered circle glyph.

---

## Export and save

We used `output_notebook()` to display the plots in oue jupyter notebook so far.

Bokeh also allows you to export directly to an html file, save as png, svg, etc.

In [None]:
from bokeh.plotting import output_file

x = [1, 2, 3, 4, 5]
y = [6, 7, 2, 4, 5]

# output to file
output_file('scatter.html')

p = figure(title="Scatter plot", x_axis_label='x', y_axis_label='y')

p.circle(x, y, size=10)

show(p) # a new file named scatter.html should automatically open in your browser window

Like the `show()` method, bokeh has a `save()` method to save the plot directly without display.

---

## Resources

* [Bokeh website](https://bokeh.org)

* [Bokeh documentation](https://docs.bokeh.org)

* [Official bokeh tutorial](https://mybinder.org/v2/gh/bokeh/bokeh-notebooks/master?filepath=tutorial%2F00%20-%20Introduction%20and%20Setup.ipynb)

* [Community support discourse](https://discourse.bokeh.org)

* [Bokeh code repository](https://github.com/bokeh/bokeh)
