Course 2.2: Introduction to Bokeh
================

### Links to resources:
- https://docs.bokeh.org/en/latest/docs/first_steps.html


# 1. General Overview

Bokeh is a visualization library that enables interactive manipulation of data. The plots come with interactive tools for users, such as zoom, scroll, and save tools.

The Python code with Bokeh generates HTML and JavaScript web pages, making it accessible across multiple platforms.

# 2. First Program
Importing elements:


In [1]:
from bokeh.plotting import figure, show

Plotting a chart with points at coordinates x, y:

In [2]:
# Constructing the data
x = [1, 2, 3, 4, 5, 6]
y = [12, 10, 12, 11, 14, 13]

# Creating the figure
p = figure()
p.circle(x, y)

# Output
show(p)

Various possible plots include:
- Points of different shapes: circle, triangle, cross, diamond, square...
- Lines: line
- Bars: hbar, vbar
- Surfaces: patch

Here's an example of a chart overlaying multiple plots: yellow bars, red triangle points, a blue line, and a green surface.

In [29]:
from bokeh.plotting import figure, show

# Constructing the data
x = [1, 2, 3, 4, 5, 6]
y = [12, 10, 12, 11, 14, 13]
patchx = [[3, 4, 5, 4]]
patchy = [[5, 7, 8, 3]]

# Creating the figure
p = figure()
p.vbar(x, top=y, color='yellow', width=0.5, alpha=0.5)
p.triangle(x, y, size=15, color='red')
p.line(x, y, color='blue')
p.patches(patchx, patchy, fill_color='green', alpha=0.2)

# Output
show(p)

## Output

An interesting point of Bokeh is that it produces output as a Html file. 
It is possible to precise the output file name: 


In [4]:
# from bokeh.io import output_file
# output_file("course_bokeh.html")


In the context of this course, anyway, we propose to set the output inside of the Jupyter Notebook. 

In [30]:
from bokeh.io import output_notebook

output_notebook()

# 3. Input Data

Bokeh is designed to work with different types of inputs:
- Lists of values (as in the above example)
- Pandas DataFrames
- ColumnDataSource

Let's illustrate these different configurations.

## 3.a Lists of Values
Consider the following data:

In [6]:
from bokeh.plotting import figure,  show, ColumnDataSource
import pandas as pd
from bokeh.palettes import Set1

# Constructing the data
x = [1, 2, 3, 4, 5, 6]
y = [5, 6, 7, 8, 9, 10]



You can plot a figure using these lists of coordinates directly.


In [7]:

p = figure()
p.square(x, y, size=10, color='red')

# Output
show(p)

## 3.b DataFrame
Now, let's create a more complex DataFrame: associating other values (here, color and size) with each x and y coordinate.

In [8]:
df = pd.DataFrame(columns=["x", "y", "color", "size"])
df["x"] = x
df["y"] = y
df["color"] = Set1[6]  # Using the first 6 colors defined in Bokeh's Set1
df["size"] = [5, 12, 18, 32, 32, 25]
print(df)

   x   y    color  size
0  1   5  #e41a1c     5
1  2   6  #377eb8    12
2  3   7  #4daf4a    18
3  4   8  #984ea3    32
4  5   9  #ff7f00    32
5  6  10  #ffff33    25


You can use the x and y columns of the DataFrame to create the following chart.

In [9]:

# Creating the figure
p = figure()
p.cross(df["x"], df["y"], size=10, color='blue')

# Output
show(p)

Opening in existing browser session.


## 3.c ColumnDataSource

A ColumnDataSource is a specific structure provided by Bokeh. You can think of it as a tabular structure, similar to a DataFrame. The special feature is that when creating a chart with Bokeh, you can pass the entire ColumnDataSource, allowing the use of values other than x and y in the graphical representation. You can create a ColumnDataSource directly from a Pandas DataFrame.

Let's illustrate this with the following example. We will create a ColumnDataSource directly from the previous Pandas DataFrame.

In [10]:
data = ColumnDataSource(df)
print(data.column_names)

['index', 'x', 'y', 'color', 'size']


When creating the chart, we will indicate that the data source is our ColumnDataSource. This way, we can use values from different columns: x and y columns for point coordinates, color column content for point color, and size values for point size.

In [11]:
# Creating the figure
p = figure()
p.triangle(x='x', y='y', source=data, color='color', size='size')

# Output
show(p)

## 3.d Details on Input Data

Depending on the type of input data, you may need to specify the type to handle different scales. For example, to use time data, specify the axis type in the figure:

In [12]:
f = figure(title="My Figure", x_axis_type='datetime')



# 4. Labels and Legends

In the following example, we show how to:
- Add a chart title. This title is associated with the figure.
- Add titles to the axes; these titles are also linked to the figure.
- Add legends to each series; these legends are associated with each plot. It is possible to specify the legend's position.

You can change the formatting of different elements using object methods.

In [13]:
from bokeh.plotting import figure, show, ColumnDataSource
import pandas as pd

# Constructing the data
x = [1, 2, 3, 4, 5, 6]
y1 = [15, 16, 15, 15, 17, 18]
y2 = [14, 13, 17, 18, 16, 12]

# Creating the figure with figure title and axis titles
p = figure(title="Temperature Readings", x_axis_label="Day", y_axis_label="Temperature under shelter")
# Plotting data with legend labels
p.line(x, y1, legend_label="Rennes", color='blue')
p.line(x, y2, legend_label="Paris", color='red')
# Legend positioning
p.legend.location = "top_left"

# Formatting data
p.title.text_color = 'green'
p.title.background_fill_color = "purple"
p.xaxis.axis_label_text_color = "orange"
p.yaxis.axis_label_text_font_size = "24pt"
# Output
show(p)

Opening in existing browser session.


# 5. Layout of Charts

## 5.a Tabs

Bokeh allows generating web

 pages by organizing them into multiple tabs: a Tabs containing TabPanel.

In the example below, we create two figures, one for Rennes and one for Paris, which are placed in tabs. Note that, in the end, the show() instruction is called on the tabs.

In [14]:
from bokeh.plotting import figure,  show
from bokeh.models import Tabs, TabPanel


# Constructing the data
x = [1, 2, 3, 4, 5, 6]
y1 = [15, 16, 15, 15, 17, 18]
y2 = [14, 13, 17, 18, 16, 12]

# Creating the figures
p1 = figure(title="Temperature in Rennes")
p1.diamond(x, y1, color='blue', size=15)

p2 = figure(title="Temperature in Paris")
p2.diamond(x, y2, color='red', size=15)

# Preparing the tabs
tab1 = TabPanel(child=p1, title="Rennes")
tab2 = TabPanel(child=p2, title="Paris")
tabs = Tabs(tabs=[tab1, tab2])

# Output
show(tabs)

## 5.b Row, Column Layout

Each tab can contain graphs on multiple columns, rows. It is also possible to coordinate axes to compare values.

In the following example, temperature graphs for Rennes and Paris are arranged on the same row (Row).

In [15]:
from bokeh.plotting import figure,  show
from bokeh.layouts import row


# Constructing the data
x = [1, 2, 3, 4, 5, 6]
y1 = [15, 16, 15, 15, 17, 18]
y2 = [14, 13, 17, 18, 16, 12]

# Creating the figures
p1 = figure(title="Temperature in Rennes")
p1.diamond(x, y1, color='blue', size=15)

p2 = figure(title="Temperature in Paris")
p2.diamond(x, y2, color='red', size=15)

# Horizontal arrangement
row_layout = row([p1, p2])
p1.y_range = p2.y_range
# Output
show(row_layout)


# 6. Hover Tool

Bokeh allows adding a tool that provides information when hovering over points: HoverTool. This tool is configurable by specifying which values to display. Then, add the tool to the available tools.

For example, in the temperature readings, you can display the day and temperature when hovering over points.

In [16]:

from bokeh.plotting import figure, output_file, show
from bokeh.models import HoverTool


# Constructing the data
x = [1, 2, 3, 4, 5, 6]
y = [15, 16, 15, 15, 17, 18]

# Creating the figure
p = figure(title="Temperature in Rennes")
p.diamond(x, y, color='blue', size=15)

# Creating the tool
hover_tool = HoverTool(tooltips=[('Day', '@x'), ('Temperature', '@y')])
p.add_tools(hover_tool)

# Output
show(p)


In [20]:
import pandas as pd
df = pd.read_csv("AAPL_data.csv")
df.date = pd.to_datetime(df.date)
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1259 entries, 0 to 1258
Data columns (total 7 columns):
 #   Column  Non-Null Count  Dtype         
---  ------  --------------  -----         
 0   date    1259 non-null   datetime64[ns]
 1   open    1259 non-null   float64       
 2   high    1259 non-null   float64       
 3   low     1259 non-null   float64       
 4   close   1259 non-null   float64       
 5   volume  1259 non-null   int64         
 6   Name    1259 non-null   object        
dtypes: datetime64[ns](1), float64(4), int64(1), object(1)
memory usage: 69.0+ KB


In [40]:
from bokeh.plotting import figure, show, ColumnDataSource

# Creating the figure with figure title and axis titles
p = figure(title="Apple Stock Evaluation", x_axis_label="Date", y_axis_label="Value in Dollars", x_axis_type='datetime')

# Plotting data with legend labels
p.line(df.date, df.high, legend_label="High", color='red')
p.line(df.date, df.low, legend_label="Low", color='blue')

# Legend positioning
p.legend.location = "top_right"

# Formatting data
p.title.text_color = 'white'
p.title.background_fill_color = "lightgray"
p.title.border_line_color = "black"
# p.xaxis.axis_label_text_color = "orange"
p.yaxis.axis_label_text_font_size = "24pt"
# Output
show(p)

In [52]:
from bokeh.plotting import figure,  show
from bokeh.models import Tabs, TabPanel, HoverTool, ColumnDataSource


# Creating the figures
source = ColumnDataSource(df)

p1 = figure(title="Closing Value vs Opening Value")
p1.circle('open', 'close', color='red', size=4, source=source)
hover_tool_p1 = HoverTool(tooltips=[('Date', '@date{%F}'), ('Opening Value', '@open'), ('Closing Value', '@close')], formatters={'@date': 'datetime'})
p1.add_tools(hover_tool_p1)

p2 = figure(title="High vs Low")
p2.circle('low', 'high', color='red', size=4, source=source)
hover_tool_p2 = HoverTool(tooltips=[('Date', '@date{%F}'), ('High', '@high'), ('Low', '@low')], formatters={'@date': 'datetime'})
p2.add_tools(hover_tool_p2)

# Preparing the tabs
tab1 = TabPanel(child=p1, title="Closing vs Opening")
tab2 = TabPanel(child=p2, title="High vs Low")
tabs = Tabs(tabs=[tab1, tab2])

# Output
show(tabs)