# Introduction to interactive visualisations with bokeh

https://docs.bokeh.org/en/latest/index.html
* First steps
* Tutorial
* User guide
* Reference
* Gallery

In [40]:
import os
from dotenv import load_dotenv

In [41]:
load_dotenv();

In [42]:
data_folder=os.getenv('DATA_FOLDER')

## Inline Output in Jupyter Notebooks
Basic imports:

In [43]:
from bokeh.plotting import figure, show, output_notebook

call `output_notebook()` instead of (or in addition to) `the output_file()` function to display Bokeh plots inline.

In [44]:
output_notebook() #hide_banner=True

## The Example: CO2 Emissions vs GDP

In [45]:
import pandas as pd

In [46]:
df_co2 = pd.read_csv(os.path.join(data_folder, 'co2_gdp', 'co2_gdp_country.csv'))
df_co2

Unnamed: 0,country,region,year,co2,gdp
0,Afghanistan,South Asia,1964,0.0863,1182.0
1,Afghanistan,South Asia,1965,0.1010,1182.0
2,Afghanistan,South Asia,1966,0.1080,1168.0
3,Afghanistan,South Asia,1967,0.1240,1173.0
4,Afghanistan,South Asia,1968,0.1160,1187.0
...,...,...,...,...,...
8197,Zimbabwe,Sub-Saharan Africa,2009,0.4060,1352.0
8198,Zimbabwe,Sub-Saharan Africa,2010,0.5520,1484.0
8199,Zimbabwe,Sub-Saharan Africa,2011,0.6650,1626.0
8200,Zimbabwe,Sub-Saharan Africa,2012,0.5300,1750.0


[Data Sources in the Bokeh documentation](https://docs.bokeh.org/en/latest/docs/user_guide/basic/data.html)

- Providing data with Python lists or numpy arrays
- ColumnDataSource

In [None]:
from bokeh.models import ColumnDataSource, Slider
from bokeh.models import BooleanFilter, CDSView

In [None]:
# create a ColumnDataSource by passing the DataFrame:
ds_co2 = ColumnDataSource(data=df_co2)

# Define a BooleanFilter for rows where year == 1964
booleans = (df_co2['year'] == 1964).tolist()
filter = BooleanFilter(booleans=booleans)

# Create a CDSView with the filter
view = CDSView(filter=filter)

## Basic Elements of a Bokeh Plot

Some terminology
- Model
- Glyph
- Annotations
- Widget
- Plot
- Layout
- Document

[Glossary in the Bokeh documentation](https://docs.bokeh.org/en/latest/docs/user_guide/intro.html)

In [47]:
# Create figure
p = figure(height=400, width=800)

# add a scatterplot
p.scatter(source=ds_co2, x='gdp', y='co2',view=view)

# display the plot
show(p)

In [48]:
from bokeh.models import Title, LogScale, LogTicker, LogTickFormatter
from bokeh.models import NumeralTickFormatter

In [49]:
# Create figure
p = figure(
    height=400, width=800,
    tools=[]
)

# add a scatterplot
p.scatter(source=ds_co2, x='gdp', y='co2',
          view=view,
          size=10, fill_alpha=0.5
        )


# Set the title and formatting
p.title = Title(
    text='CO2 Emissions vs GDP in 1964',
    align='center',  # Center the title
    text_font_size='16pt'  # Set font size
)

# Axis labels...
p.xaxis.axis_label = "CO2 emissions (metric tons per capita)"

# Changing to log scale
p.y_scale = LogScale()
# and formatting the tick labels
p.yaxis.ticker = LogTicker()
p.yaxis.formatter = LogTickFormatter()
p.yaxis.major_label_text_font_size = '14px'

p.yaxis.axis_label = "GDP (in USD per capita)"
# ... and styling:
p.yaxis.axis_label_text_font_style = 'normal'
p.yaxis.axis_label_text_font_size = '18pt'

p.xaxis.formatter = NumeralTickFormatter(format='0 a')
p.xaxis.major_label_text_font_size = '14px'
p.xaxis.axis_label_text_font_size = '18pt'
p.xaxis.axis_label_text_font_style = 'normal'
 
# display the model
show(p)

In [50]:
from bokeh.models import CategoricalColorMapper
from bokeh.palettes import Category10

In [51]:
# Create figure
p = figure(
    height=400, width=800,
    tools=[]
)

# Get unique regions for categorical coloring
regions = df_co2['region'].unique().tolist()

# Create a color mapper
color_mapper = CategoricalColorMapper(
    factors=regions,
    palette=Category10[len(regions)]
)

# add a scatterplot
p.scatter(source=ds_co2, x='gdp', y='co2',
          view=view,
          size=10, fill_alpha=0.5,
          color={'field': 'region', 'transform': color_mapper},  # Color by region
          legend_field='region'  # Add legend using region field
        )

# Set the title and formatting
p.title = Title(
    text='CO2 Emissions vs GDP in 1964',
    align='center',  # Center the title
    text_font_size='16pt'  # Set font size
)

# Axis labels...
p.xaxis.axis_label = "CO2 emissions (metric tons per capita)"

# Changing to log scale
p.y_scale = LogScale()
# and formatting the tick labels
p.yaxis.ticker = LogTicker()
p.yaxis.formatter = LogTickFormatter()
p.yaxis.major_label_text_font_size = '14px'

p.yaxis.axis_label = "GDP (in USD per capita)"
# ... and styling:
p.yaxis.axis_label_text_font_style = 'normal'
p.yaxis.axis_label_text_font_size = '18pt'


p.xaxis.formatter = NumeralTickFormatter(format='0 a')
p.xaxis.major_label_text_font_size = '14px'
p.xaxis.axis_label_text_font_size = '18pt'
p.xaxis.axis_label_text_font_style = 'normal'

# Customise the legend
p.legend.title = "Region"
p.legend.title_text_font_style = "bold"
p.legend.location = "top_right"
p.legend.label_text_font_size = "10pt"
p.legend.background_fill_alpha = 0.5  # Semi-transparent background

# display the model
show(p)

## Interactive plot tools in bokeh

* Zoom and pan-tools
* Tooltips with the HoverTool
* Selections
* Reset, save etc.
* Widgets, such as switches, checkboxes, sliders etc.

[Bokeh plot tools](https://docs.bokeh.org/en/latest/docs/user_guide/interaction/tools.html)

In [52]:
from bokeh.models import BoxZoomTool, WheelZoomTool, HoverTool, ResetTool

In [53]:
# Create figure
p = figure(
    height=400, width=800,
    tools=[BoxZoomTool(), WheelZoomTool(), HoverTool(), ResetTool()]
)

# Get unique regions for categorical coloring
regions = df_co2['region'].unique().tolist()

# Create a color mapper
color_mapper = CategoricalColorMapper(
    factors=regions,
    palette=Category10[len(regions)]
)

# add a scatterplot
p.scatter(source=ds_co2, x='gdp', y='co2',
          view=view,
          size=10, fill_alpha=0.5,
          color={'field': 'region', 'transform': color_mapper},  # Color by region
          legend_field='region'  # Add legend using region field
        )

# Set the title and formatting
p.title = Title(
    text='CO2 Emissions vs GDP in 1964',
    align='center',  # Center the title
    text_font_size='16pt'  # Set font size
)

# Axis labels...
p.xaxis.axis_label = "CO2 emissions (metric tons per capita)"

# Changing to log scale
p.y_scale = LogScale()
# and formatting the tick labels
p.yaxis.ticker = LogTicker()
p.yaxis.formatter = LogTickFormatter()
p.yaxis.major_label_text_font_size = '14px'

p.yaxis.axis_label = "GDP (in USD per capita)"
# ... and styling:
p.yaxis.axis_label_text_font_style = 'normal'
p.yaxis.axis_label_text_font_size = '18pt'


p.xaxis.formatter = NumeralTickFormatter(format='0 a')
p.xaxis.major_label_text_font_size = '14px'
p.xaxis.axis_label_text_font_size = '18pt'
p.xaxis.axis_label_text_font_style = 'normal'

# Customise the legend
p.legend.title = "Region"
p.legend.title_text_font_style = "bold"
p.legend.location = "top_right"
p.legend.label_text_font_size = "10pt"
p.legend.background_fill_alpha = 0.5  # Semi-transparent background

# Customise the tooltips
hover = p.select_one(HoverTool)
hover.tooltips = [
    # ("index", "$index"),
    # ("(xcoord,ycoord)", "($x, $y)"),
    ("Country", "@country"),
    ("Region", "@region"),
    ("GDP", "@gdp"),
    ("CO2 Emissions", "@co2"),
]

# display the model
show(p)

## Layouts

In [None]:
from bokeh.layouts import row, column, layout, gridplot

In [54]:

def create_figure(year):
    # Create a figure
    p = figure(
        height=250, width=350,
        tools=[BoxZoomTool(), WheelZoomTool(), HoverTool(), ResetTool()]
    )

    # Define a BooleanFilter for rows where year == 1964
    booleans = (df_co2['year'] == year).tolist()
    filter = BooleanFilter(booleans=booleans)

    # Create a CDSView with the filter
    view = CDSView(filter=filter)

    # add a scatterplot
    p.scatter(source=ds_co2, x='gdp', y='co2',
            view=view,
            size=10, fill_alpha=0.5,
            color={'field': 'region', 'transform': color_mapper},  # Color by region
            # legend_field='region'  # Add legend using region field
            )

    # Set the title and formatting
    p.title = Title(
        text='{}'.format(year),
        align='center',  # Center the title
        text_font_size='14pt'  # Set font size
    )

    # Changing to log scale
    p.y_scale = LogScale()
    # and formatting the tick labels
    p.yaxis.ticker = LogTicker()
    p.yaxis.formatter = LogTickFormatter()
    p.yaxis.major_label_text_font_size = '8px'


    p.xaxis.formatter = NumeralTickFormatter(format='0 a')
    p.xaxis.major_label_text_font_size = '8px'

    # Customise the tooltips
    hover = p.select_one(HoverTool)
    hover.tooltips = [
        # ("index", "$index"),
        # ("(xcoord,ycoord)", "($x, $y)"),
        ("Country", "@country"),
        ("Region", "@region"),
        ("GDP", "@gdp"),
        ("CO2 Emissions", "@co2"),
    ]

    return p

def set_ylabel(p, label='GDP (in USD per capita)'):
    p.yaxis.axis_label = label
    p.yaxis.axis_label_text_font_style = 'normal'
    p.yaxis.axis_label_text_font_size = '12pt'

def set_xlabel(p, label='CO2 emissions (metric tons per capita)'):
    p.xaxis.axis_label = label
    p.xaxis.axis_label_text_font_style = 'normal'
    p.xaxis.axis_label_text_font_size = '12pt'

p1 = create_figure(1964)
p2 = create_figure(1974)
p3 = create_figure(1984)

set_ylabel(p1)
set_xlabel(p2)

# put the results in a row and show
show(row(p1, p2, p3))

In [55]:
p1 = create_figure(1964)
p2 = create_figure(1974)
p3 = create_figure(1984)

set_ylabel(p3)
set_ylabel(p1)
set_xlabel(p1)
set_xlabel(p2)

# show(column(row(p3), row(p1, p2)))
# Use gridplot for better alignment
grid = gridplot([[p3, None], [p1, p2]], toolbar_location="right")

show(grid)

## Widgets
to interactively control elements of the Bokeh plot. For example a slider to control the transparency of the points in the plot::

In [56]:
# Create figure
p = figure(
    height=400, width=800,
    tools=[BoxZoomTool(), WheelZoomTool(), HoverTool(), ResetTool()]
)

# Get unique regions for categorical coloring
regions = df_co2['region'].unique().tolist()

# Create a color mapper
color_mapper = CategoricalColorMapper(
    factors=regions,
    palette=Category10[len(regions)]
)

# add a scatterplot
sc = p.scatter(source=ds_co2, x='gdp', y='co2',
          view=view,
          size=10, fill_alpha=0.5,
          color={'field': 'region', 'transform': color_mapper},  # Color by region
          legend_field='region'  # Add legend using region field
        )

# Set the title and formatting
p.title = Title(
    text='CO2 Emissions vs GDP in 1964',
    align='center',  # Center the title
    text_font_size='16pt'  # Set font size
)

# Axis labels...
p.xaxis.axis_label = "CO2 emissions (metric tons per capita)"

# Changing to log scale
p.y_scale = LogScale()
# and formatting the tick labels
p.yaxis.ticker = LogTicker()
p.yaxis.formatter = LogTickFormatter()
p.yaxis.major_label_text_font_size = '14px'

p.yaxis.axis_label = "GDP (in USD per capita)"
# ... and styling:
p.yaxis.axis_label_text_font_style = 'normal'
p.yaxis.axis_label_text_font_size = '18pt'


p.xaxis.formatter = NumeralTickFormatter(format='0 a')
p.xaxis.major_label_text_font_size = '14px'
p.xaxis.axis_label_text_font_size = '18pt'
p.xaxis.axis_label_text_font_style = 'normal'

# Customise the legend
p.legend.title = "Region"
p.legend.title_text_font_style = "bold"
p.legend.location = "top_right"
p.legend.label_text_font_size = "10pt"
p.legend.background_fill_alpha = 0.5  # Semi-transparent background

# Customise the tooltips
hover = p.select_one(HoverTool)
hover.tooltips = [
    # ("index", "$index"),
    # ("(xcoord,ycoord)", "($x, $y)"),
    ("Country", "@country"),
    ("Region", "@region"),
    ("GDP", "@gdp"),
    ("CO2 Emissions", "@co2"),
]

# The slider can be configured with start and end values, a step size, an initial value, and a title: 
slider = Slider(start=0, end=1, value=1.0, step=.1, title="Transparency")

# link the values generated by the RangeSlider to your plot. 
slider.js_link("value", sc.glyph, "fill_alpha")

l = layout([slider], [p])
show(l)

Widgets that act on the underlying data require a backend callback, which is problematic in Jupyter notebooks. See the example in the separate bokeh app.

`bokeh serve co2-gdp_app.py --show` to run the app in a separate process. The `--show` flag will open the app in a new browser tab.

The `--dev` flag will enable hot reloading of the app when you change the code. This is useful for development.