# Interactive plotting in Python using Bokeh

## 1. Introduction
* `bokeh.plotting.figure()` 用來畫圖
* `output_notebook()` 把結果輸出到 notebook 中, `output_file()` 把結果輸出到檔案中
  * `bokeh.io` 用來處理檔案 IO
* `show()` 把圖畫到 notebook 中，`save()` 把圖畫到檔案中
  * 要先呼叫 `output_notebook()` 後才能使用 `show()`

In [1]:
from bokeh.io import output_notebook, show
from bokeh.plotting import figure
from bokeh.resources import INLINE

output_notebook(resources=INLINE)

## 2. Loading dataset

* `bokeh.sampledata` 有很多 pandas dataframe 格式的 dataset

In [2]:
from bokeh.sampledata.autompg import autompg_clean as df_autompg
df_autompg.head()

Unnamed: 0,mpg,cyl,displ,hp,weight,accel,yr,origin,name,mfr
0,18.0,8,307.0,130,3504,12.0,70,North America,chevrolet chevelle malibu,chevrolet
1,15.0,8,350.0,165,3693,11.5,70,North America,buick skylark 320,buick
2,18.0,8,318.0,150,3436,11.0,70,North America,plymouth satellite,plymouth
3,16.0,8,304.0,150,3433,12.0,70,North America,amc rebel sst,amc
4,17.0,8,302.0,140,3449,10.5,70,North America,ford torino,ford


In [3]:
import bokeh
bokeh.sampledata.download()

from bokeh.sampledata.stocks import GOOG as google
import pandas as pd

df_google = pd.DataFrame(google)
df_google['date'] = pd.to_datetime(df_google['date'])
df_google.head()

Creating /root/.bokeh directory
Creating /root/.bokeh/data directory
Using data directory: /root/.bokeh/data
Downloading: CGM.csv (1589982 bytes)
   1589982 [100.00%]
Downloading: US_Counties.zip (3171836 bytes)
   3171836 [100.00%]
Unpacking: US_Counties.csv
Downloading: us_cities.json (713565 bytes)
    713565 [100.00%]
Downloading: unemployment09.csv (253301 bytes)
    253301 [100.00%]
Downloading: AAPL.csv (166698 bytes)
    166698 [100.00%]
Downloading: FB.csv (9706 bytes)
      9706 [100.00%]
Downloading: GOOG.csv (113894 bytes)
    113894 [100.00%]
Downloading: IBM.csv (165625 bytes)
    165625 [100.00%]
Downloading: MSFT.csv (161614 bytes)
    161614 [100.00%]
Downloading: WPP2012_SA_DB03_POPULATION_QUINQUENNIAL.zip (4816256 bytes)
   4816256 [100.00%]
Unpacking: WPP2012_SA_DB03_POPULATION_QUINQUENNIAL.csv
Downloading: gapminder_fertility.csv (64346 bytes)
     64346 [100.00%]
Downloading: gapminder_population.csv (94509 bytes)
     94509 [100.00%]
Downloading: gapminder_life_e

Unnamed: 0,date,open,high,low,close,volume,adj_close
0,2004-08-19,100.0,104.06,95.96,100.34,22351900,100.34
1,2004-08-20,101.01,109.08,100.5,108.31,11428600,108.31
2,2004-08-23,110.75,113.48,109.05,109.4,9137200,109.4
3,2004-08-24,111.24,111.6,103.57,104.87,7631300,104.87
4,2004-08-25,104.96,108.0,103.88,106.0,4598900,106.0


## 3. Scatter plots

* Bokeh 畫圖三步驟:
  * 用 `figure()` 建立 figure object
  * `circle()`, `square()`, `cross()` 等建立 glyph
  * 用 `show(fig_obj)` 把圖畫出來

In [4]:
fig = figure(
    plot_width=400,
    plot_height=400
)

fig.circle(
    x=df_autompg['displ'],
    y=df_autompg['weight']
)

show(fig)

Output hidden; open in https://colab.research.google.com to view.

In [5]:
fig = figure(
    plot_width=400,
    plot_height=400,
    title='Disposition vs Weight'
)

fig.circle(
    x=df_autompg['displ'],
    y=df_autompg['weight'],
    size=10,
    alpha=0.8,
    line_color='red',
    fill_color='skyblue'
)

fig.xaxis.axis_label='Disposition'
fig.yaxis.axis_label='Weight'

show(fig)

Output hidden; open in https://colab.research.google.com to view.

In [6]:
import numpy as np

square_sizes = np.random.randint(1, 25, size=(df_autompg.shape[0]))

fig = figure(
    plot_width=400,
    plot_height=400,
    title='Disposition'
)

fig.square(
    x=df_autompg['displ'],
    y=df_autompg['weight'],
    size=square_sizes,
    alpha=0.5,
    line_color='orange',
    fill_color='orange'
)

fig.xaxis.axis_label='Disposition'
fig.yaxis.axis_label='Weight'

show(fig)

Output hidden; open in https://colab.research.google.com to view.

## 4. Line plots

In [7]:
fig = figure(
    plot_width=600,
    plot_height=300,
    title='Sample Line Plot'
)

fig.line(
    x=range(10),
    y=np.random.randint(1, 50, 10)
)

show(fig)

Output hidden; open in https://colab.research.google.com to view.

In [8]:
fig = figure(
    plot_width=600,
    plot_height=400,
    title='Sample Dashed Dotted Line Plot'
)

fig.line(
    x=range(10),
    y=np.random.randint(1, 50, 10),
    line_width=3,
    line_color='lime',
    line_dash='dotdash'
)

show(fig)

Output hidden; open in https://colab.research.google.com to view.

In [9]:
fig = figure(
    plot_width=700,
    plot_height=400,
    x_axis_type='datetime',
    title="Google Stock Prices from 2005 - 2013"
)

fig.line(
    x=df_google.date,
    y=df_google.close,
    line_width=3,
    line_color='tomato'
)

fig.xaxis.axis_label='Time'
fig.yaxis.axis_label='Price ($)'

fig.xgrid.grid_line_color=None
fig.ygrid.grid_line_alpha=1.0

show(fig)

Output hidden; open in https://colab.research.google.com to view.

In [10]:
fig = figure(
    plot_width=700,
    plot_height=400,
    title="Sample Step Chart"
)

fig.step(
    x=range(10),
    y=np.random.randint(1, 10, size=10),
    line_color='red',
    line_width=2
)

show(fig)

Output hidden; open in https://colab.research.google.com to view.

In [11]:
fig = figure(
    plot_width=700,
    plot_height=400,
    title='Sample Multi Line Chart'
)

fig.multi_line(
    [list(range(5)), list(range(5, 10))], # line Xs
    [np.random.randint(1, 25, size=5), np.random.randint(25, 50, size=5)], # line Ys
    color=['firebrick', 'navy'],
    alpha=[0.6, 0.5],
    line_width=4
)

show(fig)

Output hidden; open in https://colab.research.google.com to view.

## 5. Bar charts

In [12]:
autompg_avg_by_origin = df_autompg.groupby(by='origin').mean()

fig = figure(
    plot_width=300,
    plot_height=300,
    title='Average mpg per region'
)

fig.vbar(
    x=[1, 2, 3],
    width=0.5,
    top=autompg_avg_by_origin.mpg, # bar 的高度
    fill_color='firebrick',
    line_color='blue',
    alpha=0.8
)

fig.xaxis.axis_label='Region'
fig.yaxis.axis_label='MPG'

fig.xaxis.ticker=[1, 2, 3]
fig.xaxis.major_label_overrides = { # 把 bar 的名字改掉
    1: 'North America',
    2: 'Asia',
    3: 'Eurpoe'
}

show(fig)

Output hidden; open in https://colab.research.google.com to view.

In [13]:
fig = figure(
    plot_width=400,
    plot_height=300,
    title='Average mpg per region'
)

fig.hbar(
    y=[1, 2, 3],
    height=0.5,
    right=autompg_avg_by_origin.mpg, # 水平的 bar 的高度
    fill_color='skyblue',
    line_color='red'
)

fig.xaxis.axis_label='MPG'
fig.yaxis.axis_label='Region'

fig.yaxis.ticker=[1, 2, 3]
fig.yaxis.major_label_overrides = {
    1: 'North America',
    2: 'Asia',
    3: 'Eurpoe'
}

show(fig)

Output hidden; open in https://colab.research.google.com to view.

In [14]:
from bokeh.models import Legend

fig = figure(
    plot_width=500,
    plot_height=400,
    title='Average mpg, accel per region'
)

v = fig.vbar_stack(
    ['mpg', 'accel'], # dataframe 中的哪些欄位要做 vertical stack
    x='index',
    width=0.6,
    color=('lime', 'tomato'),
    alpha=0.7,
    source=autompg_avg_by_origin.reset_index(), # 要把 dataframe 傳入
)

fig.xaxis.axis_label='Region'
fig.yaxis.axis_label='Average mpg/accel'

fig.xaxis.ticker=[0, 1, 2]
fig.xaxis.major_label_overrides = {
    0: 'North America',
    1: 'Asia',
    2: 'Eurpoe'
}

legend = Legend( # 要建立 legend
    items=[
        ('mpg', [v[0]]),
        ('accel', [v[1]]),
    ],
    location=(0, -30)
)

fig.add_layout(legend, 'right') # legend 建好後要加入到 fig 物件裡面

show(fig)

Output hidden; open in https://colab.research.google.com to view.

In [15]:
fig = figure(
    plot_width=600,
    plot_height=400,
    title='Average mpg, accel per region'
)

h = fig.hbar_stack(
    ['mpg', 'accel'],
    y='index',
    height=0.6,
    color=('blue', 'green'),
    alpha=0.5,
    source=autompg_avg_by_origin.reset_index(),
)

fig.xaxis.axis_label='Average mpg/accel'
fig.yaxis.axis_label='Region'

fig.yaxis.ticker=[0, 1, 2]
fig.yaxis.major_label_overrides = {
    0: 'North America',
    1: 'Asia',
    2: 'Eurpoe'
}

legend = Legend(
    items=[
        ('mpg', [h[0]]),
        ('accel', [h[1]])
    ],
    location=(0, -30)
)

fig.add_layout(legend, 'right')

show(fig)

Output hidden; open in https://colab.research.google.com to view.

## 6. Rectangles

In [16]:
fig = figure(
    plot_width=400,
    plot_height=400,
    title='Sample rectangle glyph chart'
)

fig.rect( # 點是由 rectangle 組成
    x=range(5),
    y=np.random.randint(1, 25, size=5),
    width=0.2,
    height=1,
    angle=45,
    color='lawngreen'
)

show(fig)

Output hidden; open in https://colab.research.google.com to view.

In [17]:
fig = figure(
    plot_width=400,
    plot_height=400,
    title='Sample quads glyph chart'
)

fig.quad( # 要指定上下左右的座標
    top=[1.5, 2.5, 3.5],
    bottom=[1, 2, 3],
    left=[1, 2, 3],
    right=[1.5, 2.5, 3.5],
    color='skyblue'
)

show(fig)

Output hidden; open in https://colab.research.google.com to view.

## 7. Areas

In [18]:
fig = figure(
    plot_width=400,
    plot_height=400,
    title='Sample area chart'
)

fig.varea( # 會在 x, y1, y2 和座標軸之間所圍起來的空間上色
    x=[1, 2, 3],
    y1=autompg_avg_by_origin.accel,
    y2=autompg_avg_by_origin.mpg,
    fill_color='cyan',
    alpha=0.5
)

show(fig)

Output hidden; open in https://colab.research.google.com to view.

In [21]:
fig = figure(
    plot_width=400,
    plot_height=400,
    title='Sample area chart'
)

fig.harea( # 這個是在垂直方向的區域上色
    y=[1, 2, 3],
    x1=autompg_avg_by_origin.accel,
    x2=autompg_avg_by_origin.mpg,
    fill_color='orangered',
    alpha=0.5
)

show(fig)

Output hidden; open in https://colab.research.google.com to view.

In [23]:
fig = figure(
    plot_width=400,
    plot_height=400,
    title='Sample stacked area chart'
)

fig.varea_stack( # 對兩個不同的 stacks 上色
    ['accel', 'mpg'],
    x='index',
    color=('deeppink', 'pink'),
    alpha=0.5,
    source=autompg_avg_by_origin.reset_index()
)

show(fig)

Output hidden; open in https://colab.research.google.com to view.

## 8. Patches

In [24]:
fig = figure(
    plot_width=400,
    plot_height=300,
    title='Sample ploygon chart'
)

fig.patch( # 對 polygon 圍起來的區域上色
    [1, 2, 3, 4],
    [6, 8, 8, 7],
    alpha=0.5,
    line_width=2,
    line_color='black'
)

show(fig)

Output hidden; open in https://colab.research.google.com to view.

In [29]:
fig = figure(
    plot_width=400,
    plot_height=300,
    title='Sample multiple ploygon chart'
)

fig.patches( # 對 polygon 圍起來的區域上色
    [[1, 2, 2], [2, 1, 1]],
    [[2, 3, 4], [2, 4, 5]],
    color=['lavender', 'violet'],
    line_width=2,
    line_color='black'
)

show(fig)

Output hidden; open in https://colab.research.google.com to view.

In [30]:
fig = figure(
    plot_width=300,
    plot_height=300,
    title='Sample multiple ploygon chart'
)

fig.multi_polygons( # 對 polygon 圍起來的區域上色
    xs=[[[  [0, 3, 3, 0], [1, 2, 2], [2, 1, 1] ]]],
    ys=[[[  [1, 1, 5, 5], [2, 3, 4], [2, 4, 5] ]]],
    color='red',
    alpha=0.6
)

show(fig)

Output hidden; open in https://colab.research.google.com to view.

## 9. Combining multiple charts

In [31]:
fig = figure(
    plot_width=600,
    plot_height=300,
    title='Sample merge plot'
)

y = np.random.randint(1, 50, 10)

fig.line(
    x=range(10),
    y=y,
    line_width=2
)

fig.circle(
    x=range(10),
    y=y,
    color='red',
    size=10
)

show(fig)

Output hidden; open in https://colab.research.google.com to view.

In [32]:
fig = figure(
    plot_width=400,
    plot_height=400,
    title='Disposition vs Weight color-encoded by Origin'
)

fig.circle(
    x=df_autompg[df_autompg['origin']=='North America']['displ'],
    y=df_autompg[df_autompg['origin']=='North America']['weight'],
    fill_color='tomato',
    size=12,
    alpha=0.7,
    legend_label='North America'
)

fig.diamond(
    x=df_autompg[df_autompg['origin']=='Asia']['displ'],
    y=df_autompg[df_autompg['origin']=='Asia']['weight'],
    fill_color='lawngreen',
    size=14,
    alpha=0.5,
    legend_label='Asia'
)

fig.square(
    x=df_autompg[df_autompg['origin']=='Europe']['displ'],
    y=df_autompg[df_autompg['origin']=='Europe']['weight'],
    fill_color='skyblue',
    size=10,
    alpha=0.5,
    legend_label='Europe'
)

fig.xaxis.axis_label='Disposition'
fig.yaxis.axis_label='Weight'

fig.legend.location='top_left'

show(fig)

Output hidden; open in https://colab.research.google.com to view.

In [34]:
fig = figure(
    plot_width=400,
    plot_height=400,
    title='MPG and accel bar chart'
)

fig.vbar(
    x=[1, 3, 5],
    top=autompg_avg_by_origin.mpg,
    fill_color='lime',
    line_color='lime',
    legend_label='MPG'
)

fig.vbar(
    x=[2, 4, 6],
    width=0.8,
    top=autompg_avg_by_origin.accel,
    fill_color='tomato',
    line_color='tomato',
    legend_label='accel'
)

fig.xaxis.axis_label='MPG/Acceleration'
fig.yaxis.axis_label='Origin'
fig.legend.location='top_right'

fig.xaxis.ticker=[1.5, 3.5, 5.5]
fig.xaxis.major_label_overrides={
    1.5: 'North America',
    3.5: 'Asia',
    5.5: 'Europe'
}

show(fig)

Output hidden; open in https://colab.research.google.com to view.