# Bokeh - Introduction

Bokeh is a general purpose visualisation library that is easy to use and focuses on interactivity. 
The main resource for learning Bokeh should be their website at https://bokeh.pydata.org. 

## Contents

* [Installation](#Installation)
* [Basic line plot](#Basic-line-plot)
* [Basic circle plot](#Basic-circle-plot)
* [Step plot](#Step-plot)
* [Multiple markers](#Multiple-markers)
* [Shapes](#Shapes)
* [Importing data](#Importing-data)
* [Customising axes](#Customising-axes)


## Installation

### Prerequisites:

* Anaconda Python 3.6 version (download from https://www.anaconda.com/download)
* Jupyter Notebook (comes with Anaconda)
* NumPy (also comes with Anaconda)

I also assume that you can find your ways around Jupyter Notebook and also that you know some basic NumPy concepts. I have written a [short tutorial on NumPy here](../numpy_tutorial/NumPy.ipynb). I am also going to use panda when convenient (e.g. `read_csv`). 

To install Bokeh run the following on the command line. 
```
pip3 install bokeh --upgrade
```

In [1]:
import numpy as np

from bokeh.io import output_notebook, show
from bokeh.plotting import figure

output_notebook()

The command `output_notebook()` will set Bokeh so that the figure is displayed inline. 
If you want your figure to be displayed as an html file, instead of `output_notebook()`, you need to import the `output_file()` function. For example, to save the figure as `output.html`, use instead
```
from bokeh.io import output_file
output_file('output.html')
```
It is also possible to run Bokeh from the command line, but we will not be discussing this method since it is much more convenient to use Jupyter notebook. [You can read more about it here](https://bokeh.pydata.org/en/latest/docs/user_guide/concepts.html#output-methods).

#### Aside: 

Bokeh is composed of two parts: Bokeh Python Library and BokehJS. We use Bokeh Python Library to produce a JSON data that represent a figure, which is later rendered into html by BokehJS. Bokeh Python Library itself has two main components: bokeh.plotting and bokeh.models. The former is the main tool that we are going to use, while the latter is a low-level interface which you can use to further customise your visualisation.

## Basic line plot

Main reference: https://bokeh.pydata.org/en/latest/docs/reference/plotting.html#bokeh.plotting.figure.Figure.line

To draw something using Bokeh, we must first create a figure. Once a figure is created, we are free to add as many plots as we like on it, even of different kinds. We start with the basic line plot:

In [2]:
x = np.arange(0.8,1.2,0.01)
def square(x):
    return x**2
def cube(x):
    return x**3

line_plot = figure(title="Basic line plot",plot_height=300, plot_width=500)
line_plot.line(x,x,line_color='green')
line_plot.line(x,square(x),line_color='cyan')
line_plot.line(x,cube(x),line_color='orange')
show(line_plot)

Bokeh functions can take a list of keyword arguments (kwargs), as you have seen above. Here is the more commonly used ones. The values inside the square bracket is either the type (e.g. Int or Float) or an array of possible String values. 
```
line_alpha=[Float between 0.0 and 1.0]
line_color=[Color]
line_width=[Float]
line_cap=['butt','round','square']
line_dash=['solid','dashed','dotted','dotdash','dashdot']
```

See also [`multi_line()`](https://bokeh.pydata.org/en/latest/docs/reference/plotting.html#bokeh.plotting.figure.Figure.multi_line) for drawing multiple line plots at once, which can be convenient, but it is also good to be able to customise each line separately (e.g. different interactivity). 

## Basic circle plot

Main reference: https://bokeh.pydata.org/en/latest/docs/reference/plotting.html#bokeh.plotting.figure.Figure.circle

In [3]:
circle_x = np.arange(10)
circle_y = np.random.rand(10)*20
circle_plot = figure(title='Basic circle plot',plot_height=300, plot_width=500)
circle_plot.circle(circle_x,circle_y,size=circle_y*circle_x,fill_color='orange',fill_alpha=0.3,line_color='orange')
show(circle_plot)

The main keyword arguments are
```
fill_color=[Color]
fill_alpha=[Float between 0.0 and 1.0]
radius=[Float]
size=[Float]
```

## Step plot

Main reference: https://bokeh.pydata.org/en/latest/docs/reference/plotting.html#bokeh.plotting.figure.Figure.step

In [4]:
step_x = [1,2,3,4,5,6,7,8,9,10]
step_y = [10,4,6,1,5,7,1,4,6,7];
step_plot = figure(title='Step plot', plot_height=300, plot_width=500)
step_plot.step(step_x,step_y,line_color='orange',line_width=2,mode='after',line_alpha=0.6)
show(step_plot)

The keyword arguments for line properties listed above applies to step plot as well. The other main keyword argument is
```
mode=['before','after','center']
```
denoting where the step level should be drawn in relation to the x and y coordinates.

## Multiple markers

As mentioned above, it is possible to combine multiple plots in one figure, and there are quite a few shapes that you can choose from. I displayed some of the more interesting ones in the next figure ([segment](https://bokeh.pydata.org/en/latest/docs/reference/models/glyphs/segment.html), [annulus](https://bokeh.pydata.org/en/latest/docs/reference/models/glyphs/annulus.html), and 
[hex](https://bokeh.pydata.org/en/latest/docs/reference/models/markers/hex.html) --- mostly because of [hexbin](https://bokeh.pydata.org/en/latest/docs/gallery/hexbin.html)).
For the segment plot, notice how we can pass an array of values for the keyword arguments. 


Apart from these, there are basic shapes such as 
[rectangle](https://bokeh.pydata.org/en/latest/docs/reference/models/glyphs/rect.html), 
[triangle](https://bokeh.pydata.org/en/latest/docs/reference/models/markers/triangle.html),
[square](https://bokeh.pydata.org/en/latest/docs/reference/models/markers/square.html),
[ellipse](https://bokeh.pydata.org/en/latest/docs/reference/models/glyphs/ellipse.html), and 
[diamond](https://bokeh.pydata.org/en/latest/docs/reference/models/markers/diamond.html). 
For other possible shapes, refer to the list on [bokeh plotting documentation](https://bokeh.pydata.org/en/latest/docs/reference/plotting.html#bokeh-plotting).

In [5]:
mult_plot = figure(title='Multiple markers', plot_height=400,plot_width=640)

annulus_x = np.linspace(1,20,15);
annulus_y = np.random.rand(15)*20;
mult_plot.annulus(annulus_x,annulus_y,0.2,0.5,fill_color='orange',fill_alpha=0.2,line_color='orange')

hex_x = np.linspace(1,20,100);
hex_y = np.random.rand(100)*20;
mult_plot.hex(hex_x,hex_y,20-hex_y,fill_color='red',fill_alpha=0.1,line_color='red',line_alpha=0.1)

segment_x0 = [5,7.5,10]
segment_y0 = [5,10,15]
segment_x1 = [5,7.5,15]
segment_y1 = [10,15,15]
mult_plot.segment(segment_x0,segment_y0,
                  segment_x1,segment_y1,
                  line_width=[12,7,9],
                  line_color='lavender',
                  line_alpha=[0.6,0.7,0.9])

show(mult_plot)

## Shapes

There are two main method of drawing larger shapes: [quad](https://bokeh.pydata.org/en/latest/docs/reference/models/glyphs/quad.html) and [patch](https://bokeh.pydata.org/en/latest/docs/reference/models/glyphs/patch.html#bokeh.models.glyphs.Patch). Quad is used for standard rectangular shape while patch is more general in that it can draw almost any shape.

In [6]:
quad_plot = figure(title='Quads', plot_height=400,plot_width=640)

quad_plot.quad(
        bottom=[1,3],
        top=[5,5],
        left=[2,5],
        right=[4,8],
      
        fill_color=['orange','blue'],
        fill_alpha=[0.3,0.4],
        line_color=['orange','blue'],
        line_alpha=[0.6,0.8])

show(quad_plot)

In [59]:
patch_plot = figure(title='Patch', plot_height=400,plot_width=640)

patch_plot.patch(
        [1,2,3,4,5,4,3,2,1],
        [3,4,4,6,5,3,1,1,2],
        fill_color='orange',
        fill_alpha=0.3,
        line_color='orange',
        line_alpha=0.6)

show(patch_plot)

## Importing data

For CSV file, I find it easier to use panda rather than NumPy since `read_csv()` is very powerful. You can use panda dataframes as input for your plots, because after all they are NumPy arrays. 

In [60]:
import pandas as pd
death_data = pd.read_csv('../data/deaths_homicide_suicide.csv')

death_plot = figure(title="Homicide and suicide rate, 1915-2004",plot_height=300, plot_width=900)
death_plot.line(death_data['Year'],death_data['Homicide'],line_color='orange',legend='Homicide')
death_plot.line(death_data['Year'],death_data['Suicide'],line_color='purple',legend='Suicide')
show(death_plot)

## Customising axes

In the previous examples we have been customising a plot using keyword arguments, i.e. upon creation. You can actually further modify a figure using additional functions. Here I am listing the more common ones. You can see the list for available functions [here](https://bokeh.pydata.org/en/latest/docs/reference/models/axes.html).

Sometimes it requires you to access bokeh.models library.

In [114]:
square_x = np.arange(10)
square_y = np.random.rand(10)*20
square_plot = figure(title='Square plot',plot_height=300, plot_width=500, x_range=(0,12), y_range=(0,20))
square_plot.square(square_x,square_y,size=6*square_x,fill_color='orange',fill_alpha=0.3,line_color='orange')
show(square_plot)

In [116]:
square_plot.yaxis.axis_label = 'Left Text'
square_plot.xaxis.axis_label = 'Bottom Text'

from bokeh.models import Range1d
square_plot.x_range=(Range1d(0,40))
square_plot.y_range=(Range1d(0,80))

square_plot.xaxis.axis_label_text_color = 'black'
square_plot.yaxis.axis_label_text_color = 'black'

square_plot.yaxis.major_label_text_color = 'red'

square_plot.yaxis.minor_tick_in = 0


show(square_plot)