# Intro to Bokeh
The Bokeh library is an interactive plotting library with similar functions as Plotly.
In this micro-course we will talk about how to use Bokeh to make interactive plots. A lot of examples and instructions are taken directly from this [start bokeh documentation](https://nbviewer.jupyter.org/github/bokeh/bokeh-notebooks/blob/master/quickstart/quickstart.ipynb). 

### Importing Bokeh
A simple import statement can be used to import bokeh library. To check which version of bokeh you are using just do the following. 

```python 
import bokeh
bokeh.__version__
```
To start plotting with bokeh you are going to need to do the following import 

```python 
from bokeh.plotting import figure 
from bokeh.io import output_notebook, show
```

The import figure lets you import the function figure which can be used to display figures.
To run plots in a jupyter notebook you are going to have to run  

```python 
output_notebook() 
```
In one of the cells running that will yield the bokeh loaded symbol that looks like this - <br> 

![bokeh_loaded](images/bokeh_loaded_symbol.jpg)

In [13]:
# Import bokeh library
from bokeh.plotting import figure 
from bokeh.io import output_notebook, show
output_notebook() 

Once bokeh has loaded you can start plotting. 

We will start with some simple examples- 

1) Scatter plots and line plots <br>
2) plot titles and axes labels <br>
3) Plotting over another plot <br>
4) Hover over data points <br>

### Scatter and line plots 

For this we will take a similar example to one found in the bokeh documentation -

We will plot a cosine wave using both a scatter plot and a line plot then combine them both. 
You have to run them in the cell to see how it looks. 

The inputs for the plot will be- 

```python 
import numpy as np 
x = np.linspace(-10, 10, 100)
y = np.cos(x)
```

Copy, paste and run this in a cell- 

```python 
# scatter plot 
p = figure(width=500, height=500)
p.circle(x, y, size=7, color="green")
show(p)
```
You should see a scatter plot like this- 

![scatter_plot](images/cos_plot.jpg)

You should see a grayed out panel to the right side of the image with various icons. Hover over them to see each of their functions. You will have access to different types of zooms, pan, the ability to save the image. The bokeh symbol at the top of the panel will take you to the main bokeh website. If you zoom in too far in the image you can reset it from this panel as well. 
 
Let us go over the code more carefully. The first line of the code is us creating a figure object

```python 
# scatter plot 
p = figure(width=500, height=500)
```

this object 'p' is a figure object where the height and width of the figure is 500 units. 

In the second line we have
```python 
p.circle(x, y, size=7, color="green")
```

where we are setting that the figure object is going to display a circle type object.  The arguments of 'p.circle' are x and y locations on the canvas where we want to draw the circles, the size of each circle, in this case it's 7 units and color of each circle green.


Let us now try to plot the same example for a line plot. Can you guess how would the code differ? 

**Excercise** 

Copy the code below and replace the question mark with what you expect the function to draw line plot should be. 

```python 
# line plot 
p = figure(width=500, height=500)
p.?(x, y, color="green")
show(p)
```

In [9]:
# Write your code below:


**Solution**

The solution is rather trivial. We just substitute "?" with "line".

Apart from the arguments that we have given to line and scatter(circle) plot, we can give other arguments to modify the plots. For example we can specify the line thickness of the plot by adding the argument **"line_width"**. Also we can add **"alpha"** to make plots translucent.
```python
p = figure(width=500, height=500)
p.line(x, y, color="green", line_width=5, alpha=0.2)
show(p)
```

**Excercise** 

Use the same x and y to draw a scatter plot, set circle size to 5, color to red, and alpha to 0.4.

In [None]:
# Write your code below:


### Plot titles and axes labels  
#### Plot title 
It's not just enough to plot data, it is also important to label the plot so that you can identify what the plot represents. In bokeh you can set title for the plot and modify the axis. To change the title for a line plot, we can do the following- 

```python 
# line plot 
p = figure(width=500, height=500, title="line plot of cosine function")
p.line(x, y, color="green", line_width=5)
show(p)
```
the output should look like this- 

![title](images/line_plot_title.jpg)
 

**Excercise** : 

Add 
```python
p.title.align="center"
```
to the above code, then use: 
```python
show(p)
```
what happens to the title? 

In [None]:
# Write your code below:


**Solution**

You should see that the title aligns to the center of the plot. 

#### Plot axes 

In most cases, it is also important to label your plot axes, bokeh allows a simple way of doing this. Suppose you want to give a name to the x axis as 'time' and the y axis as 'cosine value over time'. You would write- 

```python 
# line plot 
p = figure(width=500, height=500, title="line plot of cosine function")
p.line(x, y, color="red", line_width=5)
p.xaxis.axis_label= "time" 
p.yaxis.axis_label ="cosine over time" 
show(p)
```
you should see this- 

![axis_label](images/axis_label.jpg)

In the above code you would have seen that we have added two lines of code, and both of them are similar. The line 

```python 
p.xaxis.axis_label= "time" 
```

allows us to set the x-axis label as the string "time". What is important to note here is that you can modify many of the properties of the x axis by accessing functions and properties from 

```python 
p.xaxis  
```

**'axis_label'** happens to be one such property. For example, you can also change the font size of the axis label by doing- 

```python 
p.xaxis.axis_label_text_font_size="15pt"
```
where the property of the xaxis , **'axis_label_text_font_size'** is used to set the font size.

A full list of axis properties and functions can be found at [bokeh.models.axes documentation](https://bokeh.pydata.org/en/latest/docs/reference/models/axes.html)


A word of caution. The documentation is not always straight forward. For instance if we go down to the part about 'axis_label_text_font_size', we will find - 

![bokehaxes](images/bokehaxis.jpg)

you must click on the "FontSizeSpec" link to find the right documentation to use the property 'axis_label_text_font_size'. 

**Excercise** 

In the line plot change the y axis to red and also change the font size to 20 points. 

In [None]:
# Write your code below:


**Solution** 

Your code for this solution should be something like- 

```python
p = figure(width=500, height=500, title="line plot of cosine function")
p.line(x, y, color="red", line_width=5)
p.xaxis.axis_label= "time"
p.xaxis.axis_label_text_font_size="15pt"
p.yaxis.axis_label ="cosine over time" 
p.yaxis.axis_label_text_font_size="20pt"
p.yaxis.axis_label_text_color="red"
show(p)
```

As you can see we set the property 'axis_label_text_color' to 'red' and our y axis label became red. This gives us complete control over the look and feel of each plot. 

### Plotting one plot over another 

In order to compare two plots, we will routinely have to plot one plot over another, this is done simply by adding the plot type before show. For example, if I want to have a line plot and and a scatter plot of the cosine wave in the same plot but with different colors. Then we would write- 

```python
p = figure(width=500, height=500, title="line plot of cosine function")
p.line(x, y, color="red", line_width=5)
p.circle(x, y, size=10, color="green", alpha=1)
p.xaxis.axis_label= "time"
p.xaxis.axis_label_text_font_size="15pt"
p.yaxis.axis_label ="cosine over time" 
p.yaxis.axis_label_text_font_size="15pt"
show(p)
```
Copy this code and run it, you should see the scatter plot superimposed on top of the red line plot. This is reflected by writing - 

```python 
p.line(x, y, color="red", line_width=5)
p.circle(x, y, size=10, color="green", alpha=1)
```
and then writing 

```python 
show(p)
```
we don't need the lines in between for the plot to work. The main idea is that we are basically plotting the line first, then the circles and then showing the plot. this would not work if we try to write the following 

```python
p = figure(width=500, height=500, title="line plot of cosine function")
p.line(x, y, color="red", line_width=5)
p.xaxis.axis_label= "time"
p.xaxis.axis_label_text_font_size="15pt"
p.yaxis.axis_label ="cosine over time" 
p.yaxis.axis_label_text_font_size="15pt"
show(p)
p.circle(x, y, size=10, color="green", alpha=1)
```

Since the show function shows the figure 'p' before the circle plot type is attached to figure object 'p'. There is generally no limit to how many plots you can superimpose over each other. 

**Excercise** 

Plot a green line plot of the cosine funtion with red dots representing the cosine function as well. reduce the alpha of the dots to 0.5 and increase its size to 20 pts 

In [None]:
# Write your code below:


**Solution**:
```python
p = figure(width=500, height=500, title="line plot of cosine function")
p.line(x, y, color="green", line_width=5)
p.circle(x, y, size=20, color="red", alpha=0.5)
p.xaxis.axis_label= "time"
p.xaxis.axis_label_text_font_size="15pt"
p.yaxis.axis_label ="cosine over time" 
p.yaxis.axis_label_text_font_size="15pt"
show(p)
```
### Hover tool 

One of the most useful tools when plotting data is being able to hover over a certain part of the plot and read the data points. This is directly built into Plotly, but in bokeh you need to do some coding in order to access this option. This is called the hover tool. 

In order to activate the hover tool we can add 'tools="hover"' in figure: 

```python
p = figure(width=500, height=500, title="line plot of cosine function", tools="hover")
```
When you see the image, you might notice that all the other tools have disappeared. This is because all these tools are added by default, you can individually add these tools. Let's try to do this
```python
tools_I_want = ["hover", "wheel_zoom", "pan", "reset", "box_zoom", "help", "save"]
p = figure(width=500, height=500, title="line plot of cosine function", tools=tools_I_want)
p.line(x, y, color="green", line_width=5)
p.xaxis.axis_label = "time"
p.xaxis.axis_label_text_font_size = "15pt"
p.yaxis.axis_label = "cosine over time" 
p.yaxis.axis_label_text_font_size = "15pt"
show(p)
```
Above, we have specified all the tools that we want to have on the plot and show the image. Plotting this. It looks like the following - 

![hover](images/hover.jpg)


You should be able to see the hover icon as the 2nd last icon. The order of writing the tool names does not really matter since they are displayed in a certain order. In fact there are a whole bunch of tools that are available in bokeh. You can find them [here](https://bokeh.pydata.org/en/latest/docs/user_guide/tools.html).

After knowing these basic plot settings, we can start building **Bar graphs** and **Histograms**.

## Bar Graph
Suppose we have three labels in an iris dataset, "setosa", "versicolor", and "virginica". The counts for each category are [50, 20, 10]. To plot these as vertical bars, we can do the following-
```python
p = figure(x_range= ["setosa", "versicolor", "virginica"], plot_height=250, title="Iris plant type counts",
           toolbar_location="right", tools="hover, pan")

p.vbar(x=["setosa", "versicolor", "virginica"], top=[50,20,10], bottom=0, width=0.9, alpha=0.7, color="red")
show(p)
```
We can see that there are several differences from the previous codes.

1) We choose to only keep 2 tools, hover and pan and get rid of the other tools.<br>
2) We have this new option 'toolbar_location', this just decides where you can place the tool bar. <br>
3) The argument 'x_range' let us set the names of the ticks on the x axis, since we have 3 ticks, we have 3 names.<br>

The main difference between plotting a scatter plot or a line plot or a bar plot comes in the next line where we generate a figure of type 'vbar' which stands for vertical bar. Here you need to specify the x axis label for each category is 'x' and the counts for each of the categories which are given by 'top=[50,20,10]'. This array specifies the different values for each bar. The argument 'bottom' tells us what is the starting values for the bar. Currently it's set to 0. Suppose we wanted to plot the bar from 10 to 50 units, we can do that by specifying the argument 'bottom' as 10 and 'top' as 50. We can also specify the width of each bin.

### horizontal bar 
Suppose you want to take the data from the previous exercise and plot it as a **horizontal bar** rather than vertical bar. You would then do- 

```python
p = figure( y_range= ["setosa", "versicolor", "virginica"], plot_height=250, title="Iris plant type counts",
           toolbar_location="right", tools="hover, pan")
p.hbar(y=["setosa", "versicolor", "virginica"], height=[0.5,0.5,0.5], right=[50,30,20], left=0 )
show(p)

```
the plot would look like-

![hiris](images/hiris.jpg)

there are a few differences in the code compared to the code of the vertical bar plot. First being, rather than 'x_range' we have 'y_range' this is because if we are plotting horizontal plots our labels will be on the y-axis. In other words, the axis where we are going to place our categories is the y-axis. The rest of 'figure' definition does not change. The method to add a horizontal bar is 'hbar', similar to 'vbar'. It takes an axis argument 'y', which should contain the names of the categories that we want to plot. We also have the argument 'height', which controls the thickness of the horizontal bars. This is the same as the argument 'width' for vertical bars. I have plotted this as a list to show that you can control the height of each bar. Similarly, you can also do 'width = [1, 0.2, 0.6]', which represents different vertical bars' thicknesses. Like vertical bar, we can specify the location of the horizontal bar, in this case,  as 'left' and 'right'. So, if we want to specify a horizontal bar from location 10 to 30 we would write- 'left=10' and 'right=30'. 

This way, of which we can specify the bars in bokeh allows for a great degree of control.

### Adding legends to plots

Regardless of the type of chart, we want to be able to add legends to a chart that shows the different fields of data which we are looking at. We held of doing this from earlier since it's easier to illustrate with bar charts. Suppose you have a vertical bar chart of the iris dataset, from the above example you would realize that all three of the categories have the same color. So how can we represent them differently? We can do the following:

```python
p = figure( x_range= ["setosa", "versicolor", "virginica"], plot_height=250, title="Iris plant type counts",
           toolbar_location="right", tools="hover, pan")

p.vbar(x=["setosa", "versicolor", "virginica"], top=[50,20,10], bottom=0, width=0.3, color=["red", "green", "blue"])
show(p)
```
You will notice a plot with three bars with colors red, green and blue. Now we want to add a legend to this so that we can easily see which category each color belongs to. Of course you can read it off the x-axis but if you add the legend, you won't need to add the x axis labels. So you would add the legend by adding- 

```python 
p = figure( x_range= ["setosa", "versicolor", "virginica"], plot_height=250, title="Iris plant type counts",
           toolbar_location="right", tools="hover, pan")
count_list = [50, 20, 10]
color=["red", "green", "blue"]
for name, count, clr in zip(["setosa", "versicolor", "virginica"], count_list, color) :
    p.vbar(x=[name], top=count, bottom=0, width=0.3, color=clr, legend_label=name )

p.legend.orientation="vertical"
show(p)
```

You will notice that in order to have the legend we had to plot each of the bars individually, we use the for loop for plotting the data. the 'zip' method creates tuples from combining the three lists and we loop over each individual elements from the list. Finally we can set the orientation of the legend using the 'p.legend' method.

### Axis ranges 
What you will notice when you plot the above code is that the legend will overlap with the bar plots, in this situation we need to plot the legend differently. Hence we can do. 

```python 
p = figure( x_range= data.target_names, y_range=[0,70], plot_height=250, title="Iris plant type counts",
           toolbar_location="right", tools="hover, pan")
color=["red", "green", "blue"]
for name, count, clr in zip(list(data.target_names), count_list, color) :
    p.vbar(x=[name], top=count, bottom=0, width=0.3, color=clr, legend_label=name )
    
p.legend.orientation="horizontal"
show(p)
```
you should see - 

![iris_legend](images/iris_legend.jpg)

you will find that the range of the y axis has increased from 50 to 70, this is done by using the 'y_range' property in figure. You can set the scale of the x and y range this way. In fact we have set the x_range to the three entries which are the category names from our dataset. 

Now suppose you want to change the x axis numerically, similar to what we did, you would change the 'x_range' quantity. 

**Exercise:** 

Plot the horizontal bar chart with a legend for the iris dataset. Adjust the range of the x axis to be in between 0 and 100. 

In [27]:
p = figure( x_range= ["setosa", "versicolor", "virginica"], plot_height=250, title="Iris plant type counts",
           toolbar_location="right", tools="hover, pan")
count_list = [50, 20, 10]
color=["red", "green", "blue"]
for name, count, clr in zip(["setosa", "versicolor", "virginica"], count_list, color) :
    p.vbar(x=[name], top=count, bottom=0, width=0.3, color=clr, legend_label=name )

p.legend.orientation="vertical"
show(p)

In [25]:
color = 1


In this notebook we have learned how to plot and modify simple line and scatter plots. Try out these plots with an equation of a line. 


The methods we learned around modifying title, axes and adding tools are general tools that apply to all bokeh plots so they will carry over in the next notebook as well. What will change is the type of plot that we are dealing with.  



In the next section we are going to deal with the following types of plots- 

1) Bar graph - vertical <br>
2) Bar graph - horizontal <br>
3) Histograms <br>
4) Patches <br>

