# Bokeh

An introduction to the use of Bokeh instead of Matplotlib. Bokeh does the same job as Matplotlib - produces graphical output of data - but has certain advantages - particularly if you want to interact with the plot. Matplotlib in Jupyter notebooks can be a little frustrating, particularly if used interactively within a loop, as it seems to have a tendency to display all plots at once - or all plots on the same surface.

Useful URL: https://nbviewer.jupyter.org/github/bokeh/bokeh-notebooks/blob/master/index.ipynb

This document introduces the basic features.



You'll need to import it, or specific functions from it, as you would any other module. For example:

`

    from bokeh.plotting import figure, output_notebook, show
`

This notebook will use Bokeh in conjunction with pandas dataframes, but numPy arrays or plain Python lists should work just fine.

NOTE: We'll also import clear_output() from a module called IPython which just gives the ability to clear plots from the Jupyter notebook.

Finally, the two lines:

`

    from IPython.core.interactiveshell import InteractiveShell
    InteractiveShell.ast_node_interactivity = "all"
`

Are not essential, they just allow a 'pretty' display of a pandas dataframe to be produced just by using the dataframe name/function. So just typing df.head() will produce a nicely tabulated display of the first 5 lines of the datframe - print(df.head()) will work fine without this.


In [1]:
import pandas as pd
from bokeh.plotting import figure, output_notebook, show
from bokeh.plotting import reset_output

from IPython.display import display, clear_output

from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"


## 1. Get some data
As usual, read in the data and display it. We'll use some ARROW, radio telescope spectra data, covering observations along the Galactic plane at Galactic longitudes from 0 to 90 degrees in 10 degree intervals. The first column is the radial velocities measured at each longitude (already corrected to the LSR).


In [2]:
df = pd.read_csv('archive-spectra.csv', header=1, skip_blank_lines=True)
df.head()
df.tail()

Unnamed: 0,velocity,l-000,l-010,l-020,l-030,l-040,l-050,l-060,l-070,l-080,l-090
0,-396.74,0.38,-0.32,0.48,0.16,-0.18,0.48,0.21,0.1,0.28,0.03
1,-395.71,0.21,-0.11,-0.41,0.35,-0.11,0.24,0.16,-0.2,0.46,-0.3
2,-394.68,0.29,0.22,-0.14,-0.14,-0.07,0.36,0.34,-0.18,-0.12,0.08
3,-393.65,0.3,-0.11,0.34,0.35,0.12,0.25,-0.36,0.44,-0.26,-0.19
4,-392.62,-0.45,0.57,0.35,-0.36,0.39,-0.14,0.24,-0.31,-0.06,0.44


Unnamed: 0,velocity,l-000,l-010,l-020,l-030,l-040,l-050,l-060,l-070,l-080,l-090
769,395.71,0.27,-0.05,-0.33,-0.47,0.08,0.26,-0.28,-0.22,0.3,-0.22
770,396.74,-0.42,-0.53,-0.13,-0.19,0.18,-0.1,-0.36,0.03,0.32,-0.34
771,397.77,0.24,0.0,-0.42,-0.17,0.13,0.19,0.16,0.01,0.11,-0.32
772,398.8,0.37,-0.05,-0.18,-0.09,-0.02,-0.22,-0.01,-0.41,-0.05,0.0
773,399.83,-0.39,-0.22,0.08,-0.07,-0.03,-0.39,-0.07,-0.39,0.27,0.21


## 2.  Plotting

Now we can produce a plot pretty easily using the following simple steps.

1. Set up a 'figure' - titles, axis labels, size etc.
2. Give the figure data to display and the way we want it displayed (a line plot in this case - these plotting forms are called Glyphs)
3. 'Show' it

NOTE: we need to include the line *output_notebook()* at the start to ensure it works OK in our notebook.


In [4]:
output_notebook()
p1 = figure(title = "Spectral data from Galactic longitude 30 degrees", 
          x_axis_label='Velocity (kms^-1)', 
          y_axis_label='Intensity')
p1.line(df['velocity'],df['l-030'])
show(p1)



Note that in the top right corner there are a series of icons:
![Bokeh Tools](../images/bokeh-glyph-1.png)
These allow you to interact with the plot. The light blue lines next to two of then indicate these actions have been selected (by a left mouse click). In this case the 'pan' and 'mouse wheel zoom' functions.

These can be changed to get further functionality and we'll talk about this later.

### 2.1 Error Bars

If you've got the data you can do error bars using 'Whisker' fron 'bokeh.models'. This is a bit convoluted and you also need 'ColumnDataSource' also from models. You'll need something like this (but read the manuals for a fuller explanation).


```
  src = ColumnDataSource(data=dict(
      y = x_vals,
      lower = y_vals - y_error_vals,
      upper = y_vals + y_error_vals))

  w = Whisker(base='y', lower='lower', upper='upper', line_color='black', dimension='height', source=src)

  p.add_layout(w)
```

where x_vals, y_vals and y_err are all dataseries or arrays


### 2.2 Multiple data on one plot - and some interactivity

You can of course add more than one set of data - and modify how each is displayed. 

Here we've also added a legend - and this begins to demonstrate the interactivity. By including the line:

`

    p1.legend.click_policy="hide"
`

You can click on the appropriate part of the legend and hide/how the data!


In [5]:
p1 = figure(title = "Spectral data from Galactic observations", 
          x_axis_label='Velocity (kms^-1)', 
          y_axis_label='Intensity')
p1.line(df['velocity'],df['l-030'], legend='l=30')
p1.line(df['velocity'],df['l-090'], color='red', line_dash="dashed", legend='l=90')
p1.legend.location = "top_left"
p1.legend.click_policy="hide"
show(p1)



## 3. Adding Tools - particularly the HoverTool

As we mentioned above, there are a lot of interactive tools you can add to the plots in addition to the pan, zoom, save etc. Here we'll look at just one of them - the **HoverTool**. This allows you to inspect the data values.

first you need to:

`

    from bokeh.models.tools import HoverTool
`

Then just use the add_tools() function. here we'll add the HOverTool and link it to a 'vertical' cursor as this works well for if we want to examine velocity values. Note an icon has been added to the list - and enabled automatically.



In [6]:
from bokeh.models.tools import HoverTool

p1 = figure(title = "Spectral data from Galactic longitude at 90 degrees", 
          x_axis_label='Velocity (kms^-1)', 
          y_axis_label='Intensity')
p1.line(df['velocity'],df['l-090'])
p1.add_tools(HoverTool(mode='vline'))
show(p1)

## 4. Sub-plots

You can, of course, display plots as a series of sub-plots

You'll need to import 'gridplot' from bockeh.layouts

1. Set up multiple figures with data
2. Create a grid with the figures
3. Show the grid

In [7]:
from bokeh.layouts import gridplot
from bokeh.models import Range1d

s1 = figure(plot_width=250, plot_height=175, title='l = 20',
            x_axis_label='Velocity (kms^-1)', 
            y_axis_label='Intensity')
s1.line(df['velocity'],df['l-020'], color='red')
s2 = figure(plot_width=250, plot_height=175, title='l = 30',
            x_axis_label='Velocity (kms^-1)', 
            y_axis_label='Intensity')
s2.line(df['velocity'],df['l-030'], color='green')
s2.y_range = Range1d(0,100)  # You can use this to mach the scales
s3 = figure(plot_width=250, plot_height=175, title='l = 50',
            x_axis_label='Velocity (kms^-1)', 
            y_axis_label='Intensity')
s3.line(df['velocity'],df['l-050'], color='blue')
s4 = figure(plot_width=250, plot_height=175, title='l = 90',
            x_axis_label='Velocity (kms^-1)', 
            y_axis_label='Intensity')
s4.line(df['velocity'],df['l-090'], color='purple')

grid = gridplot([[s1,s2],[s3,s4]])
show(grid)



### Exercise 4.1

Choose 3 data sets and produce a 3x1 (3 rows 1 column) display.


In [11]:
from bokeh.layouts import gridplot
from bokeh.models import Range1d

s1 = figure(plot_width=250, plot_height=175, title='l = 20',
            x_axis_label='Velocity (kms^-1)', 
            y_axis_label='Intensity')
s1.line(df['velocity'],df['l-020'], color='red')
s2 = figure(plot_width=250, plot_height=175, title='l = 30',
            x_axis_label='Velocity (kms^-1)', 
            y_axis_label='Intensity')
s2.line(df['velocity'],df['l-030'], color='green')
s2.y_range = Range1d(0,100)  # You can use this to mach the scales
s3 = figure(plot_width=250, plot_height=175, title='l = 50',
            x_axis_label='Velocity (kms^-1)', 
            y_axis_label='Intensity')
s3.line(df['velocity'],df['l-050'], color='blue')

grid = gridplot([[s1],[s2],[s3]])
show(grid)


### Exercise 4.2

And a 1x3 (1 row 3 columns) display.

In [13]:
from bokeh.layouts import gridplot
from bokeh.models import Range1d

s1 = figure(plot_width=250, plot_height=175, title='l = 20',
            x_axis_label='Velocity (kms^-1)', 
            y_axis_label='Intensity')
s1.line(df['velocity'],df['l-020'], color='red')
s2 = figure(plot_width=250, plot_height=175, title='l = 30',
            x_axis_label='Velocity (kms^-1)', 
            y_axis_label='Intensity')
s2.line(df['velocity'],df['l-030'], color='green')
s2.y_range = Range1d(0,100)  # You can use this to mach the scales
s3 = figure(plot_width=250, plot_height=175, title='l = 50',
            x_axis_label='Velocity (kms^-1)', 
            y_axis_label='Intensity')
s3.line(df['velocity'],df['l-050'], color='blue')

grid = gridplot([[s1,s2,s3]])
show(grid)


## 5. Looping through multiple data sets

So, putting some of this all together, let's cycle through, and inspect values for all of the data we have.



In [8]:
# How many spectra have we got? It's one less than the total number of columns.
spec_no = len(df.columns)-1
#print(spec_no)

#Cycle through the columns starting at column 1 (remebber index starts at 0
# - which is the velocity column)
for idx in range(1,11):
    # Get the name of the column
    colname=df.columns[idx]
    # print (colname)
    p1 = figure(title = "Spectral data from spectum column"+colname, 
          x_axis_label='Velocity (kms^-1)', 
          y_axis_label='Intensity')
    p1.line(df['velocity'],df[colname])
    p1.add_tools(HoverTool(mode='vline'))
    show(p1)
    # Wait between plots
    fred = input('Next plot: ')
    # Clear the display before starting again - otherwise we get multiple plots
    clear_output(wait=True)
