# Bokeh Quick Reference Guide

This notebook is meant to serve as a very quick reference guide / tutorial for the Python visualization library [Bokeh](https://docs.bokeh.org/en/latest/index.html). For a more detailed introduction to the library, there is a tutorial available [here](https://github.com/bokeh/bokeh-notebooks/tree/7b6da26945e284b19df07daecc6beabdb7adbe81/tutorial), and there are several examples and detailed descriptions of various Bokeh components in the [User Guide](https://docs.bokeh.org/en/latest/docs/user_guide.html).

## Setting the output mode

When you create a Bokeh plot, you can save to a file and/or display it in a Jupyter notebook. You'll want to set the mode using the `output_notebook()` and `output_file()` functions from `bokeh.plotting`.

* To display in notebook: Run `output_notebook()`
* To save to a file: Run `output_file(filename)`

Then, you can display the plot with `show()` or only save (without displaying) with `save()`. If you've run `output_file()`, `show()` will save the file and display it in a new browser tab.

Both `output_notebook()` and `output_file()` are persistent, so if you've run `output_file(filename)`, all subsequent plots will be saved to that file. To move on to a new plot without saving, run `reset_output()`.

In [1]:
from bokeh.plotting import output_notebook, show  # output_file, reset_output, save

In [2]:
output_notebook()

# Loading in sample data

We'll use some sample data located in '../data/rL_data.csv'. This file has measurements of $\log_{10}[\lambda L_\lambda(1350)/L_\odot]$, CIV emission line time lags $\tau$, and redshift $z$ for various AGNs taken from multiple references. The $L$ and $\tau$ measurements have upper and lower uncertainties, given by the suffixes "_hi" and "_lo".

You don't *need* to use Pandas to handle your data, but it works very well with Bokeh, so I would recommend it.

In [3]:
import pandas as pd

df = pd.read_csv('../data/rL_data.csv')
df.head()

Unnamed: 0,objname,reference,logL1350,logL1350_hi,logL1350_lo,tau,tau_hi,tau_lo,z
0,32,Grier at al. 2019,44.492,0.021,0.021,21.1,22.7,12.8,1.715
1,52,Grier at al. 2019,45.499,0.002,0.002,32.6,6.9,6.6,2.305
2,181,Grier at al. 2019,44.545,0.015,0.015,102.1,26.8,30.5,1.675
3,249,Grier at al. 2019,44.984,0.01,0.01,22.8,31.3,13.5,1.717
4,256,Grier at al. 2019,45.089,0.003,0.003,43.1,49.0,26.1,2.244


# Basic plotting

Most basic plotting functionality can be accessed simply using the higher-level `bokeh.plotting` module. Start off by creating a blank figure with `p = figure()`, then you can build on top of this by adding "glyphs" (lines, circles, squares, etc.). Do this with, e.g.,  `p.line()` or `p.circle()`, passing the x,y coordinates and glyph properties. For our example, we'd like to plot the lags (tau) on a log scale, so we'll pass that as an argument in the figure call.

In [4]:
from bokeh.plotting import figure

x = df.logL1350
y = df.tau

p = figure(plot_width=500, plot_height=300,y_axis_type="log",x_axis_label="log(L1350/Lsun)",y_axis_label="CIV lag (days)")
p.scatter(x, y, marker='circle', size=15, line_color="navy", fill_color="orange", fill_alpha=0.5)
show(p)

You can make subplots with `row`, `column`, or `gridplot` from the `bokeh.layouts` module, and piecing together multiple figures:

In [5]:
from bokeh.layouts import row, column, gridplot

x = df.logL1350
y = df.tau
z = df.z

p1 = figure(plot_width=400, plot_height=300, y_axis_type="log",x_axis_label="log(L1350/Lsun)",y_axis_label="CIV lag (days)")
p1.scatter(x, y, marker='circle', size=10, line_color="navy", fill_color="orange", fill_alpha=0.5)
p2 = figure(plot_width=400, plot_height=300,x_axis_label="log(L1350/Lsun)",y_axis_label="redshift")
p2.scatter(x, z, marker='circle', size=10, line_color="navy", fill_color="orange", fill_alpha=0.5)

layout = row(p1, p2)
show(layout)

# Tools

There are a variety of tools available to interact with Bokeh plots. You can choose what tools you would like available by setting a `tools` argument when you create your figure. The default is `"pan,box_zoom,wheel_zoom,save,reset"`.

A list of all available tools can be found [here](https://docs.bokeh.org/en/latest/docs/user_guide/tools.html).

In [6]:
TOOLS = "pan,wheel_zoom,box_select,crosshair,hover,reset"

x = df.logL1350
y = df.tau
z = df.z

p1 = figure(
    plot_width=400, plot_height=300,
    y_axis_type="log",
    x_axis_label="log(L1350/Lsun)",y_axis_label="CIV lag (days)",
    tools=TOOLS
)
p1.scatter(x, y, marker='circle', size=10, line_color="navy", fill_color="orange", fill_alpha=0.5)

p2 = figure(
    plot_width=400, plot_height=300,
    x_axis_label="log(L1350/Lsun)",y_axis_label="redshift",
    tools=TOOLS
)
p2.scatter(x, z, marker='circle', size=10, line_color="navy", fill_color="orange", fill_alpha=0.5)

layout = gridplot([[p1, p2]])
show(layout)

# Connecting data across subplots

One of the great things about Bokeh is that it's easy to link data across multiple plots for interaction. To do so, we need to put the data into a `ColumnDataSource` object. You can do this by putting the data into a dictionary, or if you use Pandas, you can simply pass in the DataFrame.

In [7]:
from bokeh.models import ColumnDataSource

source = ColumnDataSource(df)

Now, when you create your glyph, you can give the column names as the x and y coordinates and provide the ColumnDataSource object. You'll notice that if you highlight points in one subplot, the corresponding points will also highlight in the second subplot.

In [8]:
TOOLS = "pan,wheel_zoom,box_select,crosshair,hover,reset"

p1 = figure(
    plot_width=400, plot_height=300,
    y_axis_type="log",
    x_axis_label="log(L1350/Lsun)",y_axis_label="CIV lag (days)",
    tools=TOOLS
)
p1.scatter('logL1350', 'tau', source=source, marker='circle', size=10, line_color="navy", fill_color="orange", fill_alpha=0.5)

p2 = figure(
    plot_width=400, plot_height=300,
    x_axis_label="log(L1350/Lsun)",y_axis_label="redshift",
    tools=TOOLS
)
p2.scatter('logL1350', 'z', source=source, marker='circle', size=10, line_color="navy", fill_color="orange", fill_alpha=0.5)

layout = gridplot([[p1, p2]])
show(layout)

When using a ColumnDataSource, the HoverTool becomes much more powerful. We can set what information is displayed by passing the `tooltips` argument:


In [9]:
TOOLS = "pan,wheel_zoom,box_select,crosshair,hover,reset"
TOOLTIPS = [
    ("Object", "@objname"),
    ("Reference", "@reference"),
    ("log(L1350/Lsun)", "@logL1350"),
    ("Lag", "@logL1350"),
    ("z", "@z"),
]

p1 = figure(
    plot_width=400, plot_height=300,
    y_axis_type="log",
    x_axis_label="log(L1350/Lsun)",y_axis_label="CIV lag (days)",
    tools=TOOLS, tooltips=TOOLTIPS
)
p1.scatter('logL1350', 'tau', source=source, marker='circle', size=10, line_color="navy", fill_color="orange", fill_alpha=0.5)

p2 = figure(
    plot_width=400, plot_height=300,
    x_axis_label="log(L1350/Lsun)",y_axis_label="redshift",
    tools=TOOLS, tooltips=TOOLTIPS
)
p2.scatter('logL1350', 'z', source=source, marker='circle', size=10, line_color="navy", fill_color="orange", fill_alpha=0.5)

layout = gridplot([[p1, p2]])
show(layout)

# Error bars

Unfortunately, Bokeh does not have a build in errorbar plotting function. However, you can code up a quick function to compute the coordinates of each error bar line segment and plot them with `multi_line()`.

In [10]:
def addErrorBars(p, source, x, y, **kwargs):
    data = source.data
    errs_x, errs_y = [], []
    # Loop over each entry and set the coordinates of a vertical errorbar line
    for i in range(len(data[x])):
        errs_x.append([data[x][i] - data[x+'_lo'][i], data[x][i] + data[x+'_hi'][i]])
        errs_y.append([data[y][i], data[y][i]])
    # Repeat for the y errors
    for i in range(len(data[x])):
        errs_x.append([data[x][i], data[x][i]])
        errs_y.append([data[y][i] - data[y+'_lo'][i], data[y][i] + data[y+'_hi'][i]])
    p.multi_line(errs_x, errs_y, **kwargs)

TOOLS = "pan,wheel_zoom,box_select,crosshair,hover,reset"
TOOLTIPS = [
    ("Object", "@objname"),
    ("Reference", "@reference"),
    ("log(L1350/Lsun)", "@logL1350"),
    ("Lag", "@logL1350"),
    ("z", "@z"),
]

p = figure(
    plot_width=500, plot_height=300,
    y_axis_type="log",
    x_axis_label="log(L1350/Lsun)",y_axis_label="CIV lag (days)",
    tools=TOOLS, tooltips=TOOLTIPS
)
p.scatter('logL1350', 'tau', source=source, marker='circle', size=8, line_color="navy", fill_color="orange", fill_alpha=0.5)
addErrorBars(p, source, 'logL1350', 'tau', color='navy')
show(p)

# Legends

We can create a legend simply by adding the `legend_label` argument to `p.scatter()`. You can then set the legend properties such as it's location.

Here, we're going to sort the points by their references, which we can access with `df.reference.unique()`. We'll use one of Bokeh's color palettes, `Category10`. A list of available palettes is available [here](https://docs.bokeh.org/en/latest/docs/reference/palettes.html).

We'll also add an additional layer of interaction by setting `p.legend.click_policy = "mute"` which will let us use the legend to toggle the alpha of the points, set by the `muted_alpha` argument. Alternatively, we could choose `"hide"` which would hide the points completely.

In [11]:
from bokeh.palettes import Category10

TOOLS = "pan,wheel_zoom,box_select,crosshair,hover,reset"
TOOLTIPS = [
    ("Object", "@objname"),
    ("Reference", "@reference"),
    ("log(L1350/Lsun)", "@logL1350"),
    ("Lag", "@logL1350"),
    ("z", "@z"),
]

p = figure(plot_width=700, plot_height=400, y_axis_type="log", tools=TOOLS, tooltips=TOOLTIPS)
for ref, color in zip(df.reference.unique(), Category10[len(df.reference.unique())]):
    source = ColumnDataSource(df[df['reference'] == ref])
    p.scatter('logL1350', 'tau', fill_color=color, line_color='grey', size=8, fill_alpha=0.8, legend_label=ref, source=source, muted_alpha=0.1)
    addErrorBars(p, source, 'logL1350', 'tau',color=color, legend_label=ref, muted_alpha=0.1)

p.legend.location = "top_left"
p.legend.click_policy = "mute"

p.xaxis.axis_label = "log(L1350/Lsun)"
p.yaxis.axis_label = "CIV lag"
show(p)

# Higher-level interactions

On top of the basic level of interaction automatically built into all Bokeh plots, we can create create more complex interactions that trigger code to manipulate the data that appear in our figures. These interactions can be based on widgets like slider bars, text boxes, and dropdown menus or based on, e.g., which points the user has highlighted.

* **JavaScript callbacks**: These will call lines of JavaScript code when the user interacts with the figure (such as changing a slider value) and update the figure accordingly. Figures created using JavaScript callbacks can be saved as standalone .html files with all of the interaction code embedded within the file. All calculations will be executed client-side in the user's browser (or other html viewer). You could easily send the file to anyone with a browser to open, or could easily embed the file on a website.

* **Python callbacks**: These call lines of Python code to update the figure and require a bokeh server to run. In other words, you can't simply share an output file with others -- they either need to have bokeh installed so that they can run their own server, or you'll need to serve it on your own computer where they can access it (either via a website or ssh tunneling -- more on that later)

## JavaScript callbacks

To use javascript callbacks, we'll use the `CustomJS` model. In the example below, we'll plot a sine wave with amplitude, phase, and period as tunable parameters.

In [12]:
import numpy as np
from bokeh.models import CustomJS
from bokeh.models.widgets import Slider

x = np.linspace(0,10,500)
y = 2.0 * np.sin(1.0 * x + np.pi)
source = ColumnDataSource(data=dict(x=x,y=y))

p = figure(plot_width=400, plot_height=200, y_range=(-5,5), x_range=(0,10))
p.line('x','y',source=source)

# Create the sliders
slider_amp = Slider(start=0, end=5, value=2, step=.1, title="Amplitude")
slider_freq = Slider(start=0.1, end=10, value=1.0, step=.1, title="Frequency")
slider_phase = Slider(start=0, end=2.0*np.pi, value=np.pi, step=.1, title="Phase")

# Set the code to run when the slider values are changed
callback = CustomJS(
    args=dict(source=source, amp=slider_amp, freq=slider_freq, phase=slider_phase),
    code="""
        // Get the data values and the slider values
        const data = source.data;
        const A = amp.value;
        const k = freq.value;
        const phi = phase.value;
        
        const x = data['x'];
        const y = data['y'];
        
        // Loop over x and update the y values
        for (var i = 0; i < x.length; i++) {
            y[i] = A * Math.sin(k * x[i] + phi);
        }
        
        // Need this to send back the updated values
        source.change.emit();
    """
)

# Let bokeh know that when the value of any of the sliders changes it should run the callback code
slider_amp.js_on_change('value', callback)
slider_freq.js_on_change('value', callback)
slider_phase.js_on_change('value', callback)

# Set the layout with the sliders and plot
layout = row(column(slider_amp, slider_freq, slider_phase), p)
show(layout)

If you want to send this to someone to use elsewhere, you can save the file as an .html file and open it in any browser.

<font color=red><strong>IMPORTANT: The .html file contains all of the JavaScript callback code, so keep this in mind if you want to share these figures with others</strong></font>

In [13]:
from bokeh.plotting import output_file, reset_output

output_file('output/sine_wave_sliders.html')
show(layout)

# Run these lines otherwise you'll keep saving to 'sine_wave_sliders.html' every time you show a new plot
reset_output()
output_notebook()

## Python callbacks

To use python callbacks within a Jupyter notebook, you'll need to first put all of the code into a function, e.g. `modify_doc(doc)`, and then you will run the cell with `show(modify_doc, notebook_url="http://localhost:8889")`. Here, `notebook_url` should point to whatever the actual URL is for your current Jupyter notebook session.

The rest is very similar: write a function to define how you'd like to update the figure, and attach it to a widget with, e.g., `slider.on_change('value', update_function)`

Here, we replicate the sine wave figure using a Python callback instead.

In [14]:
from bokeh.models.widgets import Slider, Button

def modify_doc(doc):
    
    # First set the starting data and create the plot
    x = np.linspace(0,10,500)
    y = 2.0 * np.sin(1.0 * x + np.pi)
    source = ColumnDataSource(data=dict(x=x,y=y))

    p = figure(plot_width=400, plot_height=200, y_range=(-5,5), x_range=(0,10))
    p.line('x','y',source=source)
    
    # Create the sliders
    slider_amp = Slider(start=0, end=5, value=2, step=.1, title="Amplitude")
    slider_freq = Slider(start=0.1, end=10, value=1.0, step=.1, title="Frequency")
    slider_phase = Slider(start=0, end=2.0*np.pi, value=np.pi, step=.1, title="Phase")

    # Set the code to update the data when the sliders are change
    def update(attr, old, new):
        A = slider_amp.value
        freq = slider_freq.value
        phase = slider_phase.value
        
        new_data = {}
        new_data['x'] = source.data['x']
        new_data['y'] = A * np.sin(freq * x + phase)

        source.data = new_data

    # Let bokeh know that when the value of any of the sliders changes it should run the callback code
    slider_amp.on_change('value', update)
    slider_freq.on_change('value', update)
    slider_phase.on_change('value', update)
    
    # Set the layout with the sliders and plot
    layout = row(column(slider_amp, slider_freq, slider_phase), p)

    # add the layout to curdoc
    doc.add_root(layout)

# To display in the jupyter notebook, you need to provide the notebook_url
show(modify_doc, notebook_url="http://localhost:8889") 

To run a bokeh server outside of the notebook, we need to change just a few things: First, put the code into a file, e.g., `sliderexample.py`. Then add `from bokeh.io import curdoc` and move everything out of the `modify_doc()` function into the main code. Then, get rid of the `show()` call and change `doc.add_root()` to `curdoc().add_root()`.

Your code should look like this:

```python
import numpy as np
from bokeh.plotting import figure
from bokeh.models import ColumnDataSource
from bokeh.models.widgets import Slider
from bokeh.layouts import row, column
from bokeh.io import curdoc

# First set the starting data and create the plot
x = np.linspace(0,10,500)
y = 2.0 * np.sin(1.0 * x + np.pi)
source = ColumnDataSource(data=dict(x=x,y=y))

p = figure(plot_width=400, plot_height=200, y_range=(-5,5), x_range=(0,10))
p.line('x','y',source=source)
    
# Create the sliders
slider_amp = Slider(start=0, end=5, value=2, step=.1, title="Amplitude")
slider_freq = Slider(start=0.1, end=10, value=1.0, step=.1, title="Frequency")
slider_phase = Slider(start=0, end=2.0*np.pi, value=np.pi, step=.1, title="Phase")

# Set the code to update the data when the sliders are change
def update(attr, old, new):
    A = slider_amp.value
    freq = slider_freq.value
    phase = slider_phase.value
        
    new_data = {}
    new_data['x'] = source.data['x']
    new_data['y'] = A * np.sin(freq * x + phase)

    source.data = new_data

# Let bokeh know that when the value of any of the sliders changes it should run the callback code
slider_amp.on_change('value', update)
slider_freq.on_change('value', update)
slider_phase.on_change('value', update)
    
# Set the layout with the sliders and plot
layout = row(column(slider_amp, slider_freq, slider_phase), p)

# add the layout to curdoc
curdoc().add_root(layout)
```

We can now start a server by running `bokeh serve sliderexample.py`. By default, this will start the server at `localhost:5006`, and you can access the plot by navigating to `localhost:5006` in a web browser.

Some additional options:
* `--show`: Opens up the browser automatically
* `--port 5100`: Sets the server to run at `localhost:5100`

# Deployment options

In addition to using the figures for your own data exploration, you may want to share then with others in the group, external collaborators, or with the general public.

## Save as .html file
The simplest way to share a plot is by saving it to a `.html` file by setting `output_file(output.html)`. This requires that you've written all of your callbacks in *JavaScript only*. You can then send this file to someone or put it on a website with, e.g., 
```html
<iframe src="/path/to/file.html"
    sandbox="allow-same-origin allow-scripts"
    width="100%"
    height="500"
    scrolling="no"
    seamless="seamless"
    frameborder="0">
</iframe>
```

Bokeh also has a `components()` function that will output the `<script>` and `<div>` tags that you can then insert into a web page.

<font color=red><strong>REMINDER WARNING: The .html file contains all of the JavaScript callback code, and anyone with access to the file or website also has access to this code.</strong></font>

## Bokeh server with ssh tunneling
If you want to share something with others who have access to the group computers, you can start a bokeh server on the group computer and then everyone can access it via an ssh tunnel. For example, on rocher I've started a server on port 5100 with
```bash
bokeh serve --port 5100 caramelfiles
```
Anyone in the group can do 
```bash
ssh -NL localhost:5100:localhost:5100 yourusername@rocher.astro.ucla.edu
```
and navigate to localhost:5100 on their own machine to access the bokeh document.

## Bokeh server with a public website
I'm pretty sure this is possible, but I need to talk to Nick to figure out some of the details. This option may be desirable when:
* You would like to deploy some sort of tool, but you don't want to make the code public
* The operations are computationally heavy and shouldn't be performed client-side in a browser
* You need Python functions for your callbacks