# Bokeh Demos

*The "User Guide" linked on this page is a very good resource :* https://bokeh.pydata.org/en/latest/

To run this notebook on your computer, please be sure to install the `pandas` and `bokeh` libraries.  I recommend using conda and creating your own environment for this:
```
conda create -n bokeh-env python=3.10 jupyter pandas bokeh
conda activate bokeh-env
```

# Part 1

*An example with a scatter plot and connected table plus a callback to allow access to the selected values.*


## 1.1 Import the necessary libraries



In [None]:
# Import needed libraries.
import pandas as pd
from bokeh.plotting import *
from bokeh.layouts import row, column
from bokeh.models import ColumnDataSource, Scatter, Select, CustomJS
from bokeh.models.widgets import DataTable, TableColumn

output_notebook()

# if you uncomment the line below, the plot will be exported to an html file
# Note: if you include callbacks, this only works with Javascript callbacks (see below)

# output_file("scatterSelect.html", title='scatter')

## 1.2 Read in the data

*I am using exoplanet data from the [NASA Exoplanet Archive](https://exoplanetarchive.ipac.caltech.edu/). A description of each column is provided at the top of the file.*

In [None]:
# Read in (or create) data.
df = pd.read_csv('PS_2021.10.05_11.19.37.csv', comment='#')

# for this example, I will only keep rows that have values for mass and radius
usedf = df.loc[ (pd.notnull(df['pl_bmasse'])) & (pd.notnull(df['pl_rade']))].reset_index()
usedf

## 1.3 Define the `ColumnDataSource` and the plots.

*A `ColumnDataSource` will hold a Python dictionary (or a panda dataframe) containing your data and can be accessed by Bokeh.*

In [None]:
# create a column data source containing the mass and radius
source = ColumnDataSource(data=dict(x=usedf['pl_bmasse'], y=usedf['pl_rade']))

*Generate a simple figure using Bokeh.*

In [None]:
# define the tools you want to use
TOOLS = "pan,wheel_zoom,box_zoom,reset,save,box_select,lasso_select"

# create a new plot and renderer
f = figure(tools=TOOLS)
renderer = f.scatter('x', 'y', source=source, color='black', alpha=0.5, size=5, marker='circle')

# show the plot
show(f)

## 1.4 Your turn!  Create a new plot using different columns.  


In [None]:
# Define your ColumnDataSource with the columns that you want to plot

# Use that ColumnDataSource to create your figure


*I'm going to wrap the plotting commands in a function so that I can more easily recreate the plot later.  While I'm at it, I'll add some additional options.*

In [None]:
# create a column data source containing the mass and radius
source = ColumnDataSource(data=dict(x=usedf['pl_bmasse'], y=usedf['pl_rade']))
    
def createPlot(source, x='x', y='y', 
               xLabel='mass [Earth masses]', yLabel='radius [Earth radii]', 
               xRange=(0.4, 1e5), yRange=(0.3, 50),
               xAxisType='log', yAxisType='log',
               width=350, height=350, title=None):

    # define the tools you want to use
    TOOLS = "pan,wheel_zoom,box_zoom,reset,save,box_select,lasso_select"

    # create a new plot and renderer
    f = figure(tools=TOOLS, width=width, height=height, title=title, x_axis_type=xAxisType, y_axis_type=yAxisType, y_range=yRange, x_range=xRange)
    renderer = f.scatter(
                x, y, source=source, color='black', alpha=0.5, size=5, marker='circle', 
                 # (optional) define different colors for highlighted and non-highlighted markers
                selection_color="firebrick", selection_fill_alpha=0.8, selection_line_color = None,
                nonselection_fill_color="grey", nonselection_fill_alpha=0.2, nonselection_line_color = None,
                )
    f.xaxis.axis_label = xLabel
    f.yaxis.axis_label = yLabel

    return f

f = createPlot(source)

# show the plot
show(f)

## 1.5 Add a table 

*First let's look at how to create a simple `DataTable` in Bokeh.*

In [None]:
# create a list of columns that we want to include.  Each column is defined using Bokeh's TableColumn class
# here "field" is the label we gave column in the ColumnDataSource
columns = []
columns.append(TableColumn(field = "x", title = "mass [Earth masses]"))
columns.append(TableColumn(field = "y", title = "radius [Earth radii]"))

t = DataTable(source=source, columns=columns)
show(t)

*Now, let's add this to our dashboard.  Selecting data on the Bokeh `DataTable` will highlight it on the scatter plot and vice versa.*

*Again, I am going to wrap this in a function so that I can use it later and add a few more features.*

In [None]:
# I want to send the labels to this function also so that it can be more versatile
# I will expect the labels to be a dict with a key for each columns that I want to include and value for the label
def createTable(source, labels, width=350, height=300):
    # create a table to hold the selections
    columns = []
    for field in labels:
        columns.append(TableColumn(field=field, title=labels[field]))

    t = DataTable(source=source, columns=columns, width=width, height=height)

    return t

t = createTable(source, dict(x="mass [Earth masses]", y="radius [Earth radii]"))

# create a griplot layout and show the plot and table
layout = gridplot([[f, t]])

show(layout)

## 1.6 Add a second plot for "linked brushing"

When the user makes a selection on one plot (or the table), it will also be selected on the other plot and the table.  

In [None]:
# define a data source for two plots that contains all the columns I want to include
# first I will limit the original df to remove nans
usedf = df.loc[ (pd.notnull(df['pl_bmasse'])) & (pd.notnull(df['pl_rade'])) & 
                (pd.notnull(df['pl_orbeccen'])) & (pd.notnull(df['pl_orbper'])) ].reset_index()
# usedf = df
source = ColumnDataSource(data=dict(x1=usedf['pl_bmasse'], y1=usedf['pl_rade'], 
                                    x2=usedf['pl_orbper'], y2=usedf['pl_orbeccen']))

# define labels for the plots and table
labels = dict(x1="mass [Earth masses]", y1="radius [Earth radii]",
              x2="orbital period [days]", y2="eccentricity")

# create two figures
f1 = createPlot(source, x='x1', y='y1', xLabel=labels['x1'], yLabel=labels['y1'], xRange=(0.4, 1e5), yRange=(0.3, 50),)
f2 = createPlot(source, x='x2', y='y2', xLabel=labels['x2'], yLabel=labels['y2'], xRange=(0.1, 1e4), yRange=(0.0, 1), 
                yAxisType='linear')

t = createTable(source, labels, width=700)

# create a griplot layout and show the plot and table
layout = column(
    row(f1, f2),
    row(t)
)

show(layout)

## 1.7 Your turn!  Create a plot and/or table.  

*You can use this existing dataset but plotting different columns, or you can use your own data or any dataset on this [GitHub repo](https://github.com/ageller/IDEAS_FSS-Vis/tree/main/datasets).*


In [None]:
# Import needed libraries.


In [None]:
# Read in (or create) data.


In [None]:
# Create your columnDataSource and generate a Bokeh plot and/or table.


# Part 2 : Add "widgets" and "callbacks" 

*Widgets refer to extra elements added to plots that can control how the data is displayed, including buttons, dropdowns, checkboxes, sliders, etc.  (See [examples](https://docs.bokeh.org/en/latest/docs/user_guide/interaction/widgets.html) and [documentation](https://docs.bokeh.org/en/latest/docs/reference/models/widgets.html#widgets) on Bokeh's website.)*

*A callback is a generic term for a function that is called after some event happens.  For instance, below we will write callback functions that will be called after a selection is made, and after a widget value is changed.*

*Bokeh can work with callbacks in either Python of Javascript.  When working fully within a Jupyter notebook, it probably makes most sense to write callbacks in Python.  However, if you want to export your interactive plot to a website, or you want to work in google colab, you would need to write all callbacks in Javascript.  Let's work first fully in Python; then I will rewrite these examples with Javascript callbacks below for reference.*

## 2.1 The pure Python approach

*This approach has the benefit of already being in a language you know (Python), but you cannot use Python callbacks to create a .html file.  Only Javascript callbacks can be used to create an interactive plot for your website.*

**Note: Python callbacks will not work in Google Colab.  If you are working in Colab, please see the Javascript versions below (on in the [online Colab version here](https://colab.research.google.com/drive/1JiEgyzJWC547CmamXJWj-GGCHqRVZPC7?usp=sharing)).  On your local machine, they will only work on localhost:8888 (the default address for the first jupyter notebook you open.), unless you add a flag to your show command, like ```show(bkapp, notebook_url='localhost:8889')```**

### 2.1.1 Add a callback to get the selected indices for later use in the notebook

In [None]:
# It appears that in order for the Python callback to work, I need to redefine the plot and table within this cell
source = ColumnDataSource(data=dict(x=usedf['pl_bmasse'], y=usedf['pl_rade']))
f = createPlot(source)
t = createTable(source, dict(x="mass [Earth masses]", y="radius [Earth radii]"))

# define a global variable the will be modified within the callback
indices = []
# write the Python callback.  Note that all Python callbacks require these args.
def attachSelectionHandler(source):
    def selectionHandler(attr,old,new):
        global indices
        # the indices global variable will hold the indices of the selected elements
        indices = source.selected.indices

    # attach the callback to the data source to be run when the selection indices change
    source.selected.on_change("indices", selectionHandler)

attachSelectionHandler(source)

# create a griplot layout and show the plot and table
layout = gridplot([[f, t]])

# in order to run a Python callback in a Jupyter notebook, you need to include the following
def bkapp(doc):
    doc.add_root(layout)

show(bkapp)

In [None]:
# test that we have access to the selected points
usedf.iloc[list(indices)]#['pl_bmasse']

### 2.1.2 Adding a dropdown widget

*I will work with the same Pandas DataFrame, but now I want to allow the user to be able to interactively select the data to plot on each axis from a few different columns.  First I will need to create a new `ColumnDataSource`.  Then I can use the function from above to create the plot.  Finally I will create the dropdown(s) and add them to the plot.* 

In [None]:
# I will follow a very similar method as before but I will provide more columns to the ColumnDataSource
usedf = df.loc[ (pd.notnull(df['pl_bmasse'])) & (pd.notnull(df['pl_rade'])) & 
                (pd.notnull(df['pl_orbeccen'])) & (pd.notnull(df['pl_orbper'])) &
                (pd.notnull(df['st_teff'])) & (pd.notnull(df['sy_vmag']))].reset_index()

source = ColumnDataSource(data=dict(x=usedf['pl_bmasse'], y=usedf['pl_rade'], 
                                    mass=usedf['pl_bmasse'],
                                    rad=usedf['pl_rade'],
                                    ecc=usedf['pl_orbeccen'],
                                    per=usedf['pl_orbper'],
                                    teff=usedf['st_teff'],
                                    vmag=usedf['sy_vmag']))

#create a dict to hold all the keys and labels that I will want to use
labels = dict(mass="mass [Earth masses]",
              rad="radius [Earth radii]",
              ecc="eccentricity",
              per="orbital period [days]",
              teff="star Teff [K]",
              vmag="star V [mag]")

# use the function from above the create the plot
f = createPlot(source)

# now let's create a dropdown that will change the data plotted in the x axis
# Select is a Bokeh widget that I imported above. 
#    "value" is the starting value of the dropdown
#    "options" is a list that contains the text that I want to show up in the dropdowns (should contain value)

options = list(labels.values())
keys = list(labels.keys())
# a debugging suggestion to see what options and keys contain
# print('options = ', options) 
# print('keys = ', keys)

xSelect = Select(title="x axis", value=options[0], options=options)

# Python callback to change the data plotted in the x axis
def xCallback(attr,old,new):        
    # get the index in our lists of the new value
    index = options.index(new)
    
    # get the key for the new data
    key = keys[index]
          
    # set the x key in the source (shown in the plot) to that new column of data
    source.data['x'] = source.data[key]
    
    # update the axis label
    f.xaxis[0].axis_label = new

# attach the callback to the Select widgets
xSelect.on_change("value", xCallback)

# define the layout
layout = column(
    xSelect,
    f
)


# in order to run a Python callback in a Jupyter notebook, you need to include the following
def bkapp(doc):
    doc.add_root(layout)

show(bkapp)

### 2.1.3 Your turn!  Add in a similar dropdown widget for changing the y axis


In [None]:
# Hint: start by copying the code above into the cell below.  Then duplicate my method for the x-axis widget but now referencing the y axis.


*To solve this exercise, I combined these widgets into a function and also added a features to automatically set the plot bounds based on the data sent to each axis.*

In [None]:
# I will follow a very similar method as before but I will provide more columns to the ColumnDataSource
usedf = df.loc[ (pd.notnull(df['pl_bmasse'])) & (pd.notnull(df['pl_rade'])) & 
                (pd.notnull(df['pl_orbeccen'])) & (pd.notnull(df['pl_orbper'])) &
                (pd.notnull(df['st_teff'])) & (pd.notnull(df['sy_vmag']))].reset_index()

source = ColumnDataSource(data=dict(x=usedf['pl_bmasse'], y=usedf['pl_rade'], 
                                    mass=usedf['pl_bmasse'],
                                    rad=usedf['pl_rade'],
                                    ecc=usedf['pl_orbeccen'],
                                    per=usedf['pl_orbper'],
                                    teff=usedf['st_teff'],
                                    vmag=usedf['sy_vmag']))

#create a dict to hold all the keys and labels that I will want to use
labels = dict(mass="mass [Earth masses]",
              rad="radius [Earth radii]",
              ecc="eccentricity",
              per="orbital period [days]",
              teff="star Teff [K]",
              vmag="star V [mag]")


# use the function from above the create the plot
f = createPlot(source)

# define the dropdowns.  
# Again, I will wrap this in a function
def createDropdowns(source, f, labels):
    # define widgets to change the x and y values to plot
    
    # I will create a few arrays here
    # "options" will be created from labels and will contain the text that I want to show up in the dropdowns
    # "keys" will be created from labels and will contain the actual key values that I defined in the source dict
    # "bounds" will contain axis limits for each key, note that since I'm using log scaling, I need to make these >0
    # (in principle this could be done as a single dict, but having the lists makes the javascript side easier)
    options = list(labels.values())
    keys = list(labels.keys())
    bounds = [];
    for k in labels:
        bounds.append([max(0.5*min(source.data[k]), 0.0001), max(2*max(source.data[k]), 0.0001)])
        
    # Select is a Bokeh widget that I imported above.  I will create one for the x-axis and one for the y-axis.
    xSelect = Select(title="x axis", value=options[0], options=options)
    ySelect = Select(title="y axis", value=options[1], options=options)
    
    # Python callback
    # I'm going to create separate callbacks to handle each axis
    # There may be a cleaner way to do this with an individual callback (like in Javascript below), but it's not obvious to me.
    def xCallback(attr,old,new):        
        # get the index in our lists of the new value
        index = options.index(new)
        
        # get the key for the new data
        key = keys[index]
              
        # set the x key in the source (shown in the plot) to that new column of data
        source.data['x'] = source.data[key]
        
        # update the axis limits
        f.x_range.start = bounds[index][0]
        f.x_range.end = bounds[index][1]
        
        # update the axis label
        f.xaxis[0].axis_label = new
        
    def yCallback(attr,old,new):
        # get the index in our lists of the new value
        index = options.index(new)
        
        # get the key for the new data
        key = keys[index]
              
        # set the x key in the source (shown in the plot) to that new column of data
        source.data['y'] = source.data[key]
        
        # update the axis limits
        f.y_range.start = bounds[index][0]
        f.y_range.end = bounds[index][1]
        
        # update the axis label
        f.yaxis[0].axis_label = new
    
    # attach the callback to the Select widgets
    xSelect.on_change("value", xCallback)
    ySelect.on_change("value", yCallback)
    
    return xSelect, ySelect
   
xSelect, ySelect = createDropdowns(source, f, labels)

layout = row(
    f,
    column(xSelect,ySelect)
)

# in order to run a Python callback in a Jupyter notebook, you need to include the following
def bkapp(doc):
    doc.add_root(layout)

show(bkapp)


### 2.1.4 Add the table and the selection handler back in

In [None]:
# you always need to redefine the data source in the cell (even if it is unchanged) <-- this is a Bokeh thing
source = ColumnDataSource(data=dict(x=usedf['pl_bmasse'], y=usedf['pl_rade'], 
                                    mass=usedf['pl_bmasse'],
                                    rad=usedf['pl_rade'],
                                    ecc=usedf['pl_orbeccen'],
                                    per=usedf['pl_orbper'],
                                    teff=usedf['st_teff'],
                                    vmag=usedf['sy_vmag']))

f = createPlot(source, width=500)

t = createTable(source, labels, width=800)

xSelect, ySelect = createDropdowns(source, f, labels)

attachSelectionHandler(source)

layout = column(
    row(column(xSelect,ySelect), f,),
    row(t)
)

# in order to run a Python callback in a Jupyter notebook, you need to include the following
def bkapp(doc):
    doc.add_root(layout)

show(bkapp)

In [None]:
# test that we have access to the selected points
usedf.iloc[list(indices)]#['pl_bmasse']

## 2.2 The Javascript approach:

*In this example below, I am writing the callback functions in Javascript, using Bokeh's `CustomJS`.  This will allow us to show the result within a Colab notebook and also to save the resulting plot as a standalone .html file that could be used on your website.  (But, you need to learn a little Javascript.)  Here I only recreate the dropdowns in JS.  If you are using Google Colab and want a callback to return indices of selected points, please see the [online Colab version here](https://colab.research.google.com/drive/1JiEgyzJWC547CmamXJWj-GGCHqRVZPC7?usp=sharing)*

In [None]:
# Define the Javascript callback for the dropdowns
# Again, I will wrap this in a function
def createDropdownsJS(source, f, labels):
    # define widgets to change the x and y values to plot
    
    # I will create a few arrays here
    # "options" will be created from labels and will contain the text that I want to show up in the dropdowns
    # "keys" will be created from labels and will contain the actual key values that I defined in the source dict
    # "bounds" will contain axis limits for each key, note that since I'm using log scaling, I need to make these >0
    # (in principle this could be done as a single dict, but having the lists makes the javascript side easier)
    options = list(labels.values())
    keys = list(labels.keys())
    bounds = [];
    for k in labels:
        bounds.append([max(0.5*min(source.data[k]), 0.0001), max(2*max(source.data[k]), 0.0001)])
        
    # Select is a Bokeh widget that I imported above.  I will create one for the x-axis and one for the y-axis.
    xSelect = Select(title="x axis", value=options[0], options=options)
    ySelect = Select(title="y axis", value=options[1], options=options)

    # Javascript callback
    # I'm going to create a single callback to handle both axes
    callback = CustomJS(args=dict(source=source, keys=keys, options=options, bounds=bounds,
                                 axes={"x":f.xaxis[0], "y":f.yaxis[0]}, 
                                 ranges={"x":f.x_range, "y":f.y_range} ), 
                        code="""
        //get the value from the dropdown 
        //Note: "this" is like Python's "self"; here it will containt the select element.
        var val = this.value;

        //now find the index within the options array so that I can find the correct key to use
        var index = options.indexOf(val);
        var key = keys[index];

        //check which axis this is
        var ax = "x";
        if (this.title == "y axis") ax = "y";
        console.log(this.title, ax)

        //change the data being plotted
        source.data[ax] = source.data[key];
        source.change.emit();

        //change the axis label
        axes[ax].axis_label = val;

        //change the bounds
        ranges[ax].start = bounds[index][0];
        ranges[ax].end = bounds[index][1];

    """)
    
    # attach the callback to the Select widgets
    xSelect.js_on_change("value", callback)
    ySelect.js_on_change("value", callback)
    
    return xSelect, ySelect

In [None]:
# create the data, widgets, figure and table
# I will follow a very similar method as before but I will use the Javascript callbacks
usedf = df.loc[ (pd.notnull(df['pl_bmasse'])) & (pd.notnull(df['pl_rade'])) & 
                (pd.notnull(df['pl_orbeccen'])) & (pd.notnull(df['pl_orbper'])) &
                (pd.notnull(df['st_teff'])) & (pd.notnull(df['sy_vmag']))].reset_index()
source = ColumnDataSource(data=dict(x=usedf['pl_bmasse'], y=usedf['pl_rade'], 
                                    mass=usedf['pl_bmasse'],
                                    rad=usedf['pl_rade'],
                                    ecc=usedf['pl_orbeccen'],
                                    per=usedf['pl_orbper'],
                                    teff=usedf['st_teff'],
                                    vmag=usedf['sy_vmag']))

#create a dict to hold all the keys and labels that I will want to use
labels = dict(mass="mass [Earth masses]",
              rad="radius [Earth radii]",
              ecc="eccentricity",
              per="orbital period [days]",
              teff="star Teff [K]",
              vmag="star V [mag]")


f = createPlot(source, width=500)

t = createTable(source, labels, width=800)

xSelect, ySelect = createDropdownsJS(source, f, labels)

layout = column(
    row(column(xSelect,ySelect), f,),
    row(t)
)

# show the plot
show(layout)


In [None]:
# if you uncomment the lines below, the plot will be exported to an html file
# output_file("scatterSelect.html", title='scatter')
# show(layout)

## 2.3. Your turn!  Create a Bokeh plot and/or table of some data with some widget.  

*If you have your own data set, please use that.  If you need a multi-dimensional data set to explore, you can look [here.](https://github.com/ageller/IDEAS_FSS-Vis/tree/master/datasets)*

*Include the standard pan, wheel_zoom, box_zoom, reset, save, box_select, and lasso_select tools (if creating a figure), and also [a widget](https://docs.bokeh.org/en/latest/docs/user_guide/interaction/widgets.html) (e.g., slider, button, etc.) that controls some aspect of the plot.*

*Remember, if you are working in a Jupyter-notebook, your callback can either by in Python or Javascript, but in Colab your callback must be in Javascript.*

In [None]:
# Import needed libraries.

In [None]:
# Read in (or create) data.

In [None]:
# Create your columnDataSource and generate a Bokeh plot and/or table with a widget.