----------------------------------
# Bokeh overview
----------------------------------

### * What is Bokeh?


Bokeh is a python library designed for creating interactive visualizations that are rendered in web browsers, without requiring direct JavaScript programming.


### * Basic workflow



1. Define a basic figure. You can import the code from a previous matplotlib plot

2. Customize the elements to assemble the plots that you need for your specific analysis

### *How to install Bokeh:

To install Bokeh in your local machine:

In [None]:
! bash pip install bokeh

or

In [None]:

! conda install bokeh

You can find more details in the Bokeh Installation Guide:
https://docs.bokeh.org/en/latest/docs/first_steps/installation.html

#### * Required dependencies for this tutorial



For basic usage, Bokeh requires the following libraries:

Jinja2 >=2.9

contourpy >=1.2

narwhals>=1.13

numpy >=1.16

packaging >=16.8

pandas >=1.2

pillow >=7.1.0

PyYAML >=3.10

tornado >=6.2; sys_platform != ‘emscripten’

xyzservices >=2021.09.1


***They are automatically installed when using conda or pip.

### * How run Bokeh in JupyterLab



Install the jupyter_bokeh extension: 
https://github.com/bokeh/jupyter_bokeh

For more details see:
https://docs.bokeh.org/en/latest/docs/user_guide/output/jupyter.html

### * How to display Bokeh in jupyter notebooks
Throughout this tutorial, we are going to open our figures inside this notebook, rather than a search engine. 
If you prefer to open your figures in a search engine, skip the next cell

In [24]:
from bokeh.io import output_notebook

output_notebook()

----------------------------------
# Preparing our sample 
----------------------------------

In this tutorial we are going to use the data from the paper: 

[Amélie Saintogne et al. 2011. COLD GASS, an IRAM legacy survey of molecular gas in massive galaxies – I. Relations between H2, H I, stellar content and structural properties.](https://ui.adsabs.harvard.edu/abs/2011MNRAS.415...32S/abstract)

The data is separated into Table 1 and Table 2. They can be downloaded in pdf format from the ["Suplementary Data" section](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/mnras/415/1/10.1111_j.1365-2966.2011.18677.x/2/mnras0415-0032_Supplementary_Data.zip?Expires=1761483016&Signature=Mussq95MvBkKrXB4RJdByLWw9z1J6H0-2rgCn~K72~ZKTJlv5EDC8~5hz2qF-DNttE0akYea7caS7yBidqgWYUeBQDTZE~EtJM3VZYH9ha8rVbUfn~OwJRqLr7WZSMPFv~~lngyuVGktHGGbHea1vj84wtByfDymZoq1l7NOB7Ee-gbAu642xWUqQhgq3plj4JTLMF4wpslSS0S8uSO9orWNyEDnd9I0cZJzYku9KJBppsFk3qngfv7~Twcb6LmujNqBUg2RW6aJASZNGHvPbbVokYL9BzPhXBgVciX6tIYo6MP3sOhnu0bCvI720Qd4IBfbWN9H4ZBEuXmnswHhrA__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)


## ! Read data from pdf and convert to csv table

We are going to use the Python library camelot. Run the following code in your terminal:

pip instal camelot

More info in : 
https://camelot-py.readthedocs.io/en/master/

In [2]:
import camelot

### Table 1: stellar masses, optical photometry from SDSS

Reading a table from pdf requires some manual handling. In our case, the pretty format that is often used for publications does not include lines as separators for the cells in the table. 

This makes the work of camelot more challenging. Its approach is going to be to read anything and stuff it into a cell, so we will have to find the headers and remove them from the data for


First read the table. The output is going to be a 'TableList', with one table for each page of the document

In [4]:
table1 = camelot.read_pdf('Data_Saintogne2013/mnras0415-0032-SD1.pdf', pages='1-4', flavor='stream')

Now we check the dimensions (rows,cols) of the first Table

In [5]:
table1[0] #It is going to have 69 rows and 9 columns

<Table shape=(69, 9)>

Next, we need to append the data from the different tables. For this, we create an empty dataframe and append every Table stored in our TableList. Then we can use the pd.concat() utlity of pandas.

In [6]:
import pandas as pd

dfs = [] # Empty dataframe
for file in table1: # Read every table in the TableList
    df = file.df # Convert to dataframe
    dfs.append(df) #Append to the list of dataframes

combined_df = pd.concat(dfs, ignore_index=True) # Concatenate all rows

Let's have a look at the dataframe. The first 2-3 lines will be the information from the header, that we need to store and remove from the rows.

In [7]:
combined_df 

Unnamed: 0,0,1,2,3,4,5,6,7,8
0,,,Table 1: Optical and UV parameters of the COLD...,,,,,,
1,GASS ID,SDSS ID,zSDSS,M∗,µ∗,D25,R90/R50,NUV−r,r
2,,,,[log M(cid:12)],[log M(cid:12)kpc−2],[”],,[mag],[mag]
3,11956,J000820.76+150921.6,0.0395,10.09,8.48,22.5,2.15,3.04,16.28
4,12025,J001934.54+161215.0,0.0366,10.84,9.13,34.3,3.03,5.93,14.73
...,...,...,...,...,...,...,...,...,...
226,11514,J232326.53+152510.4,0.0428,10.27,9.16,19.6,2.98,4.27,16.26
227,11386,J232611.29+140148.1,0.0462,10.56,9.08,25.7,3.46,4.69,15.77
228,11808,J235257.31+154244.8,0.0479,10.78,8.69,36.0,2.61,4.97,14.97
229,11845,J235644.47+135435.4,0.0363,10.60,8.65,42.5,2.47,4.10,14.98


We are interested in the second row, so we read it and change the name of the columns

In [8]:
new_header = combined_df.iloc[1].values # Read second row
combined_df.columns = new_header # Replace the column names
combined_df[68:73] # Check the change was succesful, and that indeed the header has been introduced multiple times into the table

Unnamed: 0,GASS ID,SDSS ID,zSDSS,M∗,µ∗,D25,R90/R50,NUV−r,r
68,18686,J095302.62+075029.3,0.0411,10.55,8.48,38.0,2.10,2.99,14.88
69,GASS ID,SDSS ID,zSDSS,M∗,µ∗,D25,R90/R50,NUV−r,r
70,,,,[log M(cid:12)],[log M(cid:12)kpc−2],[”],,[mag],[mag]
71,20292,J095349.23+091137.6,0.0299,10.67,9.02,31.3,2.41,4.47,14.38
72,20286,J095439.45+092640.7,0.0346,10.53,8.89,33.8,2.63,4.11,15.74


Now we are going to mask the rows that have text from the header in them. Let's choose the column M∗, where the title is stored, and mask every instance of the header

In [9]:
mask1 = combined_df['M∗'] != '[log M(cid:12)]'
mask2 = combined_df['M∗'] != 'M∗'
mask3 = combined_df['M∗'] != 'Table 1: Optical and UV parameters of the COLD GASS galaxies'
mask4 = combined_df['zSDSS'] != 'Table 1: Optical and UV parameters of the COLD GASS galaxies'
filtered_df1 = combined_df[mask1&mask2&mask3&mask4]

The final form of the table is as follows

In [10]:
filtered_df1[68:73]

Unnamed: 0,GASS ID,SDSS ID,zSDSS,M∗,µ∗,D25,R90/R50,NUV−r,r
73,14831,J100530.26+054019.4,0.0444,11.21,9.13,51.2,2.72,4.25,14.59
74,14943,J101600.20+061505.2,0.0458,11.33,9.23,56.7,3.37,6.44,13.74
75,26221,J101638.39+123438.5,0.0317,10.98,8.88,75.4,2.6,3.71,13.99
76,26368,J101941.29+125034.7,0.0329,10.27,8.53,57.3,2.84,3.07,15.67
77,18900,J102001.61+083053.6,0.0453,10.92,9.28,33.6,3.16,5.39,14.63


In [11]:
filtered_df1[0:3]

Unnamed: 0,GASS ID,SDSS ID,zSDSS,M∗,µ∗,D25,R90/R50,NUV−r,r
3,11956,J000820.76+150921.6,0.0395,10.09,8.48,22.5,2.15,3.04,16.28
4,12025,J001934.54+161215.0,0.0366,10.84,9.13,34.3,3.03,5.93,14.73
5,12002,J002504.00+145815.2,0.0367,10.48,9.41,24.2,3.17,6.25,15.46


Let's save it for future use

In [12]:
filtered_df1.to_csv('Data_Saintogne2013/mnras0415-0032-SD1.csv')

### Table 2: cold gas masses

Here we repeat the whole procedure for the second table

In [13]:
# Read the TableList
table2 = camelot.read_pdf('Data_Saintogne2013/mnras0415-0032-SD2.pdf', pages='1-4', flavor='stream')

# Assemble the dataframe
dfs = [] # Empty dataframe
for file in table2: # Read every table in the TableList
    df = file.df # Convert to dataframe
    dfs.append(df) #Append to the list of dataframes

# Concatenate all rows
combined_df = pd.concat(dfs, ignore_index=True) 

#View the first rows
combined_df[0:10]

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10
0,,Table 2: Molecular gas masses and CO(1-0) para...,,,,,,,,,
1,,galaxies,,,,,,,,,
2,GASS ID,σ,S/N,"SCO,obs",,"SCO,cor",,fof f,MH2,MH2/M∗,Flag
3,,[mK],,[Jy km s−1,],[Jy km s−1,],,[log M⊙],,
4,11956,1.07,2.35,1.16,,1.25,,...,8.46,0.024,1
5,12025,1.06,...,...,,...,,...,8.78,0.009,2
6,12002,1.18,...,...,,...,,...,8.79,0.021,2
7,11989,1.07,...,...,,...,,...,8.79,0.013,2
8,27167,1.17,...,...,,...,,...,8.74,0.023,2
9,3189,1.24,6.69,3.19,,3.88,,...,8.93,0.076,1


In [14]:
#Relapce column names
new_header = combined_df.iloc[2].values
combined_df.columns = new_header

#Mask header from inside table
mask1 = combined_df['σ'] != 'Table 2: Molecular gas masses and CO(1-0) parameters for the COLD GASS'
mask2 = combined_df['σ'] != 'galaxies'
mask3 = combined_df['σ'] != 'σ'
mask4 = combined_df['σ'] != '[mK]'
filtered_df2 = combined_df[mask1&mask2&mask3&mask4]

#View filtered table
filtered_df2[0:5]


Unnamed: 0,GASS ID,σ,S/N,"SCO,obs",Unnamed: 5,"SCO,cor",Unnamed: 7,fof f,MH2,MH2/M∗,Flag
4,11956,1.07,2.35,1.16,,1.25,,...,8.46,0.024,1
5,12025,1.06,...,...,,...,,...,8.78,0.009,2
6,12002,1.18,...,...,,...,,...,8.79,0.021,2
7,11989,1.07,...,...,,...,,...,8.79,0.013,2
8,27167,1.17,...,...,,...,,...,8.74,0.023,2


Save

In [15]:
filtered_df2.to_csv('Data_Saintogne2013/mnras0415-0032-SD2.csv')

### Join table 1 and table 2

In [16]:
# Join both tables
data = filtered_df1.join(filtered_df2.set_index('GASS ID'), on='GASS ID')

# Reset the index, dropping the old index
data = data.reset_index(drop=True)
data[0:5]

Unnamed: 0,GASS ID,SDSS ID,zSDSS,M∗,µ∗,D25,R90/R50,NUV−r,r,σ,S/N,"SCO,obs",Unnamed: 13,"SCO,cor",Unnamed: 15,fof f,MH2,MH2/M∗,Flag
0,11956,J000820.76+150921.6,0.0395,10.09,8.48,22.5,2.15,3.04,16.28,1.07,2.35,1.16,,1.25,,...,8.46,0.024,1
1,12025,J001934.54+161215.0,0.0366,10.84,9.13,34.3,3.03,5.93,14.73,1.06,...,...,,...,,...,8.78,0.009,2
2,12002,J002504.00+145815.2,0.0367,10.48,9.41,24.2,3.17,6.25,15.46,1.18,...,...,,...,,...,8.79,0.021,2
3,11989,J002558.89+135545.8,0.0419,10.69,9.18,23.7,3.02,5.79,15.13,1.07,...,...,,...,,...,8.79,0.013,2
4,27167,J003921.66+142811.5,0.038,10.37,9.14,21.1,2.77,4.48,15.49,1.17,...,...,,...,,...,8.74,0.023,2


In [None]:
Save

In [17]:
data.to_csv('Data_Saintogne2013/data.csv')

## ! Alternatively, read the table that is provided with the tutorial 

In [18]:
import pandas as pd
# Read the dataframe
data = pd.read_csv('Data_Saintogne2013/data.csv')

In [19]:
#Explore the columns and the data
data[0:5]

Unnamed: 0.1,Unnamed: 0,GASS ID,SDSS ID,zSDSS,M∗,µ∗,D25,R90/R50,NUV−r,r,σ,S/N,"SCO,obs",Unnamed: 13,"SCO,cor",Unnamed: 15,fof f,MH2,MH2/M∗,Flag
0,0,11956,J000820.76+150921.6,0.0395,10.09,8.48,22.5,2.15,3.04,16.28,1.07,2.35,1.16,,1.25,,...,8.46,0.024,1
1,1,12025,J001934.54+161215.0,0.0366,10.84,9.13,34.3,3.03,5.93,14.73,1.06,...,...,,...,,...,8.78,0.009,2
2,2,12002,J002504.00+145815.2,0.0367,10.48,9.41,24.2,3.17,6.25,15.46,1.18,...,...,,...,,...,8.79,0.021,2
3,3,11989,J002558.89+135545.8,0.0419,10.69,9.18,23.7,3.02,5.79,15.13,1.07,...,...,,...,,...,8.79,0.013,2
4,4,27167,J003921.66+142811.5,0.038,10.37,9.14,21.1,2.77,4.48,15.49,1.17,...,...,,...,,...,8.74,0.023,2


In [20]:
#Remove extra columns
data = data.drop(['Unnamed: 0','Unnamed: 13', 'Unnamed: 15'], axis=1)
data[0:5]

Unnamed: 0,GASS ID,SDSS ID,zSDSS,M∗,µ∗,D25,R90/R50,NUV−r,r,σ,S/N,"SCO,obs","SCO,cor",fof f,MH2,MH2/M∗,Flag
0,11956,J000820.76+150921.6,0.0395,10.09,8.48,22.5,2.15,3.04,16.28,1.07,2.35,1.16,1.25,...,8.46,0.024,1
1,12025,J001934.54+161215.0,0.0366,10.84,9.13,34.3,3.03,5.93,14.73,1.06,...,...,...,...,8.78,0.009,2
2,12002,J002504.00+145815.2,0.0367,10.48,9.41,24.2,3.17,6.25,15.46,1.18,...,...,...,...,8.79,0.021,2
3,11989,J002558.89+135545.8,0.0419,10.69,9.18,23.7,3.02,5.79,15.13,1.07,...,...,...,...,8.79,0.013,2
4,27167,J003921.66+142811.5,0.038,10.37,9.14,21.1,2.77,4.48,15.49,1.17,...,...,...,...,8.74,0.023,2


Since masking in pandas is a pain, I am going to convert the dataframe to an astropy table. Feel free to use the dataframe as preferred in the tutorial

In [21]:
from astropy.table import Table
import numpy as np

data = Table.from_pandas(data)

data['log(MH2/Ms)'] = np.log10(data['MH2/M∗'])
# data['S/N'][data['S/N'] == '...'] = np.nan
# data['S/N'] = data['S/N'].astype(float)
# data['D25'][data['D25'] == '...'] = np.nan
# data['D25'] = data['D25'].astype(float)

----------------------------------
# Basics of Bokeh 
----------------------------------

Now that we have the data, let's start building our first Bokeh plots

### * Basic Building Blocks

The structure of this tutorial consists on progressively adding the following building blocks to a display:

1. Plot, Renderers and Glyphs: graphical representation of data. Examples include scatters, wedges, bars, or tiles

2. ColumnDataSource: central data source object used throughout Bokeh

3. Tooltips: features that allow you to interact with your plot

4. Exporting your plots (spoiler alert: screenshots are the simplest option)

5. Layouts and linked interactions: Combination of multiple elements (plots) in one document, usually connected to one another through feature like hover tooltips

6. Widgets: interactive elements that allow you to control or automate aspects of your visualization. Examples include menus, sliders,...


# 1. Plots, Renderers and Glyphs

The primary interface of Bokeh is bokeh.plotting. It allows to relate glyphs to data, and automatically assembles plots with default elements such as axes, grids, and tools.

Glyphs are the fundamental graphical shapes like points, lines, bars, and polygons that are used to represent data in a plot. 

Renderers are the models responsible for drawing the glyphs, such as scatters, lines, and modifying legends, axes and titles.


### 1.1. Creating and customizing your first plot
We are going to represent the molecular gas content of the COLDGASS sample of galaxies vs their stellar mass content. Additionally, we are going to fit a trend for the detections.

Let's start with the most basic fit

In [22]:
import numpy as np

#Define the data. 
x = data[data['SCO,obs']!='...']['M∗'] #Mask the non-detections of the sample
y = np.log10(data[data['SCO,obs']!='...']['MH2/M∗']) #Mask the non-detections of the sample

#fits
datafit = np.polyfit(x, y, 1)

#print(f'The parameters of the lines: {theta_Ha,theta_HI_12,theta_HI_22}')
fitline = datafit[1] + datafit[0] * x

fitline[0] = format(fitline[0], ".3f")
fitline[1] = format(fitline[1], ".3f")

Now we can create our first plot.

NB: If Bokeh runs without any warning or error, but no plot is produced, do not panic. Check the formatting of the data, making sure you are providing lists of the same size.

In [25]:
# Import the ploting module of bokeh
from bokeh.plotting import figure, show

#------------------------------------------------------------------------------------
# Format the data as lists
#------------------------------------------------------------------------------------
# One of the perks of bokeh is that its figures only read lists of data. Inserting columns from dataframes, astropy tables or even numpy arrays will result in a blank output
# Let's create variables that contain the data that we want to represent. IMPORTANT: must convert to list
# Observed objects
xlist1 = list(data[data['SCO,obs']!='...']['M∗'])
ylist1 = list(np.log10(data[data['SCO,obs']!='...']['MH2/M∗']))
# Upper limits
xlist2 = list(data[data['SCO,obs']=='...']['M∗'])
ylist2 = list(np.log10(data[data['SCO,obs']=='...']['MH2/M∗']))
# Fitline
fitline = list(fitline)

#------------------------------------------------------------------------------------
# Create figure
#------------------------------------------------------------------------------------
# create a new plot with the figure() function
p = figure()
# Name the axes
p.yaxis.axis_label = 'log(MH_2/M_star)'
p.xaxis.axis_label = 'log(M_star/M_sun)'

#------------------------------------------------------------------------------------
# Plot the data
#------------------------------------------------------------------------------------
# Add a line renderer and legend to the plot
p.scatter(xlist1,ylist1,line_color="#3288bd", fill_color="white",size = 10,legend_label='detections') #p.scatter will plot circles
# Different markers can be selected replacing 'scatter' by the name of the shape you want to use (see [Bokeh markers](https://docs.bokeh.org/en/3.6.2/docs/reference/models/glyphs/scatter.html))
# More recent Bokeh versions allow to keep using the scatter method, specifying the 'marker = y'
p.y(xlist2,ylist2,line_color="red", fill_color="white",size = 10,legend_label=f'upper limits') # p.y will plot downward triangles
#Add the fitline
p.line(xlist1, fitline, legend_label=f'y = {fitline[0]}*x {fitline[1]}')

#------------------------------------------------------------------------------------
# Customize legend
#------------------------------------------------------------------------------------
# Let's move the legend so it does not fall on top of the scatter
# Display legend in top left corner (default is top right corner)
p.legend.location = "bottom_left"
#add a title to your legend
p.legend.title = "Saintogne+2011:COLDGASS"
# Change appearance of legend text
p.legend.label_text_font = "times"
p.legend.label_text_font_style = "italic"
p.legend.label_text_color = "black"
# Change border and background of legend
p.legend.border_line_width = 2
p.legend.border_line_color = "gray"
p.legend.border_line_alpha = 0.8
p.legend.background_fill_color = "gray"
p.legend.background_fill_alpha = 0.1

#------------------------------------------------------------------------------------
# Customize title
#------------------------------------------------------------------------------------
# Set the title text
p.title.text = "A. Saintonge et al. 2011: COLDGASS. Figure 5 A"

#------------------------------------------------------------------------------------
# Display the results
#------------------------------------------------------------------------------------
show(p)




# 2. The ColumnDataSource in Bokeh

ColumnDataSource is a data structure that serves as the primary way to provide data to your plots and glyphs.

ColumnDataSource only likes pandas dataframes or dictionaries as an input, which is why astropy table users such as the one who wrote this guide try to avoid it. But it is really convenient, so here we are going to use it to include a colormap in the previus plot.


First, let's be fancy and create our own colormaps using the matplotlib library

In [26]:
from bokeh.colors import RGB
from matplotlib import cm

m_autumn_rgb = (255 * cm.autumn(range(256))).astype('int')
autumn = [RGB(*tuple(rgb)).to_hex() for rgb in m_autumn_rgb]

m_winter_rgb = (255 * cm.winter_r(range(256))).astype('int')
winter = [RGB(*tuple(rgb)).to_hex() for rgb in m_winter_rgb]

Update the code from section 1.1.

In [27]:
# Import the ploting module of bokeh
from bokeh.transform import transform
from bokeh.plotting import figure, show
from bokeh.transform import transform
from bokeh.models import LinearColorMapper, ColorBar, ColumnDataSource

#------------------------------------------------------------------------------------
# (NEW) Create a ColumnDataSource object to pass the data to Bokeh
#------------------------------------------------------------------------------------
# Here we use the .to_pandas() method from astropy tables to transform our astropy table into a pretty dataframe
# We create 2 sources for the 2 different populations that we defined in the previous plot: detections and non-detections
source_detections = ColumnDataSource(data[data['SCO,obs']!='...'].to_pandas())
source_nondetections = ColumnDataSource(data[data['SCO,obs']=='...'].to_pandas())

#------------------------------------------------------------------------------------
# (NEW) Create colorbar parameterisation
#------------------------------------------------------------------------------------
# Define the range for the colorbar
min_q = np.nanmin(data[data['SCO,obs']!='...']['S/N'].astype(float))
max_q = np.nanmax(data[data['SCO,obs']!='...']['S/N'].astype(float))/5 #Divide by 5 to center the colorbar
# Create the mapper
mapper = LinearColorMapper(palette=winter, low=min_q, high=max_q)
color_bar = ColorBar(color_mapper = mapper,
                     label_standoff = 14,
                     location = (0,0),
                     title = 'S/N')

#------------------------------------------------------------------------------------
# Create figure
#------------------------------------------------------------------------------------
# create a new plot with the figure() function
p = figure()
# Name the axes
p.yaxis.axis_label = 'log(MH_2/M_star)'
p.xaxis.axis_label = 'log(M_star/M_sun)'

#------------------------------------------------------------------------------------
# (UPDATED) Plot the data. Indicate the source and the columns you want to represent
#------------------------------------------------------------------------------------
# Add a line renderer and legend to the plot
p.scatter('M∗','log(MH2/Ms)', source=source_detections, color = transform('S/N', mapper), fill_alpha=0.8,line_alpha=0.6, line_color='gray', size = 10,legend_label='detections') #p.scatter will plot circles
# Different markers can be selected replacing 'scatter' by the name of the shape you want to use (see [Bokeh markers](https://docs.bokeh.org/en/3.6.2/docs/reference/models/glyphs/scatter.html))
p.scatter('M∗','log(MH2/Ms)',marker='y',line_color="red", fill_color="white", source=source_nondetections,size = 10,legend_label=f'upper limits') # p.y will plot downward triangles
# #Add the fitline
p.line(xlist1, fitline, legend_label=f'y = {fitline[0]}*x {fitline[1]}')

#------------------------------------------------------------------------------------
# (NEW) Include colorbar
#------------------------------------------------------------------------------------
p.add_layout(color_bar, 'right')

#------------------------------------------------------------------------------------
# Customize legend
#------------------------------------------------------------------------------------
# Let's move the legend so it does not fall on top of the scatter
# Display legend in top left corner (default is top right corner)
p.legend.location = "bottom_left"
#add a title to your legend
p.legend.title = "Saintogne+2011:COLDGASS"
# Change appearance of legend text
p.legend.label_text_font = "times"
p.legend.label_text_font_style = "italic"
p.legend.label_text_color = "black"
# Change border and background of legend
p.legend.border_line_width = 2
p.legend.border_line_color = "gray"
p.legend.border_line_alpha = 0.8
p.legend.background_fill_color = "gray"
p.legend.background_fill_alpha = 0.1

#------------------------------------------------------------------------------------
# Customize title
#------------------------------------------------------------------------------------
#Set the title text
p.title.text = "A. Saintonge et al. 2011: COLDGASS. Figure 5 A"

#------------------------------------------------------------------------------------
# Display the results
#------------------------------------------------------------------------------------
show(p)

# 3. Add tooltips in Bokeh

The former plots were basic Bokeh plots. 

To the right you will see a set of tools that will allow you to visit Bokeh main page, move the axis of the plot, select and zoom regions of the plot, download the plot, and refresh the image to set the default view.

Try playing with them a bit. Keep in mind that you can use more than one tool at the same time.

Once you are happy, you can try to add more tooltips that help you with your specific analysis. You can find a list of the tooltips in [Bokeh Tooltips documentation](https://docs.bokeh.org/en/latest/docs/user_guide/interaction/tooltips.html)

In [28]:
from bokeh.models import ColumnDataSource, HoverTool
# ------------------------------------------------------------------------------------
# (NEW) Customize tooltips
#------------------------------------------------------------------------------------
TOOLS = "crosshair,pan,wheel_zoom,box_zoom,box_select,reset,hover,save"

TOOLTIPS = [
            ('GASS ID','@GASS ID'),
            ('zSDSS','@zSDSS'),
            ('M∗','@M∗'),
            ('log(MH2/Ms)','@log(MH2/Ms)'),
            ('S/N','@S/N'),
           ]

#------------------------------------------------------------------------------------
# Create a ColumnDataSource object to pass the data to Bokeh
#------------------------------------------------------------------------------------
source_detections = ColumnDataSource(data[data['SCO,obs']!='...'].to_pandas())
source_nondetections = ColumnDataSource(data[data['SCO,obs']=='...'].to_pandas())

#------------------------------------------------------------------------------------
# Create colorbar parameterisation
#------------------------------------------------------------------------------------
# Define the range for the colorbar
min_q = np.nanmin(data[data['SCO,obs']!='...']['S/N'].astype(float))
max_q = np.nanmax(data[data['SCO,obs']!='...']['S/N'].astype(float))/5 #Divide by 5 to center the colorbar
# Create the mapper
mapper = LinearColorMapper(palette=winter, low=min_q, high=max_q)
color_bar = ColorBar(color_mapper = mapper,
                     label_standoff = 14,
                     location = (0,0),
                     title = 'S/N')

#------------------------------------------------------------------------------------
# Create figure
#------------------------------------------------------------------------------------
# create a new plot with the figure() function
p = figure()
# Name the axes
p.yaxis.axis_label = 'log(MH_2/M_star)'
p.xaxis.axis_label = 'log(M_star/M_sun)'

#------------------------------------------------------------------------------------
# Plot the data
#------------------------------------------------------------------------------------
#Add a line renderer and legend to the plot
p.scatter('M∗','log(MH2/Ms)',source=source_detections, color = transform('S/N', mapper), fill_alpha=0.8,line_alpha=0.6, line_color='gray',  size = 10,legend_label='detections')
p.scatter('M∗','log(MH2/Ms)',source=source_nondetections, marker='y',line_color="red", fill_color="white", size = 10,legend_label=f'upper limits')
#Add the fitline
p.line(xlist1, fitline, legend_label=f'y = {fitline[0]}*x {fitline[1]}')

#------------------------------------------------------------------------------------
# Include colorbar
#------------------------------------------------------------------------------------
p.add_layout(color_bar, 'right')

#------------------------------------------------------------------------------------
# Customize legend
#------------------------------------------------------------------------------------
#Let's move the legend so it does not fall on top of the scatter
#display legend in top left corner (default is top right corner)
p.legend.location = "bottom_left"
#add a title to your legend
p.legend.title = "Saintogne+2011:COLDGASS"
#change appearance of legend text
p.legend.label_text_font = "times"
p.legend.label_text_font_style = "italic"
p.legend.label_text_color = "black"
# change border and background of legend
p.legend.border_line_width = 2
p.legend.border_line_color = "gray"
p.legend.border_line_alpha = 0.8
p.legend.background_fill_color = "gray"
p.legend.background_fill_alpha = 0.1

#------------------------------------------------------------------------------------
# Customize title
#------------------------------------------------------------------------------------
#Set the title text
p.title.text = "A. Saintonge et al. 2011: COLDGASS. Figure 5 A"

#------------------------------------------------------------------------------------
# (NEW) Hovertool
#------------------------------------------------------------------------------------
# Create and add the HoverTool
hover = HoverTool(tooltips=TOOLTIPS)
p.add_tools(hover)

#------------------------------------------------------------------------------------
# Display the results
#------------------------------------------------------------------------------------
show(p)

You will see in the former plot how if you hover over the datapoints, a window with the fields specified in TOOLTIPS pops up. 

Since Bokeh is quite picky for names, it will not print certain instances from our former table. We need to rename them.

In [29]:
data.columns

<TableColumns names=('GASS ID','SDSS ID','zSDSS','M∗','µ∗','D25','R90/R50','NUV−r','r','σ','S/N','SCO,obs','SCO,cor','fof f','MH2','MH2/M∗','Flag','log(MH2/Ms)')>

In [30]:
data['GASS ID'].name = 'GASS_ID'
data['SDSS ID'].name = 'SDSS_ID'
data['M∗'].name = 'Ms'
data['µ∗'].name = 'µs'
data['R90/R50'].name = 'R90_R50'
data['NUV−r'].name = 'NUV_r'
data['S/N'].name = 'SN'
data['MH2/M∗'].name = 'MH2_div_M∗'
data['log(MH2/Ms)'].name = 'logMH2_logMs'

In [32]:
from bokeh.models import ColumnDataSource, HoverTool
from bokeh.models import LinearColorMapper, ColorBar, ColumnDataSource
from bokeh.plotting import figure, show
from bokeh.transform import transform
# ------------------------------------------------------------------------------------
# Customize tooltips
#------------------------------------------------------------------------------------
TOOLS = "crosshair,pan,wheel_zoom,box_zoom,box_select,reset,hover,save"

TOOLTIPS = [
            ('GASS ID','@GASS_ID'),
            ('zSDSS','@zSDSS'),
            ('M∗','@Ms'),
            ('µ∗','@µs'),
            ('log(MH2/Ms)','@logMH2_logMs'),
            ('S/N','@SN'),
           ]

#------------------------------------------------------------------------------------
# Create a ColumnDataSource object to pass the data to Bokeh
#------------------------------------------------------------------------------------
source_detections = ColumnDataSource(data[data['SCO,obs']!='...'].to_pandas())
source_nondetections = ColumnDataSource(data[data['SCO,obs']=='...'].to_pandas())

#------------------------------------------------------------------------------------
# Create colorbar parameterisation
#------------------------------------------------------------------------------------
# Define the range for the colorbar
min_q = np.nanmin(data[data['SCO,obs']!='...']['SN'].astype(float))
max_q = np.nanmax(data[data['SCO,obs']!='...']['SN'].astype(float))/5 #Divide by 5 to center the colorbar
# Create the mapper
mapper = LinearColorMapper(palette=winter, low=min_q, high=max_q)
color_bar = ColorBar(color_mapper = mapper,
                     label_standoff = 14,
                     location = (0,0),
                     title = 'S/N')

#------------------------------------------------------------------------------------
# Create figure
#------------------------------------------------------------------------------------
# create a new plot with the figure() function
p = figure()
# Name the axes
p.yaxis.axis_label = 'log(MH_2/M_star)'
p.xaxis.axis_label = 'log(M_star/M_sun)'

#------------------------------------------------------------------------------------
# Plot the data
#------------------------------------------------------------------------------------
#Add a line renderer and legend to the plot
p.scatter('Ms','logMH2_logMs', color = transform('SN', mapper), fill_alpha=0.8,line_alpha=0.6, line_color='gray', source=source_detections, size = 10,legend_label='detections') 
p.scatter('Ms','logMH2_logMs',marker='y',line_color="red", fill_color="white", source=source_nondetections,size = 10,legend_label=f'upper limits') 
#Add the fitline
p.line(xlist1, fitline, legend_label=f'y = {fitline[0]}*x {fitline[1]}')

#------------------------------------------------------------------------------------
# Include colorbar
#------------------------------------------------------------------------------------
p.add_layout(color_bar, 'right')

#------------------------------------------------------------------------------------
# Customize legend
#------------------------------------------------------------------------------------
# Display legend in top left corner (default is top right corner)
p.legend.location = "bottom_left"
#add a title to your legend
p.legend.title = "Saintogne+2011:COLDGASS"
# Change appearance of legend text
p.legend.label_text_font = "times"
p.legend.label_text_font_style = "italic"
p.legend.label_text_color = "black"
# Change border and background of legend
p.legend.border_line_width = 2
p.legend.border_line_color = "gray"
p.legend.border_line_alpha = 0.8
p.legend.background_fill_color = "gray"
p.legend.background_fill_alpha = 0.1

#------------------------------------------------------------------------------------
# Customize title
#------------------------------------------------------------------------------------
# Set the title text
p.title.text = "A. Saintonge et al. 2011: COLDGASS. Figure 5 A"

#------------------------------------------------------------------------------------
# Hovertool
#------------------------------------------------------------------------------------
# Create and add the HoverTool
hover = HoverTool(tooltips=TOOLTIPS)
p.add_tools(hover)

#------------------------------------------------------------------------------------
# Display the results
#------------------------------------------------------------------------------------
show(p)

# 4. Saving your plot 

### 4.1. Exporting your plot
You can export your plot to html and then clic on it to open it in a web browser

In [33]:
from bokeh.plotting import output_file, save
# set output to static HTML file
output_file(filename="my_first_interactive_plot.html", title="A.Saintogne et al. 2011: COLDGASS. Figure 5 A")
# save the results to a file
save(p)

'/Users/clara/Documents/Doctorado/ESO/PythonCoffee/October/my_first_interactive_plot.html'

Bokeh provides a couple more tools to export your plots in different formats. They require the installation of more libraries, and will not be seen in this tutotial. For more information, check the [Bokeh export guide](https://docs-bokeh-org.translate.goog/en/latest/docs/first_steps/first_steps_7.html?_x_tr_sl=en&_x_tr_tl=es&_x_tr_hl=es&_x_tr_pto=tc)

### 4.2 Embedding your documents and applications into webpages

If you are interested in embedding your interactive plots into a webpage, the [Bokeh Webpage Documentation](https://docs.bokeh.org/en/latest/docs/user_guide/output/embed.html) offers a brief guideline on how to proceed.

We will not cover this topic in this basic tutorial, but if enough show interest, it could be the topic for a future python coffee.

# 5. Layouts
Layouts arrange multiple plots and widgets into interactive dashboards or applications.

In [35]:
# Here is when astropy table users start having compatibility issues and have to give in to the supremacy of the dataframes
data_df = data.to_pandas()
data_df = data_df.replace('...', np.nan)
data_df['SCO,obs']=data_df['SCO,obs'].astype('float')

In [37]:
from bokeh.layouts import gridplot
# ------------------------------------------------------------------------------------
# (NEW) Customize tooltips
#------------------------------------------------------------------------------------
#Add the lasso_select tool
TOOLS = "crosshair,pan,wheel_zoom,box_zoom,box_select,lasso_select,reset,hover,save"

TOOLTIPS = [
            ('GASS ID','@GASS_ID'),
            ('zSDSS','@zSDSS'),
            ('M∗','@Ms'),
            ('µ∗','@µs'),
            ('log(MH2/Ms)','@logMH2_logMs'),
            ('S/N','@SN'),
            ('r mag','@r'),
           ]

#------------------------------------------------------------------------------------
# Create a ColumnDataSource object to pass the data to Bokeh
#------------------------------------------------------------------------------------
# Create 2 sources for the 2 different populations that we defined in the previous plot: detections and non-detections
source_detections = ColumnDataSource(data_df[data_df['SCO,obs'].notna()])
source_nondetections = ColumnDataSource(data_df[~data_df['SCO,obs'].notna()])

#------------------------------------------------------------------------------------
# Create colorbar parameterisation
#------------------------------------------------------------------------------------
# Define the range for the colorbar
min_q = np.nanmin(data[data['SCO,obs']!='...']['SN'].astype(float))
max_q = np.nanmax(data[data['SCO,obs']!='...']['SN'].astype(float))/5 #Divide by 5 to center the colorbar
# Create the mapper
mapper = LinearColorMapper(palette=winter, low=min_q, high=max_q)
color_bar = ColorBar(color_mapper = mapper,
                     label_standoff = 14,
                     location = (0,0),
                     title = 'S/N')

#------------------------------------------------------------------------------------
# Create figure 1
#------------------------------------------------------------------------------------
# create a new plot with the figure() function
p1 = figure(height=400, width=450, tools=TOOLS,tooltips=TOOLTIPS)
# Name the axes
p1.yaxis.axis_label = 'log(MH_2/M_star)'
p1.xaxis.axis_label = 'log(M_star/M_sun)'

#------------------------------------------------------------------------------------
# Create figure 2
#------------------------------------------------------------------------------------
# create a new plot with the figure() function
p2 = figure(height=400, width=450, tools=TOOLS,tooltips=TOOLTIPS)
# Name the axes
p2.yaxis.axis_label = 'log(M_star/M_sun)'
p2.xaxis.axis_label = 'r magnitude [mag]'

#------------------------------------------------------------------------------------
# (UPDATED) Plot the data
#------------------------------------------------------------------------------------
# Add a hover_color to the scatter
p1.scatter('Ms','logMH2_logMs', source=source_detections, color = transform('SN', mapper), fill_alpha=0.8,line_alpha=0.6, hover_color="magenta", line_color='gray', size = 10,legend_label='detections')
p1.scatter('Ms','logMH2_logMs', source=source_nondetections,marker='y',line_color="red", fill_color="white", hover_color="blue",size = 10,legend_label=f'upper limits') 

# Add a hover_color to the scatter
p2.scatter('r','logMH2_logMs', source=source_detections, color = transform('SN', mapper), fill_alpha=0.8,line_alpha=0.6, hover_color="magenta", line_color='gray', size = 10,legend_label='detections')
p2.scatter('r','logMH2_logMs',source=source_nondetections,marker='y',line_color="red", fill_color="white",  hover_color="blue",size = 10,legend_label=f'upper limits')

#------------------------------------------------------------------------------------
# Include colorbar
#------------------------------------------------------------------------------------
p1.add_layout(color_bar, 'right')
p2.add_layout(color_bar, 'right')

#------------------------------------------------------------------------------------
# Customize legend
#------------------------------------------------------------------------------------
p1.legend.location = "bottom_left"
p2.legend.location = "bottom_left"

#------------------------------------------------------------------------------------
# Customize title
#------------------------------------------------------------------------------------
#Set the title text
p1.title.text = "A. Saintonge et al. 2011: COLDGASS. Custom Figure"
p2.title.text = "A. Saintonge et al. 2011: COLDGASS. Figure 5 A"

#------------------------------------------------------------------------------------
# Create and add the HoverTool
#------------------------------------------------------------------------------------
hover = HoverTool(tooltips=TOOLTIPS)
p1.add_tools(hover)
p2.add_tools(hover)

#------------------------------------------------------------------------------------
# (UPDATED) Display the results
#------------------------------------------------------------------------------------
#Add a grid to locate the cursor
show(gridplot([[p1,p2]]))

Bokeh can also create histograms

In [110]:
from bokeh.layouts import gridplot
# ------------------------------------------------------------------------------------
# Customize tooltips
#------------------------------------------------------------------------------------
TOOLS = "crosshair,pan,wheel_zoom,box_zoom,box_select,lasso_select,reset,hover,save"

TOOLTIPS = [
            ('GASS ID','@GASS_ID'),
            ('zSDSS','@zSDSS'),
            ('M∗','@Ms'),
            ('µ∗','@µs'),
            ('log(MH2/Ms)','@logMH2_logMs'),
            ('S/N','@SN'),
            ('r mag','@r'),
           ]

#------------------------------------------------------------------------------------
# Create a ColumnDataSource object to pass the data to Bokeh
#------------------------------------------------------------------------------------
#We create 2 sources for the 2 different populations that we defined in the previous plot: detections and non-detections
source_detections = ColumnDataSource(data_df[data_df['SCO,obs'].notna()])
source_nondetections = ColumnDataSource(data_df[~data_df['SCO,obs'].notna()])

#------------------------------------------------------------------------------------
# Create colorbar parameterisation
#------------------------------------------------------------------------------------
# Define the range for the colorbar
min_q = np.nanmin(data[data['SCO,obs']!='...']['SN'].astype(float))
max_q = np.nanmax(data[data['SCO,obs']!='...']['SN'].astype(float))/5 #Divide by 5 to center the colorbar
# Create the mapper
mapper = LinearColorMapper(palette=winter, low=min_q, high=max_q)
color_bar = ColorBar(color_mapper = mapper,
                     label_standoff = 14,
                     location = (0,0),
                     title = 'S/N')

#------------------------------------------------------------------------------------
# Create figure 0: histogram
#------------------------------------------------------------------------------------
x = list(data['Ms'])

p0 = figure(width=500, height=350, toolbar_location=None, title="Stellar mass histogram", tools=TOOLS,tooltips=TOOLTIPS)

# Histogram
bins = np.linspace(np.min(x), np.max(x), 15)
hist, edges = np.histogram(x, density=True, bins=bins)
p0.quad(top=hist, bottom=0, left=edges[:-1], right=edges[1:],
         fill_color="skyblue", line_color="white",
         legend_label="Ms",hover_color="blue")

#------------------------------------------------------------------------------------
# Create figure 1: scatter MH_2 vs Ms
#------------------------------------------------------------------------------------
# create a new plot with the figure() function
p1 = figure(height=400, width=500, tools=TOOLS,tooltips=TOOLTIPS)
# Name the axes
p1.yaxis.axis_label = 'log(MH_2/M_star)'
p1.xaxis.axis_label = 'log(M_star/M_sun)'

#------------------------------------------------------------------------------------
# Create figure 2: scatter MH_2 vs r mag
#------------------------------------------------------------------------------------
# create a new plot with the figure() function
p2 = figure(height=400, width=450, tools=TOOLS,tooltips=TOOLTIPS)
# Name the axes
p2.yaxis.axis_label = 'log(M_star/M_sun)'
p2.xaxis.axis_label = 'r magnitude [mag]'

#------------------------------------------------------------------------------------
# Plot the data
#------------------------------------------------------------------------------------
#Add a line renderer and legend to the plot
p1.scatter('Ms','logMH2_logMs', source=source_detections, color = transform('SN', mapper), fill_alpha=0.8,line_alpha=0.6, hover_color="magenta", line_color='gray', size = 10,legend_label='detections')
p1.scatter('Ms','logMH2_logMs', source=source_nondetections,marker='y',line_color="red", fill_color="white", hover_color="blue",size = 10,legend_label=f'upper limits')

#Add a line renderer and legend to the plot
p2.scatter('r','logMH2_logMs', source=source_detections, color = transform('SN', mapper), fill_alpha=0.8,line_alpha=0.6, hover_color="magenta", line_color='gray', size = 10,legend_label='detections')
p2.scatter('r','logMH2_logMs',source=source_nondetections,marker='y',line_color="red", fill_color="white",  hover_color="blue",size = 10,legend_label=f'upper limits')

#------------------------------------------------------------------------------------
# Include colorbar
#------------------------------------------------------------------------------------
p1.add_layout(color_bar, 'right')
p2.add_layout(color_bar, 'right')

#------------------------------------------------------------------------------------
# Customize legend
#------------------------------------------------------------------------------------
#isplay legend in top left corner (default is top right corner)
p1.legend.location = "bottom_left"
p2.legend.location = "bottom_left"

#------------------------------------------------------------------------------------
# Customize title
#------------------------------------------------------------------------------------
#Set the title text
p1.title.text = "A. Saintonge et al. 2011: COLDGASS. Custom Figure"
p2.title.text = "A. Saintonge et al. 2011: COLDGASS. Figure 5 A"

# Create and add the HoverTool
hover = HoverTool(tooltips=TOOLTIPS)
p0.add_tools(hover)
p1.add_tools(hover)
p2.add_tools(hover)

#------------------------------------------------------------------------------------
# (UPDATED) Display the results
#------------------------------------------------------------------------------------
#Add a grid to locate the cursor
show(gridplot([[p0, None], [p1, p2]]))

Homework: customize the histogram, turn off the hover info, link it to the scatter plots 

# 6. Widgets

Widgets are interactive elements, such as sliders, buttons, and text inputs, that you can add to Bokeh visualizations to control them, provide input, and create dynamic applications.

A complete list of widgets with examples can be found in [Bokeh Widgets and DOM elements](https://docs-bokeh-org.translate.goog/en/latest/docs/user_guide/interaction/widgets.html?_x_tr_sl=en&_x_tr_tl=es&_x_tr_hl=es&_x_tr_pto=tc)

In [38]:
from bokeh.plotting import figure, show
from bokeh.models import ColumnDataSource, RangeSlider, CustomJS
from bokeh.layouts import column

# ------------------------------------------------------------------------------------
# (UPDATED) Customize tooltips
#-------------------------------------------------------------------------------------
TOOLS = "crosshair,pan,wheel_zoom,box_zoom,box_select,lasso_select,reset,hover,save"

TOOLTIPS = [
            ('M∗','@x'),
            ('log(MH2/Ms)','@y'),
            ('S/N','@value_to_filter'),
           ]

#------------------------------------------------------------------------------------
# (UPDATED) Create a ColumnDataSource object to pass the data to Bokeh
#------------------------------------------------------------------------------------
# This database will be updated interactively with the widget
gals = {
    'x': data_df['Ms'].astype('float'),
    'y': data_df['logMH2_logMs'].astype('float'),
    'value_to_filter': data_df['SN'].astype('float') # This is the column you'll filter by
}
source = ColumnDataSource(data=gals)

#------------------------------------------------------------------------------------
# (UPDATED) Create figure and plot the data
#------------------------------------------------------------------------------------
# Create the figure
p = figure(width=600, height=400, x_range=(9.9, 11.5), y_range=(-2.6, -0.5),title="Scatter Plot with Range Slider Filter")
p.yaxis.axis_label = 'log(MH_2/M_star)'
p.xaxis.axis_label = 'log(M_star/M_sun)'
# Create the scatter with the colorbar
p.scatter('x','y', source=source, color = transform('value_to_filter', mapper), fill_alpha=0.8,line_alpha=0.6, hover_color="magenta", line_color='gray', size = 10)
p.add_layout(color_bar, 'right')
# Create and add the HoverTool
hover = HoverTool(tooltips=TOOLTIPS)
p.add_tools(hover)

#------------------------------------------------------------------------------------
# (NEW) Add widget
#------------------------------------------------------------------------------------
range_slider = RangeSlider(
    start = np.nanmin(data_df['SN'].astype('float')),
    end = np.nanmax(data_df['SN'].astype('float')),
    value = (np.nanmin(data_df['SN'].astype('float')), np.nanmax(data_df['SN'].astype('float'))), # Initial range
    step = 1,
    title = "Filter by Value"
)

#-----------------------------------------------------------------------------------------
# (NEW) Custom callback function. It will update the data every time we change the slider
#-----------------------------------------------------------------------------------------
callback = CustomJS(args=dict(source=source, slider=range_slider), code="""
    const gals = source.data;
    const x = gals['x'];
    const y = gals['y'];
    const value_to_filter = gals['value_to_filter'];
    const [start, end] = slider.value; 

    const new_x = [];
    const new_y = [];
    const new_value_to_filter = [];

    for (let i = 0; i < value_to_filter.length; i++) {
        if (value_to_filter[i] >= start && value_to_filter[i] <= end) {
            new_x.push(x[i]);
            new_y.push(y[i]);
            new_value_to_filter.push(value_to_filter[i]);
        }
    }

    source.data = {
        x: new_x,
        y: new_y,
        value_to_filter: new_value_to_filter
    };
    source.change.emit(); // Notify Bokeh about the data change
""")

range_slider.js_on_change('value', callback)

#------------------------------------------------------------------------------------
# (UPDATED) Display
#------------------------------------------------------------------------------------
layout = column(range_slider, p)
show(layout)

# 7. Next steps
Visit the [Bokeh homepage](https://bokeh.org), go through the advanced tutorials and learn to create your own function and websites powered by Bokeh!

The best place to find help or examples for your ideas if the Documentation falls short is the [Bokeh Community forum](https://discourse.bokeh.org).