# SMM635 - DV Topic 9 in Week 10 Term 1
 - Interative visualizations for the web 
 - Laboratory session on interactive visualizations (Bokeh)
 - Link to Week 11 Term 1: Case study # 2 (on interactive visualizations)

## (1) Interative visualizations for the web
  - We have already learnt some powerful Python libraries for "static" data viz: Matplotlib and Seaborn, etc.
  - Now, we will look at other libraries that users can interact with the data viz: Bokeh and Plotly, etc.
  - In my own experience, I think this topic is the most useful tool that I had learnt from Simone.
  - This field has been developing extremely fast, there are many new/improved tools in every year.
  - Bascially, users want to play with the data viz (plots or dashboard) on the web via their smart phones/tablets.

### (1.a) Main tools (recently) in Python and/or R
  - Bokeh: https://bokeh.org/
  - Plotly: https://plotly.com/
  - Plotly's R graphing library: https://plotly.com/r/
  - Shiny (R): https://shiny.rstudio.com/
  - Shinydashboard (R): https://rstudio.github.io/shinydashboard/
  - R flexdashboard: https://pkgs.rstudio.com/flexdashboard/
  - rbokeh: https://hafen.github.io/rbokeh/
  - Python visualization landscape: https://pyviz.org/overviews/
  - Dashboarding tools: https://pyviz.org/dashboarding/index.html
  - See comparisons of these tools: https://towardsdatascience.com/are-dashboards-for-me-7f66502986b1
  

### (1.b) When considering "live" interactive data viz
  - Know your audience and users 
  - Know how often they will use/maintain the tools
  - Know your data and the speed of process
  - Know the interconnected (Python and/or R) libraries
  - Know the requirements for data security
  - Of course, more and more users start to play data viz in smart devices
  - Perhaps, Virtual Reality (VR) in interative data viz very soon...


### (1.c) Examples
- On the libraries' websites
  - Bokeh demo: https://demo.bokeh.org/
  - Plotly demo: https://dash.gallery/dash-word-arithmetic/
  - Plotly in R: https://plotly.com/r/sunburst-charts/
  - Shiny in R: https://shiny.rstudio.com/gallery/crime-watch.html
- Relating to some of my research projects
  - GWAS: https://msesia.shinyapps.io/knockoffzoom/
  - Covid: https://my.locuszoom.org/gwas/881707/region/?chrom=3&start=45634967&end=46034967
  - A few other examples from R codes...

### (1.d) Common implementation steps
  - Discuss with your project manager: answer all questions in (1.b) When considering "live" interactive data viz
    - If your main audience and users do not have specific programming experience;
    - If your main audience and users do not have specific smart device or platform;
    - If your main audience and users do not have specific internal datasets;
    - If the new data viz tools will be updated regularly;
    - If the new data viz tools will be integrated with other existing tools;
    - If the data is big and/or keeping updated in real time;
    - If the methods require a lot of calculations and/or refresh the websites very frequently;
    - If the methods link to other Python and/or R libraries, but they may not be updated regularly;
    - If the data is sensitive and should be kept in private;
    - Remember, this field has been evolving extremely fast...
  - Decide a main programming language: Python or R
    - Depending on the "if conditions" above and your project team;
    - Think possible interconnections with other main tools (e.g. Machine Learning);
  - Then, choose a suitable library...
    - Based on your research question or project goal, check existing plots online;
    - Compare your target plots with the examples in the graph gallery of the library;
    - Re-use the existing codes in the library and modify them to fit your own purpose;
    - Note: it is important to use NumPy/SciPy/Pandas (in Python) or Tidyverse (in R) to clean/arrange your data
  - Finally, consider these detailed items
    - User interface
    - Input widgets
    - Output panels
    - Server, hosting and deployment
    - etc.

## (2) Laboratory session on interactive visualizations (Bokeh)
  - Examples from the Bokeh gallery: http://docs.bokeh.org/en/latest/docs/gallery.html
  - Examples from the Bokeh demo: https://demo.bokeh.org/
  - Tutorials of Bokeh in mybinder.org (by googling "Bokeh and MyBinder"): https://mybinder.org/
  
  - Example 1: NLP Topic Modelling
  - Example 2: Cambridge Uni Project
  - Example 3: MBA case 1 Rightmove Project
  - Example 4: MBA case 2 ESG (Environmental, Social, and Governance) Project

### (2.a)--- A scatter plot using Fisher’s Iris dataset.
  - Reference: http://docs.bokeh.org/en/latest/docs/gallery/iris.html

In [3]:
# Check all libraries should be installed in your virtual environment
# Bokeh
# Numpy
# Pandas
# Matplotlib
# Seaborn
# Scikit-learn
# (optional) Sometimes you may need 'xlrd' package to input excel file data

In [4]:
# Dataset
from bokeh.sampledata.iris import flowers

flowers # data in the bokeh.sampledata.iris

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species
0,5.1,3.5,1.4,0.2,setosa
1,4.9,3.0,1.4,0.2,setosa
2,4.7,3.2,1.3,0.2,setosa
3,4.6,3.1,1.5,0.2,setosa
4,5.0,3.6,1.4,0.2,setosa
...,...,...,...,...,...
145,6.7,3.0,5.2,2.3,virginica
146,6.3,2.5,5.0,1.9,virginica
147,6.5,3.0,5.2,2.0,virginica
148,6.2,3.4,5.4,2.3,virginica


In [5]:
# Standalone example: refresh the notebook
# Install the usual Python libraries, such as NumPy, Pandas, Bokeh
from bokeh.plotting import figure, show
from bokeh.sampledata.iris import flowers

colormap = {'setosa': 'red', 'versicolor': 'green', 'virginica': 'blue'}
colors = [colormap[x] for x in flowers['species']]

p = figure(title="Iris Morphology")
p.xaxis.axis_label = 'Petal Length'
p.yaxis.axis_label = 'Petal Width'

p.scatter(flowers["petal_length"], flowers["petal_width"],
          color=colors, fill_alpha=0.2, size=10)

show(p) # a standalone html page new window should open

In [6]:
# Display in the notebook directly
from bokeh.io import output_notebook, show
from bokeh.sampledata.iris import flowers

output_notebook()

colormap = {'setosa': 'red', 'versicolor': 'green', 'virginica': 'blue'}
colors = [colormap[x] for x in flowers['species']]

p_1 = figure(title="Iris Morphology")
p_1.xaxis.axis_label = 'Petal Length'
p_1.yaxis.axis_label = 'Petal Width'

p_1.scatter(flowers["petal_length"], flowers["petal_width"],
            color=colors, fill_alpha=0.2, size=10)

show(p_1)

### (2.b)--- Interactive hover text

In [7]:
# Plot a complex chart with interactive hover in a few lines of code

from bokeh.models import ColumnDataSource, HoverTool
from bokeh.plotting import figure
from bokeh.io import output_notebook, show
from bokeh.sampledata.iris import flowers

output_notebook()

colormap = {'setosa': 'red', 'versicolor': 'green', 'virginica': 'blue'}
colors = [colormap[x] for x in flowers['species']]

p_1 = figure(title="Iris Morphology")
p_1.xaxis.axis_label = 'Petal Length'
p_1.yaxis.axis_label = 'Petal Width'

p_1.scatter(flowers["petal_length"], flowers["petal_width"],
            color=colors, fill_alpha=0.2, size=10)

p_1.add_tools(HoverTool(tooltips=[("Petal Length", "@x"), ("Petal Width", "@y")])) # check this line carefully

show(p_1)

In [8]:
import bokeh.sampledata
bokeh.sampledata.download()
from bokeh.models.tools import HoverTool
from bokeh.sampledata.glucose import data

subset = data.loc['2010-10-06']

x, y = subset.index.to_series(), subset['glucose']

# Basic plot setup
p_2 = figure(width=600, height=300, x_axis_type="datetime", title='Hover over points')

p_2.line(x, y, line_dash="4 4", line_width=1, color='gray')

cr = p_2.circle(x, y, size=20,
              fill_color="grey", hover_fill_color="firebrick",
              fill_alpha=0.05, hover_alpha=0.3,
              line_color=None, hover_line_color="white")

p_2.add_tools(HoverTool(tooltips=None, renderers=[cr], mode='hline'))

show(p_2)

Creating C:\Users\Mattheus\.bokeh directory
Creating C:\Users\Mattheus\.bokeh\data directory
Using data directory: C:\Users\Mattheus\.bokeh\data
Downloading: CGM.csv (1589982 bytes)
   1589982 [100.00%]
Downloading: US_Counties.zip (3171836 bytes)
   3171836 [100.00%]
Unpacking: US_Counties.csv
Downloading: us_cities.json (713565 bytes)
    713565 [100.00%]
Downloading: unemployment09.csv (253301 bytes)
    253301 [100.00%]
Downloading: AAPL.csv (166698 bytes)
    166698 [100.00%]
Downloading: FB.csv (9706 bytes)
      9706 [100.00%]
Downloading: GOOG.csv (113894 bytes)
    113894 [100.00%]
Downloading: IBM.csv (165625 bytes)
    165625 [100.00%]
Downloading: MSFT.csv (161614 bytes)
    161614 [100.00%]
Downloading: WPP2012_SA_DB03_POPULATION_QUINQUENNIAL.zip (4816256 bytes)
   4816256 [100.00%]
Unpacking: WPP2012_SA_DB03_POPULATION_QUINQUENNIAL.csv
Downloading: gapminder_fertility.csv (64346 bytes)
     64346 [100.00%]
Downloading: gapminder_population.csv (94509 bytes)
     94509 [10

### (2.c)--- Adjust size or plot_options

In [9]:
# Plot a complex chart with interactive hover in a few lines of code
# Adjust top right toolbox

from bokeh.models import ColumnDataSource, HoverTool
from bokeh.plotting import figure
from bokeh.io import output_notebook, show
from bokeh.sampledata.iris import flowers

output_notebook()

plot_options = dict(width=350, plot_height=350, tools='pan, wheel_zoom, reset')
# https://docs.bokeh.org/en/latest/docs/user_guide/tools.html 

colormap = {'setosa': 'red', 'versicolor': 'green', 'virginica': 'blue'}
colors = [colormap[x] for x in flowers['species']]

p_1 = figure(title="Iris Morphology", **plot_options)
p_1.xaxis.axis_label = 'Petal Length'
p_1.yaxis.axis_label = 'Petal Width'

p_1.scatter(flowers["petal_length"], flowers["petal_width"],
            color=colors, fill_alpha=0.2, size=10)

p_1.add_tools(HoverTool(tooltips=[("Petal Length", "@x"), ("Petal Width", "@y")])) # check this line carefully

show(p_1)

In [10]:
# Exporting plots in other figure formats
# https://docs.bokeh.org/en/latest/docs/user_guide/export.html

In [11]:
# Combine different plots together
from bokeh.layouts import gridplot

p = gridplot([[p_2, p_1]])

show(p)

### (2.d)--- Linking selections

In [12]:
# Linking selections is accomplished in a similar way, by sharing data sources between plots. 
from bokeh.models import ColumnDataSource

x = list(range(-20, 21))
y0, y1 = [abs(xx) for xx in x], [xx**2 for xx in x]

# create a column data source for the plots to share
source = ColumnDataSource(data=dict(x=x, y0=y0, y1=y1))

TOOLS = "box_select, lasso_select, help"

# create a new plot and add a renderer
left = figure(tools=TOOLS, width=300, height=300)
left.circle('x', 'y0', source=source)

# create another new plot and add a renderer
right = figure(tools=TOOLS, width=300, height=300)
right.circle('x', 'y1', source=source)

p = gridplot([[left, right]])

show(p)

### (2.e)--- Control widgets

In [13]:
# Control widget: https://docs.bokeh.org/en/latest/docs/user_guide/interaction/widgets.html
from bokeh.models.widgets import Slider

slider = Slider(start=0, end=10, value=1, step=.1, title="foo")

show(slider)

In [14]:
# Slider widget example (NOTE: CustomJS for Property changes)
from bokeh.layouts import column
from bokeh.models import CustomJS, ColumnDataSource, Slider

x = [x*0.005 for x in range(0, 201)]

source = ColumnDataSource(data=dict(x=x, y=x))

plot = figure(plot_width=400, plot_height=400)
plot.line('x', 'y', source=source, line_width=3, line_alpha=0.6)

slider = Slider(start=0.1, end=6, value=1, step=.1, title="power")

update_curve = CustomJS(args=dict(source=source, slider=slider), code="""
    var data = source.data;
    var f = slider.value;
    var x = data['x']
    var y = data['y']
    for (var i = 0; i < x.length; i++) {
        y[i] = Math.pow(x[i], f)
    }
    
    // necessary becasue we mutated source.data in-place
    source.change.emit();
""")
slider.js_on_change('value', update_curve)

show(column(slider, plot))

In [15]:
# Tab panes allow multiple plots or layouts to be shown in selectable tabs:
# Reference: https://docs.bokeh.org/en/latest/docs/user_guide/interaction/widgets.html
from bokeh.io import show
from bokeh.models import Panel, Tabs
from bokeh.plotting import figure

p1 = figure(width=300, height=300)
p1.circle([1, 2, 3, 4, 5], [6, 7, 2, 4, 5], size=20, color="navy", alpha=0.5)
tab1 = Panel(child=p1, title="circle")

p2 = figure(width=300, height=300)
p2.line([1, 2, 3, 4, 5], [6, 7, 2, 4, 5], line_width=3, color="navy", alpha=0.5)
tab2 = Panel(child=p2, title="line")

show(Tabs(tabs=[tab1, tab2]))

In [16]:
# A numeric spinner widget:
import numpy as np

from bokeh.io import show
from bokeh.layouts import column, row
from bokeh.models import Spinner
from bokeh.plotting import figure

x = np.random.rand(10)
y = np.random.rand(10)

p = figure(x_range=(0, 1), y_range=(0, 1))
points = p.scatter(x=x, y=y, size=4)

spinner = Spinner(title="Glyph size", low=1, high=40, step=0.5, value=4, width=80)
spinner.js_link('value', points.glyph, 'size')

show(row(column(spinner, width=100), p))

### (2.f)--- Reference and more detailed tutorials

In [17]:
#Tutorials of Bokeh in mybinder.org (by googling "Bokeh and MyBinder Tutorials"): https://mybinder.org/

https://hub.gke2.mybinder.org/user/bokeh-bokeh-notebooks-p7plluni/notebooks/tutorial/00%20-%20Introduction%20and%20Setup.ipynb


## (3) Link to Week 11 Term 1: Case study # 2 (on interactive visualizations)
  - Case study 2: https://github.com/simoneSantoni/data-viz-smm635/tree/master/caseStudies/inequalityInCreativeIndustries