# <span style="color:purple"> MATH 210- Project 1</span>

## Interactive Data Visualization with `Bokeh`

## Contents:
1. Introduction
2. Set up
    * importing
3. Basic plots
    * scatter plots
    * line plots
3. Implementing ipywidgets
    * sliders
    * drop-down menu
4. Weather analysis
    * weather patterns for two Canadaian cities

## 1. Introduction

**`Bokeh`** is one of the leading Pyhton libraries for *interactive data visulaization* (see [documentation](http://bokeh.pydata.org/en/latest/docs/user_guide.html)). The following are the three different Bokeh interfaces that enable us to plot data, each with increasing level of sophistication: 

* **bokeh.models**: low-level
* **bokeh.plotting**: intermediate-level
* **bokeh.charts**: high level

In this project I will focus on the intermediate interface `bokeh.plotting` which allows us to create interactive plots with more customization than the low-level beginner interface `bokeh.models`. 

My goal in this tutorial is to explore the different types of plots Bokeh can create within the `bokeh.plotting` interface and what interactive tools you can easily add to make the plots interesting. Specifically I will explore the basics of creating scatter and line plots and then use these, along with various widgets, to plot Canadian weather data.

## 2. Set Up

Just like with any Python package, we must first import it in order to access all of its glorious content. The most common imports for `bokeh.plotting` are:

* `figure`: creates the figure we will plot the data in
* `output_notebook`: tells Python to display output inside of the notebook
* `push_notebok`: allows us to update the figure with new data
* `show`: displays the figure

In [1]:
import numpy as np
from bokeh.io import output_notebook, push_notebook 
from bokeh.plotting import figure, show

Since we are working in a Jupyter notebook, we must run `output_notebook()` once for our plots to appear inline inside the notebook. 

When imported corretly, the message **BokehJS successfully loaded** should appear.

In [2]:
output_notebook()

## 3. Basic Plots
Now that we have Bokeh loaded in our notebook, it's time to start plotting!

When we want to plot something, it's a good idea to start by creating the *figure*. The `figure` command we imported earlier has initialized arguments that we can customize by simply changing what they equal within the figure command.  Running `help(figure())` in a cell will show a comprehensive list of all the arguments `figure` takes. Some of these argumnets include title, plot_height, plot_width, axis_type, tools etc. 

The tools argument is what gives the plots in Bokeh their interactivity. The basic tools are:

* `pan`
* `wheel_zoom`
* `box_zoom`
* `resize`
* `toolbar_location`

Without redefining any of the tools in the figure command, the default tools in the toolbar are pan, box zoom, resize, and wheel zoom in the upper right corner of the plot.  

### Scatter plots

One of the simplest plots we can create is a scatter plot so let's start with that. We will use random x and y data using `np.linspace` and plot it in the default figure.

In [3]:
p = figure(title="Scatter Plot 1") # creates the figure

x = np.random.randint(0,20,50) # x data points
y = np.random.randint(0,20,50) # y data points
p.circle(x,y) # plotting data using the marker type circle

show(p); # plots the figure inside notebook

As you can see this plot is relatively straightforward and somewhat boring. Let's step it up a little and create another scatter plot with the same random data points but with some more interesting features such as a customized toolbar, labels axes and random sized points:

In [4]:
# creating figure with a specified toolbar
p = figure(plot_height=400, plot_width=400, tools='pan,box_zoom,hover,reset,tap')

# customized titles, sizes, axes
# ->remember these could also be passed inside the figure command above without the p.
p.title="Scatter Plot 2"
p.title_text_font="times"
p.xaxis.axis_label="X"
p.yaxis.axis_label="Y"
p.xaxis.minor_tick_line_color=None
p.yaxis.minor_tick_line_color=None
radii = np.random.random(size=100) * 1.5 # defining random radii sizes

p.circle(x,y,color="teal",radius=radii,alpha=0.5)

show(p);

In the last two examples I have been using the marker `p.circle`. In bokeh, a marker is what shape the data points render as. The following are all markers that are commonly used:

* `circle()`
* `diamond()`
* `cross()`
* `square()`
* `x()`
* `trangle()`

**Note**: Be aware that these markers may take different arguments than the `circle` marker we have been using thusfar. For example, `cross` may take x and y data as well as a size and an angle.    

### Line plots

Line plots are just like scatter plots but they take the marker `p.line` which creates a line that connects all the data points.

In [5]:
x = [1,2,3,4,5]
y = [10,2,8,6,9]

p = figure(plot_height=400, plot_width=400)

p.title="Line Plot"
p.xaxis.axis_label="X"
p.yaxis.axis_label="Y"

p.line(x,y,color='purple') # calls the marker `line` to plot a line connecting the data points 

show(p);

Let's create a line plot of $\sin(x)$ and $\cos(x)$ on the same plot with a legend:

In [6]:
x = np.linspace(-8*np.pi,8*np.pi,1000)
y = np.sin(x)
z = np.cos(x)

p = figure(title="Trigonometric Funcitons",x_range=(-2*np.pi,2*np.pi),y_range=(-1.5,1.5),
          plot_height=300, plot_width=500)

p.line(x,y,color="blue",legend="sin(x)",line_dash=[4,4])
p.line(x,z,color="red",legend="cos(x)")

p.legend.location="bottom_right" # top_right is the default legend location

show(p);

## 4. Implementing `ipywidgets`

There is another Pyhton package called **ipywidgets** that connects seemlessly to Bokeh. It allows us to implement a variety of interactive graphics to change the parameters of our functions (see [documentaion](https://ipywidgets.readthedocs.io/en/latest/examples/Widget%20List.html#Complete-list)). 

For now I will import the interact widget and exlpore the slider and drop-down menu functions.

To import this section of the package:

In [7]:
import ipywidgets as widgets
from ipywidgets import interact

### Sliders

For this example I will plot the [torus knot](https://en.wikipedia.org/wiki/Trefoil_knot) with 2 changing paramters $a$ and $b$:

$$
x(t) = (2+\cos(at))\cos(bt) \\
y(t) = (2+\cos(at))\sin(bt)
$$

To use the slider, we create a plot using the bokeh documentaion we have learned previously but encompassed in a function definition which takes the input parameters $a$ and $b$. Outside this function is where we use `interact`. 

`interact` takes the function we have defined and the two changing parameter variables. This is where we define what their values may be by using `widgets.IntSlider`. This takes a minimun value, a maximum value, a step size and an inital value. 

In this example the slider for $a$ is set to $(min=0,max=10,step=1,value=3)$.

In [8]:
# creates the initial figure by defining a function that takes the changing parameters a and b
def torus(a,b):
    t = np.linspace(0,2*np.pi,1000)
    
    x = (2+np.cos(a*t))*np.cos(b*t)
    y = (2+np.cos(a*t))*np.sin(b*t)

    p=figure(plot_height=400, plot_width=400)
    
    p.title="Torus Knot with Changing Parameters"
    p.title_text_font_size="10"
    p.ygrid.grid_line_dash = [6, 4]
    p.xgrid.grid_line_dash = [6, 4]

    p.line(x,y,color="green", line_width=3)
    show(p);

# call interact with the function deefined above and create sliders for a and b 
interact(torus, a=widgets.IntSlider(min=0,max=10,step=1,value=3),
         b=widgets.IntSlider(min=0,max=10,step=1,value=2));

### Drop-down menu

The drop-down menu in `ipywidgets` allows you to change the function being plotted. The `update` function created changes the function that appears in the plot, depending on what is selected in the drop-down menu. In this example, I will also add a paramter $a$ that can be changed using a slider.   

In [18]:
x = np.linspace(0,10,1000)
y = np.sin(x) # the function that initially displays

p = figure(plot_height=300, plot_width=400, title="Sample Curves")
p.xaxis.axis_label="X"
p.yaxis.axis_label="Y"

r = p.line(x, y, line_width=3, color="indigo")


# create a function definition to change the display based on what is chosen
def update(f,a=1): 
    if   f == "Sine":
        func = np.sin
    elif f == "Trigonometric curve 1":
        def curve(x):  # Can define functions within update to plot more interesting functions
            return np.cos(-x) * np.cos(a/10*np.pi*x)
        func = curve
    elif f == "Trigonometric curve 2":
        def curve_2(x):
            return (2+np.cos(a*x))*np.sin(x)
        func = curve_2
    r.data_source.data['y'] = func(a*x)
    push_notebook()
    
show(p);

In [19]:
# call the ipywidget command `interact` with the function `update`, the possible strings for 
# the dropdown menu, and the possible paramters for the slider `a`
interact(update, f=["Sine", "Trigonometric curve 1", "Trigonometric curve 2"], 
         a=widgets.IntSlider(min=1,max=6,step=1,value=1));

## Examples: Weather Patterns


The following weather pattern examples take data available from Statistics Canada for the two cities [Calgary, AB](http://climate.weather.gc.ca/climate_data/daily_data_e.html?timeframe=2&Year=2015&Month=1&Day=19&hlyRange=2008-12-22%7C2017-03-18&dlyRange=1999-05-01%7C2017-03-18&mlyRange=2000-06-01%7C2007-11-01&StationID=27211&Prov=AB&urlExtension=_e.html&searchType=stnName&optLimit=specDate&StartYear=1840&EndYear=2017&selRowPerPage=25&Line=0&searchMethod=contains&txtStationName=Calgary) and
[Vancouver, BC](http://climate.weather.gc.ca/climate_data/daily_data_e.html?timeframe=2&Year=2015&Month=1&Day=1&hlyRange=2013-06-11%7C2017-03-19&dlyRange=2013-06-13%7C2017-03-19&mlyRange=%7C&StationID=51442&Prov=BC&urlExtension=_e.html&searchType=stnName&optLimit=specDate&StartYear=1840&EndYear=2017&selRowPerPage=25&Line=2&searchMethod=contains&txtStationName=Vancouver). These examples make use of the build-in bokeh axis type "datetime" to render the x axis with the proper format which changes from month to day as you zoom in.  

I will use **Pandas** to read the data from a csv file saved in a separate Jupyter text file. In order to use Pandas we must import is as follows: 

In [11]:
import pandas as pd

### Temperatures for Calgary, AB

For this first example I will plot the maximum and minumum temperatures in Calgary. 

In [12]:
# create a data frame using pandas to read the csv file that contains our data
# `pd.read_csv` takes the csv file and (when plotting with dates) and a `parse_dates` argument
#  that takes the name of the column containing the dates in the form YYYY-MM-DD
df1=pd.read_csv("calgary_2015.csv", parse_dates=["Date/Time"])

t=figure(plot_width = 700, plot_height=400, x_axis_type="datetime")
t.title="Temperature Extremes for Calgary, AB - 2015"
t.title_text_font_size="10"
t.xaxis.axis_label="Date"
t.yaxis.axis_label="Temperature (C)"

# when plotting, the x and y values are given within the csv file. In order to access them we
# need to call them from the data frame previously defined. 
# the x values correspond to the column titled "Data/Time"
# the y values correspond to the column titled "Min Temp (C)" or "Max Temp (C)"
t.line(df1["Date/Time"],df1["Min Temp (C)"],color="blue",legend="Minimum Temperature(C)")
t.line(df1["Date/Time"],df1["Max Temp (C)"],color="red",legend="Maximum Temperature(C)")

show(t);

Now let's use the same data but incorporate the interact drop-down menu to add some interactivity. The drop-down menu will change the temperature data being plotted.

In [13]:
y = df1["Max Temp (C)"] # initialize the data that will display in the figure before we run
                        # the `interact` function

p = figure(plot_width = 700, plot_height=400,title="Temperatures in Calgary, AB - 2015",
           x_axis_type="datetime")
p.xaxis.axis_label="Date"
p.yaxis.axis_label="Temperature (C)"

r = p.line(df1["Date/Time"], y)


# create a function definition to change the temperatures based on what is chosen in the
# drop-down menu
def update(Data): 
    if   Data == "Daily Low":
        func = df1["Min Temp (C)"]
    elif Data == "Daily High":
        func = df1["Max Temp (C)"]
    elif Data == "Daily Average":
        func = df1["Mean Temp (C)"]
    r.data_source.data['y'] = func # redefine the variable y
    push_notebook()
    
show(p);

In [14]:
interact(update, Data=["Daily High","Daily Low","Daily Average"]);

### Precipitaion for Vancouver, BC

Since Vancouver is known for it's rainy climate, let's create a precipitaion plot! 

In [15]:
df2=pd.read_csv("vancouver_2015.csv", parse_dates=["Date/Time"])

t=figure(plot_width = 700, plot_height=400, x_axis_type="datetime")
t.title="Precipitaion in Vancouver, BC - 2015"
t.xaxis.axis_label="Date"
t.yaxis.axis_label="Precipitaion (mm)"

t.line(df2["Date/Time"],df2["Total Precip (mm)"])

show(t);

Now that we know how to plot this data, lets plot the total precipitaion for Vancouver and Calgary on the same plot to compare.

In [16]:
r = figure(plot_width = 700, plot_height=400, x_axis_type="datetime")

r.title="Total Precipitaion for Two Candadian Cities"
r.xaxis.axis_label="Date"
r.yaxis.axis_label="Precipitaion (mm)"

r.line(df1["Date/Time"],df1["Total Precip (mm)"], color="red", legend="Calgary")
r.line(df2["Date/Time"],df2["Total Precip (mm)"], color="blue", legend="Vancouver")
r.legend.location="top_center"

show(r);

From this plot we can clearly see the trends in precipitaion patterns, with Calgary getting more rain in the summer and Vancouver getting the majority of their rain from November to April.