# Introduction to bokeh

<img src='./images/logos.3.600.wide.png' height='250' width='300' style="float:right">

## Who am I? Who are you? What is bokeh?
---

### Me...
* Chalmer Lowe
* @chalmer_lowe
* chalmer@darkartofcoding.com


### What I do...
* Founder: Dark Art of Coding
* Founder: PyHawaii
* Senior Computer Scientist: Booz Allen Hamilton
* Chair/Co-Chair: Pycon Education Summit
* Introduction to Sprinting guy

## Tell me about yourself...

* Your background with:
  * Python
  * Visualization libraries like matplotlib, seaborn, bokeh
* What would you like to visualize with bokeh?


"Bokeh is a Python interactive visualization library that targets modern web browsers for presentation. Its goal is to provide elegant, concise construction of novel graphics in the style of D3.js, and to extend this capability with high-performance interactivity over very large or streaming datasets. Bokeh can help anyone who would like to quickly and easily create interactive plots, dashboards, and data applications."

<br>
<div style="text-align: right">Source: http://bokeh.pydata.org/en/latest/</div>

# Today's agenda
---

* Install bokeh and other pertinent libraries
* The history of bokeh
* Your first basic graph

    Break (5 mins) -------------------------------------------


* Top 5 fun and entertaining plots      **<<< Midway point...**
* Wait, what? Why doesn't that work?

    Break (5 mins) -------------------------------------------


* Bokeh server app
* Charting your course
* But wait, there's more...

# Installing Bokeh and other pertinent libraries
---

## Installing miniconda

For this tutorial, we will

* be using **Python version 3.5 (or above)** 
* be installing and using a number of packages/libraries, including: bokeh, numpy, scipy, etc.

The version of Python you normally use and already have installed on your computer may be different from the one we use. The libraries you already have installed may also be very different from what we will use. 

To minimize conflicts AND to help ensure that for troubleshooting purposes, everyone is on pretty close to the same page, we will:

1. Install miniconda
1. Create a virtual environment to separate the software and libraries we install for the purposes of the tutorial from the software/libraries you may already have installed.
1. Install Python and the needed libraries in the virtual environment

### About miniconda vs Anaconda:

**Miniconda** is produced by Continuum Analytics and gives you access to a wide variety of libraries used to perform data analytics (including, but not limited to numpy, pandas, bokeh, matplotlib, IPython, scikit learn, etc). Miniconda focuses on letting you pick and choose the libraries you want.

**Anaconda**, is also produced by Continuum Analytics, AND installs by default, about 100+ data analysis libraries in one fell swoop. If you prefer to install Anaconda instead of miniconda, that is fine, just be forewarned that it can take a lot of space on your harddrive.

IF you already have miniconda installed OR Anaconda installed, skip to the step below: **Using miniconda to create a virtualenv**

### miniconda installation process

To install miniconda, follow the instructions for your operating system in the [miniconda quickstart guide](http://conda.pydata.org/docs/install/quick.html).

### Testing your install

To confirm that the conda package manager that miniconda creates has been installed correctly:

In a **command prompt** type `conda list`. If conda is installed properly, you will see a summary of the packages installed by conda, that looks something like this (simplified for clarity):

    jarvis:~ tonystark$ conda list
    # packages in environment at /Users/tonystark/miniconda3:
    #
    conda                     4.3.14                   py35_0
    conda-env                 2.6.0                         0
    ipython                   5.3.0                    py35_0
    pip                       8.1.2                    py35_0
    python                    3.5.2                         0
    readline                  6.2                           2
    setuptools                23.0.0                   py35_0
    simplegeneric             0.8.1                    py35_1
    six                       1.10.0                   py35_0

For more information on using conda, try these resources:

[Using conda](http://conda.pydata.org/docs/using/index.html): A tutorial on how to use conda

[conda cheatsheet](https://conda.io/docs/_downloads/conda-cheatsheet.pdf): A cheatsheet of the most common conda commands


# Virtual Environments

* Enable you to create a standalone environment for your project
* Minimizes conflicts between one project and another in terms of: 
  0. Python versions
  1. Versions of other libraries that your project might depend upon


## What is a virtual environment?

Virtual environments (also called virtualenvs) are tools used to keep projects separate, especially in terms of keeping different Python versions separate and different library versions separate. Virtualenvs prevent Python's `site packages` folder  from getting disorganized and cluttered AND prevents problems that arise when one project needs `version x.x` of a library but another project needs `version y.y` of the same library. At their core, virtualenvs are glorified directories that use scripts and metadata to organize and control the environment. You are allowed to have an unlimited number of virtualenvs. And as you will see, they are very easy to create using the various command line tools, such as conda.

## When should we use a virtual environment?

As noted above, anytime you have more than one project and there is a possibility of conflicts between your libraries, it is a good time to use a virtualenv. Having said that, many programmers use virtual environments for **all but the most trivial** programming tasks. Especially for beginners, using virtualenvs early on in your learning career will build a valuable skill AND help eliminate sneaky bugs related to version discrepancies. Bugs that can be hard to diagnose.



## Using miniconda to create a virtualenv

Presuming you have `conda` installed, the following command will enable you to create your first virtual environment.

```bash
$ mkdir graphstuff
$ cd graphstuff
$ conda create -n graphs python=3
```

`conda` runs the conda program.

`create` tells it to create a virtualenv

`-n` identifies the name of the virtualenv, in this case, `graphs`

`python=3` tells conda that you want to install Python version 3 in this virtualenv

**NOTE**: conda will default to the most recent version of Python. If you need to select a specific minor version of Python, use the following syntax:

`python=3.5`

Conda will prepare to install Python and any dependencies that Python relies upon. It will display output similar to the following. 

    jarvis:intro_to_bokeh tonystark$ conda create -n graphs python=3
    Fetching package metadata .......
    Solving package specifications: ..........
    
    Package plan for installation in environment /Users/tonystark/miniconda3/envs/graphs:
    
    The following packages will be downloaded:
    
        package                    |            build
        ---------------------------|-----------------
        openssl-1.0.2k             |                1         3.0 MB
        python-3.6.0               |                0        11.7 MB
        setuptools-27.2.0          |           py36_0         523 KB
        wheel-0.29.0               |           py36_0          87 KB
        pip-9.0.1                  |           py36_1         1.7 MB
        ------------------------------------------------------------
                                               Total:        17.0 MB
    
    The following NEW packages will be INSTALLED:
    
        openssl:    1.0.2k-1
        pip:        9.0.1-py36_1
        python:     3.6.0-0
        readline:   6.2-2
        setuptools: 27.2.0-py36_0
        sqlite:     3.13.0-0
        tk:         8.5.18-0
        wheel:      0.29.0-py36_0
        xz:         5.2.2-1
        zlib:       1.2.8-3
    
    Proceed ([y]/n)?

To finish the creation of the virtualenv and install the software, press:

`y`

## Activating your virtualenv 

Once you have created a virtualenv, you will need to activate it. Activation has several side effects:

* It temporarily changes your `$PATH` variable so calls to the `python` executable (and similar commands) will look first in the virtualenv's bin/ directory. 
* It temporarily changes your shell prompt to show which virtualenv you are using. Your prompt will likely look something like this, with the name of your virtualenv in parenthesis in front of the prompt:
    * `(graphs) jarvis:graphstuff tonystark$` 
    * `(graphs) C:\graphstuff>`

To activate your virtualenv, run the appropriate command for your operating system:

### Linux\Mac OSX version

```bash
$ source activate graphs
```
### Windows version

```bat
C:\> activate graphs
```



## Using miniconda to install bokeh

### Adding software to your virtualenv 

To add more software to the virtualenv, you can use `conda` to install the software. The maintainers of conda provide access to many Python libraries, but not all of them. If conda cannot install a particular library that you need, you can generally use `pip` to install it instead (covering pip is outside the scope of this workshop).

Please install IPython using the following `conda` command:

```
conda install ipython
```

Conda will prepare to install IPython and any dependencies that IPython relies upon. It will display output similar to the following (truncated to save space).


    Fetching package metadata .......
    Solving package specifications: ..........
    
    Package plan for installation in environment /Users/chalmerlowe/miniconda3:
    
    The following packages will be downloaded:
    
        package                    |            build
        ---------------------------|-----------------
        conda-env-2.6.0            |                0          601 B
        ...
        ipython-5.3.0              |           py35_0        1021 KB
        conda-4.3.14               |           py35_0         505 KB
        ------------------------------------------------------------
                                               Total:         3.8 MB
    
    The following NEW packages will be INSTALLED:
    
        appnope:          0.1.0-py35_0
        ...
        wcwidth:          0.1.7-py35_0
    
    The following packages will be UPDATED:
    
        conda:            4.1.11-py35_0 --> 4.3.14-py35_0
        conda-env:        2.5.2-py35_0  --> 2.6.0-0
        requests:         2.10.0-py35_0 --> 2.13.0-py35_0
    
    Proceed ([y]/n)?

To finish the installation of IPython and its dependencies, press:

`y`




# Install the rest of the packages

The rest of the packages for this tutorial can be installed using the following command (separate the name of each package with a space):

`conda install bokeh jupyter numpy pandas scipy`

# Using this or similar notebooks

Download this notebook and associated files from github:

https://github.com/chalmerlowe/bokeh_tutorial

Unzip the content if necessary.

Run Jupyter Notebook using the `jupyter notebook` command from within the `graphstuff` folder:

`(graphs) jarvis:graphstuff tonystark$ jupyter notebook`

Which should open a dashboard in your brower showing the content of your virtual environment.


## Leaving the virtualenv when you are done

When you are **done** working in your virtualenv, you can deactivate it using the following command.
For now, leave your virtualenv active so we can proceed together.

### Linux\Mac OSX version

```bash
(graphs) $ source deactivate
```

### Windows version

```bat
(graphs) C:\> deactivate
```

# Bokeh, where we came from
---

## The history of bokeh

Photographers use the Japanese word “bokeh” to describe the blurring of the out-of-focus parts of an image.

Its aesthetic quality can greatly enhance a photograph, and photographers artfully use focus to draw attention to subjects of interest. 

“Good bokeh” contributes visual interest to a photograph and places its subjects in context. 

The `bokeh` library was so named because it allows users the flexibility to focus on the most important data without losing track of the rich context that allows it to be understood.

<br>
<div style="text-align: right">Source: http://bokehplots.com/pages/technical-vision.html</div>

## The philosophy of bokeh (and datashader)

### How do we look at all the data?

What are the best perceptual approaches to honestly and accurately represent the data to domain experts and SMEs so they can apply their intuition to the data?

Are there automated approaches to accurately reduce large datasets so that outliers and anomalies are still visible, while we meaningfully represent baselines and backgrounds? How can we do this without “washing away” all the interesting bits during a naive downsampling?

If we treat the pixels and topology of pixels on a screen as a bottleneck in the I/O channel between hard drives and an analyst’s visual cortex, what are the best compression techniques at all levels of the data transformation pipeline?

### How can scientists and data analysts be empowered to use visualization fluidly, not merely as an output facility or one stage of a pipeline, but as an entire mode of engagement with data and models?

Are language-based approaches for expressing mathematical modeling and data transformations the best way to compose novel interactive graphics?

What data-oriented interactions (besides mere linked brushing/selection) are useful for fluid, visually-enable analysis?

One guiding principle for the development of our visualization tools is to provide useful software for people, while incorporating novel ideas from the academic world of visualization research. Additionally, as modular and open-source projects, we hope that Bokeh and datashader will enable many other projects to build a rich suite of domain-specific applications that change existing, legacy paradigms of data processing workflow.

<br>
<div style="text-align: right">Source: http://bokehplots.com/pages/technical-vision.html</div>

## The two faces of bokeh

Bokeh has two primary means of use: on a server and in the browser.

### Server-based apps
The Bokeh server provides an environment where data can be updated and manipulated such that it can then update the visualization and where the user interface and selection processes can trigger visual updates.

### Client-based visualizations
Bokeh also creates standalone visualizations viewable in browsers with no use of the Bokeh server. These plots have many interactive tools and features, such as panning, brushing, hover, etc.

# Picture this, your first basic graph
---

## Creating a basic graph

In [None]:
from bokeh.plotting import figure, show, output_notebook

x_values = [0, 1, 2, 3, 4, 5, 6]
y_values = [0.0, 1.0, 1.4, 1.7, 2.0, 2.2, 2.4]

p = figure()

p.line(x_values, y_values)

output_notebook()
show(p)

## Getting help

In [None]:
# p? 

# OR

help(p)

In [None]:
# Don't know what functions are available?

# p.<tab-complete>

In [None]:
# Want insight into what a particular function can do?

help(p.line)

## Let's do it again, but moar!

In [None]:
from bokeh.plotting import figure, show, output_notebook
from math import sqrt

x_values = range(100)                        # figures can handle range objects...
y_values = [sqrt(num) for num in x_values]

p = figure()
p.line(x_values, y_values)

output_notebook()
show(p)

## Even moar!

In [None]:
from bokeh.plotting import figure, show, output_file, output_notebook
import numpy as np

moar = figure()

x = np.linspace(0.1, 5, 100)
y = x

# Let's use circular markers and a dotted and dashed line

moar.circle(x, y, legend="y=x")
moar.line(x, np.sqrt(x), legend="y=sqrt(x)", line_color="#663399", line_dash="dotdash")

# And add a legend built off the legend attributes of each item

moar.legend.location = "top_left"

output_notebook()
show(moar)

# Colors:

* any of the 147 named CSS colors, e.g 'green', 'indigo'
* an RGB(A) hex value, e.g., '#FF0000', '#44444444'
* a 3-tuple of integers (r,g,b) between 0 and 255
* a 4-tuple of (r,g,b,a) where r, g, b are integers between 0 and 255 and a is a floating point value between 0 and 1

# What colors?
[https://en.wikipedia.org/wiki/Web_colors](https://en.wikipedia.org/wiki/Web_colors)




# Moarest! adding multiple features to a basic graph

In [None]:
from bokeh.plotting import figure, show, output_file, output_notebook
import numpy as np

x = np.linspace(0.1, 5, 100)
y = x

# Add a title and define the type of y-axis, include the range of values for the y-axis

p = figure(title="log axis example", y_axis_type="log", y_range=(0.001, 10**22))

p.circle(x, y, legend="y=x")
p.line(x, np.sqrt(x), legend="y=sqrt(x)", line_color='red', line_dash="dotdash")

# Set the fill color for our circles to None

p.circle(x, x**2, legend="y=x**2", fill_color=None, line_color="olivedrab")

In [None]:
# increase the line width to 4

p.line(x, 10**x, legend="y=10^x", line_color="gold", line_width=4)

In [None]:
# Add a line composed of diamonds
# Use hex this time to color the line

p.diamond(x, 10**(x**2), legend="y=10^(x^2)",
       line_color="#663399", line_dash="solid", line_width=4)

# Move the legend location
# Change the legend font

p.legend.location = "center_left"
p.legend.label_text_font='times'

# Change characteristics of the title text (color, font, font-style)

p.title.text_color = "olive"
p.title.text_font = "times"
p.title.text_font_style = "italic"

# Set the background color

p.background_fill_color = "whitesmoke"

output_notebook()
show(p)

In [None]:
help(p.diamond)

In [None]:
from bokeh.palettes import Purples

# Purples is a dictionary indexed by integer...
# for example:
# Purples[5] >>> ['#54278f', '#756bb1', '#9e9ac8', '#cbc9e2', '#f2f0f7']

p.line(x, np.sqrt(x) * 100, legend="y=sqrt(x)", line_color=Purples[5][1], line_dash="dotdash")
show(p)

# Break (5 minutes)
---

# Top 5 fun and entertaining plots
---

Next, we will take a look at five sample plots include many diverse examples of how bokeh can be used for awesome data visualizations.
   0. scatterplot
   1. histogram with pdf and cdf curves
   2. graph with interactive sliders to set parameters
   3. visualization tools: pan, zoom, hover, crosshairs, lasso select, box select, poly select, save, undo, redo, tap
   4. geographical map
 

## Simple scatterplot

In [None]:
from bokeh.plotting import figure, show, output_file, output_notebook
from bokeh.sampledata.iris import flowers

flowers


In [None]:
colormap = {'setosa': 'red', 'versicolor': 'green', 'virginica': 'blue'}
colors = [colormap[x] for x in flowers['species']]

colors

In [None]:
print(flowers['petal_length'], flowers['petal_width'])


In [None]:
p = figure(title = "Iris Morphology")
p.xaxis.axis_label = 'Petal Length'
p.yaxis.axis_label = 'Petal Width'

p.circle(flowers["petal_length"], flowers["petal_width"],
         color=colors, fill_alpha=0.2, size=10)

# output_file("iris.html", title="iris.py example")
output_notebook()
show(p)

# http://bokeh.pydata.org/en/latest/docs/gallery/iris.html

## Histogram: with pdf and cdf curves

In [None]:
import numpy as np
import scipy.special

from bokeh.layouts import gridplot
from bokeh.plotting import figure, show, output_file, output_notebook

p1 = figure(title="Normal Distribution (μ=0, σ=0.5)",tools="save",
            background_fill_color="#E8DDCB")

mu, sigma = 0, 0.5

measured = np.random.normal(mu, sigma, 1000)
hist, edges = np.histogram(measured, density=True, bins=50)

x = np.linspace(-2, 2, 1000)
pdf = 1/(sigma * np.sqrt(2*np.pi)) * np.exp(-(x-mu)**2 / (2*sigma**2))
cdf = (1+scipy.special.erf((x-mu)/np.sqrt(2*sigma**2)))/2

p1.quad(top=hist, bottom=0, left=edges[:-1], right=edges[1:],
        fill_color="#036564", line_color="#033649")
p1.line(x, pdf, line_color="#D95B43", line_width=8, alpha=0.7, legend="PDF")
p1.line(x, cdf, line_color="white", line_width=2, alpha=0.7, legend="CDF")

p1.legend.location = "top_left"
p1.xaxis.axis_label = 'x'
p1.yaxis.axis_label = 'Pr(x)'

# output_file('histogram.html', title="histogram.py example")
output_notebook()
show(p1)

# http://bokeh.pydata.org/en/latest/docs/gallery/histogram.html

## Using interactive sliders

In [None]:
import numpy as np

from bokeh.layouts import row, widgetbox
from bokeh.models import CustomJS, Slider
from bokeh.plotting import figure, output_file, output_notebook, show, ColumnDataSource

x = np.linspace(0, 10, 500)
y = np.sin(x)

source = ColumnDataSource(data=dict(x=x, y=y))

plot = figure(y_range=(-10, 10), plot_width=400, plot_height=400)

plot.line('x', 'y', source=source, line_width=3, line_alpha=0.6)

callback = CustomJS(args=dict(source=source), code="""
    var data = source.data;
    var A = amp.value;
    var k = freq.value;
    var phi = phase.value;
    var B = offset.value;
    x = data['x']
    y = data['y']
    for (i = 0; i < x.length; i++) {
        y[i] = B + A*Math.sin(k*x[i]+phi);
    }
    source.trigger('change');
""")

amp_slider = Slider(start=0.1, end=10, value=1, step=.1,
                    title="Amplitude", callback=callback)
callback.args["amp"] = amp_slider

freq_slider = Slider(start=0.1, end=10, value=1, step=.1,
                     title="Frequency", callback=callback)
callback.args["freq"] = freq_slider

phase_slider = Slider(start=0, end=6.4, value=0, step=.1,
                      title="Phase", callback=callback)
callback.args["phase"] = phase_slider

offset_slider = Slider(start=-5, end=5, value=0, step=.1,
                       title="Offset", callback=callback)
callback.args["offset"] = offset_slider

layout = row(
    plot,
    widgetbox(amp_slider, freq_slider, phase_slider, offset_slider),
)

# output_file("slider.html", title="slider.py example")
output_notebook()
show(layout)

# http://bokeh.pydata.org/en/latest/docs/gallery/slider.html

## Using visualization tools

In [None]:
import numpy as np
from bokeh.plotting import figure, show, output_file, output_notebook

In [None]:
number = 4000
x = np.random.random(size=number) * 100
y = np.random.random(size=number) * 100

In [None]:
radii = np.random.random(size=N) * 1.5
colors = [
    "#%02x%02x%02x" % (int(r), int(g), 150) for r, g in zip(50+2*x, 30+2*y)
]

TOOLS="hover,crosshair,pan,wheel_zoom,zoom_in,zoom_out,box_zoom,undo,redo,reset,tap,save,box_select,poly_select,lasso_select,"

p = figure(tools=TOOLS)

p.scatter(x, y, radius=radii,
          fill_color=colors, fill_alpha=0.6,
          line_color=None)

# output_file("color_scatter.html", title="color_scatter.py example")
output_notebook()
show(p)  # open a browser

# http://bokeh.pydata.org/en/latest/docs/gallery/color_scatter.html

## Geographical map

In [None]:
from bokeh.io import show
from bokeh.models import (
    ColumnDataSource,
    HoverTool,
    LogColorMapper
)
from bokeh.palettes import Viridis6 as palette
from bokeh.plotting import figure, output_notebook

from bokeh.sampledata.us_counties import data as counties
from bokeh.sampledata.unemployment import data as unemployment

palette.reverse()

counties = {
    code: county for code, county in counties.items() if county["state"] == "tx"
}

county_xs = [county["lons"] for county in counties.values()]
county_ys = [county["lats"] for county in counties.values()]

county_names = [county['name'] for county in counties.values()]
county_rates = [unemployment[county_id] for county_id in counties]
color_mapper = LogColorMapper(palette=palette)

source = ColumnDataSource(data=dict(
    x=county_xs,
    y=county_ys,
    name=county_names,
    rate=county_rates,
))

TOOLS = "pan,wheel_zoom,box_zoom,reset,hover,save"

p = figure(
    title="Texas Unemployment, 2009", tools=TOOLS,
    x_axis_location=None, y_axis_location=None
)
p.grid.grid_line_color = None

p.patches('x', 'y', source=source,
          fill_color={'field': 'rate', 'transform': color_mapper},
          fill_alpha=0.7, line_color="white", line_width=0.5)

hover = p.select_one(HoverTool)
hover.point_policy = "follow_mouse"
hover.tooltips = [
    ("Name", "@name"),
    ("Unemployment rate)", "@rate%"),
    ("(Long, Lat)", "($x, $y)"),
]

output_notebook()
show(p)

# http://bokeh.pydata.org/en/latest/docs/gallery/texas.html

# Wait, what? Why doesn't that work?
---

Sometimes, things just don't work out...

# Break (10 minutes)
---

# Bokeh server app
---

# Charting your course
---

# But wait, there's more...
---

## Resources

In [None]:
# urls
# 