# Bokeh Tutorial

## Introduction

Welcome to the Bokeh Tutorial! This tutorial will guide you through the basic steps of how to visualize your data using Bokeh. By the end of this tutorial, you should be armed with the necessary foundations to create as many complex, interactive data visualizations as your heart desires.

We will be using Pandas to prepare our data and Bokeh to visualize it. Bokeh provides three levels or methods of visualizing data: bokeh.models, bokeh.plotting, and bokeh.charts. I will mostly be covering bokeh.models, bokeh.plotting, and adding custom widgets in this tutorial.

### What is Bokeh?

From the official [Bokeh website](http://bokeh.pydata.org/en/latest/): Bokeh is a Python interactive visualization library that targets modern web browsers for presentation. Its goal is to provide elegant, concise construction of novel graphics in the style of D3.js, and to extend this capability with high-performance interactivity over very large or streaming datasets. Bokeh can help anyone who would like to quickly and easily create interactive plots, dashboards, and data applications.

### Motivation

There are already plenty of data visualization libraries for Python. Why should we prefer to use Bokeh over all the other libraries? 

Unlike Pandas, Seaborn, and ggplot, Bokeh does not depend on matplotlib to generate its visualization. It's great for creating interactive, web-ready plots that can easily be output as JSON objects, HTML documents, or interactive web applications. 

## Installation

There are many ways to install Bokeh, but I will highlight the two most common methods. If you prefer to use `pip`, you may install it with the following at the command line:

```bash
   pip install bokeh
```

Note that this assumes that you have installed all needed dependencies, such as NumPy.

Another common method is to use the Anaconda Python distribution and to enter this command at the command line:

```bash
   conda install bokeh
```

If you receive any permission errors attempting the installation, you may need to include the `sudo` keyword in front of the commands.

After running the installation, verify that the following commands work for you:

In [1]:
import pandas as pd
import bokeh

## Download and Prepare the Data

For this tutorial I will be using a "Pokedex" .csv file that can be downloaded [here](https://www.dropbox.com/s/iwanjx0vsz9q7fr/pokedex.csv?dl=0). (It was modified from [here](https://www.dropbox.com/s/2r4lmw9bv42co2b/Pokemon%20XY%20%26%20ORAS%20Pokedex%20%26%20Guide%20Book.xls?dl=0), which was linked from [here](https://www.reddit.com/r/pokemon/comments/2vucy7/useful_pokedex_excel_file_for_pokemon_xy_oras/)).

Now we can use Pandas to prepare the data.

In [2]:
'''
Since the data do not start until the third row,
we must specify the skiprows parameter and
provide new column names
'''

newColNames = [
    'caught', 'number', 'name', 'form',
    'type1', 'type2', 'ability1', 'ability2', 
    'hiddenAbility', 'evolution', 'egg1', 'egg2',
    'hp', 'atk', 'def', 'spatk', 'spdef', 'spd',
    'total', 'pokexy', 'pokeoras'
]

df = pd.read_csv('pokedex.csv', skiprows=2, names=newColNames)
df.head()

Unnamed: 0,caught,number,name,form,type1,type2,ability1,ability2,hiddenAbility,evolution,...,egg2,hp,atk,def,spatk,spdef,spd,total,pokexy,pokeoras
0,,#001,Bulbasaur,-----,Grass,Poison,Overgrow,-----,Chlorophyll,Level Up - 16,...,Grass,45,49,49,65,65,45,318,YES,NO
1,,#002,Ivysaur,-----,Grass,Poison,Overgrow,-----,Chlorophyll,Level Up - 32,...,Grass,60,62,63,80,80,60,405,YES,NO
2,,#003,Venusaur,-----,Grass,Poison,Overgrow,-----,Chlorophyll,-----,...,Grass,80,82,83,100,100,80,525,YES,NO
3,,#003,Venusaur,Mega,Grass,Poison,Thick Fat,-----,-----,-----,...,-----,80,100,123,122,120,80,625,YES,YES
4,,#004,Charmander,-----,Fire,-----,Blaze,-----,Solar Power,Level Up - 16,...,Dragon,39,52,43,60,50,65,309,YES,NO


Now we have the raw data. But this is still not quite ready to be visualized. For the purposes of this tutorial, we want to drop the first column ("caught"), and replace all the dashed cells with empty strings.

In [3]:
df.drop('caught', axis=1, inplace=True)  # Drop first column
df.replace('-----', '', inplace=True)    # Replace hyphens with empty strings
df.head()

Unnamed: 0,number,name,form,type1,type2,ability1,ability2,hiddenAbility,evolution,egg1,egg2,hp,atk,def,spatk,spdef,spd,total,pokexy,pokeoras
0,#001,Bulbasaur,,Grass,Poison,Overgrow,,Chlorophyll,Level Up - 16,Monster,Grass,45,49,49,65,65,45,318,YES,NO
1,#002,Ivysaur,,Grass,Poison,Overgrow,,Chlorophyll,Level Up - 32,Monster,Grass,60,62,63,80,80,60,405,YES,NO
2,#003,Venusaur,,Grass,Poison,Overgrow,,Chlorophyll,,Monster,Grass,80,82,83,100,100,80,525,YES,NO
3,#003,Venusaur,Mega,Grass,Poison,Thick Fat,,,,,,80,100,123,122,120,80,625,YES,YES
4,#004,Charmander,,Fire,,Blaze,,Solar Power,Level Up - 16,Monster,Dragon,39,52,43,60,50,65,309,YES,NO


## Displaying the Data
### bokeh.charts
Now I will demonstrate how to use bokeh.charts, the high level interface that Bokeh has. bokeh.charts provides a fast, convenient way to create common statistical charts with minimal code. Wherever possible, the interface is geared to be extremely simple to use in conjunction with Pandas, by accepting a DataFrame and names of columns directly to specify data.

Let's say we want to see the distribution of the total base stats of a Pokemon using a histogram.

In [4]:
from bokeh.charts import Histogram, output_file
from bokeh.io import output_notebook, show
from bokeh.layouts import row

output_notebook()  # Allows us to view the graph directly inside of our notebook

hist = Histogram(df, values='total', title="Total Base Stats Histogram", plot_width=400)

# Uncommenting the following line of code will save the graph to an HTML file
# output_file('hist.html')

show(hist)

We can get fancier and create a histogram that compares the totals for the different Pokemon types. You can also specify the number of bins with the "bins" parameter. It will default to None if the values are all the same, or to auto select. Neat!


In [5]:
hist2 = Histogram(df, values='total', label='type1', color='type1', legend='top_right',
                  title="Total Base Stats Histogram by Type 1 Count", plot_width=800)
show(hist2)

Of course, since there are so many different types of Pokemon, our histogram is too hard to read now. Let's try a scatter plot.

In [6]:
from bokeh.charts import Scatter

tooltips=[
    ('Pokemon', '@name'),
    ('Type 1', '@type1'),
    ('Total Base Stats', '@total'),
    ('Type 2', '@type2'),
]

s1 = Scatter(df, x='type1', y='total', marker='type1',
             title="Total Base Stats vs Type 1 (marked by Type 1)", xlabel="Type 1",
             ylabel="Total Base Stats", legend=None, tooltips=tooltips)

s2 = Scatter(df, x='type1', y='total', color='type1', title="Total Base Stats vs. Type 1",
             xlabel="Type 1", ylabel="Total Base Stats", legend=None,
             tooltips=tooltips)

Wow! We're doing a lot of things here. Let's break it down.

`tooltips` is a parameter that can be used to add information to the hover tool, and is supported for the scatter and line charts. In this case, when we hover over a data point on the graph, it will allow us to see the Pokemon, the two types of the Pokemon, and its total base stats.

The two scatter plots that I create above demonstrate that you can distinguish the data points by marker shape or by color (or even both!). Since I am already plotting Type 1, these are not necessary, but in other cases being able to distinguish the different types of data you are plotting makes it easier to visualize.

Notice that bokeh.charts makes it extremely easy for us to plot our data. The functions directly take in a dateframe and we can specify each parameter by the column names in our dataframe.

Now let's see the charts side by side! I also highly encourage you to play around with the chart options on the right side of each plot.

In [7]:
show(row(s1, s2))

That's about it for bokeh.charts. Let's move on to bokeh.plotting!

### bokeh.plotting
The bokeh.plotting interface is a "mid-level" interface. The main idea for this interface is:

Starting from simple default figures (with sensible default tools, grids and axes), add markers and other shapes whose visual attributes are tied to directly data. bokeh.plotting provides these options with "glyphs," including square, circle, segment, ray, wedge, and even arc. It is also possible to mix multiple glyphs together. 

As in bokeh.charts, it is still possible to customize and change all of the defaults, but having them means that it is possible to get up and running very quickly. Since the functions are still relatively straightforward, I will not cover bokeh.plotting in too much detail. 

Now, let's create another scatter plot comparing attack stats with defense stats.

In [8]:
import numpy as np
from bokeh.plotting import figure

In [9]:
x = df['atk']
y = df['def']
radii = np.random.random(size=len(y)) * 5  # Randomize circle sizes for extra visual flair
colors = ["#%02x%02x%02x" % (int(r), int(g), 150) for r, g in zip(np.floor(50+2*x), np.floor(30+2*y))]  # Create gradient of colors based on x and y values

TOOLS = 'resize, crosshair, pan, wheel_zoom, \
         box_zoom, reset, tap, previewsave, \
         box_select, poly_select, lasso_select'  # Different types of tools that allow for more interaction with plot

p = figure(width=750, title='Attack vs. Defense', tools=TOOLS)
p.scatter(x, y, radius=radii, fill_color=colors, fill_alpha=0.5, line_color="pink", line_alpha=0.4)

show(p)

Wow, pretty! Notice the plethora of tools that we've added to the graph. Feel free to play around with them, and bask in the glory of how simple it was to add them, enabling us to interact with our data even more!

But, remember the `tooltips` that we added to our charts when we were using bokeh.charts? It might not be as helpful to us if we don't know which circle represents which set of data points. Let's try adding them back in.

In [10]:
tooltips = [
    ('Pokemon', '@name'),
    ('Attack', '@atk'),
    ('Defense', '@def')
]

p.scatter(x,y, radius=radii, fill_color=colors, fill_alpha=0.5, line_color="pink", line_alpha=0.4, tooltips=tooltips)

show(p)

AttributeError: unexpected attribute 'tooltips' to Circle, possible attributes are angle, angle_units, fill_alpha, fill_color, line_alpha, line_cap, line_color, line_dash, line_dash_offset, line_join, line_width, name, radius, radius_dimension, radius_units, size, tags, visible, x or y

Unfortunately, bokeh.plotting glyphs do not have the same `tooltips` attribute as bokeh.charts does. This is where we can to turn to bokeh.models to help us out a little bit. Although I will not be covering bokeh.models as much in this tutorial, here is a brief overview:

### bokeh.models

bokeh.models is Bokeh's "low-level" interface. Regardless of how the plot creation code is spelled in a language, the result is an object graph that encompasses all the visual and data aspects of the scene. Furthermore, this scene graph is to be serialized, and it is this serialized graph that the client library BokehJS uses to render the plot. The low-level objects that comprise a Bokeh scene graph are called Models.


In [11]:
from bokeh.plotting import ColumnDataSource
from bokeh.models import HoverTool, BoxSelectTool, CrosshairTool, TapTool, ResetTool

# Set up source for data so we have access to the individual data points
source = ColumnDataSource(
            data=dict(
            x=x,
            y=y,
            names=df['name']
    )
)

# Customize hover tool
hover = HoverTool()
hover.tooltips = [
    ('index', '$index'),
    ("name", '@names'),
    ("(x,y)", "(@x, @y)"),  # x = attack, y = defense
]
NEW_TOOLS = [BoxSelectTool(), hover, CrosshairTool(), TapTool(), ResetTool()]


p1 = figure(tools=NEW_TOOLS, title="Attack vs Defense By Pokemon", width=800)
p1.scatter('x', 'y', radius=radii, fill_color=colors, fill_alpha=0.5,
          line_color="pink", line_alpha=0.4, source=source)

show(p1)

Supplying a user-defined data source AND iterable values to glyph methods is deprecated.

See https://github.com/bokeh/bokeh/issues/2056 for more information.

  warn(message)
Supplying a user-defined data source AND iterable values to glyph methods is deprecated.

See https://github.com/bokeh/bokeh/issues/2056 for more information.

  warn(message)


As you can see, bokeh.models allows us to make even more refined customizations to our graphs, opening even more possibilities for further data exploration. Before we move on, there is one type of graph that bokeh.plotting easily allows us to construct, which is one that graphs timeseries data.

Since our Pokemon data doesn't have data that involve dates or times, we will be using data collected about Apple stocks.

In [12]:
appleDf = pd.read_csv(
        "http://ichart.yahoo.com/table.csv?s=AAPL&a=0&b=1&c=2000&d=0&e=1&f=2010",
        parse_dates=['Date'])

# Create a new plot with a datetime axis type
apples = figure(width=800, height=350, x_axis_type="datetime")

apples.line(appleDf['Date'], appleDf['Close'], color='navy', alpha=0.5)

show(apples)

Wow! What's coolor is that the website notes that "Future versions of Bokeh will attempt to auto-detect situations when datetime axes are appropriate, and add them automatically by default."

On to our last topic of the tutorial: Widgets!

## Widgets
Widgets are interactive controls that can be added to Bokeh applications to provide a front end user interface to a visualization. They allow users to make more complex interactions with their Bokeh applications, such as making new computations, updating plots, and connecting to other programmatic functionality. When used with the Bokeh server, widgets can run arbitrary Python code, enabling complex applications. They can also be used without the Bokeh server in standalone HTML documents through the browser’s Javascript runtime.

Suppose we want to make a table for whether comparing whether a Pokemon is available for catching in Pokemon X/Y or Pokemon Omega Ruby/Alpha Sapphire. 

In [13]:
from bokeh.layouts import widgetbox
from bokeh.models import ColumnDataSource
from bokeh.models.widgets import DataTable, TableColumn

data = dict(
        pokemon=df['name'],
        xy=df['pokexy'],
        oras=df['pokeoras']
)  # Set fields to use
source = ColumnDataSource(data)

columns = [
        TableColumn(field="pokemon", title="Pokemon"),
        TableColumn(field="xy", title="In XY"),
        TableColumn(field="oras", title="In ORAS")]  # Define columns

dataTable = DataTable(source=source, columns=columns, width=400, height=500)  # Initialize data table

show(widgetbox(dataTable))

Pretty straightforward! We can even create tab panes for our graphs and charts too.

In [14]:
from bokeh.models.widgets import Panel, Tabs

# Using p and p1 from above (the graphs plotting attack base stats vs defense base stats)
tab1 = Panel(child=p, title="Attack vs Defense 1")
tab2 = Panel(child=p1, title="Attack vs Defense 2")
tab3 = Panel(child=apples, title="Apple Stocks")

tabs = Tabs(tabs=[tab1, tab2, tab3])

show(tabs)

The last widget I will cover today is the Radio Button Group. Adding these widgets to a web application could have a variety of uses, such as toggling between different charts or data.

In [16]:
from bokeh.models.widgets import RadioButtonGroup

radio_button_group = RadioButtonGroup(
        labels=["Click me", "Click me!", "Click ME!"], active=0)  # 'active' defines initial state of button

show(widgetbox(radio_button_group))

That's it! Thank you for reading through this tutorial; I hope this helped you understand the basics of Bokeh. Now go off into the world and plot those data! 

Want to learn even more about Bokeh? Provided below are some links for further enlightenment.

## Further Resources

* [bokeh.charts](http://bokeh.pydata.org/en/latest/docs/user_guide/charts.html)
* [bokeh.plotting](http://bokeh.pydata.org/en/latest/docs/user_guide/plotting.html)
* [bokeh.models](http://bokeh.pydata.org/en/latest/docs/reference/models.html#bokeh-models)
* [Widgets](http://bokeh.pydata.org/en/latest/docs/user_guide/interaction/widgets.html)
* [Running a Bokeh Server](http://bokeh.pydata.org/en/latest/docs/user_guide/server.html)
* [Developing with JS](http://bokeh.pydata.org/en/latest/docs/user_guide/bokehjs.html#userguide-bokehjs)