The Bokeh Plot Library
======================

I am [paddy_mullen](https://twitter.com/paddy_mullen). I work for [Continuum Analytics](http://continuum.io/) where we write the [Bokeh](http://bokeh.pydata.org) open source plotting library.  This tutorial will walk you through the basic bokeh plotting api and show you some of the advanced possiblilities.  Peter Wang, Bryan Van de Ven, Hugo Shi and myself are the primary contributors.


Installation instructions for tutorial
======================================

If you have conda installed run the following shell commands

    mkdir bokeh_example
    cd bokeh_example/
    git clone https://github.com/paddymul/bokeh_tutorial.git
    conda create -n bokeh_tutorial bokeh ipython-notebook pyyaml pyaudio anaconda=1.8 --yes
    source activate bokeh_tutorial
    cd bokeh_tutorial
    ipython notebook

Then in the IPython notebook, open the bokeh_tutorial notebook.

If you are executing this notebook, please use the menu and select Cell -> All Output -> Clear.  Then reload the page, this quirk will be going away in 0.3.

In [2]:
import numpy as np
from bokeh.plotting import output_notebook
import pandas as pd
output_notebook()


Here is a simple plot.

In [11]:
from bokeh.plotting import figure, output_file, show
output_notebook()

# prepare some data
x = [0.1, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0]
y0 = [i**2 for i in x]
y1 = [10**i for i in x]
y2 = [10**(i**2) for i in x]

# output to static HTML file
#output_file("log_lines.html")

# create a new plot
p = figure(
   tools="pan,box_zoom,reset,save",
   y_axis_type="log", y_range=[0.001, 10**11], title="log axis example",
   x_axis_label='sections', y_axis_label='particles'
)

# add some renderers
p.line(x, x, legend="y=x")
p.circle(x, x, legend="y=x", fill_color="white", size=8)
p.line(x, y0, legend="y=x^2", line_width=3)
p.line(x, y1, legend="y=10^x", line_color="red")
p.circle(x, y1, legend="y=10^x", fill_color="red", line_color="red", size=6)
p.line(x, y2, legend="y=10^x^2", line_color="orange", line_dash="4 4")

# show the results
show(p)

<bokeh.models.renderers.GlyphRenderer at 0x68a77b8>

In [16]:
from bokeh.plotting import figure, output_file, show


x = np.linspace(0, 4*np.pi, 20)
y = np.sin(x)

F=bkp.figure()
F.line()
F.line(x,y)
show(F)

Take a moment to play around with a simple line plot.


Let's try some glyphs
=====================

Bokeh is built around glyphs and plot objects that can be composed into plots.  This is simple and powerful.  The whole system can be manipulated from python without the need to write javascript or html code.

[Gallery](http://bokeh.pydata.org/gallery.html)
===============================================

There are many many types of glyphs:
Here is a list

- line 
- multi_line 
- annular_wedge 
- annulus 
- arc 
- bezier 
- oval 
- patch 
- patches 
- ray 
- quad 
- quadratic 
- rect 
- segment 
- text 
- wedge 

Let's look at combining two glyph renderers onto the same plot.

To do this we use the `hold() ` function,  this allows us to combine renderers onto the same plot.

In [20]:
from bokeh.plotting import figure, show
f=figure()

f.rect([10,20,30], [10,20,30], width=2, height=5, plot_width=400, plot_height=400, tools=[])
show(f)

AttributeError: unexpected attribute 'plot_height' to Rect, similar attributes are height

Lets try combining glyphs:
==========================

Due to a bug multiple plots will show up here, just look at the first two.

In [6]:
from bokeh.plotting import annular_wedge, hold, figure, show
figure()  #create a new figure
hold(False)

annular_wedge(
    [10,20,30], [30,25,10], 10, 20, 0.6, 4.1,
    inner_radius_units="screen", outer_radius_units = "screen",
    color="#8888ee", tools=[])
hold(True)
rect([10,20,30], [10,20,30], width=2, height=5, plot_width=400, plot_height=400, tools=[])
show()

So at this point we have a very flexible powerful plotting system.  What about our data ranges?
They are automatically configured for us.  Notice that we don't have to specify pixels, only data sizes.

Tools
=====

Bokeh ships with existing tools for pan, zoom, preview save, resize, and embed.  
Tools are added with the tools kwarg of plots, like this:

In [7]:
hold(False)
x = np.linspace(0, 4*np.pi, 20)
y = np.sin(x)
#use a scatter because select doesn't work on lines
line(x,y, color="#0000FF", tools="pan, zoom, resize, select, save")
show()

BokehJS object graph
====================

Bokehjs renders plots based on their object graph.  Inside this graph objects like renderers are described.

Plots have renderers, axes, grids, and tools.  Renderers (Circle, Quad, Line..) have references to DataRanges and DataSources.
Data Ranges operate on a DataSource to describe which portion of the dataspace should be rendered.
Data sources containe the actual data to be displayed.  multiple columns are algined on the same x-axis in data sources.

Now here is the really cool thing.  Since plots don't have an attribute of min x and max x, but instead they have a reference to a data range, two separate plots can share the same data range.  This means that they will pan and zoom together.


In [None]:
from IPython.display import Image
Image(filename='bokeh_objects.png')

Let's see what a simple line plot looks like if we build the object graph up

In [9]:
from numpy import pi, arange, sin, cos
import numpy as np
import os.path

from bokeh.objects import (Plot, DataRange1d, LinearAxis, 
        ObjectArrayDataSource, ColumnDataSource, Glyph, GridPlot,
        PanTool, ZoomTool)
from bokeh.glyphs import Line, Rect
from bokeh import session
x = np.linspace(-2*pi, 2*pi, 100)
y = sin(x)
z = cos(x)
widths = np.ones_like(x) * 0.02
heights = np.ones_like(x) * 0.2

In [16]:
#I'm putting all of this into a function so that we don't pollute the global namespace
def simple_line_object():
    from bokeh.plotting import curplot
    source = ColumnDataSource(data=dict(x=x,y=y,z=z,widths=widths,
            heights=heights))
    xdr = DataRange1d(sources=[source.columns("x")])
    ydr = DataRange1d(sources=[source.columns("y")])
    line_glyph = Line(x="x", y="y", line_color="blue")
    renderer = Glyph(data_source = source,
        xdata_range = xdr, ydata_range = ydr,
        glyph = line_glyph)

    plot = Plot(x_range=xdr, y_range=ydr, data_sources=[source], 
        border=50, height=300, width=300)
    xaxis = LinearAxis(plot=plot, dimension=0, location="bottom")
    yaxis = LinearAxis(plot=plot, dimension=1, location="left")

    pantool = PanTool(dataranges = [xdr, ydr], dimensions=["width","height"])
    zoomtool = ZoomTool(dataranges=[xdr,ydr], dimensions=("width","height"))

    plot.renderers.append(renderer)
    plot.tools = [pantool, zoomtool]
    sess = curplot()._session
    sess.add(plot, renderer, xaxis, yaxis, source, xdr, ydr, pantool, zoomtool)
    sess.plotcontext.children.append(plot)
    
simple_line_object()
show()

Linked panning
==============
Now we will create two plots which share the same DataRange object

<bokeh.session.NotebookSession at 0x102342c90>

In [18]:
from bokeh.glyphs import Wedge, Rect
def simple_linked():
    from bokeh.plotting import curplot
    source = ColumnDataSource(data=dict(x=x,y=y,z=z,widths=widths,
            heights=heights))

    xdr = DataRange1d(sources=[source.columns("x")])
    ydr = DataRange1d(sources=[source.columns("y")])

    line_glyph = Line(x="x", y="y", line_color="blue")
    #FIXME, I can't seem to get other glyph styles to work
    rect_glyph = Rect(x="x", y="y", height=.5, width=.05, angle=30)
    wedge_glyph = Wedge(x="x", y="y",  radius=np.pi/4, 
        start_angle= np.pi/6, end_angle=np.pi/2, direction="clock", color="red")

    renderer = Glyph(data_source = source,  xdata_range = xdr,
        ydata_range = ydr, glyph = line_glyph)
    
    plot = Plot(x_range=xdr, y_range=ydr, data_sources=[source], 
        border=50, height=300, width=300)
    plot.renderers.append(renderer)

    renderer2 = Glyph(data_source = source, xdata_range = xdr,
        ydata_range = ydr, glyph = line_glyph)

    plot2 = Plot(x_range=xdr, y_range=ydr, data_sources=[source], 
        border=50, height=300, width=300)
    pantool2 = PanTool(dataranges = [xdr, ydr], dimensions=["width","height"])
    zoomtool2 = ZoomTool(dataranges=[xdr,ydr], dimensions=("width","height"))

    plot2.renderers.append(renderer2)
    plot2.tools = [pantool2, zoomtool2]

    sess = curplot()._session
    sess.add(plot, renderer, source, xdr, ydr)
    sess.plotcontext.children.append(plot)
    show()
    sess.add(plot2, renderer2, pantool2, zoomtool2)
    sess.plotcontext.children.append(plot2)
simple_linked()
show()

In [19]:

def line_advanced():
    from bokeh.plotting import curplot
    source = ColumnDataSource(data=dict(x=x,y=y,z=z,widths=widths,
                heights=heights))
    
    xdr = DataRange1d(sources=[source.columns("x")])
    xdr2 = DataRange1d(sources=[source.columns("x")])
    ydr = DataRange1d(sources=[source.columns("y")])
    ydr2 = DataRange1d(sources=[source.columns("y")])
    
    line_glyph = Line(x="x", y="y", line_color="blue")
    wedge_glyph = Wedge(x="x", y="y",  radius=np.pi/14, 
        start_angle= 3*np.pi/6, end_angle=4*np.pi/4, direction="clock")
    
    renderer = Glyph(data_source = source,  xdata_range = xdr,
            ydata_range = ydr, glyph = line_glyph)
    pantool = PanTool(dataranges = [xdr, ydr], dimensions=["width","height"])
    zoomtool = ZoomTool(dataranges=[xdr,ydr], dimensions=("width","height"))
    
    plot = Plot(x_range=xdr, y_range=ydr, data_sources=[source], 
            border=50, height=400, width=400)
    plot.tools = [pantool, zoomtool]
    plot.renderers.append(renderer)
    
    #notice that these two have a different y data range
    renderer2 = Glyph(data_source = source, xdata_range = xdr,
            ydata_range = ydr2, glyph = line_glyph)
    
    plot2 = Plot(x_range=xdr, y_range=ydr2, data_sources=[source], 
            border=50, height=400, width=400)
    
    plot2.renderers.append(renderer2)
    
    #notice that these two have a differen y data range
    renderer3 = Glyph(data_source = source, xdata_range = xdr2,
            ydata_range = ydr, glyph = line_glyph)
    
    plot3 = Plot(x_range=xdr2, y_range=ydr, data_sources=[source], 
            border=50, height=400, width=400)
    
    plot3.renderers.append(renderer3)
    
    #this is a dummy plot with no renderers
    plot4 = Plot(x_range=xdr2, y_range=ydr, data_sources=[source], 
            border=50, height=400, width=400)
    
    
    sess = curplot()._session
    sess.add(plot, renderer, source, xdr, ydr, pantool, zoomtool)
    
    sess.add(plot2, renderer2, ydr2, xdr2, renderer3, plot3, plot4)
    grid = GridPlot(children=[[plot, plot2], [plot3, plot4 ]], name="linked_advanced")
    
    sess.add(grid)
    sess.plotcontext.children.append(grid)
line_advanced()
show()


There are 3 distinct components to the bokeh plotting library.


- The python bokeh client library.  This the api that we are using in the talk to generate plots
- The bokeh plot server.  This keeps track of which plots are in which documents,
  and its a webserver that communicates with the browser.
- The bokehjs javascript library.  This renders the plots and communicates with the plot server.

This archictecture lets us do some remarkable things.

It is possible to run bokeh without the plot server.  The file based examples that we have seen output static javascript that includes everything needed for bokehjs to display the plot.  It is also important to understand that 99% of bokeh stays the same however the plot is output.

Embedding
=========


In [None]:
from bokeh.plotting import hold, line
hold(False)
x = np.linspace(0, 4*np.pi, 20)
y = np.sin(x)
hold(True)
line_plot = line(x,y, color="#0000FF", tools="pan, zoom, preview, resize, select, embed, save")

line_snippet =  line_plot.inject_snippet()
print line_snippet
hold(False)

In [None]:
import webbrowser
import os
#ok let's create an html page with that snippet

open("foo.html","w").write("""
<html>
<body>
<h1> Embed example </h1>
%s
<h2> after embed </h2>
</body>
</html>""" % line_snippet)

webbrowser.open("file://" + os.path.abspath("foo.html"))


Animation
=========
Since plots are first class objects in bokehjs and the bokeh python system they can be modified.  Because the bokeh plotserver communicates updates to the browser, we can animate plots from python.  For these demos to work, you must be running the plot server.

    $ bokeh-server

The bokeh plot server does not yet work on windows.
Once you have the started the server, navigate to the [plot server http://localhost:5006/bokeh](http://localhost:5006/bokeh) in another browser tab.  Due to a bug in bokeh, all of the plots created start out zoomed in, you must zoom out to see the whole animation.

The IPython kernel runs the animation, to interupt the kernel type `CTRL-m i`.

In [None]:
print "Go to http://localhost:5006/bokeh to view this plot"

import numpy as np
from numpy import pi, cos, sin, linspace
from bokeh.plotting import *

colors = ("#A6CEE3", "#1F78B4", "#B2DF8A")
N = 36
r_base = 8
theta = linspace(0, 2*pi, N)
r_x = linspace(0, 6*pi, N-1)
rmin = r_base - cos(r_x) - 1
rmax = r_base + sin(r_x) + 1

output_server("wedge animate")

cx = cy = np.ones_like(rmin)
annular_wedge(cx, cy, 
        rmin, rmax, theta[:-1], theta[1:],
        inner_radius_units="data",
        outer_radius_units="data",
        color = colors[0], 
        line_color="black", tools="pan,zoom,resize")
#show()

import time
from bokeh.objects import GlyphRenderer
renderer = [r for r in curplot().renderers if isinstance(r, GlyphRenderer)][0]
ds = renderer.data_source
while True:
    for i in np.linspace(-2*np.pi, 2*np.pi, 50):
        rmin = ds.data["inner_radius"]
        rmin = np.roll(rmin, 1)
        ds.data["inner_radius"] = rmin
        rmax = ds.data["outer_radius"]
        rmax = np.roll(rmax, -1)
        ds.data["outer_radius"] = rmax
        ds._dirty = True
        session().store_obj(ds)
        time.sleep(.25)


Spectrogram demo
================

In [None]:
import numpy as np
from numpy import pi, cos, sin, linspace, zeros, linspace, \
        short, fromstring, hstack, transpose
from scipy import fft
import time
from bokeh.plotting import *

NUM_SAMPLES = 1024
SAMPLING_RATE = 44100
MAX_FREQ = SAMPLING_RATE / 8
FREQ_SAMPLES = NUM_SAMPLES / 8
SPECTROGRAM_LENGTH = 400

_stream = None
def read_mic():
    import pyaudio
    global _stream
    if _stream is None:
        pa = pyaudio.PyAudio()
        _stream = pa.open(format=pyaudio.paInt16, channels=1, rate=SAMPLING_RATE,
                     input=True, frames_per_buffer=NUM_SAMPLES)
    try:
        audio_data  = fromstring(_stream.read(NUM_SAMPLES), dtype=short)
        normalized_data = audio_data / 32768.0
        return (abs(fft(normalized_data))[:NUM_SAMPLES/2], normalized_data)
    except:
        return None

def get_audio_data(interval=0.05):
    time.sleep(interval)
    starttime = time.time()
    while time.time() - starttime < interval:
        data = read_mic()
        if data is not None:
            return data
    return None

output_server("spectrogram")

# Create the base plot
N = 36
theta = linspace(0, 2*pi, N+1)
rmin = 10
rmax = 20 * np.ones(N)
cx = cy = np.ones(N)
annular_wedge(cx, cy, rmin, rmax, theta[:-1], theta[1:],
        inner_radius_units = "data",
        outer_radius_units = "data",
        color = "#A6CEE3", line_color="black", 
        tools="pan,zoom,resize")
show()

from bokeh.objects import GlyphRenderer
renderer = [r for r in curplot().renderers if isinstance(r, GlyphRenderer)][0]
ds = renderer.data_source
while True:
    data = get_audio_data()
    if data is None:
        continue
    else:
        data = data[0]
    # Zoom in to a frequency range:
    data = data[:len(data)/2]
    histdata = (np.histogram(data, N, density=True)[0] * 5) + rmin
    ds.data["outer_radius"] = histdata
    ds._dirty = True
    session().store_obj(ds)


Future of Bokeh
===============

There are a lot of exciting things in store for bokeh.  These include:

- better IPython notebook support.  We are looking to integrate with their new widget model.
- better embedding support
- built in s3 uploading
- static animation that doesn't require the plot server
- performance enhancements
- abstract rendering
- grammar of graphics style plotting
- ease of use
