## Bokeh Demo

https://bokeh.pydata.org/en/latest/

Interactive visualization with Python, designed for web browsers! Explore the gallery for more examples: https://bokeh.pydata.org/en/latest/docs/gallery.html#gallery

The basic steps to creating plots with the bokeh.plotting interface are:

 * Prepare some data
  * Can be plain python lists, NumPy arrays, or Pandas series.
 * Tell Bokeh where to generate output
  * can use output_file() or output_notebook() for use in Jupyter notebooks.
 * Call figure()
  * This creates a plot with typical default options and easy customization of title, tools, and axes labels.
 * Add renderers
  * In this case, we use line() for our data, specifying visual customizations like colors, legends and widths.
 * Ask Bokeh to show() or save() the results.
  * These functions save the plot to an HTML file and optionally display it in a browser.


Bokeh can also be used to create map-based visualizations. In this tutorial we will use the Google Maps API to visualize our data on top of Google Maps.

First things first: we're going to need a Google Maps Developer key. Sign up for one here: https://developers.google.com/maps/documentation/javascript/get-api-key

Save your API key somewhere secure - you're going to need it soon.

Now, we import our dataset into a pandas dataframe. Since we already did some data exploration in the previous tutorial, we'll skip right to creating the visualization.

In [2]:
import pandas as pd

In [3]:
data = pd.read_csv('./data/metadata.csv') 

In [4]:
data.head()

Unnamed: 0,Cell Cgi,Cell Tower Location,Comm Identifier,Comm Timedate String,Comm Type,Latitude,Longitude
0,50501015388B9,REDFERN TE,f1a6836c0b7a3415a19a90fdd6f0ae18484d6d1e,4/1/14 9:40,Phone,-33.892933,151.202296
1,50501015388B9,REDFERN TE,62157ccf2910019ffd915b11fa037243b75c1624,4/1/14 9:42,Phone,-33.892933,151.202296
2,505010153111F,HAYMARKET #,c8f92bd0f4e6fb45ed7fce96fc831b283db2b642,4/1/14 13:13,Phone,-33.880329,151.20569
3,505010153111F,HAYMARKET #,f1a6836c0b7a3415a19a90fdd6f0ae18484d6d1e,4/1/14 13:13,Phone,-33.880329,151.20569
4,5.05E+106,HAYMARKET #,f1a6836c0b7a3415a19a90fdd6f0ae18484d6d1e,4/1/14 17:27,Phone,-33.880329,151.20569


In [5]:
# for our map, we'll select just a few days in our dataset
data = data[(data['Comm Timedate String'] > '4/1/14 00:00') & (data['Comm Timedate String'] < '4/4/14 00:00')]

In [6]:
data['Comm Timedate String'].min(), data['Comm Timedate String'].max()

('4/1/14 13:13', '4/3/14 14:36')

In [7]:
data.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 222 entries, 0 to 10475
Data columns (total 7 columns):
Cell Cgi                222 non-null object
Cell Tower Location     222 non-null object
Comm Identifier         135 non-null object
Comm Timedate String    222 non-null object
Comm Type               222 non-null object
Latitude                222 non-null float64
Longitude               222 non-null float64
dtypes: float64(2), object(5)
memory usage: 13.9+ KB


Install bokeh and import packages: 

In [5]:
!pip install bokeh

[33mYou are using pip version 9.0.3, however version 10.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


In [8]:
from bokeh.io import output_file, show, save, curdoc
from bokeh.models import (
  GMapPlot, GMapOptions, ColumnDataSource, Circle, Range1d, PanTool, 
    WheelZoomTool, BoxSelectTool, LinearColorMapper, CategoricalColorMapper, HoverTool,
    Plot, Circle, LinearAxis, Text,
    SingleIntervalTicker, Slider, CustomJS, Select
)
from bokeh.palettes import Spectral6
from bokeh.layouts import column, row, widgetbox
from bokeh.models.widgets import Slider

In [9]:
# we want an actual datetime to use for our map
data['Comm Timedate String'] = pd.to_datetime(data['Comm Timedate String'])

In [10]:
# now we'll grab the hour and date as separate columns
data['hour'] = data['Comm Timedate String'].apply(lambda x: x.hour)
data['date'] = data['Comm Timedate String'].apply(lambda x: x.date())

In [11]:
data.head()

Unnamed: 0,Cell Cgi,Cell Tower Location,Comm Identifier,Comm Timedate String,Comm Type,Latitude,Longitude,hour,date
0,50501015388B9,REDFERN TE,f1a6836c0b7a3415a19a90fdd6f0ae18484d6d1e,2014-04-01 09:40:00,Phone,-33.892933,151.202296,9,2014-04-01
1,50501015388B9,REDFERN TE,62157ccf2910019ffd915b11fa037243b75c1624,2014-04-01 09:42:00,Phone,-33.892933,151.202296,9,2014-04-01
2,505010153111F,HAYMARKET #,c8f92bd0f4e6fb45ed7fce96fc831b283db2b642,2014-04-01 13:13:00,Phone,-33.880329,151.20569,13,2014-04-01
3,505010153111F,HAYMARKET #,f1a6836c0b7a3415a19a90fdd6f0ae18484d6d1e,2014-04-01 13:13:00,Phone,-33.880329,151.20569,13,2014-04-01
4,5.05E+106,HAYMARKET #,f1a6836c0b7a3415a19a90fdd6f0ae18484d6d1e,2014-04-01 17:27:00,Phone,-33.880329,151.20569,17,2014-04-01


In [12]:
# set the data source - we'll pick certain columns from the dataframe that we want to visualize on a map

source = ColumnDataSource(data={
    'long'  : data['Longitude'],
    'lat'   : data['Latitude'],
    'loc': data['Cell Tower Location'],
    'timedate': data['Comm Timedate String'],
    'type': data['Comm Type'],
    'hour': data['hour'],
    'date': data['date']
})

In [13]:
# let's see what the distribution of lats and longs look like, to determine where to locate the center of our map

data.describe()

Unnamed: 0,Latitude,Longitude,hour
count,222.0,222.0,222.0
mean,-33.887321,151.20398,15.783784
std,0.010343,0.007586,4.820592
min,-33.892933,151.202296,0.0
25%,-33.892933,151.202296,12.0
50%,-33.892933,151.202296,17.0
75%,-33.880329,151.20569,19.0
max,-33.79661,151.285293,23.0


In [68]:
# set map options - we'll start with about halfway between the min and max

map_options = GMapOptions(lat=-33.890648, lng=151.212921, map_type="roadmap", zoom=14)

In [69]:
# initiate our plot with the map options we just defined

plot = GMapPlot(x_range=Range1d(), y_range=Range1d(), 
                map_options=map_options, plot_width=600, plot_height=700)

plot.title.text = "cell tower data"

In [70]:
# can't remember how to do this...
#import my_google_key

# FILL IN YOUR API KEY HERE! OR BETTER YET - SAVE IT IN AN EXTERNAL FILE (IN THE SAME FOLDER) AND REFERENCE HERE
plot.api_key = 'AIzaSyC-lCulrOsxS79kbcfC1tSlxlsIOtWG1KE'

In [71]:
list(set(data['Comm Type']))

['Phone', 'SMS', 'Internet']

In [72]:
# map colors to comm type
mapper = CategoricalColorMapper(
    palette=Spectral6,
    factors=list(set(data['Comm Type']))
)

In [73]:
# define circles
circle = Circle(x="long", y="lat", 
                fill_color={'field': 'type', 'transform': mapper}, 
                fill_alpha=1, line_color=None, size=14)

In [74]:
# set hover tips - text to appear when the mouse hovers over a point
hover = HoverTool(tooltips=[("datetime", '@timedate'), 
                    ("location", '@loc')],
                    )

In [75]:
# add interactive tools to plot
plot.add_tools(PanTool(), WheelZoomTool(), BoxSelectTool(), hover)

# add circles and source to plot
plot.add_glyph(source, circle)

In [76]:
show(plot)

ERROR:/Users/emmafreeman/anaconda3/lib/python3.6/site-packages/bokeh/core/validation/check.py:E-1005 (MISSING_GOOGLE_API_KEY): Google now requires API keys for all Google Maps usage: GMapPlot(id='17082018-61a1-40d2-a5cf-60abc6481d46', ...)


In [111]:
# Make a slider object for the hour
slider = Slider(start=0, end=23, value=0, step=1, title="Time")

# make a dropdown object for the date
select = Select(
    options=['4/1', '4/2', '4/3'], 
    value='4/1', title="Date")

In [113]:
# Define the callback function - what happens when the slider changes
def update_plot(attr, old, new):
    new_h = slider.value
    new_hour = {
             'long'  : data.loc[data['hour']==new_h].long,
             'lat'   : data.loc[data['hour']==new_h].lat,
             'time': data.loc[data['hour']==new_h].hour,
             }

    source.data = new_hour
    
# Attach the callback to the 'value' property of slider
slider.on_change('value', update_plot)

# define the callback for the dropdown 
def callback(attr, old, new):
    if new == '4/2': 
        source.data = {
             'long'  : data.loc[data['date']==22].long,
             'lat'   : data.loc[data['date']==22].lat,
             'time': data.loc[data['date']==22].session_time_m,
             'ap': data.loc[data['date']==22].ap_mac,
             'avg_time': data.loc[data['date']==22].avg_time_client
        }
   
    elif new == '4/3':
        source.data = {
             'long'  : data.loc[data['date']==21].long,
             'lat'   : data.loc[data['date']==21].lat,
             'clients' : data.loc[data['date']==21].client_mac_address,
             'size': data.loc[data['date']==21].client_mac_address_sized,
             'time': data.loc[data['date']==21].session_time_m,
             'ap': data.loc[data['date']==21].ap_mac,
             'avg_time': data.loc[data['date']==21].avg_time_client
        }

select.on_change('value', callback)

# Make a layout of slider and plot and add it to the current document
layout = column(plot, select, slider)
curdoc().add_root(layout)

In [115]:
# show plot and call on bokeh serve
show(plot)

