## Visualizing Spatial Data with Pandas and Bokeh

[bokeh](http://bokeh.pydata.org/en/latest/) is a relatively new JavaScript visualization language for Python that is modeled after D3 but is intended to be able to handle millions of data points.

>Bokeh is a Python interactive visualization library that targets modern web browsers for presentation. Its goal is to provide elegant, concise construction of novel graphics in the style of D3.js, and to extend this capability with high-performance interactivity over very large or streaming datasets. Bokeh can help anyone who would like to quickly and easily create interactive plots, dashboards, and data applications. ([Bokeh Website](http://bokeh.pydata.org/en/latest/))

The advantage of Bokeh over matplotlib is that the visualizations can be interactive (via the JavaScript).

From the U.K. accident data, we can plot the location of accidents for which latitude and longitude values are provided.

In [12]:
import os
import sqlite3 as sqlite
DATADIR = os.path.join(os.path.expanduser("~"),"DATA",
                       "Misc")
print(os.path.exists(DATADIR))
import pandas as pd
import numpy as np
import folium

True


In [None]:
from bokeh.io import output_notebook

In [2]:
data = pd.read_csv(os.path.join(DATADIR,
                               "Accidents7904.csv"),
                  skiprows = lambda index: index > 0 and index <= 4883216)

  interactivity=interactivity, compiler=compiler, result=result)


### This enables drawing directly in the notebook

In [None]:
output_notebook()

### Read in the data

In [8]:
data = pd.read_csv(os.path.join(DATADIR,
                         "Accidents7904.csv"),
                   usecols = ['Longitude',"Latitude","Date","Time","Number_of_Casualties"],
                  skiprows = lambda index: index > 0 and index <= 4883216)
data.head()

Unnamed: 0,Longitude,Latitude,Number_of_Casualties,Date,Time
0,-0.271752,51.715661,1,25/12/1999,09:30
1,-0.239977,51.695136,1,17/12/1999,18:38
2,-0.270037,51.715096,2,15/12/1999,18:04
3,-0.263233,51.711309,1,02/12/1999,04:10
4,-0.227225,51.6882,3,04/12/1999,09:51


In [4]:
data.dtypes

Longitude               float64
Latitude                float64
Number_of_Casualties      int64
Date                     object
dtype: object

In [9]:
data2 = data.dropna()
data2.head()

Unnamed: 0,Longitude,Latitude,Number_of_Casualties,Date,Time
0,-0.271752,51.715661,1,25/12/1999,09:30
1,-0.239977,51.695136,1,17/12/1999,18:38
2,-0.270037,51.715096,2,15/12/1999,18:04
3,-0.263233,51.711309,1,02/12/1999,04:10
4,-0.227225,51.6882,3,04/12/1999,09:51


####  We can use the [``sample``](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.sample.html) method to get a subset of DataFrame

In [7]:
from datetime import datetime
tmp = datetime.strptime("09:30", "%H:%M")
print(tmp.time())

09:30:00


In [10]:
data2["Time"] = data2.apply(lambda row: datetime.strptime(row["Time"],"%H:%M").time(),
                                                         axis=1)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  


In [None]:
subdata = data.sample(2000)
mean_long = np.mean(subdata['Longitude'])
mean_lat  = np.mean(subdata['Latitude'])


In [None]:
from bokeh.io import output_file, show
from bokeh.models import (
  GMapPlot, GMapOptions, ColumnDataSource, Circle, DataRange1d, 
    PanTool, WheelZoomTool, BoxSelectTool, HoverTool
)

hover = HoverTool()
map_options = GMapOptions(lat=mean_lat, 
                          lng=mean_long, 
                          map_type="roadmap", zoom=6)

plot = GMapPlot(
    x_range=DataRange1d(), 
    y_range=DataRange1d(), 
    map_options=map_options
)
plot.title.text = "U.K. Road Accidents"

source = ColumnDataSource(
    data=dict(
        lat=subdata['Latitude'],
        lon=subdata['Longitude'],
    )
)


hover.tooltips.append(('index','$index'))
circle = Circle(x="lon", y="lat", size=2, 
                fill_color="blue", fill_alpha=0.8, 
                line_color=None)
plot.add_glyph(source, circle)

plot.add_tools(PanTool(),WheelZoomTool(), BoxSelectTool(), hover)
show(plot)