## Overview

In this assignment, you will analyze criminal incident data from Seattle or San Francisco to visualize patterns and, if desired, contrast and compare patterns across the two cities.

You will produce a blog-post-style visual narrative consisting of a series of visualizations interspersed with sufficient descriptive text to make a convincing argument.

The assignment will be assessed by peer review. The rubric for assessment will include questions about the effectiveness and clarity of your argument, your use of visualization, and the completeness of your analysis. Reproducibility will also be considered, but will be evaluated subjectively -- peer reviewers will not be asked to recreate your results.

## Project Ideas

You may want to consider one or more of the following types of questions when developing your submission.

- For either city, how do incidents vary by time of day? Which incidents are most common in the evening? During what periods of the day are robberies most common?
- For either city, how do incidents vary by neighborhood? Which incidents are most common in the city center? In what areas or neighborhoods are robberies or thefts most common?
- For either city, how do incidents vary month to month in the Summer 2014 dataset?
- For either city, which incident types tend to correlate with each other on a day-by-day basis?
- **Advanced**  What can we infer broadly about the differences in crime patterns between Seattle and San Francisco? Does one city tend to have more crime than the other, per capita? Do the relative frequencies of types of incidents change materially between the two cities? (NOTE: The two datasets do not have the same schema, so comparisons will require some work and some assumptions. This will require extra work, but you will be working at the forefront of what is known!)
- **Advanced**  For either city, do certain crimes correlate with environmental factors such as temperature? (To answer this kind of question, you will need to identify and use external data sources!)


In [1]:
import pandas as pd
import numpy as np
from bokeh.plotting import figure, show
from bokeh.io import output_notebook
output_notebook()
np.random.seed(123456)

from bokeh.palettes import Spectral11, OrRd9, PuBu9, Greys9, Blues9, Spectral11, Purples9, RdYlGn9
myColors = Spectral11


In [2]:
category_selection = ['LARCENY/THEFT', 'ASSAULT', 'VEHICLE THEFT', 'WARRANTS', 
                       'DRUG/NARCOTIC', 'MISSING PERSON', 'WEAPON LAWS',
                       'ROBBERY']

In [3]:
sanfrancisco_incidents = pd.read_csv('sanfrancisco_incidents_summer_2014.csv')


In [52]:
#sanfrancisco_incidents_selectedcategories[sanfrancisco_incidents_selectedcategories['Category'] == 'ROBBERY']

Unnamed: 0,IncidntNum,Category,Descript,DayOfWeek,Date,Time,PdDistrict,Resolution,Address,X,Y,Location,PdId,DateTime,HourOfDay
108,140733551,ROBBERY,ROBBERY OF A CHAIN STORE WITH BODILY FORCE,Sunday,08/31/2014,18:18,CENTRAL,"ARREST, BOOKED",300 Block of POWELL ST,-122.408384,37.787827,"(37.7878271137225, -122.408384275542)",14073355103044,2014-08-31 18:18:00,18
118,140733545,ROBBERY,ROBBERY ON THE STREET WITH A KNIFE,Sunday,08/31/2014,17:50,MISSION,NONE,500 Block of DOLORES ST,-122.426037,37.759420,"(37.759419804663, -122.426036566526)",14073354503012,2014-08-31 17:50:00,17
247,140749558,ROBBERY,ROBBERY ON THE STREET WITH A DANGEROUS WEAPON,Sunday,08/31/2014,06:30,INGLESIDE,NONE,0 Block of ARLETA AV,-122.403663,37.712696,"(37.7126960034611, -122.40366319182)",14074955803013,2014-08-31 06:30:00,6
259,140732581,ROBBERY,ROBBERY ON THE STREET WITH A KNIFE,Sunday,08/31/2014,05:00,BAYVIEW,NONE,LANE ST / SHAFTER AV,-122.390730,37.731039,"(37.7310386910994, -122.390729985082)",14073258103012,2014-08-31 05:00:00,5
303,140731834,ROBBERY,ROBBERY OF A SERVICE STATION WITH BODILY FORCE,Saturday,08/30/2014,23:34,NORTHERN,NONE,1800 Block of LOMBARD ST,-122.431978,37.800479,"(37.8004790843579, -122.431977680518)",14073183403034,2014-08-30 23:34:00,23
318,140731680,ROBBERY,ROBBERY ON THE STREET WITH A GUN,Saturday,08/30/2014,22:09,INGLESIDE,NONE,500 Block of ALEMANY BL,-122.421305,37.732164,"(37.7321638759455, -122.421305348396)",14073168003011,2014-08-30 22:09:00,22
714,140732973,ROBBERY,ROBBERY OF A RESIDENCE WITH BODILY FORCE,Friday,08/29/2014,21:00,BAYVIEW,NONE,3RD ST / PALOU AV,-122.390972,37.734015,"(37.7340152180723, -122.390971734551)",14073297303054,2014-08-29 21:00:00,21
715,140732973,ROBBERY,ATTEMPTED ROBBERY RESIDENCE WITH BODILY FORCE,Friday,08/29/2014,21:00,BAYVIEW,NONE,3RD ST / PALOU AV,-122.390972,37.734015,"(37.7340152180723, -122.390971734551)",14073297303454,2014-08-29 21:00:00,21
725,140728148,ROBBERY,ATTEMPTED ROBBERY ON THE STREET WITH A KNIFE,Friday,08/29/2014,20:15,TENDERLOIN,NONE,300 Block of TURK ST,-122.414901,37.782742,"(37.7827419623926, -122.414900544562)",14072814803412,2014-08-29 20:15:00,20
986,140726108,ROBBERY,ROBBERY ON THE STREET WITH A DANGEROUS WEAPON,Friday,08/29/2014,04:51,TENDERLOIN,"ARREST, BOOKED",ELLIS ST / LARKIN ST,-122.417710,37.784236,"(37.7842362877887, -122.417710344726)",14072610803013,2014-08-29 04:51:00,4


In [5]:
sanfrancisco_incidents.head(1)

Unnamed: 0,IncidntNum,Category,Descript,DayOfWeek,Date,Time,PdDistrict,Resolution,Address,X,Y,Location,PdId
0,140734311,ARSON,ARSON OF A VEHICLE,Sunday,08/31/2014,23:50,BAYVIEW,NONE,LOOMIS ST / INDUSTRIAL ST,-122.405647,37.738322,"(37.7383221869053, -122.405646994567)",14073431126031


### Number of incidents over time and HourOfDay

•For either city, how do incidents vary by time of day? Which incidents are most common in the evening? During what periods of the day are robberies most common?

In [6]:
sanfrancisco_incidents[sanfrancisco_incidents['Category'] == 'LARCENY/THEFT']

Unnamed: 0,IncidntNum,Category,Descript,DayOfWeek,Date,Time,PdDistrict,Resolution,Address,X,Y,Location,PdId
2,146177923,LARCENY/THEFT,GRAND THEFT FROM LOCKED AUTO,Sunday,08/31/2014,23:30,SOUTHERN,NONE,1000 Block of MISSION ST,-122.409795,37.780036,"(37.7800356268394, -122.409795194505)",14617792306244
3,146177531,LARCENY/THEFT,GRAND THEFT FROM LOCKED AUTO,Sunday,08/31/2014,23:30,RICHMOND,NONE,FULTON ST / 26TH AV,-122.485263,37.772518,"(37.7725176473142, -122.485262988324)",14617753106244
11,146178410,LARCENY/THEFT,PETTY THEFT OF PROPERTY,Sunday,08/31/2014,23:00,CENTRAL,NONE,300 Block of BAY ST,-122.412782,37.805665,"(37.8056654684523, -122.412782236976)",14617841006372
22,140734822,LARCENY/THEFT,PETTY THEFT FROM LOCKED AUTO,Sunday,08/31/2014,22:30,CENTRAL,NONE,MONTGOMERY ST / PACIFIC AV,-122.403678,37.797297,"(37.7972972029479, -122.403678112414)",14073482206241
23,146178476,LARCENY/THEFT,GRAND THEFT FROM LOCKED AUTO,Sunday,08/31/2014,22:30,SOUTHERN,NONE,HARRISON ST / 11TH ST,-122.412483,37.770631,"(37.7706305910776, -122.41248326348)",14617847606244
24,146179355,LARCENY/THEFT,GRAND THEFT FROM UNLOCKED AUTO,Sunday,08/31/2014,22:18,NORTHERN,NONE,LARKIN ST / PINE ST,-122.418845,37.789830,"(37.7898297669592, -122.418844748098)",14617935506224
30,146177741,LARCENY/THEFT,GRAND THEFT FROM LOCKED AUTO,Sunday,08/31/2014,22:00,RICHMOND,NONE,2500 Block of MCALLISTER ST,-122.453871,37.775582,"(37.7755820573955, -122.453870632501)",14617774106244
31,146179054,LARCENY/THEFT,GRAND THEFT FROM LOCKED AUTO,Sunday,08/31/2014,22:00,SOUTHERN,NONE,400 Block of NATOMA ST,-122.406941,37.780679,"(37.7806787734071, -122.406941270083)",14617905406244
34,140735610,LARCENY/THEFT,PETTY THEFT FROM LOCKED AUTO,Sunday,08/31/2014,22:00,MISSION,"ARREST, BOOKED",600 Block of FLORIDA ST,-122.411094,37.761048,"(37.7610475742041, -122.411093822635)",14073561006242
35,140734173,LARCENY/THEFT,GRAND THEFT FROM LOCKED AUTO,Sunday,08/31/2014,22:00,NORTHERN,NONE,POST ST / LAGUNA ST,-122.428151,37.785783,"(37.7857828233879, -122.428151140162)",14073417306244


In [7]:
# Set and align DateTime and Category information
def getHourOfDay(x):
    return int(x.hour)   
def getDate(x):
    if int(x.day) > 9:
        return str(x.month) + '/' + str(x.day) + '/' + str(x.year)
    else:
        return str(x.month) + '/0' + str(x.day) + '/' + str(x.year)

sanfrancisco_incidents['DateTime'] = pd.to_datetime(sanfrancisco_incidents['Date'] + ' ' + sanfrancisco_incidents['Time'])
sanfrancisco_incidents['HourOfDay'] = sanfrancisco_incidents['DateTime'].apply(getHourOfDay)


sanfrancisco_incidents_selectedcategories = sanfrancisco_incidents.loc[sanfrancisco_incidents['Category'].isin(category_selection)]




In [8]:
# Group Data
# San Francisco

# by date (14/08/02 ...)
sf_byDateGB = sanfrancisco_incidents.groupby('Date')
sf_byDate_CategoryGB = sanfrancisco_incidents.groupby(['Date', 'Category'])
# by hour of the day (0, 1, 2 ... 23)
sf_byHourOfDayGB = sanfrancisco_incidents_selectedcategories.groupby('HourOfDay')
sf_byHourOfDay_CategoryGB = sanfrancisco_incidents_selectedcategories.groupby(['HourOfDay', 'Category'])
# by day of week
sf_byDayOfWeekGB = sanfrancisco_incidents_selectedcategories.groupby('DayOfWeek')
sf_byDayOfWeek_CategoryGB = sanfrancisco_incidents_selectedcategories.groupby(['DayOfWeek', 'Category'])

# Total


sf_byDate_Total  = sf_byDateGB.size().reset_index().rename(columns = { 0 : 'numIncidents'})
sf_byDate_Total['DateS'] = sf_byDate_Total['Date']
sf_byDate_Total['Date'] = pd.to_datetime(sf_byDate_Total['Date'])

#by Category
sf_byCategoryGB = sanfrancisco_incidents.groupby('Category')
sf_byCategory_Total  = sf_byCategoryGB.size().reset_index().rename(columns = { 0 : 'numIncidents'})

sf_byCategory_HourOfDayGB = sanfrancisco_incidents_selectedcategories.groupby(['Category' , 'HourOfDay'])
sf_byCategory_HourOfDay = sf_byCategory_HourOfDayGB.size()
sf_byCategory_HourOfDay_Total  = sf_byCategory_HourOfDayGB.size().reset_index().rename(columns = { 0 : 'numIncidents'})



sf_byHourOfDay_CategoryGB = sanfrancisco_incidents_selectedcategories.groupby(['HourOfDay', 'Category'])
sf_byHourOfDay_Category_Total  = sf_byHourOfDay_CategoryGB.size().reset_index().rename(columns = { 0 : 'numIncidents'})


In [9]:
#sf_byCategory_Total

In [10]:
sf_byCategory_HourOfDayGB.size()

Category       HourOfDay
ASSAULT        0            169
               1            115
               2            126
               3             68
               4             39
               5             52
               6             47
               7             70
               8             90
               9            103
               10           114
               11           117
               12           154
               13           112
               14           125
               15           135
               16           155
               17           182
               18           138
               19           130
               20           164
               21           153
               22           178
               23           146
DRUG/NARCOTIC  0             56
               1             30
               2             31
               3             24
               4              3
               5              8
               

## Top Categories

In [11]:
#sanfrancisco_incidents[sanfrancisco_incidents['Category'] == 'BURGLARY']
sf_byCategory_Total[sf_byCategory_Total['Category'] == 'ROBBERY']

Unnamed: 0,Category,numIncidents
23,ROBBERY,308


In [87]:
from bokeh.charts import Bar, show, defaults
from bokeh.models import ColumnDataSource, HoverTool
from bokeh.models.ranges import FactorRange
from bokeh.plotting import figure, show
from bokeh.embed import file_html
from bokeh.resources import CDN
output_notebook()

defaults.width = 1024
defaults.height = 1024

TOOLS = "pan,wheel_zoom,reset,save,resize,hover"

sf_byCategory_Total.sort_values(by = 'numIncidents', ascending = False, inplace=True)



p = Bar(sf_byCategory_Total, 'Category', values='numIncidents', title="Crime Incidents per Category",
        xlabel = 'Category', ylabel = 'Incidents', tools = TOOLS, color = 'blue', logo = None,
        plot_width=1024, plot_height=1024)
p.x_range = FactorRange(factors=sf_byCategory_Total['Category'].tolist())

hover = p.select(dict(type=HoverTool))
hover.tooltips=[
            ("Incidents", "@numIncidents"),
            ("Category", "@Category"),
        ]



html = file_html(p, CDN, "SFPD Incidents per Category")

file_ = open('scripts/sfincidentsbycategory.html', 'w')
file_.write(html)
file_.close()

show(p)

<bokeh.io._CommsHandle at 0x394d40db70>

In [13]:
sf_byCategory_Total[['Category', 'numIncidents']]

Unnamed: 0,Category,numIncidents
15,LARCENY/THEFT,9466
20,OTHER OFFENSES,3567
19,NON-CRIMINAL,3023
1,ASSAULT,2882
31,VEHICLE THEFT,1966
32,WARRANTS,1782
6,DRUG/NARCOTIC,1345
28,SUSPICIOUS OCC,1300
18,MISSING PERSON,1266
25,SECONDARY CODES,442


## Incidents over Time

In [86]:
from bokeh.plotting import figure, show
from bokeh.models import ColumnDataSource, HoverTool
from bokeh.resources import CDN
from bokeh.embed import autoload_static
from bokeh.embed import file_html

#TOOLS="resize,crosshair,pan,wheel_zoom,box_zoom,reset,tap,previewsave,box_select,poly_select,lasso_select,hover"
TOOLS="resize,pan,wheel_zoom,reset"



sf_source = ColumnDataSource(
        data=dict(
            x = sf_byDate_Total['Date'],
            y = sf_byDate_Total['numIncidents'],
            date = sf_byDate_Total['DateS'],
        )
    )


hover = HoverTool(
        tooltips=[
            ("Date", "@date"),
            ("#Incidents", "@y"),
        ]
    )

p = figure(plot_width=1024, plot_height=1024, x_axis_type="datetime", tools=TOOLS,
           title = 'Number of San Francisco Crime Incidents Summer 2014',
           x_axis_label = 'Date',
           y_axis_label =  '#Incidents',
           logo = None
          )

p.line('x', 'y', source = sf_source , color = 'navy', line_width=4)
p.circle('x', 'y', source = sf_source , color = 'navy', fill_color="white", size=8)

#add hover tool
p.add_tools(hover)

js, tag = autoload_static(p, CDN, "scripts/sfincidentsovertime.js")
html = file_html(p, CDN, "SFPD Incidents")

file_ = open('scripts/sfincidentsovertime.js', 'w')
file_.write(js)
file_.close()
file_ = open('scripts/sfincidentsovertime.tag', 'w')
file_.write(tag)
file_.close()
file_ = open('scripts/sfincidentsovertime.html', 'w')
file_.write(html)
file_.close()




show(p)

<bokeh.io._CommsHandle at 0x394c092d68>

In [15]:
tag

'\n<script\n    src="scripts/sfincidentsovertime.js"\n    id="8421b6bf-b1fd-4bfd-8dda-17d381877da2"\n    data-bokeh-model-id="9e8d336f-cd32-4128-bcc8-ca43a25b810f"\n    data-bokeh-doc-id="1aa558e5-34fe-4350-9a52-98313f8af1d1"\n></script>'

## by Category

In [16]:
#sf_byHourOfDay_Category = sf_byHourOfDay_CategoryGB.size().unstack()

In [17]:
#def calcPerentage(row):
#    return float(row)  

In [18]:
#sf_byHourOfDay_Category.sum(1)
#sf_byHourOfDay_Category['_TOTAL_'] = sf_byHourOfDay_Category.sum(1)
#sf_byHourOfDay_Category.apply()


In [19]:
#sf_byHourOfDay_Category.fillna(0,inplace=True)
#sf_byHourOfDay_Category = sf_byHourOfDay_Category.reset_index()
#sf_byHourOfDay_Category = sf_byHourOfDay_Category.sort_values(by = 'HourOfDay',inplace=True)

In [20]:
#sf_byHourOfDay_Category.head(4)

## By DayOfWeek

In [84]:
from bokeh.charts import Bar, show, Line, defaults
from bokeh.models import HoverTool 
from bokeh.resources import CDN
from bokeh.embed import autoload_static
from bokeh.embed import file_html

output_notebook()
defaults.width = 1024
defaults.height = 1024

import matplotlib as plt
import matplotlib.cm as cm
import numpy as np

#colormap =cm.get_cmap("Dark2") #choose any matplotlib colormap here
#bokehpalette = [plt.colors.rgb2hex(m) for m in colormap(np.arange(colormap.N))]

TOOLS = "pan,wheel_zoom,reset,save,resize,hover"

my_BarDataHOD = sanfrancisco_incidents_selectedcategories.loc[:,('DayOfWeek', 'IncidntNum' ,'Category')]

p = Bar(my_BarDataHOD, label = 'DayOfWeek', values='IncidntNum', stack = 'Category', agg = 'count', title="Incidents per weekday",
        xlabel = 'weekday', ylabel = 'Incidents', tools = TOOLS,
        palette=myColors, logo = None,
        plot_width=1024, plot_height=1024
        )

hover = p.select(dict(type=HoverTool))
hover.tooltips=[
            ("Hour", "@DayOfWeek"),
            ("Category", "@Category"),
        ]

html = file_html(p, CDN, "SFPD Incidents per weekday")

file_ = open('scripts/sfincidentsoverweekday.html', 'w')
file_.write(html)
file_.close()


show(p)

<bokeh.io._CommsHandle at 0x394bdf3828>

## Overall by HourOfDay

In [22]:
sf_byHourOfDay_Category = sf_byHourOfDay_CategoryGB.size().unstack()

In [23]:
#sf_byHourOfDay_Category.head()

In [24]:
#sf_byHourOfDay_Category_T = sf_byHourOfDay_Category.transpose()

In [25]:
#sf_byHourOfDay_Category_T.head()

In [26]:
sf_byHourOfDay_CategoryGB_Frame = sf_byHourOfDay_CategoryGB.size().reset_index()

In [27]:
sf_byHourOfDay_CategoryGB_Frame = sf_byHourOfDay_CategoryGB_Frame.rename(columns = { 0 : 'numIncidents'})

In [28]:
sf_byHourOfDay_CategoryGB_Frame.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 192 entries, 0 to 191
Data columns (total 3 columns):
HourOfDay       192 non-null int64
Category        192 non-null object
numIncidents    192 non-null int64
dtypes: int64(2), object(1)
memory usage: 6.0+ KB


In [29]:
#sf_byHourOfDay_CategoryGB_Frame

In [30]:
#sf_byHourOfDay_CategoryGB_Frame

In [83]:
from bokeh.charts import Bar, show, Line, defaults
from bokeh.models import HoverTool 
from bokeh.resources import CDN
from bokeh.embed import autoload_static
from bokeh.embed import file_html


output_notebook()
defaults.width = 1024
defaults.height = 1024

TOOLS = "pan,wheel_zoom,reset,save,resize,hover"

my_BarDataHOD = sanfrancisco_incidents_selectedcategories.loc[:,('HourOfDay', 'IncidntNum' ,'Category')]

p = Bar(my_BarDataHOD, label = 'HourOfDay', values='IncidntNum', stack = 'Category', agg = 'count',
        title="Incidents in hour of day",
        palette=myColors,
        xlabel = 'hour of the day', ylabel = 'Incidents', tools = TOOLS, logo = None,
        plot_width=1024, plot_height=1024)
#p.legend.location = 'top_left'
hover = p.select(dict(type=HoverTool))
hover.tooltips=[
            ("Hour", "@HourOfDay"),
            ("Category", "@Category"),
        ]

html = file_html(p, CDN, "SFPD Incidents in hour of day")

file_ = open('scripts/sfincidentsoverhourofday.html', 'w')
file_.write(html)
file_.close()

show(p)


<bokeh.io._CommsHandle at 0x394b9c5cf8>

## For either city, how do incidents vary by neighborhood? Which incidents are most common in the city center? In what areas or neighborhoods are robberies or thefts most common?

In [82]:
output_notebook()
defaults.width = 1024
defaults.height = 1024

TOOLS = "pan,wheel_zoom,reset,save,resize,hover"

my_BarDataNB = sanfrancisco_incidents_selectedcategories.loc[:,('PdDistrict', 'IncidntNum' ,'Category')]

p = Bar(my_BarDataNB, label = 'PdDistrict', values='IncidntNum', stack = 'Category', 
        agg = 'count', title="Incidents per District",
        palette=myColors,
        xlabel = 'District', ylabel = 'Incidents', tools = TOOLS, logo = None,
        plot_width=1024, plot_height=1024)
#p.legend.location = 'top_left'
hover = p.select(dict(type=HoverTool))
hover.tooltips=[
            ("District", "@PdDistrict"),
            ("Category", "@Category"),
        ]

html = file_html(p, CDN, "SFPD Incidents per district")

file_ = open('scripts/sfincidentsoverdistrict.html', 'w')
file_.write(html)
file_.close()


show(p)


<bokeh.io._CommsHandle at 0x394a4cb208>

## City Center

In [33]:
output_notebook()
defaults.width = 1024
defaults.height = 1024

TOOLS = "pan,wheel_zoom,reset,save,resize,hover"
my_BarDataNB = sanfrancisco_incidents_selectedcategories.loc[:,('PdDistrict', 'IncidntNum' ,'Category')]
my_BarDataNB = my_BarDataNB[my_BarDataNB['PdDistrict'] == 'CENTRAL']
p = Bar(my_BarDataNB, label = 'PdDistrict', values='IncidntNum', 
        stack = 'Category', agg = 'count', title="Incidents in CENTRAL disctrict",
        palette=myColors,
        xlabel = 'CENTRAL District', ylabel = 'Incidents', tools = TOOLS, logo = None)



hover = p.select(dict(type=HoverTool))
hover.tooltips=[
            ("District", "@PdDistrict"),
            ("Category", "@Category"),
        ]

html = file_html(p, CDN, "SFPD Incidents in CENTRAL district")

file_ = open('scripts/sfincidentscentraldistrict.html', 'w')
file_.write(html)
file_.close()

show(p)

<bokeh.io._CommsHandle at 0x394141a4e0>

## Robbery over HourOfDay

In [34]:
sf_byHourOfDay_Category_Total

Unnamed: 0,HourOfDay,Category,numIncidents
0,0,ASSAULT,169
1,0,DRUG/NARCOTIC,56
2,0,LARCENY/THEFT,354
3,0,MISSING PERSON,39
4,0,ROBBERY,18
5,0,VEHICLE THEFT,66
6,0,WARRANTS,62
7,0,WEAPON LAWS,26
8,1,ASSAULT,115
9,1,DRUG/NARCOTIC,30


In [35]:
from bokeh.charts import Line
output_notebook()
defaults.width = 1024
defaults.height = 1024

TOOLS = "pan,wheel_zoom,reset,save,resize,hover"
my_BarDataHOD = sf_byHourOfDay_Category_Total[sf_byHourOfDay_Category_Total['Category'] == 'ROBBERY']

p = Line(my_BarDataHOD, x = 'HourOfDay', y = 'numIncidents', title="Robberies per Hour Of Day",
        xlabel = 'HourOfDay', ylabel = 'Robberies', tools = TOOLS, logo = None)
p.x_range = FactorRange(factors=my_BarDataHOD['HourOfDay'].tolist())

#hover = p.select(dict(type=HoverTool))
#hover.tooltips=[
#            ("Hour", "@HourOfDay"),
#            ("Category", "@Category"),
#        ]

html = file_html(p, CDN, "ROBBERY over hour of day")

file_ = open('scripts/sfincidentsrobberyoverhourofday.html', 'w')
file_.write(html)
file_.close()

show(p)

<bokeh.io._CommsHandle at 0x39415b8518>

# Other Style

In [36]:
my_data = sf_byCategory_HourOfDay.unstack()
my_data.loc['ROBBERY'].tolist()

[18,
 20,
 19,
 10,
 7,
 6,
 7,
 5,
 10,
 10,
 6,
 9,
 21,
 10,
 10,
 7,
 10,
 15,
 16,
 15,
 16,
 18,
 21,
 22]

In [37]:
from bokeh.palettes import Spectral11, OrRd9, PuBu9, Greys9, Blues9
from bokeh.plotting import figure, show, output_file
output_notebook()

my_data = sf_byCategory_HourOfDay.unstack()
my_data = my_data.loc['ROBBERY']

my_x = my_data.index.values
my_y = my_data.tolist()
legend = "ROBBERY"

p = figure(plot_width=1024, plot_height=1024, tools=TOOLS,
           title = 'Robbery over Hour Of Day in of San Francisco Crime Incidents Summer 2014',
           x_axis_label = 'Time',
           y_axis_label =  '#Incidents',
           logo = None
           )

p.line(x = my_x, y = my_y, color='red', legend = legend, line_width = 5)
p.circle(x = my_x, y = my_y, color='red', size = 8, fill_color='white')

p.legend.location = 'top_left'


html = file_html(p, CDN, "Categories over Hour Of Day")

file_ = open('scripts/sfincidentsrobberyoverhourofday.html', 'w')
file_.write(html)
file_.close()

show(p)

<bokeh.io._CommsHandle at 0x39417515c0>

In [81]:
from bokeh.palettes import Spectral11, OrRd9, PuBu9, Greys9, Blues9, Spectral11, Purples9
from bokeh.plotting import figure, show, output_file
output_notebook()

my_data = sf_byCategory_HourOfDay.unstack()

category_list = my_data.index.values
my_xs = []
my_ys = []
legend = category_list

for (index, row) in my_data.iterrows():
    category = index
    c_x = range (0, 24)
    c_y = row
    my_ys.append(c_y)

numlines=len(category_list)
my_xs = [my_data.columns.astype(str)] * numlines

#myColors = OrRd9 + PuBu9 + Greys9 + Blues9
#myColors = Spectral11 + Purples9
mypalette = myColors[0:numlines]

#p = figure(width=1024, height=768, x_axis_type="datetime", x_range=(x_index[0], x_index[my_index_length - 1])) 
p = figure(plot_width=1024, plot_height=1024, tools=TOOLS,
           title = 'Categories over Hour Of Day in of San Francisco Crime Incidents Summer 2014',
           x_axis_label = 'Time',
           y_axis_label =  '#Incidents',
           logo = None
           )


for i in range(0, numlines):
    p.line(x = my_xs[i], y = my_ys[i], color=mypalette[i], legend = legend[i], line_width = 5)
    #p.line(x = my_xs[i], y = my_ys[i], color=mypalette[i], line_width = 5, legend='top_left')
    p.circle(x = my_xs[i], y = my_ys[i], color=mypalette[i], size = 8, fill_color='white')

p.legend.location = 'top_left'

hover = p.select(dict(type=HoverTool))
hover.tooltips=[
            ("Hour", "$legend"),
            ("Category", "@Category"),
        ]
    
#p.multi_line(xs = my_xs[0],
#                ys = my_ys[0],
#                line_color = mypalette[0],
#                line_width = 5)


html = file_html(p, CDN, "Categories over Hour Of Day")

file_ = open('scripts/sfincidentscategoryoverhourofday.html', 'w')
file_.write(html)
file_.close()

show(p)

SyntaxError: keyword argument repeated (<ipython-input-81-7c0f2f7d31ec>, line 31)

In [39]:
my_data = sanfrancisco_incidents_selectedcategories.loc[:,('PdDistrict', 'IncidntNum' ,'Category','X','Y')]

In [40]:
my_data.head(5)

Unnamed: 0,PdDistrict,IncidntNum,Category,X,Y
2,SOUTHERN,146177923,LARCENY/THEFT,-122.409795,37.780036
3,RICHMOND,146177531,LARCENY/THEFT,-122.485263,37.772518
5,SOUTHERN,140734349,DRUG/NARCOTIC,-122.416578,37.773907
6,SOUTHERN,140734349,DRUG/NARCOTIC,-122.416578,37.773907
10,CENTRAL,140738711,VEHICLE THEFT,-122.415822,37.787293


## Try Geo

In [97]:
from bokeh.io import show
from bokeh.models import (
  GMapPlot, GMapOptions, ColumnDataSource, Circle, DataRange1d, PanTool, WheelZoomTool, BoxSelectTool, HoverTool,
    ResizeTool, ResetTool, PreviewSaveTool
)
output_notebook()

robberies = sanfrancisco_incidents_selectedcategories[sanfrancisco_incidents_selectedcategories['Category'] == 'ROBBERY']

my_data = robberies.loc[:,('PdDistrict', 'IncidntNum' ,'Category','X','Y', 'Address', 'Date')]

#roadmap, sattellite
map_options = GMapOptions(lat=37.77, lng=-122.44, map_type="roadmap", zoom=13)



p = GMapPlot(
    x_range=DataRange1d(), y_range=DataRange1d(), map_options=map_options, title="San Francisco", logo = None,
    plot_width=1024, plot_height=1024
)


category_selection
numcategories=len(category_selection)
mypalette = myColors[0:numcategories]

x = my_data['X']
y = my_data['Y']
c = my_data['Category']
d = my_data['PdDistrict']
a = my_data['Address']
date = my_data['Date']

source = ColumnDataSource(
    data=dict(
        lat=y,
        lon=x,
        cat=c,
        district=d,
        address = a,
        date = date,
        palette=mypalette,
    )
)

circle = Circle(x='lon', y='lat', size=8, fill_color='blue', fill_alpha=0.8, line_color=None)
p.add_glyph(source, circle)

p.add_tools(PanTool(), WheelZoomTool(), HoverTool(), ResizeTool(), ResetTool(), PreviewSaveTool())

hover = p.select(dict(type=HoverTool))
hover.tooltips=[
            ("Category", "@cat"),
            ("Dsictrict", "@district"),
            ("Address", "@address"),
            ("Date", "@date")
        ]



html = file_html(p, CDN, "GEO")
file_ = open('scripts/geomaprobberies.html', 'w')
file_.write(html)
file_.close()

show(p)

ERROR:C:\Users\Thomas\Anaconda3\lib\site-packages\bokeh\core\validation\check.py:E-1000 (COLUMN_LENGTHS): ColumnDataSource column lengths are not all the same: ColumnDataSource, ViewModel:ColumnDataSource, ref _id: e0f3393f-4e33-4e3b-b86a-a782315d762b


<bokeh.io._CommsHandle at 0x395008fc88>

## Example

In [95]:
from bokeh.io import output_file, show
from bokeh.models import (
  GMapPlot, GMapOptions, ColumnDataSource, Circle, DataRange1d, PanTool, WheelZoomTool, BoxSelectTool, PreviewSaveTool
)

map_options = GMapOptions(lat=30.29, lng=-97.73, map_type="roadmap", zoom=11)

plot = GMapPlot(
    x_range=DataRange1d(), y_range=DataRange1d(), map_options=map_options, title="Austin"
)

source = ColumnDataSource(
    data=dict(
        lat=[30.29, 30.20, 30.29],
        lon=[-97.70, -97.74, -97.78],
    )
)

circle = Circle(x="lon", y="lat", size=15, fill_color="blue", fill_alpha=0.8, line_color=None)
plot.add_glyph(source, circle)

plot.add_tools(PanTool(), WheelZoomTool(), BoxSelectTool(), PreviewSaveTool())
#output_file("gmap_plot.html")
show(plot)


<bokeh.io._CommsHandle at 0x3950494160>