# Berkeley Vehicle-related Crime Heatmap
By <a href="https://ayip.io">Angus Yip</a>  


### TL;DR avoid parking in red
---
<img src="./images/ucb_parking_heatmap.png" />
GitHub can't display a `gmaps` figure yet, so here's a screenshot of the end result.  

### Introduction
---
The purpose of this notebook is to know where the safer areas for parking are in Berkeley.  
The Berkeley crime dataset found/ <a href="https://data.cityofberkeley.info/Public-Safety/Berkeley-PD-Calls-for-Service/k2nh-s5h5">here</a> contains a list of calls for service to the Berkeley Police Department.  
I extracted calls related to vehicles using tools from the `pandas` library, then put the resulting DataFrame through the `gcloud` heatmap feature.  
The resulting heatmap would indicate where it's safer to park in Berkeley in general.  
I'm in no way or form responsible if you followed this guide for parking and still got broken into; data is powerful but it ain't Jesus.  

### Imports
---

In [1]:
import pandas as pd
from sodapy import Socrata

Standard stuff  
`sodapy` is the Python bindings for the Socrata Open Data API  
The docs could be found <a href="https://dev.socrata.com/foundry/data.cityofberkeley.info/s24d-wsnp">here</a>.  

### Getting the dataset into a pd.DataFrame
---
The following snippet is just a minor modification from a code example in the docs.  

In [2]:
client = Socrata("data.cityofberkeley.info", None)
results = client.get("s24d-wsnp", limit=10000)

df = pd.DataFrame.from_records(results)
df.tail(10)



Unnamed: 0,blkaddr,block_location,block_location_address,block_location_city,block_location_state,caseno,city,cvdow,cvlegend,eventdt,eventtm,indbdate,offense,state
5634,2300 TELEGRAPH AVE,"{'type': 'Point', 'coordinates': [-122.259189,...",2300 TELEGRAPH AVE,Berkeley,CA,17041740,Berkeley,2,LARCENY,2017-05-30T00:00:00.000,16:18,2017-11-14T03:30:14.000,THEFT MISD. (UNDER $950),CA
5635,1400 PARKER ST,"{'type': 'Point', 'coordinates': [-122.282847,...",1400 PARKER ST,Berkeley,CA,17090975,Berkeley,6,LARCENY,2017-05-20T00:00:00.000,00:05,2017-11-14T03:30:12.000,THEFT FELONY (OVER $950),CA
5636,2500 TELEGRAPH AVE,"{'type': 'Point', 'coordinates': [-122.258767,...",2500 TELEGRAPH AVE,Berkeley,CA,17039900,Berkeley,1,LARCENY,2017-07-10T00:00:00.000,07:25,2017-11-14T03:30:13.000,THEFT MISD. (UNDER $950),CA
5637,TELEGRAPH AVENUE & CARLETON ST,"{'type': 'Point', 'coordinates': [-122.258784,...",TELEGRAPH AVENUE & CARLETON ST,Berkeley,CA,17066551,Berkeley,3,LARCENY,2017-11-01T00:00:00.000,17:49,2017-11-14T03:30:19.000,THEFT MISD. (UNDER $950),CA
5638,KITTREDGE STREET & SHATTUCK AVE,"{'type': 'Point', 'coordinates': [-122.26795, ...",KITTREDGE STREET & SHATTUCK AVE,Berkeley,CA,17051954,Berkeley,4,BURGLARY - VEHICLE,2017-08-31T00:00:00.000,21:46,2017-11-14T03:30:17.000,BURGLARY AUTO,CA
5639,2300 FOURTH ST,"{'type': 'Point', 'coordinates': [-122.298368,...",2300 FOURTH ST,Berkeley,CA,17091479,Berkeley,0,BURGLARY - VEHICLE,2017-07-30T00:00:00.000,18:00,2017-11-14T03:30:14.000,BURGLARY AUTO,CA
5640,2400 DURANT AVE,"{'type': 'Point', 'coordinates': [-122.26127, ...",2400 DURANT AVE,Berkeley,CA,17028183,Berkeley,4,BURGLARY - VEHICLE,2017-05-18T00:00:00.000,15:00,2017-11-14T03:30:12.000,BURGLARY AUTO,CA
5641,2000 ALLSTON WAY,"{'type': 'Point', 'coordinates': [-122.270137,...",2000 ALLSTON WAY,Berkeley,CA,17056730,Berkeley,3,VANDALISM,2017-09-20T00:00:00.000,08:00,2017-11-14T03:30:18.000,VANDALISM,CA
5642,400 SPRUCE ST,"{'type': 'Point', 'coordinates': [-122.270263,...",400 SPRUCE ST,Berkeley,CA,17092098,Berkeley,3,VANDALISM,2017-10-18T00:00:00.000,17:30,2017-11-14T03:30:19.000,VANDALISM,CA
5643,2000 PARKER ST,"{'type': 'Point', 'coordinates': [-122.269644,...",2000 PARKER ST,Berkeley,CA,17057001,Berkeley,3,BURGLARY - VEHICLE,2017-09-20T00:00:00.000,21:00,2017-11-14T03:30:18.000,BURGLARY AUTO,CA


It's not the cleanest looking df, and we need a df that only has `['latitude', 'longitude']` as columns for the heatmap in `gmaps` to work.  

### Parsing
---
Here we can see that the `cvlegend` column contains a brief description of the crime.  
Upon futher inspection, we would find that each crime that could happen to our parked car conveniently contains the string `"VEHICLE"`.  

In [3]:
df[['block_location', 'cvlegend']][df['cvlegend'].str.contains("VEHICLE")].tail()

Unnamed: 0,block_location,cvlegend
5631,"{'type': 'Point', 'coordinates': [-122.298043,...",BURGLARY - VEHICLE
5638,"{'type': 'Point', 'coordinates': [-122.26795, ...",BURGLARY - VEHICLE
5639,"{'type': 'Point', 'coordinates': [-122.298368,...",BURGLARY - VEHICLE
5640,"{'type': 'Point', 'coordinates': [-122.26127, ...",BURGLARY - VEHICLE
5643,"{'type': 'Point', 'coordinates': [-122.269644,...",BURGLARY - VEHICLE


Omitting the `NaN` rows.

In [4]:
df = df[['block_location']][df['cvlegend'].str.contains("VEHICLE")].dropna()

Almost there. Now we just need to convert the `dict` inside of every row into two seperate columns.  
The coordinates are nested 2 levels deep so we need to `.apply(pd.Series)` twice.  

In [5]:
df = df['block_location'].apply(pd.Series)['coordinates'].apply(pd.Series)
df.head()

Unnamed: 0,0,1
6,-122.269974,37.881525
27,-122.297995,37.879708
38,-122.269762,37.866426
40,-122.251978,37.883202
43,-122.250875,37.888241


`gmaps` isn't gonna be happy with long lat.

In [6]:
df = df[[1, 0]] # long lat to lat long

Perfection.

### One last import
---
Here are the `gmaps` <a href="https://github.com/pbugnion/gmaps#installing-gmaps-with-pip">installation guide</a> and <a href="http://jupyter-gmaps.readthedocs.io/en/latest/tutorial.html">docs</a>.  
You'll need a <a href="https://developers.google.com/maps/documentation/javascript/">Google Maps Web Javascript API</a> key for this to work.  

In [8]:
import gmaps

gmaps.configure(api_key=API_KEY)

Then we set center and zoom level of our `gmaps.figure`.

In [9]:
UCB_LAT_LONG = (37.8719, -122.2585)
fig = gmaps.figure(center=UCB_LAT_LONG, zoom_level=14)

Then we put our parsed DataFrame into a `gmaps.heatmap_layer` and turn it on.  

In [10]:
heatmap_layer = gmaps.heatmap_layer(df)
fig.add_layer(heatmap_layer)

Final aesthetic stuff and alas...

In [11]:
heatmap_layer.gradient = [
    (255,255,0,0),
    'red',
    'red'
]
heatmap_layer.point_radius = 30
fig

### Uh oh GitHub can't display a gmaps widget yet :(
---
It should be fine though if you cloned this repository, followed the installation guide for `gmaps`, and ran the cells again.  
The screenshot at the top is the output of the last cell minus the interactivity that a `gmaps` Javascript figure provides.  
Thanks for reading and good luck :)