# Berkeley Vehicle-related Crime Heatmap
By <a href="https://ayip.io">Angus Yip</a>  
### Introduction
---
The purpose of this notebook is to know where the safer areas for parking are in Berkeley.  
The Berkeley crime dataset found <a href="https://data.cityofberkeley.info/Public-Safety/Berkeley-PD-Calls-for-Service/k2nh-s5h5">here</a> contains a list of calls for service to the Berkeley Police Department.  
I extracted calls related to vehicles using tools from the `pandas` library, then put the resulting DataFrame through the `gcloud` heatmap feature.  
The resulting heatmap would indicate where it's safer to park in Berkeley in general.  
I'm in no way or form responsible if you followed this guide for parking and still got broken into; data is powerful but it ain't Jesus.

### Imports
---

In [2]:
import pandas as pd
from sodapy import Socrata

Standard stuff  
`sodapy` is the Python bindings for the Socrata Open Data API  
The docs could be found <a href="https://dev.socrata.com/foundry/data.cityofberkeley.info/s24d-wsnp">here</a>.  

### Getting the dataset into a pd.DataFrame
---
The following snippet is just a minor modification from a code example in the docs.  

In [3]:
client = Socrata("data.cityofberkeley.info", None)
results = client.get("s24d-wsnp")

df = pd.DataFrame.from_records(results)
df.sample(10)



Unnamed: 0,:@computed_region_3ini_iehf,:@computed_region_5bih_7r3y,:@computed_region_5s6d_2f32,:@computed_region_x3q3_gi3e,blkaddr,block_location,block_location_address,block_location_city,block_location_state,caseno,city,cvdow,cvlegend,eventdt,eventtm,indbdate,offense,state
976,13.0,19.0,1496.0,6.0,COLLEGE AVENUE & PARKER ST,"{'type': 'Point', 'coordinates': [-122.253693,...",COLLEGE AVENUE & PARKER ST,Berkeley,CA,17062574,Berkeley,0,LARCENY,2017-10-15T00:00:00.000,21:00,2017-11-13T03:30:13.000,THEFT MISD. (UNDER $950),CA
392,6.0,11.0,1502.0,5.0,200 MARINA BLVD,"{'type': 'Point', 'coordinates': [-122.313656,...",200 MARINA BLVD,Berkeley,CA,17053284,Berkeley,2,BURGLARY - VEHICLE,2017-09-05T00:00:00.000,10:50,2017-11-13T03:30:19.000,BURGLARY AUTO,CA
639,5.0,11.0,1502.0,8.0,800 BANCROFT WAY,"{'type': 'Point', 'coordinates': [-122.296893,...",800 BANCROFT WAY,Berkeley,CA,17040961,Berkeley,5,ASSAULT,2017-07-14T00:00:00.000,19:49,2017-11-13T03:30:14.000,ASSAULT/BATTERY FEL.,CA
278,,,,,2488 MARKET AVENUE,,2488 MARKET AVENUE,Berkeley,CA,17031217,Berkeley,4,DRUG VIOLATION,2017-06-01T00:00:00.000,09:43,2017-11-13T03:30:14.000,NARCOTICS,CA
39,2.0,13.0,1495.0,7.0,2900 OTIS ST,"{'type': 'Point', 'coordinates': [-122.270198,...",2900 OTIS ST,Berkeley,CA,17091854,Berkeley,6,VANDALISM,2017-09-16T00:00:00.000,09:00,2017-11-13T03:30:20.000,VANDALISM,CA
549,,,,,800 BOLIVAR DR,,800 BOLIVAR DR,Berkeley,CA,17040770,Berkeley,4,FRAUD,2017-07-13T00:00:00.000,23:50,2017-11-13T03:30:14.000,IDENTITY THEFT,CA
669,8.0,6.0,1495.0,5.0,1800 MCGEE AVE,"{'type': 'Point', 'coordinates': [-122.277834,...",1800 MCGEE AVE,Berkeley,CA,17044514,Berkeley,1,BURGLARY - RESIDENTIAL,2017-07-31T00:00:00.000,13:15,2017-11-13T03:30:15.000,BURGLARY RESIDENTIAL,CA
107,14.0,32.0,1496.0,1.0,2100 CHANNING WAY,"{'type': 'Point', 'coordinates': [-122.267244,...",2100 CHANNING WAY,Berkeley,CA,17059824,Berkeley,2,MOTOR VEHICLE THEFT,2017-09-19T00:00:00.000,12:00,2017-11-13T03:30:20.000,VEHICLE STOLEN,CA
529,2.0,13.0,1495.0,7.0,ADELINE STREET & OREGON ST,"{'type': 'Point', 'coordinates': [-122.268263,...",ADELINE STREET & OREGON ST,Berkeley,CA,17066599,Berkeley,3,LARCENY,2017-11-01T00:00:00.000,23:00,2017-11-13T03:30:13.000,THEFT MISD. (UNDER $950),CA
523,12.0,25.0,1497.0,6.0,3000 TELEGRAPH AVE,"{'type': 'Point', 'coordinates': [-122.259841,...",3000 TELEGRAPH AVE,Berkeley,CA,17061282,Berkeley,2,BURGLARY - COMMERCIAL,2017-10-10T00:00:00.000,04:49,2017-11-13T03:30:13.000,BURGLARY COMMERCIAL,CA


It's not the cleanest looking df, and we need a df that only has `['latitude', 'longitude']` as columns for the heatmap in `gmaps` to work.  

### Parsing
---
Here we can see that the `cvlegend` column contains a brief description of the crime.  
Upon futher inspection, we would find that each crime that could happen to our parked car conveniently contains the string `"VEHICLE"`.  

In [4]:
df[['block_location', 'cvlegend']][df['cvlegend'].str.contains("VEHICLE")].head()

Unnamed: 0,block_location,cvlegend
10,"{'type': 'Point', 'coordinates': [-122.271395,...",MOTOR VEHICLE THEFT
15,"{'type': 'Point', 'coordinates': [-122.266137,...",MOTOR VEHICLE THEFT
18,,BURGLARY - VEHICLE
21,"{'type': 'Point', 'coordinates': [-122.270493,...",MOTOR VEHICLE THEFT
22,"{'type': 'Point', 'coordinates': [-122.289035,...",BURGLARY - VEHICLE


Omitting the `NaN` rows.

In [5]:
df = df[['block_location']][df['cvlegend'].str.contains("VEHICLE")].dropna()
df.head()

Unnamed: 0,block_location
10,"{'type': 'Point', 'coordinates': [-122.271395,..."
15,"{'type': 'Point', 'coordinates': [-122.266137,..."
21,"{'type': 'Point', 'coordinates': [-122.270493,..."
22,"{'type': 'Point', 'coordinates': [-122.289035,..."
23,"{'type': 'Point', 'coordinates': [-122.296893,..."


Almost there. Now we just need to convert the `dict` inside of every row into two seperate columns.  
The coordinates are nested 2 levels deep so we need to `.apply(pd.Series)` twice.  

In [6]:
df = df['block_location'].apply(pd.Series)['coordinates'].apply(pd.Series)
df.head()

Unnamed: 0,0,1
10,-122.271395,37.877951
15,-122.266137,37.88248
21,-122.270493,37.895099
22,-122.289035,37.85922
23,-122.296893,37.879996


`gmaps` isn't gonna be happy with long lat.

In [7]:
df = df[[1, 0]] # long lat to lat long

Perfection.

### One last import
---
The `gmaps` docs for Jupyter can be found <a href="http://jupyter-gmaps.readthedocs.io/en/latest/tutorial.html">here</a>.  

In [8]:
import gmaps

gmaps.configure(api_key=API_KEY)

Then we set center and zoom level of our `gmaps.figure`.

In [9]:
UCB_LAT_LONG = (37.8719, -122.2585)
fig = gmaps.figure(center=UCB_LAT_LONG, zoom_level=14)

Then we put our parsed DataFrame into a `gmaps.heatmap_layer` and turn it on.  

In [10]:
heatmap_layer = gmaps.heatmap_layer(df)
fig.add_layer(heatmap_layer)

Final aesthetic stuff and alas...

In [11]:
heatmap_layer.gradient = [
    (255,255,0,0),
    'red',
    'red'
]
heatmap_layer.point_radius = 30
fig

Good luck!