# Berkeley Vehicle-related Crime Heatmap
By <a href="https://ayip.io">Angus Yip</a>  
### Introduction
---
The purpose of this notebook is to know where the safer areas for parking are in Berkeley.  
The Berkeley crime dataset found <a href="https://data.cityofberkeley.info/Public-Safety/Berkeley-PD-Calls-for-Service/k2nh-s5h5">here</a> contains a list of calls for service to the Berkeley Police Department.  
I extracted calls related to vehicles using tools from the `pandas` library, then put the resulting DataFrame through the `gcloud` heatmap feature.  
The resulting heatmap would indicate where it's safer to park in Berkeley in general.  
I'm in no way or form responsible if you followed this guide for parking and still got broken into; data is powerful but it ain't Jesus.

### Imports
---

In [1]:
import pandas as pd
from sodapy import Socrata

Standard stuff  
`sodapy` is the Python bindings for the Socrata Open Data API  
The docs could be found <a href="https://dev.socrata.com/foundry/data.cityofberkeley.info/s24d-wsnp">here</a>.  

### Getting the dataset into a pd.DataFrame
---
The following snippet is just a minor modification from a code example in the docs.  

In [2]:
client = Socrata("data.cityofberkeley.info", None)
results = client.get("s24d-wsnp", limit=10000)

df = pd.DataFrame.from_records(results)
df.sample(10)



Unnamed: 0,blkaddr,block_location,block_location_address,block_location_city,block_location_state,caseno,city,cvdow,cvlegend,eventdt,eventtm,indbdate,offense,state
320,BANCROFT WAY & FULTON ST,"{'type': 'Point', 'coordinates': [-122.265858,...",BANCROFT WAY & FULTON ST,Berkeley,CA,17030603,Berkeley,1,BURGLARY - VEHICLE,2017-05-29T00:00:00.000,12:00,2017-11-14T03:30:12.000,BURGLARY AUTO,CA
1225,800 COLUSA AVE,"{'type': 'Point', 'coordinates': [-122.281105,...",800 COLUSA AVE,Berkeley,CA,17046923,Berkeley,4,ALL OTHER OFFENSES,2017-08-10T00:00:00.000,19:44,2017-11-14T03:30:14.000,MUNICIPAL CODE,CA
2128,2300 DWIGHT WAY,"{'type': 'Point', 'coordinates': [-122.262993,...",2300 DWIGHT WAY,Berkeley,CA,17040762,Berkeley,4,BURGLARY - RESIDENTIAL,2017-07-13T00:00:00.000,19:00,2017-11-14T03:30:13.000,BURGLARY RESIDENTIAL,CA
2498,2000 CENTER ST,"{'type': 'Point', 'coordinates': [-122.270557,...",2000 CENTER ST,Berkeley,CA,17039877,Berkeley,1,ASSAULT,2017-07-10T00:00:00.000,00:41,2017-11-14T03:30:13.000,ASSAULT/BATTERY FEL.,CA
4477,CHANNING WAY & DANA ST,"{'type': 'Point', 'coordinates': [-122.26108, ...",CHANNING WAY & DANA ST,Berkeley,CA,17034993,Berkeley,0,DISORDERLY CONDUCT,2017-06-18T00:00:00.000,01:03,2017-11-14T03:30:12.000,DISTURBANCE,CA
4456,2500 DWIGHT WAY,"{'type': 'Point', 'coordinates': [-122.258331,...",2500 DWIGHT WAY,Berkeley,CA,17061044,Berkeley,6,BURGLARY - VEHICLE,2017-10-07T00:00:00.000,18:00,2017-11-14T03:30:19.000,BURGLARY AUTO,CA
3142,2600 ASHBY AVE,"{'type': 'Point', 'coordinates': [-122.255248,...",2600 ASHBY AVE,Berkeley,CA,17063230,Berkeley,3,ASSAULT,2017-10-18T00:00:00.000,17:00,2017-11-14T03:30:19.000,ASSAULT/BATTERY MISD.,CA
4072,2400 MCGEE AVE,"{'type': 'Point', 'coordinates': [-122.276835,...",2400 MCGEE AVE,Berkeley,CA,17091521,Berkeley,3,BURGLARY - VEHICLE,2017-08-02T00:00:00.000,19:45,2017-11-14T03:30:14.000,BURGLARY AUTO,CA
5204,1900 ALLSTON WAY,"{'type': 'Point', 'coordinates': [-122.272701,...",1900 ALLSTON WAY,Berkeley,CA,17051310,Berkeley,2,DRUG VIOLATION,2017-08-29T00:00:00.000,12:00,2017-11-14T03:30:16.000,NARCOTICS,CA
3529,1400 GRANT ST,"{'type': 'Point', 'coordinates': [-122.27627, ...",1400 GRANT ST,Berkeley,CA,17091993,Berkeley,2,LARCENY,2017-10-03T00:00:00.000,20:00,2017-11-14T03:30:19.000,THEFT MISD. (UNDER $950),CA


It's not the cleanest looking df, and we need a df that only has `['latitude', 'longitude']` as columns for the heatmap in `gmaps` to work.  

In [3]:
"Rows: %d" % len(df.index) # lets also find out how many rows there are

'Rows: 5644'

### Parsing
---
Here we can see that the `cvlegend` column contains a brief description of the crime.  
Upon futher inspection, we would find that each crime that could happen to our parked car conveniently contains the string `"VEHICLE"`.  

In [4]:
df[['block_location', 'cvlegend']][df['cvlegend'].str.contains("VEHICLE")].head()

Unnamed: 0,block_location,cvlegend
1,,MOTOR VEHICLE THEFT
4,,MOTOR VEHICLE THEFT
6,"{'type': 'Point', 'coordinates': [-122.269974,...",BURGLARY - VEHICLE
11,,BURGLARY - VEHICLE
15,,BURGLARY - VEHICLE


Omitting the `NaN` rows.

In [5]:
df = df[['block_location']][df['cvlegend'].str.contains("VEHICLE")].dropna()
df.head()

Unnamed: 0,block_location
6,"{'type': 'Point', 'coordinates': [-122.269974,..."
27,"{'type': 'Point', 'coordinates': [-122.297995,..."
38,"{'type': 'Point', 'coordinates': [-122.269762,..."
40,"{'type': 'Point', 'coordinates': [-122.251978,..."
43,"{'type': 'Point', 'coordinates': [-122.250875,..."


Almost there. Now we just need to convert the `dict` inside of every row into two seperate columns.  
The coordinates are nested 2 levels deep so we need to `.apply(pd.Series)` twice.  

In [6]:
df = df['block_location'].apply(pd.Series)['coordinates'].apply(pd.Series)
df.head()

Unnamed: 0,0,1
6,-122.269974,37.881525
27,-122.297995,37.879708
38,-122.269762,37.866426
40,-122.251978,37.883202
43,-122.250875,37.888241


`gmaps` isn't gonna be happy with long lat.

In [7]:
df = df[[1, 0]] # long lat to lat long

Perfection.

### One last import
---
Here are the `gmaps` <a href="https://github.com/pbugnion/gmaps#installing-gmaps-with-pip">installation guide</a> and <a href="http://jupyter-gmaps.readthedocs.io/en/latest/tutorial.html">docs</a>.  
You'll need a <a href="https://developers.google.com/maps/documentation/javascript/">Google Maps Web Javascript API</a> key for this to work.  

In [9]:
import gmaps

gmaps.configure(api_key=API_KEY)

Then we set center and zoom level of our `gmaps.figure`.

In [10]:
UCB_LAT_LONG = (37.8719, -122.2585)
fig = gmaps.figure(center=UCB_LAT_LONG, zoom_level=14)

Then we put our parsed DataFrame into a `gmaps.heatmap_layer` and turn it on.  

In [11]:
heatmap_layer = gmaps.heatmap_layer(df)
fig.add_layer(heatmap_layer)

Final aesthetic stuff and alas...

In [12]:
heatmap_layer.gradient = [
    (255,255,0,0),
    'red',
    'red'
]
heatmap_layer.point_radius = 30
fig

### Uh oh GitHub can't display a gmaps widget yet :(
---
It should be fine though if you cloned this repository, followed the installation guide for `gmaps`, and ran the cells again.  
Here's an attached image to show what our heatmap would look like.

<img src="./images/ucb_parking_map.png" />

Thanks for reading and good luck fam