Import dataframe and do minor preprocessing

In [1]:
import folium
from folium.plugins import HeatMap
import requests
import pandas as pd
import re
from math import exp
import branca.colormap
from collections import defaultdict

arrest_table = pd.read_csv("https://cmsc320.github.io/files/BPD_Arrests.csv")

arrest_table = arrest_table[pd.notnull(arrest_table["Location 1"])]

arrest_table["lat"], arrest_table["long"] = arrest_table["Location 1"].str.split(",").str
arrest_table["lat"] = arrest_table["lat"].str.replace("(", "").astype(float)
arrest_table["long"] = arrest_table["long"].str.replace(")", "").astype(float)

arrest_table

  arrest_table["lat"], arrest_table["long"] = arrest_table["Location 1"].str.split(",").str


Unnamed: 0,arrest,age,race,sex,arrestDate,arrestTime,arrestLocation,incidentOffense,incidentLocation,charge,chargeDescription,district,post,neighborhood,Location 1,lat,long
1,11127013.0,37,B,M,01/01/2011,00:01:00,2000 Wilkens Ave,79-Other,Wilkens Av & S Payson St,1 1425,Reckless Endangerment || Hand Gun Violation,SOUTHERN,934.0,Carrollton Ridge,"(39.2814026274, -76.6483635135)",39.281403,-76.648364
2,11126887.0,46,B,M,01/01/2011,00:01:00,2800 Mayfield Ave,Unknown Offense,,,Unknown Charge,NORTHEASTERN,415.0,Belair-Edison,"(39.3227699160, -76.5735750473)",39.322770,-76.573575
3,11126873.0,50,B,M,01/01/2011,00:04:00,2100 Ashburton St,79-Other,2100 Ashburton St,1 1106,Reg Firearm:Illegal Possession || Hgv,WESTERN,735.0,Panway/Braddish Avenue,"(39.3117196723, -76.6623546313)",39.311720,-76.662355
4,11126968.0,33,B,M,01/01/2011,00:05:00,4000 Wilsby Ave,Unknown Offense,1700 Aliceanna St,,Unknown Charge,NORTHERN,525.0,Pen Lucy,"(39.3382885254, -76.6045667070)",39.338289,-76.604567
5,11127041.0,41,B,M,01/01/2011,00:05:00,2900 Spellman Rd,81-Recovered Property,2900 Spelman Rd,1 1425,Reckless Endangerment || Handgun Violation,SOUTHERN,924.0,Cherry Hill,"(39.2449886230, -76.6273582432)",39.244989,-76.627358
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
104522,13610388.0,32,B,M,12/31/2012,23:00:00,3300 Pulaski St,79-Other,3300 Pulaski Hw,1 5212,Handgun On Person || Handgun Violation,SOUTHEASTERN,223.0,Ellwood Park/Monument,"(39.2958396988, -76.5712467336)",39.295840,-76.571247
104523,13610389.0,27,B,M,12/31/2012,23:10:00,400 W Baltimore St,Unknown Offense,,2 0055,Fail Obey Renble/Lawfl || Disorderly/Fto,CENTRAL,113.0,Downtown,"(39.2893323126, -76.6210021717)",39.289332,-76.621002
104524,13610377.0,34,B,M,12/31/2012,23:30:00,3800 Belair Rd,4E-Common Assault,3600 Belair Rd,1 1415,Asslt-Sec Degree || Assault,NORTHEASTERN,415.0,Belair-Edison,"(39.3252613570, -76.5689030474)",39.325261,-76.568903
104525,13610383.0,38,B,M,12/31/2012,23:40:00,2000 Mckean Ave,87-Narcotics,1700 Mckean Av,2A0696,Att-Cds Manuf/Dist-Narc || Distribution Cds,WESTERN,733.0,Mondawmin,"(39.3116837460, -76.6475011849)",39.311684,-76.647501


Load map centered over Baltimore

In [2]:
map_osm = folium.Map(location=[39.29, -76.61], zoom_start=11)
map_osm

Let's see the best/worst places to live if you don't like gun crimes. To do this, we'll look at the subset of the data where "gun" or similar words are present in the charge description. We aren't picky about which violation, if it involves a gun, we want to look at it. We want to get the locations of each arrest to plot later.

In [3]:
arrest_table = arrest_table[arrest_table['chargeDescription'].notna()]

# Looked through the link below to select all the words used to refer to guns
# MD Charging Language:
# https://mdcourts.gov/sites/default/files/import/district/charginglanguage.pdf
gun_words = ['gun','firearm','rifle']

# regex used in str.contains to find any of the gun words above
pat = '|'.join(r"\b{}\b".format(gun_word) for gun_word in gun_words)

gun_crimes = arrest_table[arrest_table['chargeDescription'].str.contains(pat,flags=re.IGNORECASE)]

locations = pd.DataFrame(columns = ['lat','lon'])
locations['lat'] = gun_crimes['lat']
locations['lon'] = gun_crimes['long']

locations

Unnamed: 0,lat,lon
1,39.281403,-76.648364
3,39.311720,-76.662355
12,39.317729,-76.648958
16,39.299259,-76.640105
17,39.299259,-76.640105
...,...,...
103819,39.285664,-76.672270
104119,39.337401,-76.677576
104145,39.304277,-76.635679
104148,39.285669,-76.636667


Now that we have locations, lets plot them and surround them with a 5 block radius so we can see the number of gun crimes in a 5 block radius. We'll plot the markers and heat map without blur to check the zoom_m_scale.

In [4]:
# adds circles with radius n blocks (~80m)
def plotDot(row,radius_m,map_name):
    folium.Circle(location=[row.lat, row.lon], radius=radius_m, weight=1).add_to(map_name)

def plotHeatMapCircles(zoom,radius_blocks) :
    zoom_m_scale = 0.00000871*exp(0.69*zoom)
    radius_m = radius_blocks*80

    map_name = folium.Map(location=[39.29, -76.61],
                             zoom_start=zoom,
                             zoom_control=False,
                             scrollWheelZoom=False,)

    locations.apply(lambda row : plotDot(row,radius_m,map_name),axis = 1)

    HeatMap(locations,
            radius = radius_m*zoom_m_scale,
            blur = 1).add_to(map_name)

    display(map_name)
    
radius_blocks = 5
plotHeatMapCircles(11,radius_blocks)
plotHeatMapCircles(13,radius_blocks)
plotHeatMapCircles(15,radius_blocks)

We see our zoom scaling correction term zoom_m_scale is working correctly and the heat map is correctly selecting the specified radius. Now we will tweak our previous function to make it smoother since we know the radius used in the computation is accurate and meaningful. We also see strange stuff in the 15 zoom, so we will not proceed in using it for analysis.

In [5]:
def plotHeatMap(zoom,radius_blocks) :
    zoom_m_scale = 0.00000871*exp(0.69*zoom)
    radius_m = radius_blocks*80

    map_name = folium.Map(location=[39.29, -76.61],
                             zoom_start=zoom,
                             zoom_control=False,
                             scrollWheelZoom=False,)
    
    HeatMap(locations,
            radius = radius_m*zoom_m_scale,
            blur = 15).add_to(map_name)

    display(map_name)

radius_blocks = 5
plotHeatMap(11,radius_blocks)
plotHeatMap(13,radius_blocks)

We now can select good places to live without gun crimes nearby. In the middle northernmost region, there are very few gun crimes. It would be interesting to compute the actual number of gun crimes withing 5 blocks and add a colorbar, but from my research this is not trivial. It is also uncessary, since we are only looking for the best/worst places to live free of gun crime, so we only care about relative numbers.