In [1]:
!pip install folium



In [2]:
import folium
import requests
import pandas
arrest_table = pandas.read_csv("http://www.hcbravo.org/IntroDataSci/misc/BPD_Arrests.csv")
arrest_table = arrest_table[pandas.notnull(arrest_table["Location 1"])]
arrest_table["lat"], arrest_table["long"] = arrest_table["Location 1"].str.split(",").str
arrest_table["lat"] = arrest_table["lat"].str.replace("(", "").astype(float)
arrest_table["long"] = arrest_table["long"].str.replace(")", "").astype(float)
arrest_table.head()

Unnamed: 0,arrest,age,race,sex,arrestDate,arrestTime,arrestLocation,incidentOffense,incidentLocation,charge,chargeDescription,district,post,neighborhood,Location 1,lat,long
1,11127013.0,37,B,M,01/01/2011,00:01:00,2000 Wilkens Ave,79-Other,Wilkens Av & S Payson St,1 1425,Reckless Endangerment || Hand Gun Violation,SOUTHERN,934.0,Carrollton Ridge,"(39.2814026274, -76.6483635135)",39.281403,-76.648364
2,11126887.0,46,B,M,01/01/2011,00:01:00,2800 Mayfield Ave,Unknown Offense,,,Unknown Charge,NORTHEASTERN,415.0,Belair-Edison,"(39.3227699160, -76.5735750473)",39.32277,-76.573575
3,11126873.0,50,B,M,01/01/2011,00:04:00,2100 Ashburton St,79-Other,2100 Ashburton St,1 1106,Reg Firearm:Illegal Possession || Hgv,WESTERN,735.0,Panway/Braddish Avenue,"(39.3117196723, -76.6623546313)",39.31172,-76.662355
4,11126968.0,33,B,M,01/01/2011,00:05:00,4000 Wilsby Ave,Unknown Offense,1700 Aliceanna St,,Unknown Charge,NORTHERN,525.0,Pen Lucy,"(39.3382885254, -76.6045667070)",39.338289,-76.604567
5,11127041.0,41,B,M,01/01/2011,00:05:00,2900 Spellman Rd,81-Recovered Property,2900 Spelman Rd,1 1425,Reckless Endangerment || Handgun Violation,SOUTHERN,924.0,Cherry Hill,"(39.2449886230, -76.6273582432)",39.244989,-76.627358


In [3]:
arrest_table_f = arrest_table.loc[arrest_table.age >= 65]

The map showing the markers are the crimes in Baltimore, Maryland, USA. The markers show the races of the person who committed the crime in the age greater than and equal to 65. The 65 age is considered the very old age despite their age they still committed the crimes. As the graph shows some more black peoples committed the crime than whites and other races. Although, the white comes in second place.

In [4]:
def raceToColor(race):
  if race == 'B':
    return "#4272f5"
  elif race == 'W':
    return "#ed8a11"
  elif race == "A":
    return "#eb4034"
  elif race == "U":
    return "#07f0d1"
  elif race == "I":
    return "#c210e6"
  elif race == "H":
    return "#6ee802"
  else:
    return "#ffffff"

def raceToText(race):
  if race == 'B':
    return "Black"
  elif race == 'W':
    return "White"
  elif race == "A":
    return "Asian"
  elif race == "U":
    return "Unknown"
  elif race == "I":
    return "#Indian"
  elif race == "H":
    return "Hispanic"
  else:
    return "None"

map = folium.Map(location=[39.29, -76.61], zoom_start=11)

for index, row in arrest_table_f.iterrows():
  folium.CircleMarker(location=[row.lat, row.long], radius = 6, popup=raceToText(row.race),fill_color=raceToColor(row.race), color = raceToColor(row.race), fill_opacity=0.7).add_to(map)

map

The map showing the markers are the crimes in Baltimore, Maryland, USA. The markers show the races of the person who committed the crime in the age less than and equal to 17. The 17 age is considered the very young age despite their age they still committed the crimes. As the graph shows some more black peoples committed the crime than whites and other races.

In [5]:
arrest_table_l17 = arrest_table.loc[(arrest_table.age <= 17) ]
mapl17 = folium.Map(location=[39.29, -76.61], zoom_start=11)

for index, row in arrest_table_l17.iterrows():
  folium.CircleMarker(location=[row.lat, row.long], radius = 6, popup=raceToText(row.race),fill_color=raceToColor(row.race), color = raceToColor(row.race), fill_opacity=0.7).add_to(mapl17)

mapl17

# # Deeper Analysis 

The map is helpful in visualize specific sets of data from the collected. However the entire data set was too large to be displayed neatly on a map. Since the entire dataset had too much data entires for the map to handle I wanted to look at the numbers and do a deeper analysis using the entire data set. I wanted to see which age group commited the most crimes and what neighborhoods had the most amounts of crimes. I wanted to use the entire data set to analyze this and not just a sample from the data gathered. 


Crime commited adults age 35 and under

In [6]:
arrest_table.loc[arrest_table.age <= 35]['age'].count()

39745

Crime commited by adults between ages 36 and 60

In [7]:
arrest_table.loc[(arrest_table.age >= 36) & (arrest_table.age <= 60) ]['age'].count()

23320

Crime commited by adults 60 and older

In [8]:
arrest_table.loc[(arrest_table.age >= 60) ]['age'].count()

1059

Crime commited by teens 17 and under

In [9]:
arrest_table.loc[(arrest_table.age <= 17) ]['age'].count()

268

Top Five Neighborhood with most crimes

In [10]:
arrest_table.groupby('neighborhood').neighborhood.count().sort_values(ascending=False)[:5]

neighborhood
Downtown                3221
Sandtown-Winchester     2705
Central Park Heights    1771
Broadway East           1617
Belair-Edison           1534
Name: neighborhood, dtype: int64

From the analysis it is found that people ages 35 and under committed the most number of crimes at 39,745 compared to 23,320 crimes commited by adults between 36-60 and 1,059 crimes commited by adults over 60. It is also interesting to see that the has only been 268 crimes commited by teens age 17 and under. This is suprising to me because I had expected a larger number of crimes commited by teens 17 and under. This also means that the most crimes fell under the age group of 18-35.

From the data we can also see that from the locations that is known, Downtown had the most amounts of crimes. This is expected as Downtown is a more crowded region with more activities happening. 

