# This notebook will cover the arrests made in Pittsburgh area within their local neighborhoods.
## The highest rated neighborhood in this case would be the one with the least amount of arrests in that area.
### Analysis by Justin Nguyen
Police Arrests Data Set: https://data.wprdc.org/dataset/pbp_arrest_data_2024_2025/resource/e419c20c-8df4-4729-830c-e49427a656e0
- data steward: Garrett Jeanes

Monthly Criminal Activity Data Set: https://data.wprdc.org/dataset/monthly-criminal-activity-dashboard/resource/bd41992a-987a-4cca-8798-fbe1cd946b07
- data steward: Garrett Jeanes

# Importing Statements
Code which imports the necessary systems for the rest of the data to run in the notebook.

In [20]:
import pandas as pd
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt

# Importing and Generating Police Arrest Data
This code will import the Pittsburgh police arrests dataset and generate a small header of the dataset.

In [3]:
police = pd.read_csv("PittArrests.tsv", sep="\t")
police.head()

Unnamed: 0,_id,Template,Case_Number,ArrestPerson_ID,Arrest_Date,Arrest_Year,Arrest_Month,Arrest_Time,Type,ArrestPerson_Age,...,NIBRS_Crime_Against,NIBRS_Offense_Grouping,Violation,Zone,Tract,Neighborhood,ArrestCharge_Felony_Misdemeanor_Description,XCOORD,YCOORD,Block_Address
0,1,Adult Arrest-On View,PGHP25000026,b09e1066-c355-cc1b-218c-08dd2a4354ea,2025-01-01,2025.0,Jan,04:04,On-View Arrest (apprehension without a warrant...,21.0,...,Group B,B,18 5503 A1* Disorderly Conduct-Fighting/Threat...,Zone 3,1702.0,South Side Flats,MISDEMEANOR - M3,-79.9824,40.4288,"1600 Block of E Carson Street Pittsburgh, PA"
1,2,Adult Arrest-On View,PGHP25000110,e97f5c84-ddf3-c039-b576-08dd2a41a058,2025-01-01,2025.0,Jan,03:52,On-View Arrest (apprehension without a warrant...,23.0,...,Society,A,18 6106 A1 Firearms Act-Carrying Firearm W/O L...,Zone 3,1702.0,South Side Flats,FELONY - F3,-79.982,40.4298,Intersection of SIDNEY ST./ S. 17TH ST. Pittsb...
2,3,Adult Arrest-On View,PGHP25000110,e97f5c84-ddf3-c039-b576-08dd2a41a058,2025-01-01,2025.0,Jan,03:52,On-View Arrest (apprehension without a warrant...,23.0,...,Group B,B,75 3353 A2II Illegal Park W/I 15 Feet of Fire ...,Zone 3,1702.0,South Side Flats,INFRACTION - S,-79.982,40.4298,Intersection of SIDNEY ST./ S. 17TH ST. Pittsb...
3,4,Adult Arrest-On View,PGHP25000110,e97f5c84-ddf3-c039-b576-08dd2a41a058,2025-01-01,2025.0,Jan,03:52,On-View Arrest (apprehension without a warrant...,23.0,...,Group B,B,75 3353 A3II Illegal Park Where Official Signs...,Zone 3,1702.0,South Side Flats,INFRACTION - S,-79.982,40.4298,Intersection of SIDNEY ST./ S. 17TH ST. Pittsb...
4,5,Adult Arrest-On View,PGHP25000110,e97f5c84-ddf3-c039-b576-08dd2a41a058,2025-01-01,2025.0,Jan,03:52,On-View Arrest (apprehension without a warrant...,23.0,...,Group B,B,75 3353 A1III Illegal Park In Intersection,Zone 3,1702.0,South Side Flats,INFRACTION - S,-79.982,40.4298,Intersection of SIDNEY ST./ S. 17TH ST. Pittsb...


# Importing and Generating Criminal Activity Data
This code will import the monthly crimal activity dataset and generate a small header of the dataset.

In [7]:
criminal = pd.read_csv("PittCriminalActivity.tsv", sep="\t")
criminal.head()

  criminal = pd.read_csv("PittCriminalActivity.tsv", sep="\t")


Unnamed: 0,_id,Report_Number,ReportedDate,ReportedTime,Hour,DayofWeek,ReportedMonth,NIBRS_Coded_Offense,NIBRS_Offense_Code,NIBRS_Offense_Category,NIBRS_Offense_Type,NIBRS_Crime_Against,NIBRS_Offense_Grouping,Violation,XCOORD,YCOORD,Zone,Tract,Neighborhood,Block_Address
0,1,PGHP24000024,2024-01-01,00:31,0,Monday,Jan,13A AGGRAVATED ASSAULT,13A,Assault Offenses,Aggravated Assault,Person,A,18 2718 A1 Strangulation Basic - Applying,-80.0268,40.3964,Zone 6,1919.0,Brookline,"2800 Block of FITZHUGH WAY Pittsburgh, PA"
1,2,PGHP24000024,2024-01-01,00:31,0,Monday,Jan,13C INTIMIDATION,13C,Assault Offenses,Intimidation,Person,A,18 2706 A1 Terroristic Threats-General,-80.0268,40.3964,Zone 6,1919.0,Brookline,"2800 Block of FITZHUGH WAY Pittsburgh, PA"
2,3,PGHP24000024,2024-01-01,00:31,0,Monday,Jan,90Z ALL OTHER OFFENSES,90Z,All other Offenses,All other Offenses,Group B,B,75 3733 A Fleeing or Attempting To Elude Polic...,-80.0268,40.3964,Zone 6,1919.0,Brookline,"2800 Block of FITZHUGH WAY Pittsburgh, PA"
3,4,PGHP24000024,2024-01-01,00:31,0,Monday,Jan,23H ALL OTHER LARCENY,23H,Larceny/Theft Offenses,All Other Larceny,Property,A,18 3921 A Theft by Unlawful Taking-Movable – L...,-80.0268,40.3964,Zone 6,1919.0,Brookline,"2800 Block of FITZHUGH WAY Pittsburgh, PA"
4,5,PGHP24000017,2024-01-01,00:21,0,Monday,Jan,9999 Vehicle Offense (Not NIBRS Reportable),999,Not NIBRS Reportable,Not NIBRS Reportable,Group B,B,LO 6 101 Discharge of Firearms Prohibited,-80.0243,40.4582,Zone 1,2107.0,Manchester,"1200 Block of COLUMBUS AVE Pittsburgh, PA"


# Generating and Sorting a Table Based On Police Arrest Frequency
This code will generate a table which contains a sorted list of neighborhoods and how many arrests were made in those neighborhoods, going from the least amount of arrests to the most amount of arrests.

In [5]:
dict1=dict()
for index, row in police.iterrows():
    if(row["Neighborhood"] in dict1.keys()):
        dict1[row["Neighborhood"]]+=1
    else:
        dict1[row["Neighborhood"]]=1
sorted(dict1.items(), key=lambda item: item[1])
pd.options.display.max_rows = None
display(pd.DataFrame(sorted(dict1.items(), key=lambda item: item[1]), columns=['Neighborhood', 'Frequency']))

Unnamed: 0,Neighborhood,Frequency
0,Chartiers,14
1,Saint Clair,15
2,Ridgemont,16
3,Swisshelm Park,17
4,Regent Square,23
5,New Homestead,28
6,East Carnegie,29
7,Oakwood,31
8,Esplen,35
9,Mount Oliver,35


# Generating and Sorting a Table Based On Criminal Activity Frequency
This code will generate a table which contains a sorted list of neighborhoods and how much criminal activity was reported in those neighborhoods, going from the least amount of activity to the most amount of activity.

In [8]:
dict2=dict()
for index, row in criminal.iterrows():
    if(row["Neighborhood"] in dict2.keys()):
        dict2[row["Neighborhood"]]+=1
    else:
        dict2[row["Neighborhood"]]=1
sorted(dict2.items(), key=lambda item: item[1])
pd.options.display.max_rows = None
display(pd.DataFrame(sorted(dict2.items(), key=lambda item: item[1]), columns=['Neighborhood', 'Frequency']))

Unnamed: 0,Neighborhood,Frequency
0,Saint Clair,25
1,Ridgemont,32
2,Chartiers,38
3,New Homestead,51
4,Swisshelm Park,61
5,East Carnegie,75
6,Summer Hill,77
7,Esplen,79
8,Oakwood,84
9,Regent Square,95


# Generating and Sorting Neighborhoods Based On their Ratio of Crime to Arrests
This code will use the previous two datasets to generate a table which contains a sorted list of neighborhoods and the ratio of crime to the number of arrests made in those neighborhoods, going from the lowest ratio to the highest ratio.

In [40]:
dict3=dict()
for neighborhood in dict2:
    dict3[neighborhood]= dict2[neighborhood]/dict1[neighborhood]
sorted(dict3.items(), key=lambda item: item[1])
pd.options.display.max_rows = None
display(pd.DataFrame(sorted(dict3.items(), key=lambda item: item[1]), columns=['Neighborhood', 'Crime to Arrest Ratio']))

Unnamed: 0,Neighborhood,Crime to Arrest Ratio
0,East Allegheny,1.299718
1,Northview Heights,1.382263
2,Allegheny Center,1.442359
3,Lincoln–Lemington–Belmar,1.574257
4,Central Business District,1.584223
5,Bluff,1.627208
6,Homewood North,1.644068
7,East Hills,1.655367
8,Saint Clair,1.666667
9,Homewood South,1.69382


# Looking for the Big 4 Neighborhoods
The code below will generate a table of the Top 4 Neighborhoods that are in Landon's dataset

In [53]:
dict4=dict()
for neighborhood in dict3:
    if neighborhood == 'Lincoln-Lemington-Belmar' or neighborhood == 'Strip District' or neighborhood == 'Highland Park' or neighborhood == 'Central Business District':
        dict4[neighborhood]= dict3[neighborhood]
sorted(dict4.items(), key=lambda item: item[1])
pd.options.display.max_rows = None
display(pd.DataFrame(sorted(dict4.items(), key=lambda item: item[1]), columns=['Neighborhood', 'Crime to Arrest Ratio']))

Unnamed: 0,Neighborhood,Crime to Arrest Ratio
0,Central Business District,1.584223
1,Lincoln-Lemington-Belmar,1.811321
2,Highland Park,3.672414
3,Strip District,4.051282
