# Dataset: Police Incident Blotter(Archive)

Our group selected health and safety as our metric to determine the best neighborhoods in Allegheny County. My submetric is reported Police Incidents per neighborhood in Allegheny County. This is crucial to define the safety of a given neighborhood because the greater number of Police Incidents in a neighborhood indicates that the neighborhood may have a higher crime rate. Due to this, a neighborhood with a lower count of Police Incidents indicates that it is a safer neighborhood to occupy. 

In this notebook, I will analyze the record of Police Blotters in the City of Pittsburgh

Link to data set: https://data.wprdc.org/datastore/dump/044f2016-1dfd-4ab0-bc1e-065da05fca2e

In [1]:
import pandas as pd
import matplotlib.pyplot as plt

In [2]:
#import and print data
url= "https://data.wprdc.org/datastore/dump/044f2016-1dfd-4ab0-bc1e-065da05fca2e"
police= pd.read_csv(url)
police.head()

Unnamed: 0,_id,PK,CCR,HIERARCHY,INCIDENTTIME,INCIDENTLOCATION,CLEAREDFLAG,INCIDENTNEIGHBORHOOD,INCIDENTZONE,INCIDENTHIERARCHYDESC,OFFENSES,INCIDENTTRACT,COUNCIL_DISTRICT,PUBLIC_WORKS_DIVISION,X,Y
0,1,2802309,16000001.0,10,2016-01-01T00:00:00,"400 Block North Shore DR Pittsburgh, PA 15212",Y,North Shore,1,HARRASSMENT/THREAT/ATTEMPT/PHY,2702 Aggravated Assault. / 2709(a) Harassment....,2205.0,1.0,6.0,-80.012337,40.446263
1,2,2803174,16004547.0,11,2016-01-01T00:01:00,"5400 Block Carnegie ST Pittsburgh, PA 15201",N,Upper Lawrenceville,2,THEFT BY DECEPTION,3922 Theft by Deception.,1011.0,7.0,2.0,-79.950295,40.48229
2,3,2801809,16000367.0,4,2016-01-01T00:10:00,"500 Block Mt Pleasant RD Pittsburgh, PA 15214",N,Northview Heights,1,DISCHARGE OF FIREARM INTO OCC.STRUCTURE,2707.1 Discharge of a Firearm into Occupied St...,2609.0,1.0,1.0,-80.000966,40.478651
3,4,2802315,16000035.0,10,2016-01-01T00:15:00,"300 Block Wood ST Pittsburgh, PA 15222",Y,Golden Triangle/Civic Arena,2,HARRASSMENT/THREAT/ATTEMPT/PHY,2709(a)(3) Harassment No Legitimate Purpose,201.0,6.0,6.0,-80.001251,40.438918
4,5,2802312,16000024.0,4,2016-01-01T00:16:00,"500 Block Mt Pleasant RD Pittsburgh, PA 15214",N,Northview Heights,1,PROP MISSILE INTO OCC VEHICLE/OR ROADWAY,2705 Recklessy Endangering Another Person. / 3...,2609.0,1.0,1.0,-80.000966,40.478651


To make the data easier to analyze, I am removing the unnecessary columns from the data set for the data to be easier to navigate. I will do this by only leaving the hierarchy of the crime, which in the context of the data set is from 1-99, 99 being a minor crime and 1 being a major crime, as well as the neighborhood the crime was committed. I am also going to remove all of the rows in the column labeled 'CLEAREDLFAG' that contain 'N' because an 'N' means a case is opened, unresolved, or under investigation. In order to get the most accurate results I am only leaving  incidents marked 'Y' meaning that the case was cleared. 

In [4]:
#remove unnecessary columns
police_col_remove=police.drop(columns=['_id', 'PK', 'CCR', 'INCIDENTTIME', 'INCIDENTLOCATION', 'INCIDENTZONE', 'INCIDENTHIERARCHYDESC', 'OFFENSES', 'INCIDENTTRACT', 'COUNCIL_DISTRICT', 'PUBLIC_WORKS_DIVISION', 'X', 'Y'],errors='ignore')
police_col_remove.head()
##remove all incidents that contain 'N' 
police= police[police['CLEAREDFLAG'] !='N']
police_col_remove.head()

Unnamed: 0,HIERARCHY,CLEAREDFLAG,INCIDENTNEIGHBORHOOD
0,10,Y,North Shore
3,10,Y,Golden Triangle/Civic Arena
5,23,Y,South Side Flats
6,4,Y,Elliott
7,23,Y,South Side Flats


I am now going to sort the data in order to be able to indicate which neighborhoods have the greatest and lowest police incidents based on the heigheracy opf the crime (1-99)

Tables 1-2: Ranking the top 10 towns with the **highest** number of police incidents with a hierarchy of 1-49(Table 1). Then, ranking the top 10 towns with the **lowest** number of police incidents with a hierarchy of 1-49(Table 2)


Tables 3-4: Ranking the top 10 towns with the **highest** number of police incidents with a hierarchy of 50-99 (Table 3). Then, ranking the top 10 towns with the **lowest** number of police incidents with a hierarchy of 50-99 (Table 4)


I am creating tables using these two metrics to draw the most accurate conclusion about the best neighborhood. Just because a neighborhood may have a low amount of minor crimes, it may have a high amount of major crimes, which indicates that the neighborhood may be less safe due to more major crimes being committed in the neighborhood. To make a consensus about the safest neighborhoods in Pittsburgh, the neighborhood must be in the **top 10 neighborhoods with the lowest police incidents (1-99)** and also be in the table that includes the **top 10 neighborhoods with the lowest police incidents (50-99)**

In [14]:
import pandas as pd
# Group HERIARCHY into high or low 
def rank_hierarchy(x): 
    if 1<= x <=48:
        return 'High (1-49)'
    elif 50 <= x <=99:
        return 'Low (50-99)'
    else:
        return 'Other'

police['crime_level']=police ['HIERARCHY']. apply(rank_hierarchy)

#Remove other values
police= police[police['crime_level'] !='Other']
police_group=police.groupby(['INCIDENTNEIGHBORHOOD', 'crime_level']).size().reset_index(name='total_incidents')

# Group the crime levels
high_crime= police_group[police_group['crime_level'] == 'High (1-49)']
low_crime = police_group[police_group['crime_level'] == 'Low (50-99)']

# Create the 4 separate tables
table_one= ( low_crime.sort_values(by='total_incidents', ascending=False).head(10).reset_index(drop=True))
table_two= ( low_crime.sort_values(by='total_incidents', ascending=True).head(10).reset_index(drop=True))
table_three= ( high_crime.sort_values(by='total_incidents', ascending=False).head(10).reset_index(drop=True))
table_four= ( high_crime.sort_values(by='total_incidents', ascending=True).head(10).reset_index(drop=True))

# Print out the tables
print("top 10 towns with the highest number of police incidents with a hierarchy of 50-99")
display(table_one)
print("top 10 towns with the lowest number of police incidents with a hierarchy of 50-99")
display(table_two)
print("top 10 towns with the highest number of police incidents with a hierarchy of 1-49")
display(table_three)
print("top 10 towns with the lowest number of police incidents with a hierarchy of 1-49")
display(table_four)

top 10 towns with the highest number of police incidents with a hierarchy of 50-99


Unnamed: 0,INCIDENTNEIGHBORHOOD,crime_level,total_incidents
0,Central Business District,Low (50-99),4762
1,South Side Flats,Low (50-99),2585
2,Carrick,Low (50-99),2234
3,Brookline,Low (50-99),1923
4,Squirrel Hill South,Low (50-99),1642
5,Mount Washington,Low (50-99),1542
6,East Liberty,Low (50-99),1451
7,Sheraden,Low (50-99),1395
8,Beechview,Low (50-99),1376
9,Shadyside,Low (50-99),1370


top 10 towns with the lowest number of police incidents with a hierarchy of 50-99


Unnamed: 0,INCIDENTNEIGHBORHOOD,crime_level,total_incidents
0,Mt. Oliver Neighborhood,Low (50-99),9
1,Mt. Oliver Boro,Low (50-99),16
2,Troy Hill-Herrs Island,Low (50-99),43
3,Regent Square,Low (50-99),66
4,Ridgemont,Low (50-99),70
5,Mount Oliver,Low (50-99),76
6,Arlington Heights,Low (50-99),77
7,St. Clair,Low (50-99),78
8,Central Northside,Low (50-99),81
9,New Homestead,Low (50-99),82


top 10 towns with the highest number of police incidents with a hierarchy of 1-49


Unnamed: 0,INCIDENTNEIGHBORHOOD,crime_level,total_incidents
0,South Side Flats,High (1-49),6565
1,Central Business District,High (1-49),5613
2,Carrick,High (1-49),3145
3,Homewood South,High (1-49),2370
4,East Allegheny,High (1-49),2249
5,Homewood North,High (1-49),2155
6,East Liberty,High (1-49),1968
7,Lincoln-Lemington-Belmar,High (1-49),1808
8,Knoxville,High (1-49),1749
9,Sheraden,High (1-49),1730


top 10 towns with the lowest number of police incidents with a hierarchy of 1-49


Unnamed: 0,INCIDENTNEIGHBORHOOD,crime_level,total_incidents
0,Mt. Oliver Boro,High (1-49),24
1,Mt. Oliver Neighborhood,High (1-49),34
2,Troy Hill-Herrs Island,High (1-49),55
3,Ridgemont,High (1-49),63
4,Swisshelm Park,High (1-49),70
5,Outside County,High (1-49),74
6,Regent Square,High (1-49),74
7,New Homestead,High (1-49),76
8,East Carnegie,High (1-49),81
9,Chartiers City,High (1-49),85


## Conclusion: 
Based on the data in the following tables, we can make the conclusion that, based on **Police Indicents**, the best neighborhoods in Pittsburgh are Mt. Oliver Boro, Mt. Oliver Neighborhood, Troy Hill-Herrs Island, and Ridgemont. I was able to make this conclusion using all 4 tables that I abstracted from my data. These neighborhoods are the ones that were found on both the top 10 lowest total crime "High(1-49)" and "Low(50-99)", we can assume that these neighborhoods have the lowest crime rate, making it a safer neighborhood to occupy. 

I as well concluded that the most unsafe neighborhoods are South Side Flats, Central Business District, Carrick

The Police Blotter submetric will be combined with 911 dispatches and COVID cases in Pittsburgh's Neighborhoods 