# Submetric 1: **Safety**

## The importance of safety

Obviously, crime itself is a problem nobody wants to deal with, but *fear of crime* is often overlooked and is as much of a problem. <br />

Fear of crime can be very stressful and mentally taxing. Nobody should have to worry about their kids playing outside or the safety of their belongings when at home.

---

## Pittsburgh Police Arrest Dataset
Includes data from **August 2016** to **March 2022**

We will be using `INCIDENTNEIGHBORHOOD` and querying `OFFENSES` from this dataset to determine which neighborhoods are the safest.

In [20]:
import pandas as pd
dset = pd.read_csv('e03a89dd-134a-4ee8-a2bd-62c40aeebc6f.csv')
dset.sample(3)

Unnamed: 0,PK,CCR,AGE,GENDER,RACE,ARRESTTIME,ARRESTLOCATION,OFFENSES,INCIDENTLOCATION,INCIDENTNEIGHBORHOOD,INCIDENTZONE,INCIDENTTRACT,COUNCIL_DISTRICT,PUBLIC_WORKS_DIVISION,X,Y
54643,2046653,22041462,50.0,M,W,2022-03-20T17:36:00,"300 Block Cedar AV Pittsburgh, PA 15212",3929 Retail Theft. / 5104 Resisting Arrest or ...,"300 Block Cedar AV Pittsburgh, PA 15212",East Allegheny,1,2304.0,1.0,1.0,-80.000908,40.450894
14633,1992962,17191995,32.0,M,W,2017-11-09T11:10:00,"600 Block Ist AV Pittsburgh, PA 15219",3921 Theft by Unlawful Taking or Disposition.,"2100 Block E Carson ST Pittsburgh, PA 15203",South Side Flats,3,1609.0,3.0,3.0,-79.976069,40.428307
23351,2004195,18155176,23.0,F,B,2018-08-14T14:00:00,"600 Block 1st AV Pittsburgh, PA 15219",2702 Aggravated Assault.,"N St Clair ST & Broad ST Pittsburgh, PA 15206",East Liberty,5,1115.0,,,0.0,0.0


When sorting neighborhoods by `count` of crimes, we see the Central Business District (Downtown) way above the rest

In [41]:
df = dset.groupby('INCIDENTNEIGHBORHOOD')['OFFENSES'].describe().sort_values(by='count', ascending=False)
df.head()

Unnamed: 0_level_0,count,unique,top,freq
INCIDENTNEIGHBORHOOD,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Central Business District,3327,1335,9501 Bench Warrant,238
South Side Flats,2861,1412,9015 Failure To Appear/Arrest on Attachment Order,137
Carrick,1969,872,2701 Simple Assault.,170
East Allegheny,1782,689,9015 Failure To Appear/Arrest on Attachment Order,170
Homewood South,1746,804,9015 Failure To Appear/Arrest on Attachment Order,129


But this statistic cannot be misleading. When we sort Neighborhoods by **Driving related crimes**, we see South Side Flats above the rest while the Central Business District does't break the top 5

In [53]:
query = dset['OFFENSES'].str.contains('Driving', na=False)
aslt = dset[query].groupby('INCIDENTNEIGHBORHOOD')['OFFENSES'].describe().sort_values(by='count', ascending=False)
aslt.head()

Unnamed: 0_level_0,count,unique,top,freq
INCIDENTNEIGHBORHOOD,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
South Side Flats,241,208,3736 Reckless Driving / 3802(a)(1) DUI - Gener...,9
Homewood South,231,189,1543 Driving While Operating Privilege is Susp...,9
Carrick,174,158,1543 Driving While Operating Privilege is Susp...,3
Homewood North,166,148,1543 Driving While Operating Privilege is Susp...,4
Mount Washington,157,136,3714 Careless Driving / 3802(a)(1) DUI - Gener...,7


In [42]:
dset['OFFENSES'].unique()

array(['3929 Retail Theft.',
       '13(a)(16) Possession of Controlled Substance',
       '5503 Disorderly Conduct. / 5505 Public Drunkenness', ...,
       '903 Criminal Conspiracy. / 907 Possessing Instruments of Crime. / 2702 Aggravated Assault. / 2706 Terroristic Threats. / 2902 Unlawful Restraint. / 5104 Resisting Arrest or Other Law Enforcement. / 6106 Firearms not to be Carried without a License.',
       '4106 Access Device Fraud / 13(a)(32) Paraphernalia - Use or Possession / 9501 Bench Warrant / 4120 Identity Theft / 4914(A) False Identification to Law Enforcement',
       '901 Criminal Attempt / 2702 Aggravated Assault. / 3736 Reckless Driving / 3732.1 Aggravated Assault by Vehicle'],
      dtype=object)