# Initial Exploration of Metropolitan Police Use of Force Data

#### Data Sources

The results presented here are harvested from the London Metropolitan Police Department use of force database available at the [London Datastore](https://data.london.gov.uk/dataset/use-of-force). Many columns were copied directly from the table while others (such as location and subject behavior) were filled in using several columns of data from the original database. Data on the perceived well-being of each borough as well as their actual crime rates was also downloaded from the Datastore and added to the main data frame.

Data processing was done in R (see Proposal.R) with resulting tables being saved as csv files and passed to python.

*(final size: 75mb including over 55,000 incidents from April 2017 to Feb 2018)*. 


#### Objectives

Both the US and the UK have recently seen large-scale protests in response to incidents of police brutality, particularly involving black citizens as the targets. After the first round of the London Met database was published, it was widely reported that a disproportionate amount of these use of force incidents involved black citizens. While only making up 13% of the city's population, [they were the subject of 36% of the incidents](https://www.theguardian.com/uk-news/2017/aug/01/met-police-using-force-against-disproportionately-large-number-of-black-people).  
The creation and release of these incident databases has actually been, in part, an attempt to determine whether these incidents are outliers or results of systematic bias. My analysis will be able to show which factors are the most important in the conscious and unconscious decision to use escalating levels of force. I will then be able to compare this observed decision making process between officers in the UK who do not commonly carry firearms and those in the US who do. 

#### Package Imports

In [1]:
import numpy as np
import pandas as pd
from pandas import get_dummies
import sklearn as skl
from sklearn.ensemble import RandomForestClassifier

## Exploratory Analysis

In [6]:
#Load in data set of top 5 final effective tactics used
raw1=pd.read_csv("Data/01_FinalTact_Resp.csv")
raw1[0:5]

Unnamed: 0,Final_Effective_Tactic,Staff...medical.assistance.provided,Subject...medical.assistance.provided,Outcome...arrested,Month,Hour,Location,Subject_Behavior,Impact.factor..possession.of.a.weapon,Impact.factor..alcohol,...,Num_Tactics_Tried,Subject.s.perceived.age,Subject.s.perceived.gender,Subject.s.perceived.ethnicity,Subject.physically.disabled..officer.perceived.,Subject.mentally.disabled..officer.perceived.,All_Offenses,Unemployment.rate,Deliberate.Fires,Subjective.well.being.average.score
1,Unarmed Skills,,Yes,Yes,4,23,dwelling,Aggressive resistance,No,No,...,1,18-34,Male,White,No,Yes,70.9,-212.8,-207.8,-15.6
3,Compliant handcuffing,,,Yes,12,7,dwelling,Compliant,No,No,...,1,11-17,Male,Black (or Black British),No,No,70.9,-212.8,-207.8,-15.6
4,Compliant handcuffing,,,Yes,12,23,dwelling,Verbal resistance/gestures,No,Yes,...,1,35-49,Male,Asian (or Asian British),No,No,70.9,-212.8,-207.8,-15.6
5,Non-compliant handcuffing,,,Yes,12,10,dwelling,Serious or aggravated resistance,No,No,...,2,35-49,Male,Black (or Black British),No,No,70.9,-212.8,-207.8,-15.6
6,Non-compliant handcuffing,,,No,7,3,street.highway,Aggressive resistance,No,Yes,...,2,11-17,Female,Black (or Black British),No,No,70.9,-212.8,-207.8,-15.6


In [7]:
#Replace NaN in medical assistance columns with No
raw1[["Staff...medical.assistance.provided","Subject...medical.assistance.provided"]]=raw1[["Staff...medical.assistance.provided","Subject...medical.assistance.provided"]].replace(np.nan,"No")
raw1.iloc[:,1:]=raw1.iloc[:,1:].replace(np.nan,0)
raw1.iloc[:,1:]=raw1.iloc[:,1:].replace("No",0)
raw1.iloc[:,1:]=raw1.iloc[:,1:].replace("Yes",1)

raw1[:5]

Unnamed: 0,Final_Effective_Tactic,Staff...medical.assistance.provided,Subject...medical.assistance.provided,Outcome...arrested,Month,Hour,Location,Subject_Behavior,Impact.factor..possession.of.a.weapon,Impact.factor..alcohol,...,Num_Tactics_Tried,Subject.s.perceived.age,Subject.s.perceived.gender,Subject.s.perceived.ethnicity,Subject.physically.disabled..officer.perceived.,Subject.mentally.disabled..officer.perceived.,All_Offenses,Unemployment.rate,Deliberate.Fires,Subjective.well.being.average.score
1,Unarmed Skills,0,1,1,4,23,dwelling,Aggressive resistance,0,0,...,1,18-34,Male,White,0,1,70.9,-212.8,-207.8,-15.6
3,Compliant handcuffing,0,0,1,12,7,dwelling,Compliant,0,0,...,1,11-17,Male,Black (or Black British),0,0,70.9,-212.8,-207.8,-15.6
4,Compliant handcuffing,0,0,1,12,23,dwelling,Verbal resistance/gestures,0,1,...,1,35-49,Male,Asian (or Asian British),0,0,70.9,-212.8,-207.8,-15.6
5,Non-compliant handcuffing,0,0,1,12,10,dwelling,Serious or aggravated resistance,0,0,...,2,35-49,Male,Black (or Black British),0,0,70.9,-212.8,-207.8,-15.6
6,Non-compliant handcuffing,0,0,0,7,3,street.highway,Aggressive resistance,0,1,...,2,11-17,Female,Black (or Black British),0,0,70.9,-212.8,-207.8,-15.6


In [9]:
raw1.dtypes

Final_Effective_Tactic                              object
Staff...medical.assistance.provided                  int64
Subject...medical.assistance.provided                int64
Outcome...arrested                                   int64
Month                                                int64
Hour                                                 int64
Location                                            object
Subject_Behavior                                    object
Impact.factor..possession.of.a.weapon                int64
Impact.factor..alcohol                               int64
Impact.factor..drugs                                 int64
Impact.factor..mental.health                         int64
Impact.factor..prior.knowledge                       int64
Impact.factor..size.gender.build                     int64
Impact.factor..acute.behavioural.disorder            int64
Impact.factor..crowd                                 int64
Impact.factor..other                                 int

In [11]:
raw1.iloc[:,[0,6,7,30,35,36,37]]=raw1.iloc[:,[0,6,7,30,35,36,37]].astype('category')
raw1.dtypes

Final_Effective_Tactic                             category
Staff...medical.assistance.provided                   int64
Subject...medical.assistance.provided                 int64
Outcome...arrested                                    int64
Month                                                 int64
Hour                                                  int64
Location                                           category
Subject_Behavior                                   category
Impact.factor..possession.of.a.weapon                 int64
Impact.factor..alcohol                                int64
Impact.factor..drugs                                  int64
Impact.factor..mental.health                          int64
Impact.factor..prior.knowledge                        int64
Impact.factor..size.gender.build                      int64
Impact.factor..acute.behavioural.disorder             int64
Impact.factor..crowd                                  int64
Impact.factor..other                    

In [13]:
#Change categorical data into it's one hot representation
raw1_dum=pd.get_dummies(data=raw1,columns=["Location","Subject_Behavior"])
#raw1_dum[:5]

In [None]:
#Break up into response data and predictors
resp1=raw1.Final_Effective_Tactic
preds1=raw1.iloc[:,4:]
preds1[:5]

In [None]:
#Run Model Fitting
mod1=RandomForestClassifier()
mod1.fit(X=preds1,y=resp1)

In [None]:
raw1.iloc[:,[1:3,6:9,11:15]]

In [None]:
python data = pd.concat([data, pd.get_dummies(data['City'], prefix='City')], axis=1) data[['City', 'City_London', 'City_New Delhi', 'City_New York']]