# Abstract
As the Black Lives Matter movement took root across the United States in the wake of George Floyd's death at the hands of Minneapolis police officers, a rallying cry has gained momentum in response to police brutality: defund the police. The movement doesn't aim to abolish lawn enforcement, but rather to redistribute the bulk of its funding to other areas.

Is such a notion even plausible? This notebook aims to draw a correlation between police brutality and its overfunding/militarization in areas with high rates of poverty, where other departments such as education and public health suffer from lack of funds.

## Imports

In [1]:
# Data manipulation
import pandas as pd
import numpy as np

# Options for pandas
pd.options.display.max_columns = 50
pd.options.display.max_rows = 30

# Display all cell outputs
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = 'all'

from IPython import get_ipython
ipython = get_ipython()

# Autoreload extension
if 'autoreload' not in ipython.extension_manager.loaded:
    %load_ext autoreload

%autoreload 2

# Visualizations
import plotly.express as px
import plotly.graph_objects as go
import plotly.figure_factory as ff
from plotly.colors import n_colors
from plotly.subplots import make_subplots
from plotly.offline import iplot, init_notebook_mode
init_notebook_mode(connected=True)

import cufflinks as cf
cf.go_offline(connected=True)
cf.set_config_file(theme='white')

## Data

In [27]:
#Data import
pk = pd.read_csv('./data/police_killings.csv')
cd = pd.read_csv('./data/2018_census_5YE.csv')

#Data cleaning
us_state_abbrev = {
    'Alabama': 'AL',
    'Alaska': 'AK',
    'American Samoa': 'AS',
    'Arizona': 'AZ',
    'Arkansas': 'AR',
    'California': 'CA',
    'Colorado': 'CO',
    'Connecticut': 'CT',
    'Delaware': 'DE',
    'District of Columbia': 'DC',
    'Florida': 'FL',
    'Georgia': 'GA',
    'Guam': 'GU',
    'Hawaii': 'HI',
    'Idaho': 'ID',
    'Illinois': 'IL',
    'Indiana': 'IN',
    'Iowa': 'IA',
    'Kansas': 'KS',
    'Kentucky': 'KY',
    'Louisiana': 'LA',
    'Maine': 'ME',
    'Maryland': 'MD',
    'Massachusetts': 'MA',
    'Michigan': 'MI',
    'Minnesota': 'MN',
    'Mississippi': 'MS',
    'Missouri': 'MO',
    'Montana': 'MT',
    'Nebraska': 'NE',
    'Nevada': 'NV',
    'New Hampshire': 'NH',
    'New Jersey': 'NJ',
    'New Mexico': 'NM',
    'New York': 'NY',
    'North Carolina': 'NC',
    'North Dakota': 'ND',
    'Northern Mariana Islands':'MP',
    'Ohio': 'OH',
    'Oklahoma': 'OK',
    'Oregon': 'OR',
    'Pennsylvania': 'PA',
    'Puerto Rico': 'PR',
    'Rhode Island': 'RI',
    'South Carolina': 'SC',
    'South Dakota': 'SD',
    'Tennessee': 'TN',
    'Texas': 'TX',
    'Utah': 'UT',
    'Vermont': 'VT',
    'Virgin Islands': 'VI',
    'Virginia': 'VA',
    'Washington': 'WA',
    'West Virginia': 'WV',
    'Wisconsin': 'WI',
    'Wyoming': 'WY'
}

pk.replace(to_replace=r'^Unknown race$', value='Unknown Race', regex=True, inplace=True)
cd.replace({'State': us_state_abbrev}, inplace=True)
cd = cd.set_index('State')
cd = cd.sort_index()

#Test
pk.head()
cd.head()

Unnamed: 0,Victim's name,Victim's age,Victim's gender,Victim's race,URL of image of victim,Date of Incident (month/day/year),Street Address of Incident,City,State,Zipcode,County,Agency responsible for death,Cause of death,A brief description of the circumstances surrounding the death,Official disposition of death (justified or other),Criminal Charges?,Link to news article or photo of official document,Symptoms of mental illness?,Unarmed,Alleged Weapon (Source: WaPo),Alleged Threat Level (Source: WaPo),Fleeing (Source: WaPo),Body Camera (Source: WaPo),WaPo ID (If included in WaPo database),Off-Duty Killing?,...,Unnamed: 41,Unnamed: 42,Unnamed: 43,Unnamed: 44,Unnamed: 45,Unnamed: 46,Unnamed: 47,Unnamed: 48,Unnamed: 49,Unnamed: 50,Unnamed: 51,Unnamed: 52,Unnamed: 53,Unnamed: 54,Unnamed: 55,Unnamed: 56,Unnamed: 57,Unnamed: 58,Unnamed: 59,Unnamed: 60,Unnamed: 61,Unnamed: 62,Unnamed: 63,Unnamed: 64,Unnamed: 65
0,Eric M. Tellez,28.0,Male,White,https://fatalencounters.org/wp-content/uploads...,31/12/2019,Broad St.,Globe,AZ,85501.0,Gila,Globe Police Department,Gunshot,"After midnight, a patrol officer was on routin...",Pending investigation,No known charges,https://www.azfamily.com/news/phoenix-man-arme...,No,Allegedly Armed,knife,other,not fleeing,no,5332.0,,...,,,,,,,,,,,,,,,,,,,,,,,,,
1,Name withheld by police,,Male,Unknown Race,,31/12/2019,7239-7411 I-40,Memphis,AR,38103.0,Crittenden,"Memphis Police Department, Arkansas State Police",Gunshot,"Police began a chase regarding a kidnapping, e...",Pending investigation,No known charges,https://www.fox16.com/local-news-2/kidnapping-...,No,Unclear,unclear,other,,,,,...,,,,,,,,,,,,,,,,,,,,,,,,,
2,Terry Hudson,57.0,Male,Black,,31/12/2019,3600 N 24th St,Omaha,NE,68110.0,Douglas,Omaha Police Department,Gunshot,Police responded to a domestic incident on the...,Pending investigation,No known charges,https://www.ketv.com/article/omaha-police-offi...,No,Allegedly Armed,gun,attack,not fleeing,no,5359.0,,...,,,,,,,,,,,,,,,,,,,,,,,,,
3,Malik Williams,23.0,Male,Black,,31/12/2019,30800 14th Avenue South,Federal Way,WA,98003.0,King,Federal Way Police Department,Gunshot,Police responded to a domestic dispute. Police...,Pending investigation,No known charges,https://www.king5.com/article/news/local/2-fed...,No,Allegedly Armed,gun,attack,not fleeing,no,5358.0,,...,,,,,,,,,,,,,,,,,,,,,,,,,
4,Frederick Perkins,37.0,Male,Black,,31/12/2019,17057 N Outer 40 Rd,Chesterfield,MO,63005.0,St. Louis,Chesterfield Police Department,Gunshot,Police went to Chesterfield Outlets about 1 p....,Pending investigation,No known charges,https://www.stltoday.com/news/local/crime-and-...,No,Vehicle,vehicle,attack,car,no,5333.0,,...,,,,,,,,,,,,,,,,,,,,,,,,,


Unnamed: 0_level_0,Total Population,Male,Female,Hispanic,White,Black,Native American,Asian,Pacific Islander,Unknown Race
State,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
AK,738516,385579,352937,51186,450754,22817,103506,45617,8544,1459
AL,4864680,2355799,2508881,203146,3196730,1285737,23243,63936,1521,7503
AR,2990671,1468412,1522259,219052,2173849,458536,17342,43441,7877,4641
AZ,6946685,3453439,3493246,2163312,3825886,286614,271946,222477,12523,9177
CA,39148760,19453769,19694991,15221577,14695836,2164519,138427,5525439,138911,97763


## Feature Engineering

In [33]:
#Functions
def race_data(data, race):
    new_data = data[data["Victim's race"] == race]
    sort_data = new_data[["Victim's name", "State"]]
    data_grouped = sort_data.groupby('State')["Victim's name"].nunique()
    data_df = data_grouped.to_frame()
    data_df = data_df.rename(columns = {"Victim's name" : race + ' Police Killings'})
    return data_df

#Police killings total by race per state
hispanic_df = race_data(pk, 'Hispanic')
black_df = race_data(pk, 'Black')
white_df = race_data(pk, 'White')
asian_df = race_data(pk, 'Asian')
native_df = race_data(pk, 'Native American')
pacific_df = race_data(pk, 'Pacific Islander')
unknown_df = race_data(pk, 'Unknown Race')

#Police killings total across all races per state
pk_total = pk[["Victim's name", 'State']]
state_total = pk_total.groupby('State')["Victim's name"].nunique()
state_total_df = state_total.to_frame()
state_total_df = state_total_df.rename(columns = {"Victim's name": 'Total Police Killings'})

#Combining data into single df
hispanic_df['Total Police Killings'] = state_total_df['Total Police Killings']
hispanic_df[['Hispanic Population', 'Total State Population']] = cd[['Hispanic', 'Total Population']]
black_df['Total Police Killings'] = state_total_df['Total Police Killings']
black_df[['Black Population', 'Total State Population']] = cd[['Black', 'Total Population']]
white_df['Total Police Killings'] = state_total_df['Total Police Killings']
white_df[['White Population', 'Total State Population']] = cd[['White', 'Total Population']]
asian_df['Total Police Killings'] = state_total_df['Total Police Killings']
asian_df[['Asian Population', 'Total State Population']] = cd[['Asian', 'Total Population']]
native_df['Total Police Killings'] = state_total_df['Total Police Killings']
native_df[['Native American Population', 'Total State Population']] = cd[['Native American', 'Total Population']]
pacific_df['Total Police Killings'] = state_total_df['Total Police Killings']
pacific_df[['Pacific Islander Population', 'Total State Population']] = cd[['Pacific Islander', 'Total Population']]
unknown_df['Total Police Killings'] = state_total_df['Total Police Killings']
unknown_df[['Unknown Race Population', 'Total State Population']] = cd[['Unknown Race', 'Total Population']]

#Test
hispanic_df.head()
black_df.head()
white_df.head()
asian_df.head()
native_df.head()
pacific_df.head()
unknown_df.head()
print(pk_total)
print(state_total)
state_total_df.head()

Unnamed: 0_level_0,Hispanic Police Killings,Total Police Killings,Hispanic Population,Total State Population
State,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
AK,1,41,51186,738516
AR,3,103,219052,2990671
AZ,114,338,2163312,6946685
CA,469,1123,15221577,39148760
CO,67,218,1184794,5531141


Unnamed: 0_level_0,Black Police Killings,Total Police Killings,Black Population,Total State Population
State,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
AK,5,41,22817,738516
AL,51,136,1285737,4864680
AR,28,103,458536,2990671
AZ,31,338,286614,6946685
CA,184,1123,2164519,39148760


Unnamed: 0_level_0,White Police Killings,Total Police Killings,White Population,Total State Population
State,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
AK,15,41,450754,738516
AL,73,136,3196730,4864680
AR,59,103,2173849,2990671
AZ,141,338,3825886,6946685
CA,331,1123,14695836,39148760


Unnamed: 0_level_0,Asian Police Killings,Total Police Killings,Asian Population,Total State Population
State,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
AL,1,136,63936,4864680
AR,1,103,43441,2990671
CA,44,1123,5525439,39148760
CO,4,218,169556,5531141
CT,1,36,157406,3581504


Unnamed: 0_level_0,Native American Police Killings,Total Police Killings,Native American Population,Total State Population
State,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
AK,12,41,103506,738516
AZ,14,338,271946,6946685
CA,7,1123,138427,39148760
CO,5,218,30131,5531141
ID,2,48,18775,1687809


Unnamed: 0_level_0,Pacific Islander Police Killings,Total Police Killings,Pacific Islander Population,Total State Population
State,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
CA,9,1123,138911,39148760
HI,22,37,132583,1422029
MI,1,114,2464,9957488
MO,2,194,6037,6090062
NC,1,204,5677,10155624


Unnamed: 0_level_0,Unknown Race Police Killings,Total Police Killings,Unknown Race Population,Total State Population
State,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
AK,8,41,1459,738516
AL,12,136,7503,4864680
AR,12,103,4641,2990671
AZ,39,338,9177,6946685
CA,83,1123,97763,39148760


                Victim's name State
0              Eric M. Tellez    AZ
1     Name withheld by police    AR
2                Terry Hudson    NE
3              Malik Williams    WA
4           Frederick Perkins    MO
...                       ...   ...
7902                      NaN   NaN
7903                      NaN   NaN
7904                      NaN   NaN
7905                      NaN   NaN
7906                      NaN   NaN

[7907 rows x 2 columns]
State
AK      41
AL     136
AR     103
AZ     338
CA    1123
      ... 
VT      12
WA     210
WI     111
WV      72
WY      19
Name: Victim's name, Length: 51, dtype: int64


Unnamed: 0_level_0,Total Police Killings
State,Unnamed: 1_level_1
AK,41
AL,136
AR,103
AZ,338
CA,1123


# Analysis/Modeling
Do work here

In [5]:
#Functions
state_sum = pk.groupby(['State', "Victim's race"]).count()
def create_pie_chart(input_data, state):
    labels = input_data.loc[state]["Victim's name"].index
    values = input_data.loc[state]["Victim's name"]
    trace = go.Pie(labels = labels, values = values)
    data = [trace]
    fig = go.Figure(data = data)
    iplot(fig)

create_pie_chart(state_sum, 'NJ')
create_pie_chart(state_sum, 'NY')
create_pie_chart(state_sum, 'CA')

# Results
Show graphs and stats here

# Conclusions and Next Steps
Summarize findings here