# Nashvile Metro Police Department Discipline Data

## What is this? 
This is an analysis of a cleaned file containing discipline incidents from the Metro Nashville Police Department. We joined officer race/gender from the official MNPD staff roster to the list of discipline incidents provided to reporter Samantha Max as a part of a series of public records requests. 

APM Reports and WPLN requested more than a decade's worth of discipline data from the Metro Nashville Police Department. Reporters used a staff roster and a combination of manual cleaning and [fuzzy string matching](https://en.wikipedia.org/wiki/Approximate_string_matching) to join race and gender data of the officers to the discipline data. This allows us to understand the process of the opaque disciplinary process at the police department. 


## Table of Contents

* [Imports](#Imports)
* [What are we leaving out?](#What-are-we-leaving-out?)
* [Data Overview](#Data-Overview)
* [Department Statistics](#Department-Statistics)
* [Tenure](#Tenure)
* [Officer Race and Gender Compared to Discipline](#Officer-Race-and-Gender-compared-to-Discipline)
* [Race and Gender a Different Way](#Race-and-Gender,-broken-down-a-different-way)
* [Percentage of Officers Disciplined](#What-percentage-of-officers-are-disciplined,-on-average)
* [Severe Discipline](#Percentage-of-each-group's-discipline-that-counts-as-severe)
* [Significance Testing](#Significance-Testing)


## Imports

In [1]:
import os
import pandas as pd
import altair as alt

data_dir = os.path.join(cwd, 'data')
processed_dir = os.path.join(data_dir, 'processed')
discipline_csv = os.path.join(processed_dir, 'cleaned_discipline_final.csv')
staff_roster_csv = os.path.join(processed_dir, 'staff_roster_cleaned.csv')

In [2]:
# variables for importing the data

columns = {
    'CONTROL #': 'control_number',
    'FINAL DISP DATE': 'final_disposition_date',
    'FINAL DISPOSITION': 'final_disposition',
    'FINAL # DAYS': 'final_number_of_days',
    'EMPLOYEE LAST NAME': 'last_name',
    'EMPLOYEE FIRST NAME': 'first_name',
    'ALLEGATION': 'allegation',
    'COMP SEX': 'complaintant_gender',
    'COMP RACE': 'complaintant_race',
    'full_name': 'dirty_full_name',
    'clean_name_x': 'clean_discipline_name',
    'roster_name_match': 'clean_roster_name',
    'gender': 'officer_gender',
    'clean_race_ethnicity': 'officer_race',
}

discipline_df = pd.read_csv(
    discipline_csv,
    parse_dates = ['FINAL DISP DATE'],
    dtype = {'CONTROL #': 'object'}
)
discipline_df = discipline_df.rename(columns=columns)

# keep only a selection of the columns to make things easier to work with
discipline_df = discipline_df[[
    'control_number',
    'final_disposition_date',
    'final_disposition',
    'final_number_of_days',
    'dirty_full_name',
    'clean_roster_name',
    'officer_gender',
    'officer_race',
    'allegation'
]].copy()

discipline_df['year'] = discipline_df.apply(
    lambda x: x['final_disposition_date'].year,
    axis=1
)

discipline_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 12896 entries, 0 to 12895
Data columns (total 10 columns):
 #   Column                  Non-Null Count  Dtype         
---  ------                  --------------  -----         
 0   control_number          12895 non-null  object        
 1   final_disposition_date  12896 non-null  datetime64[ns]
 2   final_disposition       12896 non-null  object        
 3   final_number_of_days    2669 non-null   float64       
 4   dirty_full_name         12854 non-null  object        
 5   clean_roster_name       12896 non-null  object        
 6   officer_gender          12576 non-null  object        
 7   officer_race            12575 non-null  object        
 8   allegation              12888 non-null  object        
 9   year                    12896 non-null  int64         
dtypes: datetime64[ns](1), float64(1), int64(1), object(7)
memory usage: 1007.6+ KB


In [3]:
staff_roster_df = pd.read_csv(
    staff_roster_csv,
    parse_dates=['date_started']
)


staff_roster_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 23645 entries, 0 to 23644
Data columns (total 9 columns):
 #   Column                Non-Null Count  Dtype         
---  ------                --------------  -----         
 0   name                  23645 non-null  object        
 1   job_desc              23645 non-null  object        
 2   dept_desc             23645 non-null  object        
 3   gender                23645 non-null  object        
 4   race_ethnicity        23643 non-null  object        
 5   date_started          23644 non-null  datetime64[ns]
 6   year                  23645 non-null  int64         
 7   clean_name            23645 non-null  object        
 8   clean_race_ethnicity  23643 non-null  object        
dtypes: datetime64[ns](1), int64(1), object(7)
memory usage: 1.6+ MB


## What are we leaving out? 

Each row represents a discipline incident.

There were __12,896__ rows in the data. The original data provided by the MNPD did not have the race or gender of the officers being disciplined. We were able to fairly successfully join that data to the discipline incidents using a staff roster from the department. 

In total, we had to drop about 3% of the total disciplinary actions because there was either no match in the roster or the officer's race was unknown. 

In [4]:
total = len(discipline_df)
total

12896

We were not able to match __320__ rows to anyone from the staff roster, or about __2.5%__ of the rows in the discipline data

In [5]:
no_match_len = len(discipline_df[discipline_df.clean_roster_name=='no match'])

no_match_len

320

In [6]:
320/total

0.02481389578163772

There are __73__ rows where the officer's race is listed as unknown and __1__ where the race is blank and __0__ rows where the officer gender is unknown. 

In [7]:
len(discipline_df[
    discipline_df.officer_race=='unknown'
])

73

In [8]:
len(discipline_df[
    (discipline_df.officer_race.isna()) & (discipline_df.clean_roster_name != 'no match' )
])

1

In [9]:
len(discipline_df[
    discipline_df.officer_gender=='unknown'
])

0

In total, there are __394__ rows, or __3%__ of the data we have to ignore to analyze the data. The extras come from when the officer race is is null but there was a match. 

In [10]:
discipline_df[
   (discipline_df.officer_race.isna()) | (discipline_df.officer_race=='unknown') | (discipline_df.clean_roster_name == 'no match')
    
]

Unnamed: 0,control_number,final_disposition_date,final_disposition,final_number_of_days,dirty_full_name,clean_roster_name,officer_gender,officer_race,allegation,year
0,018210,2010-02-01,MATTER OF RECORD,,,no match,,,RESPONSIBILITY,2010
1,09729,2010-03-03,MATTER OF RECORD,,,no match,,,INTIMIDATION,2010
2,018410,2010-03-05,EXONERATED,,,no match,,,RE: HOW OFFICERS ARE DISPATCHED,2010
3,030S10,2010-03-31,UNFOUNDED,,,no match,,,ADHERENCE TO LAW (PROBABLE 10-35 COMPLAINANT),2010
4,031O10,2010-05-19,MATTER OF RECORD,,,no match,,,SELF-IDENTIFICATION,2010
...,...,...,...,...,...,...,...,...,...,...
12406,SV2019-00128,2019-06-10,EXONERATED,,WESLEY TILLEY,c tilley wesley,M,unknown,WARRANTLESS SEARCHES,2019
12407,SV2017-00014,2019-08-08,WRITTEN,,WESLEY TILLEY,c tilley wesley,M,unknown,ABSENT WITHOUT LEAVE (SPECIAL EVENTS ASSIGNMENT),2019
12408,CC2019-00096,2019-11-26,ORAL,,WESLEY TILLEY,c tilley wesley,M,unknown,ACTING IN CIVIL MATTERS,2019
12474,120410,2011-01-05,SUSPENDED,1.0,WILLIAM JEFFERS,no match,,,CARE OF GOVERNMENT PROPERTY; TASER MANNER OF C...,2011


In [11]:
394/total

0.03055210918114144

In [12]:
discipline_final = discipline_df[
    (discipline_df.officer_race!='unknown') & (discipline_df.clean_roster_name!='no match') & ~(discipline_df.officer_race.isna())
].copy()

## Data Overview

Discipline has been decreasing year-over-year at the MNPD for a decade. 2019, the last normal year of operations in our data, had 833 disciplinary actions taken against officers, which is less than half of the number of disciplinary actions in 2010. 


In [15]:
disposition_date = discipline_final.final_disposition_date.copy()

disposition_date_data = disposition_date.apply(
    lambda x: x.year
).value_counts().reset_index().rename(columns={'index':'year'})

disposition_date_data

Unnamed: 0,year,final_disposition_date
0,2010,1628
1,2012,1401
2,2011,1390
3,2013,1359
4,2015,1243
5,2014,1215
6,2016,1091
7,2017,917
8,2018,826
9,2019,813


The number of discipline actions taken against officers has fallen significantly since 2010. 2018-2020 saw roughly half of the yearly discipline actions as 2010. 

In [16]:

alt.Chart(disposition_date_data).mark_bar().encode(
    x='year:O',
    y='final_disposition_date:Q'
).properties(title='Disciplinary Actions Per Year')

## Officer Stats

There are significantly fewer female officers in the discipline data compared to male officers. Less than __10%__ of the discipline handed out over the last decade has been against female officers. 

The vast majority of discipline (__91%__) is against male officers. 

In [17]:
discipline_final.officer_gender.value_counts()

M    11403
F     1099
Name: officer_gender, dtype: int64

In [18]:
discipline_by_gender = discipline_final.groupby(
    'officer_gender'
).agg({'clean_roster_name':'count'}).rename(
    columns={'clean_roster_name': 'perc_of_discipline'}
).groupby(level=0).apply(lambda x: 100 * x / len(discipline_final)).reset_index()

discipline_by_gender

Unnamed: 0,officer_gender,perc_of_discipline
0,F,8.790594
1,M,91.209406


There are 1,722 white officers in the discipline data and 306 black officers. The department and the disciplinary actions are very white. 

White people were __79.9%__ of the discipline actions, while black people were __13.6%__ of the discipline

In [19]:
discipline_final.groupby('officer_race').agg(
    {'clean_roster_name':'nunique'}
)

Unnamed: 0_level_0,clean_roster_name
officer_race,Unnamed: 1_level_1
asian/pacific islander,24
black,306
hispanic,47
multiracial,59
native american,3
white,1722


In [20]:
discipline_by_race = discipline_final.groupby(
    ['officer_race']
).agg(
    {'clean_roster_name': 'count'} # count returns a simple count, we can choose any column here 
).rename(
    columns={'clean_roster_name': 'perc_of_discipline'}
).groupby(level=0).apply(lambda x: 100 * x / len(discipline_final)).reset_index()

discipline_by_race.sort_values(by='perc_of_discipline').style.set_caption(
    'Officer Race as a percentage of all discipline actions'
)

Unnamed: 0,officer_race,perc_of_discipline
4,native american,0.06399
0,asian/pacific islander,1.415773
2,hispanic,1.887698
3,multiracial,2.991521
1,black,13.677812
5,white,79.963206


The pattern holds steady year-over-year. White officers remain around 80% of the discipline. 

In [21]:
# percent of that year's discipline broken down by race
perc_by_year = discipline_final.groupby(
    ['year','officer_race']
).agg(
    {'clean_roster_name': 'count'} # count returns a simple count, we can choose any column here 
).rename(
    columns={'clean_roster_name': 'perc_of_group'}
).groupby(level=0).apply(lambda x: 100 * x / float(x.sum())).reset_index()

alt.Chart(perc_by_year[perc_by_year.officer_race.isin(['white','black'])]).mark_bar().encode(
    x='year:O',
    y='perc_of_group',
    color='officer_race'
).properties(title='Percent of Yearly Discipline by Race')

As we drill into the intersection of race and gender, white and male officers make up a smaller proportion of the total discipline, but they still make up most of the discipline data. 

__74%__ of the disciplinary actions actions were taken against white men. They are also the largest group in the department, so that makes sense. 

__12%__ of the discipline was taken against black men, the second most represented group in the data. 

White women were __6.3%__ of the discpline over the last decade and black women were __1.9%__

In [22]:
race_gender_all_discipline = discipline_final.groupby(
    ['officer_race','officer_gender']
).agg(
    {'clean_roster_name': 'count'} # count returns a simple count, we can choose any column here 
).rename(
    columns={'clean_roster_name': 'perc_of_discipline'}
).groupby(level=0).apply(lambda x: 100 * x / len(discipline_final)).reset_index() # divide by the ENTIRE discipline file to get the total percentage 

race_gender_all_discipline.sort_values(
    by='perc_of_discipline',
    ascending=False
).style.set_caption("Percentage of all discipline")

Unnamed: 0,officer_race,officer_gender,perc_of_discipline
11,white,M,73.652216
3,black,M,11.798112
10,white,F,6.31099
7,multiracial,M,2.703567
2,black,F,1.879699
5,hispanic,M,1.62374
1,asian/pacific islander,M,1.383779
6,multiracial,F,0.287954
4,hispanic,F,0.263958
9,native american,M,0.047992


The chart below is only within-group. So 92% of the discipline against white people was against white men. 

In [23]:
discipline_final.groupby(
    ['officer_race','officer_gender']
).agg(
    {'clean_roster_name': 'count'} # count returns a simple count, we can choose any column here 
).rename(
    columns={'clean_roster_name': 'perc_of_group'}
).groupby(level=0).apply(lambda x: 100 * x / float(x.sum()))

Unnamed: 0_level_0,Unnamed: 1_level_0,perc_of_group
officer_race,officer_gender,Unnamed: 2_level_1
asian/pacific islander,F,2.259887
asian/pacific islander,M,97.740113
black,F,13.74269
black,M,86.25731
hispanic,F,13.983051
hispanic,M,86.016949
multiracial,F,9.625668
multiracial,M,90.374332
native american,F,25.0
native american,M,75.0


## Department Statistics

The Metro Nashville Police Department is very white and very male. The department averages around 2000 officers a year and has a very steady 3-1 male to female ratio. 

In [24]:
staff_roster_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 23645 entries, 0 to 23644
Data columns (total 9 columns):
 #   Column                Non-Null Count  Dtype         
---  ------                --------------  -----         
 0   name                  23645 non-null  object        
 1   job_desc              23645 non-null  object        
 2   dept_desc             23645 non-null  object        
 3   gender                23645 non-null  object        
 4   race_ethnicity        23643 non-null  object        
 5   date_started          23644 non-null  datetime64[ns]
 6   year                  23645 non-null  int64         
 7   clean_name            23645 non-null  object        
 8   clean_race_ethnicity  23643 non-null  object        
dtypes: datetime64[ns](1), int64(1), object(7)
memory usage: 1.6+ MB


To put the numbers we see in the discipline data in context, we have to understand the department statistics.

The department averages around __1,970__ officers a year.

In [25]:
staff_roster_df.groupby(['year']).size().reset_index().rename(columns={0:'officer_count'})

Unnamed: 0,year,officer_count
0,2010,1902
1,2011,1816
2,2012,2029
3,2013,1933
4,2014,1930
5,2015,1954
6,2016,2014
7,2017,2024
8,2018,2105
9,2019,1974


In [26]:
avg_dept_size = staff_roster_df.groupby(['year']).size().mean()

avg_dept_size

1970.4166666666667

If we look at every officer that worked in the department over the last decade, its 69%-31% Men-Women. 

In [28]:
# flat_roster is a roster of every officer to appear in the staff roster. The staff roster is per year, so this is any officer that appears at least once. 
flat_roster = staff_roster_df.groupby(
    ['clean_name', 'gender', 'clean_race_ethnicity']
).size().reset_index()[
    ['clean_name', 'gender', 'clean_race_ethnicity']
]


flat_roster.groupby(['gender']).size()/len(flat_roster)

gender
F    0.30526
M    0.69474
dtype: float64

Year over year, the department is a consistent 75-25 Men-Women split. That seems to suggest that women have a much shorter tenure since there are more women over the decade aggregated, but fewer women in any given year. 

In [29]:
# what is the average male-female split? 
dept_gender_breakdown = staff_roster_df.groupby(
    ['year',  'gender']
).agg({'name':'count'}).rename(
    columns={'name': 'perc_of_dept'}
).groupby(level=0).apply(lambda x: 100 * x / float(x.sum())).reset_index().groupby(
    'gender'
)['perc_of_dept'].mean().reset_index()

dept_gender_breakdown

Unnamed: 0,gender,perc_of_dept
0,F,24.582339
1,M,75.417661


In [30]:
avg_racial_makeup = staff_roster_df.groupby(
    ['year',  'clean_race_ethnicity']
).agg({'name':'count'}).rename(
    columns={'name': 'avg_perc_of_dept'}
).groupby(level=0).apply(lambda x: 100 * x / float(x.sum())).reset_index().groupby(
    'clean_race_ethnicity'
)['avg_perc_of_dept'].mean().reset_index() 

avg_racial_makeup.sort_values(
    by='avg_perc_of_dept',
    ascending=False
).style.set_caption("Average racial makeup of Department")

Unnamed: 0,clean_race_ethnicity,avg_perc_of_dept
6,white,77.187467
1,black,18.57173
3,multiracial,3.195554
2,hispanic,1.78288
0,asian/pacific islander,1.202652
4,native american,0.126457
5,unknown,0.095444


In [31]:
avg_race_gender_roster = staff_roster_df.groupby(
    ['year',  'clean_race_ethnicity','gender']
).agg({'name':'count'}).rename(
    columns={'name': 'avg_perc_of_dept'}
).groupby(level=0).apply(lambda x: 100 * x / float(x.sum())).reset_index().groupby(
    ['clean_race_ethnicity', 'gender']
)['avg_perc_of_dept'].mean().reset_index()

avg_race_gender_roster.sort_values(
    by='avg_perc_of_dept',
    ascending=False
).style.set_caption("Average racial/gender makeup of Department") 

Unnamed: 0,clean_race_ethnicity,gender,avg_perc_of_dept
12,white,M,61.952189
11,white,F,15.235278
3,black,M,10.137213
2,black,F,8.434516
7,multiracial,M,2.573711
5,hispanic,M,1.219323
1,asian/pacific islander,M,1.109463
6,multiracial,F,0.621843
4,hispanic,F,0.563557
10,unknown,M,0.095444


In [32]:
avg_race_gender_roster['race_gender'] = avg_race_gender_roster['clean_race_ethnicity'] + '_' + avg_race_gender_roster['gender']

alt.Chart(avg_race_gender_roster).mark_bar().encode(
    x='avg_perc_of_dept:Q',
    y='race_gender',
    order=alt.Order('avg_perc_of_dept', sort='descending')
)

## Officer Race and Gender compared to Discipline

Black officers are underrepresented in the discipline data while white officers are over-represented in the discipline data, but only by small amounts. 

In [33]:
race_discipline_perc_comparison = discipline_by_race.merge(
    avg_racial_makeup,
    how='left',
    left_on='officer_race',
    right_on='clean_race_ethnicity'
)[['officer_race', 'avg_perc_of_dept', 'perc_of_discipline']]

race_discipline_perc_comparison.sort_values(
    by='avg_perc_of_dept',
    ascending=False
).style.set_caption('Makeup of the Department compared to the Discipline Data')

Unnamed: 0,officer_race,avg_perc_of_dept,perc_of_discipline
5,white,77.187467,79.963206
1,black,18.57173,13.677812
3,multiracial,3.195554,2.991521
2,hispanic,1.78288,1.887698
0,asian/pacific islander,1.202652,1.415773
4,native american,0.126457,0.06399


Female officers are 25% of the department but only 9% of the discipline data, a significant difference. 

In [34]:
dept_gender_breakdown.merge(
    discipline_by_gender,
    how='left',
    left_on='gender',
    right_on='officer_gender'
)[['gender', 'perc_of_dept','perc_of_discipline']].sort_values(
    by='perc_of_dept', ascending=False
)

Unnamed: 0,gender,perc_of_dept,perc_of_discipline
1,M,75.417661,91.209406
0,F,24.582339,8.790594


White men are over-represented in the discipline data compared to the average make-up of the department. 

In [35]:
race_gender_all_discipline.merge(
    avg_race_gender_roster,
    how='left',
    left_on=['officer_race','officer_gender'],
    right_on=['clean_race_ethnicity', 'gender']
)[[
    'officer_race', 'officer_gender', 'avg_perc_of_dept','perc_of_discipline'
]].sort_values(
    by='avg_perc_of_dept',
    ascending=False
)

Unnamed: 0,officer_race,officer_gender,avg_perc_of_dept,perc_of_discipline
11,white,M,61.952189,73.652216
10,white,F,15.235278,6.31099
3,black,M,10.137213,11.798112
2,black,F,8.434516,1.879699
7,multiracial,M,2.573711,2.703567
5,hispanic,M,1.219323,1.62374
1,asian/pacific islander,M,1.109463,1.383779
6,multiracial,F,0.621843,0.287954
4,hispanic,F,0.563557,0.263958
0,asian/pacific islander,F,0.093189,0.031995


## Tenure 

Discipline is not the be-all and end-all of difficulties officers face in the course of their duty. One indirect way of measuring the difficulty of being a gender and racial minority in a department is by measuring how long officers are willing to stay in their jobs. 

Women are over-represented in the 10-year total of all the officers at 30%, compared to 25% yearly average. That says that women cycle through the department more frequently. And when we look at the data, that seems to be true. The staff roster gave us the start date for each employee, so we analyzed the tenure of the employees with the department in 2021. 

Women at the department have a shorter average tenure compared to men by about 2 years. 

In [36]:
roster_2021  = staff_roster_df[staff_roster_df.year==2021].copy()

date_data_received = pd.to_datetime('06/28/2021')

roster_2021['tenure_days'] = roster_2021.apply(
    lambda x: (date_data_received - x.date_started).days,
    axis=1
)

In 2021, women did have a shorter average tenure compared to men, by about 2 years

In [37]:
roster_2021.groupby('gender')['tenure_days'].mean()/365

gender
F    10.349463
M    12.455165
Name: tenure_days, dtype: float64

In [38]:
roster_2021.groupby('gender')['tenure_days'].median()/365

gender
F     6.269863
M    10.126027
Name: tenure_days, dtype: float64

In [39]:
alt.Chart(roster_2021).mark_bar().encode(
    x=alt.X('tenure_days', bin=alt.Bin(maxbins=50)),
    y=alt.Y('count()',stack=None),
    color=alt.Color('gender')
)

The difference is not as big looking at just race

In [40]:
roster_2021.groupby(['clean_race_ethnicity'])['tenure_days'].mean()/365

clean_race_ethnicity
asian/pacific islander    11.368493
black                     11.881397
hispanic                   7.448080
multiracial               11.121661
native american            8.875342
white                     12.149930
Name: tenure_days, dtype: float64

Black women, in 2021, had an average tenure of about 10 years. Hispanic officers had very short tenures compared to officers of other race/ethnicities. There is single Native American woman, leading to a big outlier. 

In [42]:
roster_2021.groupby(
    ['clean_race_ethnicity', 'gender']
).agg({'tenure_days': lambda x: x.mean()/365}).reset_index().rename(
    columns={'tenure_days':'avg_tenure_years'}
).sort_values(
    by='avg_tenure_years', ascending=False
)

Unnamed: 0,clean_race_ethnicity,gender,avg_tenure_years
8,native american,F,19.726027
3,black,M,13.402666
11,white,M,12.469129
1,asian/pacific islander,M,12.050311
7,multiracial,M,11.995595
10,white,F,10.974607
2,black,F,9.830991
5,hispanic,M,8.449787
6,multiracial,F,7.693151
4,hispanic,F,6.127646


In [43]:
roster_2021.groupby(['clean_race_ethnicity','gender']).size()

clean_race_ethnicity    gender
asian/pacific islander  F            2
                        M           22
black                   F          138
                        M          186
hispanic                F           22
                        M           29
multiracial             F           13
                        M           51
native american         F            1
                        M            3
white                   F          324
                        M         1194
dtype: int64

## Race and Gender, broken down a different way 

Women make up a very small percentage of the discipline, but could that be because there are some men with so much discipline that it is throwing off the numbers? Lets aggregate the discipline by name and see if the number of individual women in the data is different

In [44]:
discipline_by_officers = discipline_final.groupby(
    ['clean_roster_name', 'officer_gender', 'officer_race']
).size().reset_index().rename(columns={0:'count_occurances'})

discipline_by_officers.head()

Unnamed: 0,clean_roster_name,officer_gender,officer_race,count_occurances
0,a aldea lopez luis,M,hispanic,7
1,a almose thompson,M,black,4
2,a anderson carlos,M,black,2
3,a anderson david,M,white,13
4,a andrew kooshian,M,white,2


The percentage of men vs women in the data isn't vastly different looking at officers disciplined or all disciplinary actions. Men drop by about 5-percentage points but women are still under-represented in the discipline data compared to the makeup of the department 

In [45]:
discipline_by_officers.officer_gender.value_counts()/len(discipline_by_officers)

M    0.852383
F    0.147617
Name: officer_gender, dtype: float64

In [46]:
discipline_by_officers.officer_race.value_counts()/len(discipline_by_officers)

white                     0.796853
black                     0.141601
multiracial               0.027302
hispanic                  0.021749
asian/pacific islander    0.011106
native american           0.001388
Name: officer_race, dtype: float64

The discipline makeup doesn't change that much year over year, though the percentage of women disciplined increased in 2019 and 2020. 

In [47]:
discipline_by_officers_by_year = discipline_final.groupby(
    ['year','clean_roster_name', 'officer_gender', 'officer_race']
).size().reset_index().rename(columns={0:'count_occurances'})

discipline_gender_per_year = discipline_by_officers_by_year.groupby(
    ['year', 'officer_gender']
).agg(
    {'clean_roster_name': 'count'}
).groupby(level=0).apply(lambda x: 100 * x/float(x.sum()))

alt.Chart(discipline_gender_per_year.reset_index()).mark_line().encode(
 x='year:O',
    y='clean_roster_name',
    color='officer_gender'
)

In [48]:
disposition_by_race = discipline_final.groupby(
    ['officer_race', 'final_disposition']
).size().to_frame(name='amount').reset_index()

disposition_by_race[
    disposition_by_race.final_disposition.str.contains('SUSPENDED')
].groupby('officer_race').agg({'amount':'sum'})

Unnamed: 0_level_0,amount
officer_race,Unnamed: 1_level_1
asian/pacific islander,36
black,510
hispanic,58
multiracial,77
native american,1
white,1983


## What percentage of officers are disciplined, on average

We wanted to know what percentage of officers were disciplined, on average, per year. We took the average number of officers of each race for both the staff roster and the discipline data and found that a lower percentage of black officers have allegations lodged against them when compared to white officers. 

34% of white officers, on average, have an allegation made against them and enter the disciplinary process, while only 23% of black officers have an allegation made against them. Then once the disciplinary process begins, that is where we observe black officers facing harsher punishment.

When we look only at severe discipline, we find that roughly 10% of both black and white officers are severely disciplined per year on average. We are defining severe discipline as suspension, demotion, and termination. 


In [50]:
# The groupbys below look complicated, but they are actually quite simple. 
# We start by getting the unique set of officers per year for each dataset, then we count the number of people of different races per year
# then finally we take the average number of officers of each race 

In [51]:
avg_race_discipline = discipline_final.groupby(
    ['clean_roster_name', 'officer_race','year']
).size().to_frame('count').reset_index().groupby(
    ['officer_race','year']
).size().to_frame('count').reset_index().groupby(
    'officer_race'
).agg({'count':'mean'}).reset_index().rename(columns={'count':'avg_number_per_year_discipline'})

In [52]:
avg_race_roster = staff_roster_df.groupby(
    ['year',  'clean_race_ethnicity']
).agg({'name':'count'}).reset_index().groupby(
    ['clean_race_ethnicity']
).agg({'name':'mean'}).reset_index().rename(columns={'name':'avg_number_per_year_roster'})

In [53]:
num_suspended_per_year = discipline_final[
    (discipline_final.final_disposition.str.contains('SUSPENDED'))|(discipline_final.final_disposition.str.contains('DEMOT'))|(discipline_final.final_disposition.str.contains('TERMIN'))
].groupby(
    ['year','clean_roster_name','officer_race']
).size().to_frame('count').reset_index().groupby(
    ['year', 'officer_race']
).size().to_frame('count').reset_index().groupby(
    ['officer_race']
).agg({'count':'mean'}).reset_index().rename(columns={'count':'avg_num_severe_discipline_per_year'})

num_suspended_per_year

Unnamed: 0,officer_race,avg_num_severe_discipline_per_year
0,asian/pacific islander,3.0
1,black,36.818182
2,hispanic,4.6
3,multiracial,5.636364
4,native american,1.0
5,white,151.727273


In [54]:
race_roster_discipline_comparison = avg_race_discipline.merge(
    avg_race_roster,
    how='left',
    left_on='officer_race',
    right_on='clean_race_ethnicity'
).merge(
    num_suspended_per_year,
    how='left',
    left_on = 'officer_race',
    right_on = 'officer_race'
)

race_roster_discipline_comparison['perc_disciplined'] = 100 *  race_roster_discipline_comparison['avg_number_per_year_discipline']/race_roster_discipline_comparison['avg_number_per_year_roster']
race_roster_discipline_comparison['perc_severe_discipline'] = 100 *  race_roster_discipline_comparison['avg_num_severe_discipline_per_year']/race_roster_discipline_comparison['avg_number_per_year_roster']

In [55]:
race_roster_discipline_comparison[
    ['officer_race',
     'avg_number_per_year_discipline',
     'avg_number_per_year_roster',
     'avg_num_severe_discipline_per_year',
     'perc_disciplined',
     'perc_severe_discipline']
]

Unnamed: 0,officer_race,avg_number_per_year_discipline,avg_number_per_year_roster,avg_num_severe_discipline_per_year,perc_disciplined,perc_severe_discipline
0,asian/pacific islander,7.818182,23.666667,3.0,33.034571,12.676056
1,black,84.363636,365.333333,36.818182,23.092236,10.077969
2,hispanic,11.0,35.25,4.6,31.205674,13.049645
3,multiracial,18.727273,64.25,5.636364,29.147506,8.77255
4,native american,1.166667,2.5,1.0,46.666667,40.0
5,white,512.272727,1520.833333,151.727273,33.683686,9.976588


4% of the black women, on average, face severe discipline per year. 

In [134]:
avg_per_year_discipline = discipline_final.groupby(
    ['clean_roster_name', 'officer_race','officer_gender','year']
).size().to_frame('count').reset_index().groupby(
    ['officer_race','officer_gender','year']
).size().to_frame('count').reset_index().groupby(
    ['officer_race','officer_gender']
).agg({'count':'mean'}).reset_index().rename(columns={'count':'avg_per_year_discipline'})

avg_per_year_discipline

Unnamed: 0,officer_race,officer_gender,avg_per_year_discipline
0,asian/pacific islander,F,1.333333
1,asian/pacific islander,M,7.454545
2,black,F,13.727273
3,black,M,70.636364
4,hispanic,F,2.75
5,hispanic,M,9.0
6,multiracial,F,2.1
7,multiracial,M,16.818182
8,native american,F,1.0
9,native american,M,1.0


In [135]:
avg_race_gender_roster = staff_roster_df.groupby(
    ['year',  'clean_race_ethnicity', 'gender']
).agg({'name':'count'}).reset_index().groupby(
    ['clean_race_ethnicity', 'gender']
).agg({'name':'mean'}).reset_index().rename(columns={'name':'avg_per_year_roster'})

avg_race_gender_roster

Unnamed: 0,clean_race_ethnicity,gender,avg_per_year_roster
0,asian/pacific islander,F,1.833333
1,asian/pacific islander,M,21.833333
2,black,F,165.916667
3,black,M,199.416667
4,hispanic,F,11.166667
5,hispanic,M,24.083333
6,multiracial,F,12.5
7,multiracial,M,51.75
8,native american,F,1.2
9,native american,M,1.5


In [136]:
num_severe_per_year = discipline_final[
    (discipline_final.final_disposition.str.contains('SUSPENDED'))|(discipline_final.final_disposition.str.contains('DEMOT'))|(discipline_final.final_disposition.str.contains('TERMIN'))
].groupby(
    ['year','clean_roster_name','officer_race', 'officer_gender']
).size().to_frame('count').reset_index().groupby(
    ['year', 'officer_race', 'officer_gender']
).size().to_frame('count').reset_index().groupby(
    ['officer_race', 'officer_gender']
).agg({'count':'mean'}).reset_index().rename(columns={'count':'avg_num_severe_discipline_per_year'})

num_severe_per_year

Unnamed: 0,officer_race,officer_gender,avg_num_severe_discipline_per_year
0,asian/pacific islander,F,1.5
1,asian/pacific islander,M,2.7
2,black,F,6.727273
3,black,M,30.090909
4,hispanic,F,1.2
5,hispanic,M,4.444444
6,multiracial,F,1.25
7,multiracial,M,5.181818
8,native american,M,1.0
9,white,F,12.090909


In [137]:
roster_discipline = avg_per_year_discipline.merge(
    avg_race_gender_roster,
    how='left',
    left_on=['officer_race', 'officer_gender'],
    right_on=['clean_race_ethnicity', 'gender']
)[['officer_race','officer_gender','avg_per_year_discipline','avg_per_year_roster']].copy()

race_gender_roster_discipline = roster_discipline.merge(
    num_severe_per_year,
    how='left',
    on=['officer_race','officer_gender']
)
race_gender_roster_discipline

Unnamed: 0,officer_race,officer_gender,avg_per_year_discipline,avg_per_year_roster,avg_num_severe_discipline_per_year
0,asian/pacific islander,F,1.333333,1.833333,1.5
1,asian/pacific islander,M,7.454545,21.833333,2.7
2,black,F,13.727273,165.916667,6.727273
3,black,M,70.636364,199.416667,30.090909
4,hispanic,F,2.75,11.166667,1.2
5,hispanic,M,9.0,24.083333,4.444444
6,multiracial,F,2.1,12.5,1.25
7,multiracial,M,16.818182,51.75,5.181818
8,native american,F,1.0,1.2,
9,native american,M,1.0,1.5,1.0


In [138]:
race_gender_roster_discipline['perc_disciplined'] = 100 *  race_gender_roster_discipline['avg_per_year_discipline']/race_gender_roster_discipline['avg_per_year_roster']
race_gender_roster_discipline['perc_severe_discipline'] = 100 *  race_gender_roster_discipline['avg_num_severe_discipline_per_year']/race_gender_roster_discipline['avg_per_year_roster']

In [139]:
race_gender_roster_discipline.sort_values(by='perc_severe_discipline',ascending=False)

Unnamed: 0,officer_race,officer_gender,avg_per_year_discipline,avg_per_year_roster,avg_num_severe_discipline_per_year,perc_disciplined,perc_severe_discipline
0,asian/pacific islander,F,1.333333,1.833333,1.5,72.727273,81.818182
9,native american,M,1.0,1.5,1.0,66.666667,66.666667
5,hispanic,M,9.0,24.083333,4.444444,37.370242,18.454441
3,black,M,70.636364,199.416667,30.090909,35.421495,15.089465
1,asian/pacific islander,M,7.454545,21.833333,2.7,34.142956,12.366412
11,white,M,466.272727,1220.25,139.636364,38.211246,11.443259
4,hispanic,F,2.75,11.166667,1.2,24.626866,10.746269
7,multiracial,M,16.818182,51.75,5.181818,32.498902,10.013175
6,multiracial,F,2.1,12.5,1.25,16.8,10.0
2,black,F,13.727273,165.916667,6.727273,8.273595,4.054609


## Percentage of each group's discipline that counts as severe 

We know the average percentage of each group that faces severe discipline per year at the department, but the numbers look more stark if we analyze the results of discipline once the process has started. 

A small percentage of black women are disciplined at the department per year, but once the discipline process starts, they are severely disciplined at much higher rates than white officers with allegations made against them.

In [56]:
severe = discipline_final[
    (discipline_final.final_disposition.str.contains('SUSPENDED'))|(discipline_final.final_disposition.str.contains('DEMOT'))|(discipline_final.final_disposition.str.contains('TERMIN'))
].groupby(
    ['officer_race']
).size().to_frame('count_severe').reset_index()

all_discipline = discipline_final.groupby(
    ['officer_race']
).size().to_frame('count_all').reset_index()

discipline_severe_vs_all_race = severe.merge(
    all_discipline,
    how='left',
    on='officer_race'
)

discipline_severe_vs_all_race['perc_severe'] = 100 * discipline_severe_vs_all_race['count_severe'] / discipline_severe_vs_all_race['count_all']
discipline_severe_vs_all_race['perc_not_severe'] = 100 - discipline_severe_vs_all_race.perc_severe

discipline_severe_vs_all_race.sort_values(by='perc_severe', ascending=False)

Unnamed: 0,officer_race,count_severe,count_all,perc_severe,perc_not_severe
1,black,522,1710,30.526316,69.473684
2,hispanic,58,236,24.576271,75.423729
3,multiracial,78,374,20.855615,79.144385
0,asian/pacific islander,36,177,20.338983,79.661017
5,white,2006,9997,20.06602,79.93398
4,native american,1,8,12.5,87.5


In [57]:
severe = discipline_severe_vs_all_race[['officer_race','perc_severe']].copy()
severe['severity'] = 'severe'

not_severe = discipline_severe_vs_all_race[['officer_race','perc_not_severe']].copy()
not_severe['severity'] = 'not severe'

severity_data = pd.concat(
    [severe.rename(columns={'perc_severe': 'percent'}),
    not_severe.rename(columns={'perc_not_severe': 'percent'})]
)

alt.Chart(severity_data).mark_bar().encode(
    x='percent',
    y='officer_race',
    color='severity'
)

Below is the same analysis but included is a race-gender breakdown. 

The most striking result: of the allegations against black women in the department over the last decade, 41% ended in severe discipline. 

In [58]:
severe = discipline_final[
    (discipline_final.final_disposition.str.contains('SUSPENDED'))|(discipline_final.final_disposition.str.contains('DEMOT'))|(discipline_final.final_disposition.str.contains('TERMIN'))
].groupby(
    ['officer_race', 'officer_gender']
).size().to_frame('count_severe').reset_index()

all_discipline = discipline_final.groupby(
    ['officer_race', 'officer_gender']
).size().to_frame('count_all').reset_index()

discipline_severe_vs_all_race_gender = severe.merge(
    all_discipline,
    how='left',
    on=['officer_race', 'officer_gender']
)

discipline_severe_vs_all_race_gender['perc_severe'] = 100 * discipline_severe_vs_all_race_gender['count_severe'] / discipline_severe_vs_all_race_gender['count_all']
discipline_severe_vs_all_race_gender['perc_not_severe'] = 100 - discipline_severe_vs_all_race_gender['perc_severe']
discipline_severe_vs_all_race_gender.sort_values(by='perc_severe', ascending=False)

Unnamed: 0,officer_race,officer_gender,count_severe,count_all,perc_severe,perc_not_severe
0,asian/pacific islander,F,3,4,75.0,25.0
2,black,F,96,235,40.851064,59.148936
3,black,M,426,1475,28.881356,71.118644
5,hispanic,M,50,203,24.630542,75.369458
4,hispanic,F,8,33,24.242424,75.757576
7,multiracial,M,72,338,21.301775,78.698225
9,white,F,161,789,20.405577,79.594423
10,white,M,1845,9208,20.036924,79.963076
1,asian/pacific islander,M,33,173,19.075145,80.924855
6,multiracial,F,6,36,16.666667,83.333333


In [59]:
severe = discipline_severe_vs_all_race_gender[['officer_race', 'officer_gender','perc_severe']].copy()
severe['severity'] = 'severe'

not_severe = discipline_severe_vs_all_race_gender[['officer_race','officer_gender', 'perc_not_severe']].copy()
not_severe['severity'] = 'not severe'

granular_severity_data = pd.concat(
    [severe.rename(columns={'perc_severe': 'percent'}),
    not_severe.rename(columns={'perc_not_severe': 'percent'})]
)

granular_severity_data['race_gender'] = granular_severity_data['officer_race'] + '_' + granular_severity_data['officer_gender']

alt.Chart(granular_severity_data).mark_bar().encode(
    x='percent',
    y='race_gender',
    color='severity'
)

## Avg Length of Suspension
Here is the average length of suspension for officers  facining suspension. Black women and hispanic men faced longer suspensions on average. 

In [60]:
discipline_final[
    (discipline_final.final_disposition.str.contains('SUSPEN')) 
].groupby(
    ['officer_race', 'officer_gender']
).agg(
    {'final_number_of_days': ['mean','median','count']}
).reset_index().sort_values(by=('final_number_of_days','mean'), ascending=False)

Unnamed: 0_level_0,officer_race,officer_gender,final_number_of_days,final_number_of_days,final_number_of_days
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,mean,median,count
0,asian/pacific islander,F,5.0,5.0,3
5,hispanic,M,4.795918,1.0,49
2,black,F,4.11828,2.0,93
3,black,M,3.518428,1.0,407
1,asian/pacific islander,M,3.151515,1.0,33
10,white,M,3.145017,1.0,1786
8,native american,M,3.0,3.0,1
9,white,F,2.915033,1.0,153
7,multiracial,M,2.318841,2.0,69
4,hispanic,F,2.125,1.0,8


# Significance Testing

Let's test the signifance of our findings. 

If we assume that the proportion of serious discipline should be equal across our sub-populations, then we can use the total discipline to calculate the expected proportion of severe discipline and compare it to the observed proportions and use a chi-squared test for independence 

In [61]:
discipline_sig_testing = discipline_final.copy()

discipline_sig_testing['severe_flag'] = discipline_sig_testing.apply(
    lambda x: 1 if ('SUSPENDED' in x.final_disposition)|('DEMOT' in x.final_disposition)|('TERMIN' in x.final_disposition) else 0,
    axis=1
)

discipline_sig_testing.head()

Unnamed: 0,control_number,final_disposition_date,final_disposition,final_number_of_days,dirty_full_name,clean_roster_name,officer_gender,officer_race,allegation,year,severe_flag
31,064410,2010-08-23,UNFOUNDED,,DANIEL BOWLING,bowling daniel j,M,white,OBSTRUCTION OF RIGHTS-OPA REFERRAL,2010,0
42,SV2020-00111,2020-03-23,WRITTEN,,KEVIN BREEDING,breeding kevin l,M,white,CARE OF GOVERNMENT PROPERTY,2020,0
44,011310,2010-03-26,EXONERATED,,AARON JONES,aaron jones l,M,black,INABILITY TO APPEAR - CALLED IN 12/21/2009,2010,0
45,12939,2010-03-29,EXONERATED,,AARON JONES,aaron jones l,M,black,INABILITY TO APPEAR - CALLED IN 12/2/2009,2010,0
46,045810,2010-08-05,WRITTEN,,AARON JONES,aaron jones l,M,black,ADHERENCE TO LAW - SPEEDING/RECKLESS DRIVING,2010,0


Roughly 22% of the allegations overall end in severe discipline

In [62]:
discipline_sig_testing.severe_flag.sum() / len(discipline_sig_testing)

0.21604543273076307

In [63]:
sig_test_df = discipline_sig_testing.groupby(
    ['officer_race', 'officer_gender']
).agg(
    {
        'severe_flag':'sum',
        'final_disposition': 'count'
    }
).reset_index().copy().rename(columns={'severe_flag': 'severe_count'})

sig_test_df['non_severe_count'] = sig_test_df['final_disposition'] - sig_test_df['severe_count']
sig_test_df.set_index(['officer_race','officer_gender'])[['severe_count', 'non_severe_count']]

Unnamed: 0_level_0,Unnamed: 1_level_0,severe_count,non_severe_count
officer_race,officer_gender,Unnamed: 2_level_1,Unnamed: 3_level_1
asian/pacific islander,F,3,1
asian/pacific islander,M,33,140
black,F,96,139
black,M,426,1049
hispanic,F,8,25
hispanic,M,50,153
multiracial,F,6,30
multiracial,M,72,266
native american,F,0,2
native american,M,1,5


In [64]:
sig_test_df.set_index(['officer_race','officer_gender'])[['severe_count', 'non_severe_count']]

Unnamed: 0_level_0,Unnamed: 1_level_0,severe_count,non_severe_count
officer_race,officer_gender,Unnamed: 2_level_1,Unnamed: 3_level_1
asian/pacific islander,F,3,1
asian/pacific islander,M,33,140
black,F,96,139
black,M,426,1049
hispanic,F,8,25
hispanic,M,50,153
multiracial,F,6,30
multiracial,M,72,266
native american,F,0,2
native american,M,1,5


In [65]:
from scipy.stats import chi2_contingency
from scipy.stats import chi2
chi2_contingency(sig_test_df.set_index(['officer_race','officer_gender'])[['severe_count', 'non_severe_count']])

(121.33523514801928,
 9.765900275932065e-21,
 11,
 array([[8.64181731e-01, 3.13581827e+00],
        [3.73758599e+01, 1.35624140e+02],
        [5.07706767e+01, 1.84229323e+02],
        [3.18667013e+02, 1.15633299e+03],
        [7.12949928e+00, 2.58705007e+01],
        [4.38572228e+01, 1.59142777e+02],
        [7.77763558e+00, 2.82223644e+01],
        [7.30233563e+01, 2.64976644e+02],
        [4.32090865e-01, 1.56790913e+00],
        [1.29627260e+00, 4.70372740e+00],
        [1.70459846e+02, 6.18540154e+02],
        [1.98934634e+03, 7.21865366e+03]]))

In [66]:
chi, pval, dof, exp = chi2_contingency(sig_test_df.set_index(['officer_race','officer_gender'])[['severe_count', 'non_severe_count']])
print('p-value is: ', pval)
significance = 0.005
p = 1 - significance
critical_value = chi2.ppf(p, dof)
print('chi=%.6f, critical value=%.6f\n' % (chi, critical_value))

if chi > critical_value:
    print("""At %.2f level of significance, we reject the null hypotheses and accept H1. They are not independent.""" % (significance))
else:
    print("""At %.2f level of significance, we accept the null hypotheses. They are independent.""" % (significance))



p-value is:  9.765900275932065e-21
chi=121.335235, critical value=26.756849

At 0.01 level of significance, we reject the null hypotheses and accept H1. They are not independent.


We have to check the assumptions we are making about the patterns we are seeing in the data. 

We know that black women in the department are a small number of the allegations, but almost half of the allegations against them end in severe discipline. This could be because they are treated more harshly in the discipline process. But it also could be that they are aware that they might be under more scrutiny, so they behave better and only have allegations lodged against them for the most egregious actions. 

We need to do a bit more digging on the allegations. Are BW punished at a higher rate for the same allegations? 

The rate of severe discipline levied out for allegations of misconduct remains consistent over the years. 

In [67]:
severe_by_year = discipline_sig_testing.groupby('year').agg({'severe_flag': 'sum', 'final_disposition':'count'})

severe_by_year['%'] = 100 * severe_by_year.severe_flag/severe_by_year.final_disposition

severe_by_year

Unnamed: 0_level_0,severe_flag,final_disposition,%
year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2010,290,1628,17.813268
2011,317,1390,22.805755
2012,333,1401,23.768737
2013,318,1359,23.399558
2014,297,1215,24.444444
2015,289,1243,23.250201
2016,214,1091,19.615032
2017,189,917,20.610687
2018,182,826,22.033898
2019,161,813,19.803198


We looked at a subsection of the data that includes a subset of the total allegations. These encompass the 10 most-common allegations made against officers, with some lenient 'contains' to catch different phrasing. We tested using three masks and its easy to switch between them. We did this because the allegation data was so dirty that cleaning it was not something we could do on a reasonable timeframe. We tested using strict and loose allegation matching. 

In [69]:
mask1 = (discipline_sig_testing.allegation.str.contains('CARE OF GOVERNMENT PROPERTY')) |\
(discipline_sig_testing.allegation.str.contains('COURTESY')) |\
(discipline_sig_testing.allegation.str.contains('PERFORMANCE OF DUTIES')) |\
(discipline_sig_testing.allegation.str.contains('OBSTRUCTION OF RIGHTS')) |\
(discipline_sig_testing.allegation.str.contains('ABUSIVE')) |\
(discipline_sig_testing.allegation.str.contains('ADHERENCE TO LAW')) |\
(discipline_sig_testing.allegation.str.contains('CONDUCT UNBECOMING')) |\
(discipline_sig_testing.allegation.str.contains('BIASED BASED')) |\
(discipline_sig_testing.allegation.str.contains('OPA'))

# Similar to masks but checks for direct equality to the wording of the following allegations 
mask2 = discipline_sig_testing.allegation.isin(
    ['CARE OF GOVERNMENT PROPERTY',
    'COURTESY',
    'DEFICIENT OR INEFFICIENT PERFORMANCE OF DUTIES',
    'OBSTRUCTION OF RIGHTS',
    'DEFICIENT PERFORMANCE OF DUTIES',
    'ABUSIVE TREATMENT',
    'ADHERENCE TO LAW',
    'CONDUCT UNBECOMING',
    'BIASED BASED POLICING',
    'OPA COMPLAINT',]
)

# mask3 looks at a sinlge allegation
mask3 = discipline_sig_testing.allegation == 'DEFICIENT OR INEFFICIENT PERFORMANCE OF DUTIES'

subset = discipline_sig_testing[
    mask2
].copy()

print(f'length of subset: {len(subset)}\nlength of total: {len(discipline_sig_testing)}')

length of subset: 3914
length of total: 12502


The subset has fewer allegations ending in severe discipline than the whole dataset

In [70]:
subset.severe_flag.sum()/len(subset)

0.15508431272355647

When looking at the same set of common allegations, black women are disciplined severely at a higher rate than other groups. 

In [71]:
officer_race_groupby = subset.groupby(
    ['officer_race','officer_gender']
).agg(
    {'severe_flag':'sum', 'final_disposition': 'count'}
)

officer_race_groupby['%_severe'] = officer_race_groupby.severe_flag/officer_race_groupby.final_disposition * 100
officer_race_groupby['non_severe'] = officer_race_groupby.final_disposition - officer_race_groupby.severe_flag
officer_race_groupby.sort_values(by='%_severe', ascending=False)

Unnamed: 0_level_0,Unnamed: 1_level_0,severe_flag,final_disposition,%_severe,non_severe
officer_race,officer_gender,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
black,F,20,57,35.087719,37
multiracial,F,4,14,28.571429,10
black,M,87,444,19.594595,357
hispanic,F,2,11,18.181818,9
white,F,46,256,17.96875,210
multiracial,M,16,100,16.0,84
white,M,420,2924,14.363885,2504
hispanic,M,7,56,12.5,49
asian/pacific islander,M,5,49,10.204082,44
native american,F,0,1,0.0,1


There is a statistically significant corrolation between race/gender the severity of the discipline despite the allegations being a smaller, more consistent selection. 

In [72]:
chi, pval, dof, exp = chi2_contingency(officer_race_groupby[['severe_flag', 'non_severe']])
print('p-value is: ', pval)
significance = 0.005
p = 1 - significance
critical_value = chi2.ppf(p, dof)
print('chi=%.6f, critical value=%.6f\n' % (chi, critical_value))

if chi > critical_value:
    print("""At %.2f level of significance, we reject the null hypotheses and accept H1. They are not independent.""" % (significance))
else:
    print("""At %.2f level of significance, we accept the null hypotheses. They are independent.""" % (significance))



p-value is:  0.0007562231234999273
chi=30.330514, critical value=25.188180

At 0.01 level of significance, we reject the null hypotheses and accept H1. They are not independent.


Given the same allegations, black women are still severely disciplined at a higher rate. The below table is the percentage of each category in the subset of the data. Black women make up nearly the same proportion of the subset of discipline data as they do in the total discipline data (1.88% for all discipline and 1.45% of the subset). 

In [73]:
officer_race_groupby['final_disposition'] / officer_race_groupby['final_disposition'].sum()

officer_race            officer_gender
asian/pacific islander  M                 0.012519
black                   F                 0.014563
                        M                 0.113439
hispanic                F                 0.002810
                        M                 0.014308
multiracial             F                 0.003577
                        M                 0.025549
native american         F                 0.000255
                        M                 0.000511
white                   F                 0.065406
                        M                 0.747062
Name: final_disposition, dtype: float64