# Fatal Police Encounters in the US
****

## Topic 

The topic of this project is about the lives that have been taken away due to police
brutality in the US. This issue isn’t just a matter of health concerns and excessive force, it
involves racism and injustice. People of color, especially Black people, have been facing a
higher risk of being killed by the police than White people. The rate of death by police in the US
is concerningly high and occurs much more frequently than in other countries. 

This can no
longer be overlooked; it must be stopped. For too long, Black people in the US have been
unfairly treated at the hands of the criminal justice system and police brutality played a
significant role in this. 

> ### Why this topic?
People, including myself, need to be more aware and educated about this social issue. Black lives matter and society won’t change without any efforts.


## Research Question

What is the trend of the deaths from police
violence year-by-year from 2015 to 2020? 

In addition, were there any changes after the recent
George Floyd protests that began on the date May 26, 2020? 

> ### How is this relevant to the topic? 
It will provide data and insights on how racism and injustice are presented in
police violence. The trend of each year will display the level of how the polices in the US are
excessively using their force within each race. 

## Data Sources
The primary data source for the analysis is from [The Washington Post](https://raw.githubusercontent.com/washingtonpost/data-police-shootings/master/fatal-police-shootings-data.csv)
. It is a `.csv` file and includes the data of victims who have been killed by the police starting
from the beginning of the year 2015 to June 2020. 

It provides the ID, name of the victim, manner of death, whether they
were armed or not, age, gender, race, and more.
****


### Install and import

In [14]:
!pip install wordcloud



In [50]:
import os
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
from wordcloud import WordCloud
from datetime import datetime


In [51]:
pwd

'/Users/pamelalee/Desktop/police-brutality/data'

In [52]:
ls

FATAL ENC.xlsx    MPV.xlsx          fatal-police.csv


In [53]:
victims = pd.read_csv("fatal-police.csv")
victims

Unnamed: 0,id,name,date,manner_of_death,armed,age,gender,race,city,state,signs_of_mental_illness,threat_level,flee,body_camera
0,3,Tim Elliot,2015-01-02,shot,gun,53.0,M,A,Shelton,WA,True,attack,Not fleeing,False
1,4,Lewis Lee Lembke,2015-01-02,shot,gun,47.0,M,W,Aloha,OR,False,attack,Not fleeing,False
2,5,John Paul Quintero,2015-01-03,shot and Tasered,unarmed,23.0,M,H,Wichita,KS,False,other,Not fleeing,False
3,8,Matthew Hoffman,2015-01-04,shot,toy weapon,32.0,M,W,San Francisco,CA,True,attack,Not fleeing,False
4,9,Michael Rodriguez,2015-01-04,shot,nail gun,39.0,M,H,Evans,CO,False,attack,Not fleeing,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
5424,5942,TK TK,2020-06-24,shot,gun,,F,A,Lake Forest,CA,False,other,Not fleeing,False
5425,5946,TK TK,2020-06-25,shot,knife,,F,,Plano,TX,False,attack,Not fleeing,False
5426,5948,Robert D'Lon Harris,2020-06-25,shot,undetermined,34.0,M,B,Vinita,OK,False,undetermined,Not fleeing,False
5427,5949,Martin Humberto Sanchez Fregoso,2020-06-25,shot,gun,37.0,M,H,Atlanta,GA,False,attack,Other,False


In [54]:
victims.describe()

Unnamed: 0,id,age
count,5429.0,5189.0
mean,3018.568429,37.106379
std,1701.408851,13.114945
min,3.0,6.0
25%,1549.0,27.0
50%,3016.0,35.0
75%,4498.0,46.0
max,5949.0,91.0


In [55]:
victims["date"]

0       2015-01-02
1       2015-01-02
2       2015-01-03
3       2015-01-04
4       2015-01-04
           ...    
5424    2020-06-24
5425    2020-06-25
5426    2020-06-25
5427    2020-06-25
5428    2020-06-26
Name: date, Length: 5429, dtype: object

In [57]:
victims["date"] = [datetime.strptime(x, '%Y-%m-%d').date() for x in letters["date"]]



NameError: name 'letters' is not defined

In [58]:
victims['date'] = pd.to_datetime(victims['date'])
victims = victims.set_index(victims['date'])
victims = victims.sort_index()

In [59]:
victims['date']


date
2015-01-02   2015-01-02
2015-01-02   2015-01-02
2015-01-03   2015-01-03
2015-01-04   2015-01-04
2015-01-04   2015-01-04
                ...    
2020-06-24   2020-06-24
2020-06-25   2020-06-25
2020-06-25   2020-06-25
2020-06-25   2020-06-25
2020-06-26   2020-06-26
Name: date, Length: 5429, dtype: datetime64[ns]

In [62]:
split_date = pd.date_range(2016,12,20)

In [63]:
split_date

DatetimeIndex(['1970-01-01 00:00:00.000002016',
               '1970-01-01 00:00:00.000001911',
               '1970-01-01 00:00:00.000001806',
               '1970-01-01 00:00:00.000001700',
               '1970-01-01 00:00:00.000001595',
               '1970-01-01 00:00:00.000001489',
               '1970-01-01 00:00:00.000001384',
               '1970-01-01 00:00:00.000001278',
               '1970-01-01 00:00:00.000001173',
               '1970-01-01 00:00:00.000001067',
               '1970-01-01 00:00:00.000000962',
               '1970-01-01 00:00:00.000000856',
               '1970-01-01 00:00:00.000000751',
               '1970-01-01 00:00:00.000000645',
               '1970-01-01 00:00:00.000000540',
               '1970-01-01 00:00:00.000000434',
               '1970-01-01 00:00:00.000000329',
               '1970-01-01 00:00:00.000000223',
               '1970-01-01 00:00:00.000000118',
               '1970-01-01 00:00:00.000000012'],
              dtype='datetime64[ns]', f