<h2>About this survey </h2>

We will try to find the reasons behind Police shooting from some plots that will show us the main factors that trigger police to shoot citizens. The dataset is from Kaggle <a>https://www.kaggle.com/mrmorj/data-police-shootings</a>

<h2>Info About dataset</h2>

The FBI and the Centers for Disease Control and Prevention log fatal shootings by police, but officials acknowledge that their data is incomplete. In 2015, The Post documented more than two times more fatal shootings by police than had been recorded by the FBI. Last year, the FBI announced plans to overhaul how it tracks fatal police encounters. The dataset contains 5416 records from period 2015-2020.

In [None]:
#importing librareis
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
from geopy.exc import GeocoderTimedOut 
from geopy.geocoders import Nominatim

In [None]:
#importing the dataset
df = pd.read_csv(r'../input/data-police-shootings/fatal-police-shootings-data.csv')
df.drop('name', axis=1,inplace=True)
df

<h2>Victims over time</h2>

In [None]:
#Lets find the deaths in time 
x_months = [0]
y_months = [0]
x_year = [0]
y_year = [0]
temp = 0
for i in df.date: 
    #finding the deaths per month
    temp = i[:7]
    if x_months[-1] == temp:
        y_months[-1] += 1
    else:
        x_months.append(temp)
        y_months.append(1)
    #finding the deaths per year
    temp = i[:4]
    if x_year[-1] == temp:
        y_year[-1] += 1
    else:
        x_year.append(temp)
        y_year.append(1)
        
x_months.pop(0)
y_months.pop(0)
x_year.pop(0)
y_year.pop(0)
f, ax = plt.subplots(1,2,figsize=(20,8))
ax[0].plot(x_months,y_months)
ax[0].axhline(y=np.mean(y_months), color = 'orange', label = 'Average')
ax[0].legend(fontsize=12)
ax[0].set_ylabel('Number of Deaths', fontsize=15)
ax[0].set_xlabel('Months', fontsize=15)
ax[0].set_title('Number of Police Shooting per Month', fontsize=20)
ax[0].xaxis.set_major_locator(ticker.MaxNLocator(10))
ax[1].bar(x_year,y_year)
ax[1].set_xlabel('Years', fontsize=15)
ax[1].set_ylabel('Number of Deaths', fontsize=15)
ax[1].set_title('Number of Police Shooting per Year', fontsize=20)
print('Average Deaths per month: {}\nStandard deviation Deaths per month: {}\nAverage Deaths per Year (not including 2020): {}\nStandard deviation Deaths per year: {}'.format(np.mean(y_months),np.var(y_months)**(1/2),np.mean(y_year),np.var(y_year)**(1/2)))

<h3> Ploting Observations</h3>

We can see that shooting in United States the Average Police Shooiting is around 75-90 every month and 902 per year. So shooting seems more than a habit rather than random cases. It is really tragic that we even can predict that 2020 that has 481, will have propably 400 and more shoots by police.

<h2>Gender,Race and Age factors</h2>

In [None]:
# lets see about gender, race and age factors
males = len(df.loc[df.gender == 'M'])
wom = len(df.loc[df.gender == 'F'])
blacks = len(df.loc[df.race == 'B'])
white = len(df.loc[df.race == 'W'])
ages = df.age
f, ax = plt.subplots(1,2,figsize=(15,7))
ax[0].set_title('Gender & Race Factor')
ax[0].bar([1,2], [males,wom])
ax[0].bar([4,5], [white,blacks])
ax[0].set_xticks([1,2,4,5])
ax[0].set_xticklabels(['males','females','white', 'blacks'], fontsize = 12)
ax[1].hist(ages,bins=np.arange(0,100,5))
ax[1].set_title('Police Shooting by age')
print('Number of males: {}\nNumber of females: {}\nNumber of White: {}\nNumber of Blacks: {}\nAverage age: {}\nStandar Deviation of age: {}'.format(males,wom,white, blacks, np.mean(ages), np.var(ages)**(1/2)))

<h3>Plotting Observations</h3>

First of all we notice that the number of men that are shooted by police is by far bigger than females. This can be translated that men are more violent and law offenders. In race comparison we notice that more white people are killed, something that could mean that Police shooting is irrelevant with racist believes or that white people are more violent. Finally looking the age histogram we notice that from 18 age until the peak of 37 we have a expodential increasement of number of victims by police and from the peak until 90 we have a expodential decreaseemnt of number of victims. But generally we could say that Police shoot more middle-aged people.


<h2>Crime Scene</h2>

In [None]:
#check the scenes of crime
shot = len(df.loc[df.manner_of_death == df.manner_of_death.unique()[0]])
taser = len(df.loc[df.manner_of_death == df.manner_of_death.unique()[1]])
gun = len(df[df.armed == 'gun'])
knife = len(df[df.armed == 'knife'])
unarmed = len(df[df.armed == 'unarmed'])
not_fleeing = len(df[df.flee == 'Not fleeing'])
car = len(df[df.flee == 'Car'])
foot = len(df[df.flee == 'Foot'])
#plotting
f, ax = plt.subplots(1,1,figsize=(15,8))
ax.bar([1,2], [shot,taser], label = 'Manner of Death')
ax.bar([4,5,6], [gun,knife,unarmed], label = 'armed')
ax.bar([8,9,10], [not_fleeing,car,foot], label = 'flee')
ax.set_xticks([1,2,4,5,6,8,9,10])
ax.set_xticklabels(['shot','taser & shot','gun','knife','unarmed','not_fleeing','car','foot'])
ax.legend(fontsize=12)
print('Manner of Death: Shot({}) ,taser & shot({})\nArmed: gun({}), knife({}), unarmed({})\nFlee: not fleeing({}), car({}), foot({})'.format(shot,taser,gun,knife,unarmed,not_fleeing,car,foot))

<h3>Ploting Observation</h3>

We notice that in crime scenes police mainly shoot the offenders and that most of the offenders are armed with gun or knife. So we could asume that Police may were at self defence. From the other side we notice a significant big number of unarmed, or people that trying to flee with car or foot and get shot to death by Police, something that could considered abuse of power.

<h2>Deeper gender Relationship with crimes</h2>

In [None]:
#Deeper Relationships with gender
u_males = len(df.loc[(df.armed == 'unarmed') & (df.gender == 'M')])
threat_males = len(df.loc[(df.threat_level == 'attack') & (df.gender == 'M')])
flee_males = len(df.loc[(df.flee != 'Not fleeing') & (df.gender == 'M')])
u_females = len(df.loc[(df.armed == 'unarmed') & (df.gender == 'F')])
threat_females = len(df.loc[(df.threat_level == 'attack') & (df.gender == 'F')])
flee_females = len(df.loc[(df.flee != 'Not fleeing') & (df.gender == 'F')])
pltlist = [u_males, u_females,threat_males,threat_females,flee_males,flee_females]
f, ax = plt.subplots(2,1,figsize=(15,16))
for i in range(3):
    ax[0].bar(3*i+1, pltlist[2*i+1], color = ['red'], label='female')
    ax[0].bar(3*i, pltlist[2*i], color = ['blue'],label='male')
ax[0].set_xticks([0.5,3.5,6.5]) 
ax[0].set_xticklabels(['unarmed', 'attack', 'flee'])
ax[0].legend(['males','females'])
ax[0].set_title('Compare Actual Number of male and female victims in different cases', fontsize=15)
for i in range(3):
    ax[1].bar(3*i+1, pltlist[2*i+1]/len(df.loc[(df.gender == 'F')]) *100, color = ['red'], label='female')
    ax[1].bar(3*i, pltlist[2*i]/len(df.loc[(df.gender == 'M')]) *100, color = ['blue'],label='male')
ax[1].set_xticks([0.5,3.5,6.5]) 
ax[1].set_xticklabels(['unarmed', 'attack', 'flee'])
ax[1].legend(['males','females'])
ax[1].set_title('Compare Percentages of male and female victims in different cases', fontsize=15)
ax[1].set_ylabel('100%')

<h3>Plotting Observations</h3>

We can notice that more men are shot when attack or try to flee than women. And from percentages comparison we can see that relatively more women are unarmed than men but still in total both women and men are mostly armed when get shot.

<h2>Area Factor</h2>

<h3> Tablue Plots for cities </h3>
<img src="https://storage.googleapis.com/kagglesdsdata/datasets%2F783953%2F1347330%2Fcities.png?GoogleAccessId=databundle-worker-v2@kaggle-161607.iam.gserviceaccount.com&Expires=1595196820&Signature=Gx8gs3DBHL%2BrYWk%2FazdQOgfOap%2FsXF%2FDdsUCQNPRgKFVMr%2B6z3QmbNe%2FOq%2Bv0Ic2ZaO19dgBqbDU%2FPXEcVWTGSekbI1ORX7sn3Z8bUhXrrSEpxI97YbFQvYtS47kCj6gbCxdNBhsmnFdd%2B2M8DwdrRcr05tKT70kx%2FVPAI5dHPAf%2FfKBBXeEFMP2aYdB3voyyAjxusXZuzq7W0G6PPCV8Hf2AvlSmgLJJyhWKNmKKJ1JjWLQN83KiLBGeT2Yv%2FIWt%2Fpq509qD4pIH8F%2BJTWkdWrlUQzsu5gJveyDrSOY1nA%2BLtVg97V1Woo%2FBqv9fxQQizOBIV8JKr3kuCxpokT95w%3D%3D" alt="Cities" width="300" height="100">

<h3>Plotting Observation</h3>

We can notice that cases are spread in whole US so we could assume that city factor is weak for police shooting.

<h2>Uneder aged Victims</h2>

In [None]:
youngs = df.loc[(df.age<18)]
num_of_youngs = len(youngs)
lowest_age = min(youngs.age.unique())
unarmed = len(youngs.loc[youngs.armed == 'unarmed'])
women = len(youngs.loc[youngs.gender == 'F'])
print('Number of under aged children: {}\nLowest age: {}'.format(num_of_youngs, lowest_age))
youngs.loc[youngs.age == 6]
f, ax = plt.subplots(1,2, figsize=(15,6))
ax[0].bar([1,2],[unarmed, num_of_youngs-unarmed])
ax[0].bar([4,5],[women, num_of_youngs-women])
ax[0].set_xticks([1,2,4,5])
ax[0].set_xticklabels(['unarmed','armed','Females','Males'])
ax[0].set_title('Under aged Victims',fontsize=15)
ax[1].hist(youngs.age, bins = range(1,18))
ax[1].set_title('Age Distribution for Under aged Victims',fontsize=15)

<h3>Plotting Observations</h3>

We can notice that there are many under 18 age victims by police. The most of them are armed, males and around 17. The most tragic is that we see 2 cases of 6 year old victims. 

<h2>Mental Illness Factor</h2>

In [None]:
illenss = len(df.loc[df.signs_of_mental_illness == True])
noillenss = len(df.loc[df.signs_of_mental_illness == False])
illenss_armed = len(df.loc[(df.signs_of_mental_illness == True) & (df.armed != 'unarmed')])
illenss_unarmed = len(df.loc[(df.signs_of_mental_illness == True) & (df.armed == 'unarmed')])
plt.bar(['Mental Illness', 'No Mental illness'], [illenss,noillenss], width= 0.5)
plt.title('Mental Illness vs No mental Illness')
plt.show()
plt.bar(['armed', 'unarmed'], [illenss_armed,illenss_unarmed], width= 0.5)
plt.title('Mental Illness armed vs unarmed')
print('Mental Illness: {} ({}%)\nNo Mental illness: {} ({}%)'.format(illenss,illenss/(illenss+noillenss)*100,noillenss,noillenss/(illenss+noillenss)*100))

<h3>Plotting Observations</h3>

We can see that almost 1 out of 4 cases includes mentally ill person. However we should remind ourselves that dataset refer to it as sign of mental illness so there always probabity that the record was a mistake. But still we assume that Police tried to protect citizens for mental unstable persons because the were most of them armed.


<h2>Some conclusions</h2>

We see that the biggest factor in police shooting is the gender because mostly men are shooted and also time's, race's and age influce is low or none. In addition to this area seems to have low influence on police shooting reports as the cases are spread in almsot whole US. The most victims were armed and mostly with gun so we can reasonably assume that police tried to protect citizens and themselves, so the shooted. Also in defend to police many victims were extremely dangarous as they showed signs of mental illness and they were also armed.

However we can say that in many cases Police could probably avoid shooting, as some cases were children low or not at all armed and also unarmed people who tried to flee, so propably they were not direct danger for citizens. 

It is important to remember that Police protect citizens even if they need to use weapons and violence but always in moderation, without a trace of abuse or personal empathy and aware that they are defending the law while respecting and trying to save the lives of all people even offener's if possible...