# Exploring gun-related deaths in the US from 2012 to 2014

In this project I am exploring a dataset that contains information on gun deaths in the US from 2012 to 2014. Each row in the dataset represents a single fatality. The columns contain demographic and other information about the victim. Primarily this is another opportunity for me to practice my python analysis skills.
The dataset was obtained from _FiveThirtyEight_

In [1]:
# Preparing the dataset for analysis

import csv
f = open('guns.csv', 'r')
data = list(csv.reader(f))

# data preview
print(data[:5])




[['', 'year', 'month', 'intent', 'police', 'sex', 'age', 'race', 'hispanic', 'place', 'education'], ['1', '2012', '01', 'Suicide', '0', 'M', '34', 'Asian/Pacific Islander', '100', 'Home', '4'], ['2', '2012', '01', 'Suicide', '0', 'F', '21', 'White', '100', 'Street', '3'], ['3', '2012', '01', 'Suicide', '0', 'M', '60', 'White', '100', 'Other specified', '4'], ['4', '2012', '02', 'Suicide', '0', 'M', '64', 'White', '100', 'Home', '4']]


In [2]:
# removing header row from data

headers = data[0]
data = data[1:]

print(headers)    #Header Row
print(data[:5])   #First Five Rows of Data without Header


['', 'year', 'month', 'intent', 'police', 'sex', 'age', 'race', 'hispanic', 'place', 'education']
[['1', '2012', '01', 'Suicide', '0', 'M', '34', 'Asian/Pacific Islander', '100', 'Home', '4'], ['2', '2012', '01', 'Suicide', '0', 'F', '21', 'White', '100', 'Street', '3'], ['3', '2012', '01', 'Suicide', '0', 'M', '60', 'White', '100', 'Other specified', '4'], ['4', '2012', '02', 'Suicide', '0', 'M', '64', 'White', '100', 'Home', '4'], ['5', '2012', '02', 'Suicide', '0', 'M', '31', 'White', '100', 'Other specified', '2']]


### Gun-Related Deaths By Year

Now that the data has been prepped we will look at how many deaths took place each of the years we are analyzing. We will store in a dictionary


In [3]:
def demographic_dictionary(demographic):
    
    demo_name = []
    demo_index = headers.index(demographic)
    for row in data:
        demo_name.append(row[demo_index])

    #creating and populating the demographic counts dictionary

    demo_counts = {}
    for demo in demo_name:
        if demo in demo_counts:
            demo_counts[demo] += 1
        else:
            demo_counts[demo] = 1
    return demo_counts

year_counts = demographic_dictionary('year')
print(year_counts)

{'2012': 33563, '2013': 33636, '2014': 33599}


As can be seen above the gun related deaths 2012, 2013 and 2014 were respectively 33,562, 33,636, and 33,599 which doesn't suggest an increase or decrease in gun related deaths over the period. [Click here](https://public.tableau.com/views/GunDeaths2012-2014/YearlyGun-RelatedDeathTotals?:embed=y&:display_count=yes) to visualize the above data in a pie chart.
Let's see if gun deaths in the US change by month and year

### Gun Related Deaths By Month

In [4]:
"""
creating the dates dataset since we do not have a 
specific day we will assign the first of each month to the dates
database unfortunately we will not be able to use demo_counts function on dates
"""

import datetime
dates = []

for row in data:
    date = (datetime.datetime(year = int(row[1]),month = int(row[2]), day =1))
    dates.append(date)

#uncomment print statement to view dates    
#print(dates[:5])

#creating and populating the date counts dictionary  

date_counts = {}
for date in dates:
    if date in date_counts:
        date_counts[date] += 1
    else:
        date_counts[date] = 1
         
print(date_counts)
    
    

{datetime.datetime(2012, 1, 1, 0, 0): 2758, datetime.datetime(2012, 2, 1, 0, 0): 2357, datetime.datetime(2012, 3, 1, 0, 0): 2743, datetime.datetime(2012, 4, 1, 0, 0): 2795, datetime.datetime(2012, 5, 1, 0, 0): 2999, datetime.datetime(2012, 6, 1, 0, 0): 2826, datetime.datetime(2012, 7, 1, 0, 0): 3026, datetime.datetime(2012, 8, 1, 0, 0): 2954, datetime.datetime(2012, 9, 1, 0, 0): 2852, datetime.datetime(2012, 10, 1, 0, 0): 2733, datetime.datetime(2012, 11, 1, 0, 0): 2729, datetime.datetime(2012, 12, 1, 0, 0): 2791, datetime.datetime(2013, 1, 1, 0, 0): 2864, datetime.datetime(2013, 2, 1, 0, 0): 2375, datetime.datetime(2013, 3, 1, 0, 0): 2862, datetime.datetime(2013, 4, 1, 0, 0): 2798, datetime.datetime(2013, 5, 1, 0, 0): 2806, datetime.datetime(2013, 6, 1, 0, 0): 2920, datetime.datetime(2013, 7, 1, 0, 0): 3079, datetime.datetime(2013, 8, 1, 0, 0): 2859, datetime.datetime(2013, 9, 1, 0, 0): 2742, datetime.datetime(2013, 10, 1, 0, 0): 2808, datetime.datetime(2013, 11, 1, 0, 0): 2758, datet

Over the 3 year span there appears to be a seasonal trend with most gun-related deaths happening in the summer. It is unclear why this might be the case. Also if you look at this [chart](https://public.tableau.com/views/GunDeaths2012-2014/MonthlyGun-RelatedDeathTotals?:embed=y&:display_count=yes) you will notice that there is sharp decrease in gun-related deaths in the month of February each year. If we are to assume that gun related deaths are uniformely distributed then the drop could be explained by the fact that February has less days than all other months, although admitedly this does not seem to capture the whole drop.


### Sex and Race

In [5]:
sex_counts = demographic_dictionary('sex')
print(sex_counts)
race_counts = demographic_dictionary('race')
print(race_counts)



{'M': 86349, 'F': 14449}
{'Asian/Pacific Islander': 1326, 'White': 66237, 'Native American/Native Alaskan': 917, 'Black': 23296, 'Hispanic': 9022}


###### Sex

Now we begin to see some disparities in gun related deaths. There are approximately 6 times as many Males as Females suffering from gun-related deaths which can be visualized [here](https://public.tableau.com/shared/KCFX8T678?:display_count=yes).

###### Race
As well there is a significant disparity by race which can be seen [here](https://public.tableau.com/views/GunDeaths2012-2014/Race?:embed=y&:display_count=yes). However further analysis here is required:

We will now look at some census data to so as to ascertain the rate of gun deaths per 100,000 people of each race

In [6]:
# Importing the Census dataset to aid in the analysis

import csv
f = open('census.csv', 'r')
census = list(csv.reader(f))

#Census Data set
print(census)

[['Id', 'Year', 'Id', 'Sex', 'Id', 'Hispanic Origin', 'Id', 'Id2', 'Geography', 'Total', 'Race Alone - White', 'Race Alone - Hispanic', 'Race Alone - Black or African American', 'Race Alone - American Indian and Alaska Native', 'Race Alone - Asian', 'Race Alone - Native Hawaiian and Other Pacific Islander', 'Two or More Races'], ['cen42010', 'April 1, 2010 Census', 'totsex', 'Both Sexes', 'tothisp', 'Total', '0100000US', '', 'United States', '308745538', '197318956', '44618105', '40250635', '3739506', '15159516', '674625', '6984195']]


In [7]:

## mapping the census data to our race demographics
mapping = {
    'Asian/Pacific Islander': int(census[1][14]) + int(census[1][15]),
    'Black': int(census[1][12]),
    'Hispanic': int(census[1][11]),
    'Native American/Native Alaskan': int(census[1][13]),
    'White': int(census[1][10])
}

## creating the race per hundred thousand dictionary

race_per_hundredk = {}

for key in race_counts:
    race_per_hundredk[key] = race_counts[key] / mapping[key] *100000

race_per_hundredk


{'Asian/Pacific Islander': 8.374309664161762,
 'Black': 57.8773477735196,
 'Hispanic': 20.220491210910907,
 'Native American/Native Alaskan': 24.521955573811088,
 'White': 33.56849303419181}

Having looked at the different populations we can now see the differences in gun-related death rates per 100,000 people of any particular race. Black people have the highest death rate per hundred thousand at nearly 58 despite being the 3rd largest demographic and having significantly fewer people than White or Hispanic people.

##### Filtering for intent
We can furthermore analyze these numbers to see which of these deaths had homicidal intent:

In [8]:
def list_comp(demographic):
    
    demo_name = []
    demo_index = headers.index(demographic)
    for row in data:
        demo_name.append(row[demo_index])
    return demo_name

intents = list_comp('intent')
races = list_comp('race')

homicide_race_counts = {}

for i, race in enumerate(races):
    if intents[i] == 'Homicide':
        if race in homicide_race_counts:
            homicide_race_counts[race] += 1
        else:
            homicide_race_counts[race] = 1
homicide_race_counts

homicide_race_per_hundredk = {}

for key in homicide_race_counts:
    homicide_race_per_hundredk[key] = homicide_race_counts[key] / mapping[key] *100000

homicide_race_per_hundredk


{'Asian/Pacific Islander': 3.530346230970155,
 'Black': 48.471284987180944,
 'Hispanic': 12.627161104219914,
 'Native American/Native Alaskan': 8.717729026240365,
 'White': 4.6356417981453335}

## Conclusion

It appears that gun related homicides in the US disproportionately affect people in the Black and Hispanic racial categories.
Some areas to investigate further:

- The link between month and homicide rate.
- Homicide rate by gender.
- The rates of other intents by gender and race.
- Gun death rates by location and education.