# Analyzing Gun Deaths in the US 

In this project, I explored a dataset from FiveThirtyEight on gun deaths in the United States from 2012-2014, read in as "guns.csv" and available in the github repository.  Each row in the dataset represents a single fatality. The columns contain demographic and other information about the victim, as well as the intent behind the gun death event. 

In [26]:
import csv 

f = open("guns.csv", 'r')
csvreader = csv.reader(f)
data = list(csvreader)

data[:5]

print (data[:5])

len(data[0])


[['', 'year', 'month', 'intent', 'police', 'sex', 'age', 'race', 'hispanic', 'place', 'education'], ['1', '2012', '01', 'Suicide', '0', 'M', '34', 'Asian/Pacific Islander', '100', 'Home', '4'], ['2', '2012', '01', 'Suicide', '0', 'F', '21', 'White', '100', 'Street', '3'], ['3', '2012', '01', 'Suicide', '0', 'M', '60', 'White', '100', 'Other specified', '4'], ['4', '2012', '02', 'Suicide', '0', 'M', '64', 'White', '100', 'Home', '4']]


11

#### Here I am removing the header from the dataset and preparing it for analysis 

In [27]:
headers = data[0]
data = data[1:]
print(headers) 
print(data[:5])

['', 'year', 'month', 'intent', 'police', 'sex', 'age', 'race', 'hispanic', 'place', 'education']
[['1', '2012', '01', 'Suicide', '0', 'M', '34', 'Asian/Pacific Islander', '100', 'Home', '4'], ['2', '2012', '01', 'Suicide', '0', 'F', '21', 'White', '100', 'Street', '3'], ['3', '2012', '01', 'Suicide', '0', 'M', '60', 'White', '100', 'Other specified', '4'], ['4', '2012', '02', 'Suicide', '0', 'M', '64', 'White', '100', 'Home', '4'], ['5', '2012', '02', 'Suicide', '0', 'M', '31', 'White', '100', 'Other specified', '2']]


In [28]:
years = [i[1] for i in data] 

len(years)

100798

## *Counting gun deaths per year*

In [29]:
year_counts = {} 

for year in years: 
    if year in year_counts: 
        year_counts[year] += 1 
    else: 
        year_counts[year] = 1
    
print(year_counts)
        

{'2012': 33563, '2013': 33636, '2014': 33599}


## *Exploring gun deaths by month and year*

In [30]:
import datetime

dates = [datetime.datetime(year = int(i[1]), month = int(i[2]), day = 1) for i in data]

dates[:5]

[datetime.datetime(2012, 1, 1, 0, 0),
 datetime.datetime(2012, 1, 1, 0, 0),
 datetime.datetime(2012, 1, 1, 0, 0),
 datetime.datetime(2012, 2, 1, 0, 0),
 datetime.datetime(2012, 2, 1, 0, 0)]

In [31]:
date_counts = {} 

for date in dates: 
    if date in date_counts: 
        date_counts[date] += 1
    else: 
        date_counts[date] = 1 
        
date_counts
    

{datetime.datetime(2012, 1, 1, 0, 0): 2758,
 datetime.datetime(2012, 2, 1, 0, 0): 2357,
 datetime.datetime(2012, 3, 1, 0, 0): 2743,
 datetime.datetime(2012, 4, 1, 0, 0): 2795,
 datetime.datetime(2012, 5, 1, 0, 0): 2999,
 datetime.datetime(2012, 6, 1, 0, 0): 2826,
 datetime.datetime(2012, 7, 1, 0, 0): 3026,
 datetime.datetime(2012, 8, 1, 0, 0): 2954,
 datetime.datetime(2012, 9, 1, 0, 0): 2852,
 datetime.datetime(2012, 10, 1, 0, 0): 2733,
 datetime.datetime(2012, 11, 1, 0, 0): 2729,
 datetime.datetime(2012, 12, 1, 0, 0): 2791,
 datetime.datetime(2013, 1, 1, 0, 0): 2864,
 datetime.datetime(2013, 2, 1, 0, 0): 2375,
 datetime.datetime(2013, 3, 1, 0, 0): 2862,
 datetime.datetime(2013, 4, 1, 0, 0): 2798,
 datetime.datetime(2013, 5, 1, 0, 0): 2806,
 datetime.datetime(2013, 6, 1, 0, 0): 2920,
 datetime.datetime(2013, 7, 1, 0, 0): 3079,
 datetime.datetime(2013, 8, 1, 0, 0): 2859,
 datetime.datetime(2013, 9, 1, 0, 0): 2742,
 datetime.datetime(2013, 10, 1, 0, 0): 2808,
 datetime.datetime(2013, 11,

There appears to be a minor seasonal correlation, with gun deaths peaking in the summer and declining in the winter. It might be useful to filter by intent, to see if different categories of intent have different correlations with season, race, or gender.

# *Breaking down the data by sex*

In [32]:
sexes = [i[5] for i in data]

sexes[:5]

['M', 'F', 'M', 'M', 'M']

In [33]:
sex_counts = {} 

for sex in sexes:
    if sex in sex_counts: 
        sex_counts[sex] += 1 
    else: 
        sex_counts[sex] = 1
        
sex_counts
    


{'F': 14449, 'M': 86349}

In [34]:
sex_ratio = (sex_counts['M']/ sex_counts['F'])

print(sex_ratio)

5.976122915080628


As we can see, gundeaths disproportionately affect men. Specifically, men from 2012 to 2014 are almost six times as likely die by gun. 

# *Data Breakdown by Race*

In [35]:
races = [i[7] for i in data]

races[:5]

['Asian/Pacific Islander', 'White', 'White', 'White', 'White']

In [36]:
race_counts = {} 

for race in races: 
    if race in race_counts:
        race_counts[race] += 1 
    else: 
        race_counts[race] = 1 
        
race_counts 

{'Asian/Pacific Islander': 1326,
 'Black': 23296,
 'Hispanic': 9022,
 'Native American/Native Alaskan': 917,
 'White': 66237}

They also seem to disproportionately affect minorities, although having some data on the percentage of each race in the overall US population would help. 

Therefore, we will now read in a file on population information from the US Bureau so that we can ultimately determine the rate of gun deaths per 100,000 people of each race. 






### Reading in a second file from the US Census Bureau, on population demographics for background comparison

In [37]:
p = open("census.csv", 'r')
csvreader = csv.reader(p)
census = list(csvreader)

census


[['Id',
  'Year',
  'Id',
  'Sex',
  'Id',
  'Hispanic Origin',
  'Id',
  'Id2',
  'Geography',
  'Total',
  'Race Alone - White',
  'Race Alone - Hispanic',
  'Race Alone - Black or African American',
  'Race Alone - American Indian and Alaska Native',
  'Race Alone - Asian',
  'Race Alone - Native Hawaiian and Other Pacific Islander',
  'Two or More Races'],
 ['cen42010',
  'April 1, 2010 Census',
  'totsex',
  'Both Sexes',
  'tothisp',
  'Total',
  '0100000US',
  '',
  'United States',
  '308745538',
  '197318956',
  '44618105',
  '40250635',
  '3739506',
  '15159516',
  '674625',
  '6984195']]

#### The census data contains differently named racial categories than our original dataset, so we must map these catigories to the original dataset. 

In [38]:
mapping = {'Asian/Pacific Islander': 15159516 + 674625, 'Black': 40250635, 'Native American/Native Alaskan': 3739506, 'Hispanic': 44618105, 'White': 197318956}   
          

mapping
          

{'Asian/Pacific Islander': 15834141,
 'Black': 40250635,
 'Hispanic': 44618105,
 'Native American/Native Alaskan': 3739506,
 'White': 197318956}

In [39]:
race_per_hundredk = {}

for i in race_counts: 
    ratio = race_counts[i]/ mapping[i]
    fixed_ratio = ratio * 100000
    race_per_hundredk[i] = fixed_ratio 
    
race_per_hundredk

{'Asian/Pacific Islander': 8.374309664161762,
 'Black': 57.8773477735196,
 'Hispanic': 20.220491210910907,
 'Native American/Native Alaskan': 24.521955573811088,
 'White': 33.56849303419181}

We can see that there are almost twice as many black victims of gun death per 100,000 people as white people. 

In [40]:
intents = [i[3] for i in data] 

intents[:5]

['Suicide', 'Suicide', 'Suicide', 'Suicide', 'Suicide']

In [41]:
homicide_race_counts = {}

for i,race in enumerate(races):
    if race not in homicide_race_counts:
        homicide_race_counts[race] = 0
    if intents[i] == "Homicide":
        homicide_race_counts[race] += 1
            
homicide_race_counts



{'Asian/Pacific Islander': 559,
 'Black': 19510,
 'Hispanic': 5634,
 'Native American/Native Alaskan': 326,
 'White': 9147}

In [42]:
homocide_race_counts_perhundredk = {} 

for i in homicide_race_counts: 
    ratio = homicide_race_counts[i]/ mapping[i]
    fixed_ratio = ratio * 100000
    homocide_race_counts_perhundredk[i] = fixed_ratio 
    
    
homocide_race_counts_perhundredk

{'Asian/Pacific Islander': 3.530346230970155,
 'Black': 48.471284987180944,
 'Hispanic': 12.627161104219914,
 'Native American/Native Alaskan': 8.717729026240365,
 'White': 4.6356417981453335}

##### When we limit the data to gun deaths by homocide, we see that black people are over 10 times more likely to be a victim of gun murder than white people. 


In [43]:
intents_count = {} 

for intent in intents: 
    if intent in intents_count:
        intents_count[intent] += 1
    else: 
        intents_count[intent] = 1 
        
intents_count

{'Accidental': 1639,
 'Homicide': 35176,
 'NA': 1,
 'Suicide': 63175,
 'Undetermined': 807}

#### When we count the frequency of each intent category, we can clearly see that the majority of gun deaths during this period were suicides. 

In [44]:
suicide_race_counts = {} 

for i,race in enumerate(races):
    if race not in suicide_race_counts:
        suicide_race_counts[race] = 0
    if intents[i] == "Suicide":
        suicide_race_counts[race] += 1
            
homicide_race_counts




{'Asian/Pacific Islander': 559,
 'Black': 19510,
 'Hispanic': 5634,
 'Native American/Native Alaskan': 326,
 'White': 9147}

In [45]:
suicide_race_counts_perhundredk = {} 

for i in suicide_race_counts: 
    ratio = suicide_race_counts[i]/ mapping[i]
    fixed_ratio = ratio * 100000
    suicide_race_counts_perhundredk[i] = fixed_ratio 
    
    
suicide_race_counts_perhundredk

{'Asian/Pacific Islander': 4.705023152187416,
 'Black': 8.278130270491385,
 'Hispanic': 7.106980451097149,
 'Native American/Native Alaskan': 14.841532544673013,
 'White': 28.06217969245692}

#### We can see from this calculation that the majority of gun deaths by suicide are white, with a rate per hundred thousand individuals nearly double that of Native American/Native Alaskans, and over triple that of black people. 

### Seasonal Correlations


In [57]:
date_counts

months_count = {} 

for i in date_counts: 
    month = i.month 
    if month in months_count: 
        months_count[month] = date_counts[i] + months_count[month]
    else: 
        months_count[month] = date_counts[i]
    
months_count
        

{1: 8273,
 2: 7093,
 3: 8289,
 4: 8455,
 5: 8669,
 6: 8677,
 7: 8989,
 8: 8783,
 9: 8508,
 10: 8406,
 11: 8243,
 12: 8413}

As we can see, gun deaths seem to rise and fall with summers coming and passing, with the peak number of gun deaths happening in July when combining the data from 2012-2014. This may be because people are more prone to violence and impulsive act of aggression when subjected to hotter temperatures, a phenomenon well studied by social and behavioral scientists. 

##### Possible Next Steps: 

- Break down seasonal correlations for homocide and suicide 
- Find out if gun death rates correlate to location and or education 
- Homicide rate by gender