## Gun Deaths in the US, 2012-2014

This is an analysis of gun deaths in the US between the years 2012 and 2014. The data comes from FiveThirtyEight, specifcally [here](https://github.com/fivethirtyeight/guns-data).

In [2]:
import csv

f = open("guns.csv", "r")
data = list(csv.reader(f))
print(data[:5])

[['', 'year', 'month', 'intent', 'police', 'sex', 'age', 'race', 'hispanic', 'place', 'education'], ['1', '2012', '01', 'Suicide', '0', 'M', '34', 'Asian/Pacific Islander', '100', 'Home', '4'], ['2', '2012', '01', 'Suicide', '0', 'F', '21', 'White', '100', 'Street', '3'], ['3', '2012', '01', 'Suicide', '0', 'M', '60', 'White', '100', 'Other specified', '4'], ['4', '2012', '02', 'Suicide', '0', 'M', '64', 'White', '100', 'Home', '4']]


In [3]:
headers = data[0]
data = data[1:]
print(headers)
print(data[:5])

['', 'year', 'month', 'intent', 'police', 'sex', 'age', 'race', 'hispanic', 'place', 'education']
[['1', '2012', '01', 'Suicide', '0', 'M', '34', 'Asian/Pacific Islander', '100', 'Home', '4'], ['2', '2012', '01', 'Suicide', '0', 'F', '21', 'White', '100', 'Street', '3'], ['3', '2012', '01', 'Suicide', '0', 'M', '60', 'White', '100', 'Other specified', '4'], ['4', '2012', '02', 'Suicide', '0', 'M', '64', 'White', '100', 'Home', '4'], ['5', '2012', '02', 'Suicide', '0', 'M', '31', 'White', '100', 'Other specified', '2']]


In [4]:
import datetime

dates = [datetime.datetime(year=int(d[1]), month=int(d[2]), day=1) for d in data]
print(dates[:5])

date_counts = {}
for d in dates:
    if d in date_counts:
        date_counts[d] += 1
    else:
        date_counts[d] = 1
print(date_counts)

[datetime.datetime(2012, 1, 1, 0, 0), datetime.datetime(2012, 1, 1, 0, 0), datetime.datetime(2012, 1, 1, 0, 0), datetime.datetime(2012, 2, 1, 0, 0), datetime.datetime(2012, 2, 1, 0, 0)]
{datetime.datetime(2012, 3, 1, 0, 0): 2743, datetime.datetime(2014, 8, 1, 0, 0): 2970, datetime.datetime(2014, 2, 1, 0, 0): 2361, datetime.datetime(2014, 7, 1, 0, 0): 2884, datetime.datetime(2014, 4, 1, 0, 0): 2862, datetime.datetime(2014, 6, 1, 0, 0): 2931, datetime.datetime(2012, 6, 1, 0, 0): 2826, datetime.datetime(2012, 11, 1, 0, 0): 2729, datetime.datetime(2014, 9, 1, 0, 0): 2914, datetime.datetime(2014, 3, 1, 0, 0): 2684, datetime.datetime(2014, 1, 1, 0, 0): 2651, datetime.datetime(2013, 10, 1, 0, 0): 2808, datetime.datetime(2014, 5, 1, 0, 0): 2864, datetime.datetime(2012, 1, 1, 0, 0): 2758, datetime.datetime(2012, 10, 1, 0, 0): 2733, datetime.datetime(2014, 10, 1, 0, 0): 2865, datetime.datetime(2013, 5, 1, 0, 0): 2806, datetime.datetime(2013, 3, 1, 0, 0): 2862, datetime.datetime(2014, 11, 1, 0, 0

In [5]:
sex_counts = {}
for d in data:
    if d[5] in sex_counts:
        sex_counts[d[5]] += 1
    else:
        sex_counts[d[5]] = 1

race_counts = {}
for d in data:
    if d[7] in race_counts:
        race_counts[d[7]] += 1
    else:
        race_counts[d[7]] = 1

print(sex_counts)
print(race_counts)

{'F': 14449, 'M': 86349}
{'Native American/Native Alaskan': 917, 'Hispanic': 9022, 'White': 66237, 'Asian/Pacific Islander': 1326, 'Black': 23296}


## Patterns so far
* Most gun deaths are for race = white and sex = M, but is there a further correlation here?
* It may help to explore this further to find more patterns like age, education status, and if police were involved

In [7]:
f2 = open("census.csv", "r")
census = list(csv.reader(f2))
census

[['Id',
  'Year',
  'Id',
  'Sex',
  'Id',
  'Hispanic Origin',
  'Id',
  'Id2',
  'Geography',
  'Total',
  'Race Alone - White',
  'Race Alone - Hispanic',
  'Race Alone - Black or African American',
  'Race Alone - American Indian and Alaska Native',
  'Race Alone - Asian',
  'Race Alone - Native Hawaiian and Other Pacific Islander',
  'Two or More Races'],
 ['cen42010',
  'April 1, 2010 Census',
  'totsex',
  'Both Sexes',
  'tothisp',
  'Total',
  '0100000US',
  '',
  'United States',
  '308745538',
  '197318956',
  '44618105',
  '40250635',
  '3739506',
  '15159516',
  '674625',
  '6984195']]

In [8]:
mapping = {"Asian/Pacific Islander": 15834141, 
           "Black": 40250635, 
           "Native American/Native Alaskan": 3739506, 
           "Hispanic": 44618105, 
           "White": 197318956}
race_per_hundredk = {}
for r, v in race_counts.items():
    race_per_hundredk[r] = (v / mapping[r]) * 100000
print(race_per_hundredk)

{'Native American/Native Alaskan': 24.521955573811088, 'Hispanic': 20.220491210910907, 'White': 33.56849303419181, 'Asian/Pacific Islander': 8.374309664161762, 'Black': 57.8773477735196}


In [12]:
intent = [d[3] for d in data]
races = [d[7] for d in data]
homicide_race_counts = {}
for i, race in enumerate(races):
    if intent[i] == "Suicide":
        if race in homicide_race_counts:
            homicide_race_counts[race] += 1
        else:
            homicide_race_counts[race] = 1
homicide_race_per_hundredk = {}
for r, v in homicide_race_counts.items():
    homicide_race_per_hundredk[r] = (v / mapping[r]) * 100000
print(homicide_race_per_hundredk)

{'Native American/Native Alaskan': 14.841532544673013, 'Hispanic': 7.106980451097149, 'White': 28.06217969245692, 'Asian/Pacific Islander': 4.705023152187416, 'Black': 8.278130270491385}


# More findings
* If Homicide is the intent, the black race is significantly more likely to be affected by gun death (over 10 times that of whites and 4 times that of Hispanic)
* If Suicide is the intent, the white race is significantly more likely to be affected by gun death (2 times that of Native American and over 3 times that of Black or Hispanic)

# Next steps
* Investigate further if/where age comes into play as well as education