# Studying US Gun Deaths

## Introducing the dataset

The dataset is stored in the file **guns.csv** and can be found __[here](https://github.com/fivethirtyeight/guns-data)__.

It contains information on gun deaths in the US from 2012 to 2014. Each row in the dataset represents a fatality. The columns contain demographic and other information about the victim.

In [16]:
import csv
f = open('guns.csv', 'r')
csvreader = csv.reader(f)
data = list(csvreader)
print(data[0:4])

[['', 'year', 'month', 'intent', 'police', 'sex', 'age', 'race', 'hispanic', 'place', 'education'], ['1', '2012', '01', 'Suicide', '0', 'M', '34', 'Asian/Pacific Islander', '100', 'Home', '4'], ['2', '2012', '01', 'Suicide', '0', 'F', '21', 'White', '100', 'Street', '3'], ['3', '2012', '01', 'Suicide', '0', 'M', '60', 'White', '100', 'Other specified', '4']]


In [17]:
headers = data[0]
data = data[1:]
print(headers)
print(data[0:4])

['', 'year', 'month', 'intent', 'police', 'sex', 'age', 'race', 'hispanic', 'place', 'education']
[['1', '2012', '01', 'Suicide', '0', 'M', '34', 'Asian/Pacific Islander', '100', 'Home', '4'], ['2', '2012', '01', 'Suicide', '0', 'F', '21', 'White', '100', 'Street', '3'], ['3', '2012', '01', 'Suicide', '0', 'M', '60', 'White', '100', 'Other specified', '4'], ['4', '2012', '02', 'Suicide', '0', 'M', '64', 'White', '100', 'Home', '4']]


In [18]:
years = [lst[1] for lst in data]

year_counts ={}
for each_year in years:
    if each_year in year_counts:
        year_counts[each_year] = year_counts[each_year] + 1
    else:
        year_counts[each_year] = 1
year_counts

{'2012': 33563, '2013': 33636, '2014': 33599}

In [19]:
import datetime

dates = [datetime.datetime(year=int(row[1]), month=int(row[2]), day = 1) for row in data]

dates[0:4]

[datetime.datetime(2012, 1, 1, 0, 0),
 datetime.datetime(2012, 1, 1, 0, 0),
 datetime.datetime(2012, 1, 1, 0, 0),
 datetime.datetime(2012, 2, 1, 0, 0)]

In [20]:
date_counts ={}
for date in dates:
    if date in date_counts:
        date_counts[date] = date_counts[date] + 1
    else:
        date_counts[date] = 1
date_counts

{datetime.datetime(2012, 1, 1, 0, 0): 2758,
 datetime.datetime(2012, 2, 1, 0, 0): 2357,
 datetime.datetime(2012, 3, 1, 0, 0): 2743,
 datetime.datetime(2012, 4, 1, 0, 0): 2795,
 datetime.datetime(2012, 5, 1, 0, 0): 2999,
 datetime.datetime(2012, 6, 1, 0, 0): 2826,
 datetime.datetime(2012, 7, 1, 0, 0): 3026,
 datetime.datetime(2012, 8, 1, 0, 0): 2954,
 datetime.datetime(2012, 9, 1, 0, 0): 2852,
 datetime.datetime(2012, 10, 1, 0, 0): 2733,
 datetime.datetime(2012, 11, 1, 0, 0): 2729,
 datetime.datetime(2012, 12, 1, 0, 0): 2791,
 datetime.datetime(2013, 1, 1, 0, 0): 2864,
 datetime.datetime(2013, 2, 1, 0, 0): 2375,
 datetime.datetime(2013, 3, 1, 0, 0): 2862,
 datetime.datetime(2013, 4, 1, 0, 0): 2798,
 datetime.datetime(2013, 5, 1, 0, 0): 2806,
 datetime.datetime(2013, 6, 1, 0, 0): 2920,
 datetime.datetime(2013, 7, 1, 0, 0): 3079,
 datetime.datetime(2013, 8, 1, 0, 0): 2859,
 datetime.datetime(2013, 9, 1, 0, 0): 2742,
 datetime.datetime(2013, 10, 1, 0, 0): 2808,
 datetime.datetime(2013, 11,

In [21]:
sexes = [lst[5] for lst in data]

sex_counts ={}
for each_sex in sexes:
    if each_sex in sex_counts:
        sex_counts[each_sex] = sex_counts[each_sex] + 1
    else:
        sex_counts[each_sex] = 1
sex_counts

{'F': 14449, 'M': 86349}

In [22]:
races = [lst[7] for lst in data]

race_counts ={}
for each_race in races:
    if each_race in race_counts:
        race_counts[each_race] = race_counts[each_race] + 1
    else:
        race_counts[each_race] = 1
race_counts

{'Asian/Pacific Islander': 1326,
 'Black': 23296,
 'Hispanic': 9022,
 'Native American/Native Alaskan': 917,
 'White': 66237}

1. Number of gun deaths in the US over the course of 3 years have been somewhat constant
2. In 2012, gun deaths each month were similar with the highest occurring in July. This is the same for 2013 as well. Highest deaths in a month for both these years surpassed 3000 per month. 2014 had a somewhat constant number of deaths per month.
3. In these 3 years men were killed almost 6 times more than women.
4. White Americans had the highest number of deaths followed by black americans. Native americans had the fewest deaths due to guns.

Point 4 may not give a proper idea of deaths for different races as white americans make up majority of the population. Instead of studying number of deahts by races, the deaths od races in propertion to the total population should be studied. 

In [23]:
import csv
f2 = open('census.csv', 'r')
census = list(csv.reader(f2))
census

[['Id',
  'Year',
  'Id',
  'Sex',
  'Id',
  'Hispanic Origin',
  'Id',
  'Id2',
  'Geography',
  'Total',
  'Race Alone - White',
  'Race Alone - Hispanic',
  'Race Alone - Black or African American',
  'Race Alone - American Indian and Alaska Native',
  'Race Alone - Asian',
  'Race Alone - Native Hawaiian and Other Pacific Islander',
  'Two or More Races'],
 ['cen42010',
  'April 1, 2010 Census',
  'totsex',
  'Both Sexes',
  'tothisp',
  'Total',
  '0100000US',
  '',
  'United States',
  '308745538',
  '197318956',
  '44618105',
  '40250635',
  '3739506',
  '15159516',
  '674625',
  '6984195']]

In [27]:
mapping = {
    "Asian/Pacific Islander" : int(census[1][14]) + int(census[1][15]),
    "Black" : int(census[1][12]),
    "Native American/Native Alaskan" :int(census[1][13]) ,
    "Hispanic" : int(census[1][11]),
    "White" : int(census[1][10])
    }

race_per_hundredk = {}
for key, value in race_counts.items():
    each_race_calc = (value / mapping[key]) * 100000
    race_per_hundredk[key] = each_race_calc
    
race_per_hundredk

{'Asian/Pacific Islander': 8.374309664161762,
 'Black': 57.8773477735196,
 'Hispanic': 20.220491210910907,
 'Native American/Native Alaskan': 24.521955573811088,
 'White': 33.56849303419181}

In [34]:
intent = [row[3] for row in data]
races =  [row[7] for row in data]
homicide_race_counts = {}

for i, race in enumerate(races):
    if race not in homicide_race_counts:
        homicide_race_counts[race] = 0
    if intent[i] == "Homicide":
        homicide_race_counts[race] += 1

race_per_hundredk = {}
for key, value in homicide_race_counts.items():
    each_race_calc = (value / mapping[key]) * 100000
    race_per_hundredk[key] = each_race_calc
    
race_per_hundredk

{'Asian/Pacific Islander': 3.530346230970155,
 'Black': 48.471284987180944,
 'Hispanic': 12.627161104219914,
 'Native American/Native Alaskan': 8.717729026240365,
 'White': 4.6356417981453335}