## US Gun Deaths Data Set

[Original article by FiveThirtyEight about Guns](http://fivethirtyeight.com/features/gun-deaths/)

The data set contains cleaned gun-death data from the CDC for 2012-2014.

### Assignment

- Import the csv
- Read it into a list
- Preview the first 5 entries

In [1]:
import csv

def csv2list(file):
    list_of_lists = []
    with open(file) as f:
        csv_data = csv.reader(f)
        for row in csv_data:
            for i in range(len(row)):
                if row[i].isdigit():
                    row[i] = int(row[i])
            list_of_lists.append(row)   
    
    return list_of_lists

lol = csv2list("guns.csv")
lol[:5]

[['',
  'year',
  'month',
  'intent',
  'police',
  'sex',
  'age',
  'race',
  'hispanic',
  'place',
  'education'],
 [1, 2012, 1, 'Suicide', 0, 'M', 34, 'Asian/Pacific Islander', 100, 'Home', 4],
 [2, 2012, 1, 'Suicide', 0, 'F', 21, 'White', 100, 'Street', 3],
 [3, 2012, 1, 'Suicide', 0, 'M', 60, 'White', 100, 'Other specified', 4],
 [4, 2012, 2, 'Suicide', 0, 'M', 64, 'White', 100, 'Home', 4]]

### Assignment

- Remove the header row from the list of lists
- Save it to a separate list

In [2]:
header_list = lol[0]
header_list

['',
 'year',
 'month',
 'intent',
 'police',
 'sex',
 'age',
 'race',
 'hispanic',
 'place',
 'education']

In [3]:
del lol[0]
lol[:5]

[[1, 2012, 1, 'Suicide', 0, 'M', 34, 'Asian/Pacific Islander', 100, 'Home', 4],
 [2, 2012, 1, 'Suicide', 0, 'F', 21, 'White', 100, 'Street', 3],
 [3, 2012, 1, 'Suicide', 0, 'M', 60, 'White', 100, 'Other specified', 4],
 [4, 2012, 2, 'Suicide', 0, 'M', 64, 'White', 100, 'Home', 4],
 [5, 2012, 2, 'Suicide', 0, 'M', 31, 'White', 100, 'Other specified', 2]]

### Assignment

- Count the number of gun deaths by year
    - It may help to do a list comprehension to get the years
    - Iterate over the years with a dictionary to keep count
    
    

In [4]:
year_list = set([x[1] for x in lol])
deaths_by_year = {}

for year in year_list:
    deaths_by_year[year] = 0
   
for each in lol:
    deaths_by_year[each[1]] += 1

deaths_by_year

{2012: 33563, 2013: 33636, 2014: 33599}

### Assignment

- Import the datetime library
- Create a new list called "dates" with values from the data (set all the day values to 1)    
- Count they number of gun deaths by month and year



In [57]:
from datetime import datetime
dates = []

for each in lol:
    dt = str(each[2]) + ' 01 ' + str(each[1])
    dates.append(dt)

dates = [datetime.strptime(x, "%m %d %Y") for x in dates]

2012-01-01 00:00:00


In [62]:
date_counts = {}

for date in dates:
    if date not in date_counts:
        date_counts[date] = 0
    
    date_counts[date] += 1   

date_counts

{datetime.datetime(2012, 1, 1, 0, 0): 2758,
 datetime.datetime(2012, 2, 1, 0, 0): 2357,
 datetime.datetime(2012, 3, 1, 0, 0): 2743,
 datetime.datetime(2012, 4, 1, 0, 0): 2795,
 datetime.datetime(2012, 5, 1, 0, 0): 2999,
 datetime.datetime(2012, 6, 1, 0, 0): 2826,
 datetime.datetime(2012, 7, 1, 0, 0): 3026,
 datetime.datetime(2012, 8, 1, 0, 0): 2954,
 datetime.datetime(2012, 9, 1, 0, 0): 2852,
 datetime.datetime(2012, 10, 1, 0, 0): 2733,
 datetime.datetime(2012, 11, 1, 0, 0): 2729,
 datetime.datetime(2012, 12, 1, 0, 0): 2791,
 datetime.datetime(2013, 1, 1, 0, 0): 2864,
 datetime.datetime(2013, 2, 1, 0, 0): 2375,
 datetime.datetime(2013, 3, 1, 0, 0): 2862,
 datetime.datetime(2013, 4, 1, 0, 0): 2798,
 datetime.datetime(2013, 5, 1, 0, 0): 2806,
 datetime.datetime(2013, 6, 1, 0, 0): 2920,
 datetime.datetime(2013, 7, 1, 0, 0): 3079,
 datetime.datetime(2013, 8, 1, 0, 0): 2859,
 datetime.datetime(2013, 9, 1, 0, 0): 2742,
 datetime.datetime(2013, 10, 1, 0, 0): 2808,
 datetime.datetime(2013, 11,

### Assignment

- Find the number of gun deaths by Sex
- Find the number of gun deaths by Race
- How does this compare to the overall population in the US?

In [37]:
mapping = { "Asian/Pacific Islander": 15159516 + 674625, 
           "Native American/Native Alaskan": 3739506, 
           "Black": 40250635, "Hispanic": 44618105, "White": 197318956 }

In [40]:
deaths_by_sex = {}
deaths_by_sex['M'] = 0
deaths_by_sex['F'] = 0
   
for each in lol:
    deaths_by_sex[each[5]] += 1

print(deaths_by_sex)

deaths_by_race = {}
race_list = set([x[7] for x in lol])

for race in race_list:
    deaths_by_race[race] = 0
   
for each in lol:
    deaths_by_race[each[7]] += 1
    
print(deaths_by_race)

total_pop = sum(mapping.values())

print('Percentage of gun deaths by race of total population:')
for z in deaths_by_race:
    percent = deaths_by_race[z]*100/total_pop
    print(z + ' : ' + str(percent))

{'M': 86349, 'F': 14449}
{'Native American/Native Alaskan': 917, 'Black': 23296, 'White': 66237, 'Hispanic': 9022, 'Asian/Pacific Islander': 1326}
Percentage of gun deaths by race of total population:
Native American/Native Alaskan : 0.0003038825287836819
Black : 0.007720008059481628
White : 0.021950127654356312
Hispanic : 0.002989779906964425
Asian/Pacific Islander : 0.00043942010159995874


### Assignment

- Reuse the data structure counting deaths by race
- Use the dictionary below that has the actual population of each race
- Compute the rates of gun deaths per race per 100,000 people

mapping = {
    "Asian/Pacific Islander": 15159516 + 674625,
    "Native American/Native Alaskan": 3739506,
    "Black": 40250635,
    "Hispanic": 44618105,
    "White": 197318956
}

In [45]:
print('Gun deaths by race per 100,000 people:')
for z in deaths_by_race:
    num = deaths_by_race[z]*100000/total_pop
    print(z + ' : ' + str(num))

Gun deaths by race per 100,000 people:
Native American/Native Alaskan : 0.3038825287836819
Black : 7.7200080594816285
White : 21.95012765435631
Hispanic : 2.9897799069644253
Asian/Pacific Islander : 0.43942010159995876


### Assignment

You may not know this, but over half of all gun deaths are suicide.

- Redo the computation of rates of gun deaths per race per 100,000 people
- This time only count those that are "Homicide"
- How are these different than the previous calculation?


In [47]:
homicides = []
for each in lol:
    if each[3] == 'Homicide':
        homicides.append(each)

homicides_by_race = {}
for race in race_list:
    homicides_by_race[race] = 0
   
for each in homicides:
    homicides_by_race[each[7]] += 1

        
print('Homicide gun deaths by race per 100,000 people:')
for z in homicides_by_race:
    num = homicides_by_race[z]*100000/total_pop
    print(z + ' : ' + str(num))

Homicide gun deaths by race per 100,000 people:
Native American/Native Alaskan : 0.10803239300270479
Black : 6.465374194732425
White : 3.0312033705390817
Hispanic : 1.8670383502369288
Asian/Pacific Islander : 0.18524572910586495
