## US Gun Deaths Data Set

[Original article by FiveThirtyEight about Guns](http://fivethirtyeight.com/features/gun-deaths/)

The data set contains cleaned gun-death data from the CDC for 2012-2014.

### Assignment

- Import the csv
- Read it into a list
- Preview the first 5 entries

In [20]:
import csv

def read_input(csvfilename):
    with open(csvfilename, newline='') as csvfile:
        guns_data_raw = csv.DictReader(csvfile)
        headers_raw = guns_data_raw.fieldnames
        headers_list = ['id' if h is '' else h for h in headers_raw]
        headers = {}
        for i, header in enumerate(headers_list):
            headers[header] = i
        guns_data = list()
        for each_row_dict in guns_data_raw:
            each_row = {}
            for key, value in each_row_dict.items():
                if key is '':
                    key = 'id'
                each_row[key] = value
            guns_data.append(each_row)
        for i, data in enumerate(guns_data):
            if i == 10:
                break
            print("Row[{}] = {}".format(i, data))
        return headers, guns_data

guns_data = read_input('guns.csv')       

Row[0] = {'education': '4', 'hispanic': '100', 'month': '01', 'sex': 'M', 'intent': 'Suicide', 'id': '1', 'year': '2012', 'police': '0', 'place': 'Home', 'age': '34', 'race': 'Asian/Pacific Islander'}
Row[1] = {'education': '3', 'hispanic': '100', 'month': '01', 'sex': 'F', 'intent': 'Suicide', 'id': '2', 'year': '2012', 'police': '0', 'place': 'Street', 'age': '21', 'race': 'White'}
Row[2] = {'education': '4', 'hispanic': '100', 'month': '01', 'sex': 'M', 'intent': 'Suicide', 'id': '3', 'year': '2012', 'police': '0', 'place': 'Other specified', 'age': '60', 'race': 'White'}
Row[3] = {'education': '4', 'hispanic': '100', 'month': '02', 'sex': 'M', 'intent': 'Suicide', 'id': '4', 'year': '2012', 'police': '0', 'place': 'Home', 'age': '64', 'race': 'White'}
Row[4] = {'education': '2', 'hispanic': '100', 'month': '02', 'sex': 'M', 'intent': 'Suicide', 'id': '5', 'year': '2012', 'police': '0', 'place': 'Other specified', 'age': '31', 'race': 'White'}
Row[5] = {'education': '1', 'hispanic':

### Assignment

- Remove the header row from the list of lists
- Save it to a separate list

In [21]:
print(guns_data[0])

{'education': 10, 'sex': 5, 'police': 4, 'intent': 3, 'hispanic': 8, 'id': 0, 'year': 1, 'month': 2, 'place': 9, 'age': 6, 'race': 7}


### Assignment

- Count the number of gun deaths by year
    - It may help to do a list comprehension to get the years
    - Iterate over the years with a dictionary to keep count
    
    

In [47]:
def gun_deaths_by_key(hkey, guns_data_all, pred=None):
    gun_deaths = {}
    guns_data = guns_data_all[1]
    for each_row in guns_data:
        if pred is None or pred(each_row):
            try:
                gun_deaths[each_row[hkey]]
            except KeyError:
                gun_deaths[each_row[hkey]] = 1
            else:
                gun_deaths[each_row[hkey]] += 1
            
    for key, value in gun_deaths.items():
        print("Input key:{} [{}]:{}".format(hkey, key, value))
    return gun_deaths    

gun_deaths_by_year = gun_deaths_by_key('year', guns_data)

Input key:year [2014]:33599
Input key:year [2012]:33563
Input key:year [2013]:33636


### Assignment

- Import the datetime library
- Create a new list called "dates" with values from the data (set all the day values to 1)    
- Count they number of gun deaths by month and year



In [36]:
import time
from datetime import date

def gun_deaths_by_date(guns_data_all):
    gun_deaths = {}
    guns_data = guns_data_all[1]
    month_key = 'month'
    year_key = 'year'
    day = 1
    for each_row in guns_data:
        curr_date = date(int(each_row[year_key]), int(each_row[month_key]), day)
        try:
            gun_deaths[curr_date]
        except KeyError:
            gun_deaths[curr_date] = 1
        else:
            gun_deaths[curr_date] += 1
            
    for key, value in gun_deaths.items():
        print("[{}]:{}".format(key, value))
    return gun_deaths 

gun_deaths_by_month = gun_deaths_by_date(guns_data)

[2012-02-01]:2357
[2014-08-01]:2970
[2013-10-01]:2808
[2014-11-01]:2756
[2013-08-01]:2859
[2013-11-01]:2758
[2014-05-01]:2864
[2014-10-01]:2865
[2012-05-01]:2999
[2012-09-01]:2852
[2012-08-01]:2954
[2012-12-01]:2791
[2014-02-01]:2361
[2013-12-01]:2765
[2014-04-01]:2862
[2013-02-01]:2375
[2013-03-01]:2862
[2013-07-01]:3079
[2013-06-01]:2920
[2012-10-01]:2733
[2014-03-01]:2684
[2012-01-01]:2758
[2014-06-01]:2931
[2012-03-01]:2743
[2014-12-01]:2857
[2012-11-01]:2729
[2012-07-01]:3026
[2014-09-01]:2914
[2014-01-01]:2651
[2013-05-01]:2806
[2012-06-01]:2826
[2014-07-01]:2884
[2013-04-01]:2798
[2013-09-01]:2742
[2013-01-01]:2864
[2012-04-01]:2795


### Assignment

- Find the number of gun deaths by Sex
- Find the number of gun deaths by Race
- How does this compare to the overall population in the US?

In [30]:
gun_deaths_by_sex = gun_deaths_by_key('sex', guns_data)
gun_deaths_by_race = gun_deaths_by_key('race', guns_data)

Input key:sex [M]:86349
Input key:sex [F]:14449
Input key:race [Black]:23296
Input key:race [Native American/Native Alaskan]:917
Input key:race [White]:66237
Input key:race [Hispanic]:9022
Input key:race [Asian/Pacific Islander]:1326


### Assignment

- Reuse the data structure counting deaths by race
- Use the dictionary below that has the actual population of each race
- Compute the rates of gun deaths per race per 100,000 people

mapping = {
    "Asian/Pacific Islander": 15159516 + 674625,
    "Native American/Native Alaskan": 3739506,
    "Black": 40250635,
    "Hispanic": 44618105,
    "White": 197318956
}

In [45]:
mapping = { "Asian/Pacific Islander": 15159516 + 674625, 
            "Native American/Native Alaskan": 3739506, 
            "Black": 40250635, 
            "Hispanic": 44618105, 
            "White": 197318956 
           }
gun_deaths_by_race = gun_deaths_by_key('race', guns_data)
gun_deaths_by_race_percent = {}
for key, value in gun_deaths_by_race.items():
    deaths_per_100k = int((float(value)/mapping[key])*1e5)
    gun_deaths_by_race_percent[key] = deaths_per_100k
    print("[{}] = {}".format(key, deaths_per_100k))
    

Input key:race [Black]:23296
Input key:race [Native American/Native Alaskan]:917
Input key:race [White]:66237
Input key:race [Hispanic]:9022
Input key:race [Asian/Pacific Islander]:1326
[Black] = 57
[Native American/Native Alaskan] = 24
[White] = 33
[Hispanic] = 20
[Asian/Pacific Islander] = 8


### Assignment

You may not know this, but over half of all gun deaths are suicide.

- Redo the computation of rates of gun deaths per race per 100,000 people
- This time only count those that are "Homicide"
- How are these different than the previous calculation?


In [58]:
def filter_by_homicide(each_row):
    try:
        each_row['intent']
    except:
        return False
    else:
        return each_row['intent'].strip() == 'Homicide'
    
gun_deaths_by_race = gun_deaths_by_key('race', guns_data, filter_by_homicide)
gun_deaths_by_race_percent = {}
for key, value in gun_deaths_by_race.items():
    deaths_per_100k = int((float(value)/mapping[key])*1e5)
    gun_deaths_by_race_percent[key] = deaths_per_100k
    print("[{}]-Homicide = {}".format(key, deaths_per_100k))
        

Input key:race [Black]:19510
Input key:race [Native American/Native Alaskan]:326
Input key:race [White]:9147
Input key:race [Hispanic]:5634
Input key:race [Asian/Pacific Islander]:559
[Black]-Homicide = 48
[Native American/Native Alaskan]-Homicide = 8
[White]-Homicide = 4
[Hispanic]-Homicide = 12
[Asian/Pacific Islander]-Homicide = 3
