## US Gun Deaths Data Set

[Original article by FiveThirtyEight about Guns](http://fivethirtyeight.com/features/gun-deaths/)

The data set contains cleaned gun-death data from the CDC for 2012-2014.

### Assignment

- Import the csv
- Read it into a list
- Preview the first 5 entries

In [13]:
import csv

with open('guns.csv','r') as f:
    reader = csv.reader(f)
    data = list(reader)

    print(data[:5])

[['', 'year', 'month', 'intent', 'police', 'sex', 'age', 'race', 'hispanic', 'place', 'education'], ['1', '2012', '01', 'Suicide', '0', 'M', '34', 'Asian/Pacific Islander', '100', 'Home', '4'], ['2', '2012', '01', 'Suicide', '0', 'F', '21', 'White', '100', 'Street', '3'], ['3', '2012', '01', 'Suicide', '0', 'M', '60', 'White', '100', 'Other specified', '4'], ['4', '2012', '02', 'Suicide', '0', 'M', '64', 'White', '100', 'Home', '4']]


### Assignment

- Remove the header row from the list of lists
- Save it to a separate list

In [14]:
headers = data[:1]
data = data[1:]

print(headers)

[['', 'year', 'month', 'intent', 'police', 'sex', 'age', 'race', 'hispanic', 'place', 'education']]


### Assignment

- Count the number of gun deaths by year
    - It may help to do a list comprehension to get the years
    - Iterate over the years with a dictionary to keep count
    
    

In [17]:
years = [row[1] for row in data]

year_counts = {}
for year in years:
    if year not in year_counts:
        year_counts[year] = 0
    year_counts[year] += 1

year_counts
    

{'2012': 33563, '2013': 33636, '2014': 33599}

### Assignment

- Import the datetime library
- Create a new list called "dates" with values from the data (set all the day values to 1)    
- Count they number of gun deaths by month and year



In [22]:
import datetime

dates = [datetime.datetime(year=int(row[1]), month=int(row[2]),day=1) for row in data]
dates[:5]

date_counts = {}

for date in dates:
    if date not in date_counts:
        date_counts[date] = 0
        
    date_counts[date] = 0
date_counts

{datetime.datetime(2012, 1, 1, 0, 0): 0,
 datetime.datetime(2012, 2, 1, 0, 0): 0,
 datetime.datetime(2012, 3, 1, 0, 0): 0,
 datetime.datetime(2012, 4, 1, 0, 0): 0,
 datetime.datetime(2012, 5, 1, 0, 0): 0,
 datetime.datetime(2012, 6, 1, 0, 0): 0,
 datetime.datetime(2012, 7, 1, 0, 0): 0,
 datetime.datetime(2012, 8, 1, 0, 0): 0,
 datetime.datetime(2012, 9, 1, 0, 0): 0,
 datetime.datetime(2012, 10, 1, 0, 0): 0,
 datetime.datetime(2012, 11, 1, 0, 0): 0,
 datetime.datetime(2012, 12, 1, 0, 0): 0,
 datetime.datetime(2013, 1, 1, 0, 0): 0,
 datetime.datetime(2013, 2, 1, 0, 0): 0,
 datetime.datetime(2013, 3, 1, 0, 0): 0,
 datetime.datetime(2013, 4, 1, 0, 0): 0,
 datetime.datetime(2013, 5, 1, 0, 0): 0,
 datetime.datetime(2013, 6, 1, 0, 0): 0,
 datetime.datetime(2013, 7, 1, 0, 0): 0,
 datetime.datetime(2013, 8, 1, 0, 0): 0,
 datetime.datetime(2013, 9, 1, 0, 0): 0,
 datetime.datetime(2013, 10, 1, 0, 0): 0,
 datetime.datetime(2013, 11, 1, 0, 0): 0,
 datetime.datetime(2013, 12, 1, 0, 0): 0,
 datetime.

### Assignment

- Find the number of gun deaths by Sex
- Find the number of gun deaths by Race
- How does this compare to the overall population in the US?

In [32]:
sex_counts = {}
race_counts = {}

sexes = [row[5] for row in data]
races = [row[7] for row in data]

for sex in sexes:
    if sex not in sex_counts:
        sex_counts[sex] = 0
    sex_counts[sex] += 1

for race in races:
    if race not in race_counts:
        race_counts[race] = 0
    race_counts[race] += 1
race_counts
    

{'Asian/Pacific Islander': 1326,
 'Black': 23296,
 'Hispanic': 9022,
 'Native American/Native Alaskan': 917,
 'White': 66237}

### Assignment

- Reuse the data structure counting deaths by race
- Use the dictionary below that has the actual population of each race
- Compute the rates of gun deaths per race per 100,000 people

mapping = {
    "Asian/Pacific Islander": 15159516 + 674625,
    "Native American/Native Alaskan": 3739506,
    "Black": 40250635,
    "Hispanic": 44618105,
    "White": 197318956
}

In [43]:
mapping = { "Asian/Pacific Islander": 15159516 + 674625, 
           "Native American/Native Alaskan": 3739506, 
           "Black": 40250635, "Hispanic": 44618105, 
           "White": 197318956 }

race_per_hundredk = {}

for k, v in race_counts.items():
    race_per_hundredk[k] = (v / mapping[k] * 100000)
    
race_per_hundredk


{'Asian/Pacific Islander': 8.374309664161762,
 'Black': 57.8773477735196,
 'Hispanic': 20.220491210910907,
 'Native American/Native Alaskan': 24.521955573811088,
 'White': 33.56849303419181}

### Assignment

You may not know this, but over half of all gun deaths are suicide.

- Redo the computation of rates of gun deaths per race per 100,000 people
- This time only count those that are "Homicide"
- How are these different than the previous calculation?


In [44]:
intents = [row[3] for row in data]

homicide_race_counts = {}

for i, race in enumerate(races):
    if race not in homicide_race_counts:
        homicide_race_counts[race] = 0
    if intents[i] == "Homicide":
        homicide_race_counts[race] += 1
        
homicide_race_per_hundredk = {}

for k, v in homicide_race_counts.items():
    homicide_race_per_hundredk[k] = (v / mapping[k] * 100000)
    
homicide_race_per_hundredk

{'Asian/Pacific Islander': 3.530346230970155,
 'Black': 48.471284987180944,
 'Hispanic': 12.627161104219914,
 'Native American/Native Alaskan': 8.717729026240365,
 'White': 4.6356417981453335}