## US Gun Deaths Data Set

[Original article by FiveThirtyEight about Guns](http://fivethirtyeight.com/features/gun-deaths/)

The data set contains cleaned gun-death data from the CDC for 2012-2014.

### Assignment

- Import the csv
- Read it into a list
- Preview the first 5 entries

In [1]:
with open("guns.csv") as f:
    line1 = f.readline().split(',')
    columns = [column.strip('"').strip('"\n') for column in line1]
    data = []
    for line in f:
        data.append([item.strip('"').strip('\n') for item in line.split(',')])

In [2]:
columns

['',
 'year',
 'month',
 'intent',
 'police',
 'sex',
 'age',
 'race',
 'hispanic',
 'place',
 'education']

In [3]:
data[:5]

[['1',
  '2012',
  '01',
  'Suicide',
  '0',
  'M',
  '34',
  'Asian/Pacific Islander',
  '100',
  'Home',
  '4'],
 ['2', '2012', '01', 'Suicide', '0', 'F', '21', 'White', '100', 'Street', '3'],
 ['3',
  '2012',
  '01',
  'Suicide',
  '0',
  'M',
  '60',
  'White',
  '100',
  'Other specified',
  '4'],
 ['4', '2012', '02', 'Suicide', '0', 'M', '64', 'White', '100', 'Home', '4'],
 ['5',
  '2012',
  '02',
  'Suicide',
  '0',
  'M',
  '31',
  'White',
  '100',
  'Other specified',
  '2']]

### Assignment

- Remove the header row from the list of lists
- Save it to a separate list

#### *DONE

### Assignment

- Count the number of gun deaths by year
    - It may help to do a list comprehension to get the years
    - Iterate over the years with a dictionary to keep count
    
    

In [4]:
from collections import Counter

In [5]:
years = [row[1] for row in data]

In [6]:
gun_deaths = Counter(years)

In [7]:
gun_deaths

Counter({'2012': 33563, '2013': 33636, '2014': 33599})

### Assignment

- Import the datetime library
- Create a new list called "dates" with values from the data (set all the day values to 1)    
- Count they number of gun deaths by month and year



In [8]:
import datetime

In [19]:
years_int = [int(year) for year in years]

In [20]:
years_int[0]

2012

In [22]:
months = [row[2] for row in data]

In [23]:
months_int = [int(month) for month in months]

In [24]:
days = [1 for x in range(len(years))]

In [38]:
dates = zip(years_int, months_int, days)

In [41]:
dates = [datetime.datetime(year, month, day) for (year, month, day) in dates]

In [42]:
dates[:10]

[datetime.datetime(2012, 1, 1, 0, 0),
 datetime.datetime(2012, 1, 1, 0, 0),
 datetime.datetime(2012, 1, 1, 0, 0),
 datetime.datetime(2012, 2, 1, 0, 0),
 datetime.datetime(2012, 2, 1, 0, 0),
 datetime.datetime(2012, 2, 1, 0, 0),
 datetime.datetime(2012, 2, 1, 0, 0),
 datetime.datetime(2012, 3, 1, 0, 0),
 datetime.datetime(2012, 2, 1, 0, 0),
 datetime.datetime(2012, 2, 1, 0, 0)]

In [44]:
Counter(dates)

Counter({datetime.datetime(2012, 1, 1, 0, 0): 2758,
         datetime.datetime(2012, 2, 1, 0, 0): 2357,
         datetime.datetime(2012, 3, 1, 0, 0): 2743,
         datetime.datetime(2012, 4, 1, 0, 0): 2795,
         datetime.datetime(2012, 5, 1, 0, 0): 2999,
         datetime.datetime(2012, 6, 1, 0, 0): 2826,
         datetime.datetime(2012, 7, 1, 0, 0): 3026,
         datetime.datetime(2012, 8, 1, 0, 0): 2954,
         datetime.datetime(2012, 9, 1, 0, 0): 2852,
         datetime.datetime(2012, 10, 1, 0, 0): 2733,
         datetime.datetime(2012, 11, 1, 0, 0): 2729,
         datetime.datetime(2012, 12, 1, 0, 0): 2791,
         datetime.datetime(2013, 1, 1, 0, 0): 2864,
         datetime.datetime(2013, 2, 1, 0, 0): 2375,
         datetime.datetime(2013, 3, 1, 0, 0): 2862,
         datetime.datetime(2013, 4, 1, 0, 0): 2798,
         datetime.datetime(2013, 5, 1, 0, 0): 2806,
         datetime.datetime(2013, 6, 1, 0, 0): 2920,
         datetime.datetime(2013, 7, 1, 0, 0): 3079,
         

### Assignment

- Find the number of gun deaths by Sex
- Find the number of gun deaths by Race
- How does this compare to the overall population in the US?

In [45]:
columns

['',
 'year',
 'month',
 'intent',
 'police',
 'sex',
 'age',
 'race',
 'hispanic',
 'place',
 'education']

In [47]:
deaths_by_sex = [row[5] for row in data]

In [48]:
deaths_by_sex[:10]

['M', 'F', 'M', 'M', 'M', 'M', 'M', 'M', 'M', 'M']

In [49]:
Counter(deaths_by_sex)

Counter({'F': 14449, 'M': 86349})

### Assignment

- Reuse the data structure counting deaths by race
- Use the dictionary below that has the actual population of each race
- Compute the rates of gun deaths per race per 100,000 people

mapping = {
    "Asian/Pacific Islander": 15159516 + 674625,
    "Native American/Native Alaskan": 3739506,
    "Black": 40250635,
    "Hispanic": 44618105,
    "White": 197318956
}

In [50]:
deaths_by_race = [row[7] for row in data]

In [51]:
deaths_by_race[:5]

['Asian/Pacific Islander', 'White', 'White', 'White', 'White']

In [52]:
mapping = { "Asian/Pacific Islander": 15159516 + 674625, "Native American/Native Alaskan": 3739506, "Black": 40250635, "Hispanic": 44618105, "White": 197318956 }

In [54]:
number_deaths_by_race = Counter(deaths_by_race)
number_deaths_by_race

Counter({'Asian/Pacific Islander': 1326,
         'Black': 23296,
         'Hispanic': 9022,
         'Native American/Native Alaskan': 917,
         'White': 66237})

In [58]:
ndr = number_deaths_by_race

In [59]:
ndr['Asian/Pacific Islander']

1326

In [61]:
r = zip(ndr.values(), mapping.values())

In [62]:
r

[(917, 3739506),
 (23296, 40250635),
 (66237, 197318956),
 (1326, 15834141),
 (9022, 44618105)]

In [67]:
ratio = [(float(r_[0]) / float(r_[1])) * 100000  for r_ in r]
ratio

[24.521955573811088,
 57.8773477735196,
 33.56849303419181,
 8.374309664161762,
 20.220491210910907]

In [68]:
ratio = {}
for key in ndr.keys():
    ratio[key] = ndr[key] * 100000 / mapping[key] 

In [69]:
ratio

{'Asian/Pacific Islander': 8,
 'Black': 57,
 'Hispanic': 20,
 'Native American/Native Alaskan': 24,
 'White': 33}

### Assignment

You may not know this, but over half of all gun deaths are suicide.

- Redo the computation of rates of gun deaths per race per 100,000 people
- This time only count those that are "Homicide"
- How are these different than the previous calculation?
