## US Gun Deaths Data Set

[Original article by FiveThirtyEight about Guns](http://fivethirtyeight.com/features/gun-deaths/)

The data set contains cleaned gun-death data from the CDC for 2012-2014.

### Exercise 1 - What do the first 5 rows of the CSV file contain?

In [78]:
def csv_list(filename):
    string_data = open(filename).read()
    data_list = string_data.split('\n')
    
    final_list = []
    
    for row in data_list:
        
        fields = row.split(',')
        int_fields = []
        
        for item in fields:
            try:
                int_fields.append(int(item))
            except:
                int_fields.append(item)
            
        final_list.append(int_fields)
        
    return final_list

data = csv_list('guns.csv')
data[:5]

[['',
  'year',
  'month',
  'intent',
  'police',
  'sex',
  'age',
  'race',
  'hispanic',
  'place',
  'education'],
 [1, 2012, 1, 'Suicide', 0, 'M', 34, 'Asian/Pacific Islander', 100, 'Home', 4],
 [2, 2012, 1, 'Suicide', 0, 'F', 21, 'White', 100, 'Street', 3],
 [3, 2012, 1, 'Suicide', 0, 'M', 60, 'White', 100, 'Other specified', 4],
 [4, 2012, 2, 'Suicide', 0, 'M', 64, 'White', 100, 'Home', 4]]

### Exercise 2 - Separate the header row and the data set it 2 separate lists.

In [79]:
header = data[:1]   
gun_data = data[1:]

print('header = ', header)
gun_data[:5]


header =  [['', 'year', 'month', 'intent', 'police', 'sex', 'age', 'race', 'hispanic', 'place', 'education']]


[[1, 2012, 1, 'Suicide', 0, 'M', 34, 'Asian/Pacific Islander', 100, 'Home', 4],
 [2, 2012, 1, 'Suicide', 0, 'F', 21, 'White', 100, 'Street', 3],
 [3, 2012, 1, 'Suicide', 0, 'M', 60, 'White', 100, 'Other specified', 4],
 [4, 2012, 2, 'Suicide', 0, 'M', 64, 'White', 100, 'Home', 4],
 [5, 2012, 2, 'Suicide', 0, 'M', 31, 'White', 100, 'Other specified', 2]]

### Excercise 3 - How many gun deaths are there by year? By month and year?

In [80]:
def gun_deaths(data, col_num):
    
    sum_dict = {}
    
    for item in data:
        
        category = item[col_num]
        
        if category in sum_dict:
            sum_dict[category] = sum_dict[category] + 1
        else:
            sum_dict[category] = 1
            
    return sum_dict        

In [81]:
deaths_per_year = gun_deaths(gun_data, 1)
deaths_per_year


{2012: 33563, 2013: 33636, 2014: 33599}

In [82]:
deaths_per_month = gun_deaths(gun_data, 2)
deaths_per_month

{1: 8273,
 2: 7093,
 3: 8289,
 4: 8455,
 5: 8669,
 6: 8677,
 7: 8989,
 8: 8783,
 9: 8508,
 10: 8406,
 11: 8243,
 12: 8413}

In [83]:
def gun_death_mo_yr(data):

    sum_dict = {}
    month_dict = {1: 'Jan', 2:'Feb', 3:'Mar', 4:'Apr', 5:'May', 6:'Jun', 7:'Jul', 8:'Aug', 9:'Sep', 10:'Oct', 11:'Nov', 12:'Dec'}
    
    for row in data:
        year = row[1]
        month = month_dict[row[2]]
        
        value = '{}, {}'.format(year, month)
        
        if value in sum_dict:
            sum_dict[value] +=1
        else:
            sum_dict[value] = 1
    return sum_dict

In [84]:
month_year = gun_death_mo_yr(gun_data)
month_year

{'2012, Apr': 2795,
 '2012, Aug': 2954,
 '2012, Dec': 2791,
 '2012, Feb': 2357,
 '2012, Jan': 2758,
 '2012, Jul': 3026,
 '2012, Jun': 2826,
 '2012, Mar': 2743,
 '2012, May': 2999,
 '2012, Nov': 2729,
 '2012, Oct': 2733,
 '2012, Sep': 2852,
 '2013, Apr': 2798,
 '2013, Aug': 2859,
 '2013, Dec': 2765,
 '2013, Feb': 2375,
 '2013, Jan': 2864,
 '2013, Jul': 3079,
 '2013, Jun': 2920,
 '2013, Mar': 2862,
 '2013, May': 2806,
 '2013, Nov': 2758,
 '2013, Oct': 2808,
 '2013, Sep': 2742,
 '2014, Apr': 2862,
 '2014, Aug': 2970,
 '2014, Dec': 2857,
 '2014, Feb': 2361,
 '2014, Jan': 2651,
 '2014, Jul': 2884,
 '2014, Jun': 2931,
 '2014, Mar': 2684,
 '2014, May': 2864,
 '2014, Nov': 2756,
 '2014, Oct': 2865,
 '2014, Sep': 2914}

### Exercise 4 -  How many gun deaths are there by sex? By race?

In [85]:
death_by_sex = gun_deaths(gun_data, 5)
death_by_sex

{'F': 14449, 'M': 86349}

In [86]:
death_by_race = gun_deaths(gun_data, 7)
death_by_race

{'Asian/Pacific Islander': 1326,
 'Black': 23296,
 'Hispanic': 9022,
 'Native American/Native Alaskan': 917,
 'White': 66237}

### Exercise 5 - Compute the rates of gun deaths per race per 100,000 people. Here is a [reference table of the US population](http://www.infoplease.com/ipa/A0762156.html). How does the rates of gun death compare to the racial breakdown of the United States?

In [87]:
us_pop_dict = {'Asian/Pacific Islander':14946700, 'Black':37685848, 'Hispanic':50477594, 'Native American/Native Alaskan':2247098, 'White':196817552} 

def death_per_100k(race_dict, pop_dict=us_pop_dict):
    
    death_per_100k = {}
    
    for key, value in race_dict.items():
        num_deaths = value
        population = pop_dict[key]

        death_per_100k[key] = int(num_deaths/population*100000)

    return death_per_100k
        


In [88]:
gun_death_rates = death_per_100k(death_by_race)

Asian/Pacific Islander: 8
White: 33
Native American/Native Alaskan: 40
Black: 61
Hispanic: 17


In [89]:
# Blacks experience the highest rate of gun deaths, while Asians and Hispanics experience the lowest rates of gun deaths

### Exercise 6 - Almost half of all gun deaths are suicide. Redo the previous computation twice. The first time look at the rates of gun death marked as "Homicide", and the second time look at the rates of gun deaths marked as "Suicide". What potential questions does this bring up?

In [90]:
def death_race_type(data, col_num, type):
    
    sum_dict = {}
    
    for row in data:
        race = row[col_num]
        intent = row[3]
        
        if intent == type:
            if race in sum_dict:
                sum_dict[race] +=1
            else:
                sum_dict[race] = 1
        else:
            pass

    return sum_dict

In [91]:
suicides = death_race_type(gun_data, 7, 'Suicide')
homicides = death_race_type(gun_data, 7, 'Homicide')

In [97]:
suicide_per_100k = death_per_100k(suicides)

Asian/Pacific Islander: 4
White: 28
Native American/Native Alaskan: 24
Black: 8
Hispanic: 6


In [96]:
homicide_per_100k = death_per_100k(homicides)

White: 4
Asian/Pacific Islander: 3
Black: 51
Native American/Native Alaskan: 14
Hispanic: 11
