This is an analysis of a dataset from FiveThirtyEight on gun deaths in the US from 2012 - 2014. 
Each row in the dataset represents a single gun death. 
Column values: 
    * year -- year in which fatality occured
    * month -- month in which fatality occured
    * intent -- the intent of the perpetrator of the crime (Suicide, Accidental, NA, Homicide, or Undetermined.)
    * police -- whether a police officer was involved in the shooting (0 - false or 1 - true)
    * sex -- gender of the victim (M or F)
    * age -- age of victim
    * race -- race of victim (Asian/Pacific Islander, Native American/Native Alaskan, Black, Hispanic, or White.)
    * hispanic -- a code indicating the Hispanic origin of the victim.
    * place -- where the shooting occurred. Has several categories:
        * Home
        * Street
        * Other specified
    * education -- educational status of the victim. Can be one of the following:
        * 1 -- Less than High School
        * 2 -- Graduated from High School or equivalent
        * 3 -- Some College
        * 4 -- At least graduated from College
        * 5 -- Not available

In [226]:
import csv
import datetime

In [227]:
with open("guns.csv", 'r') as f:
    csv_read = csv.reader(f)
    data = list(csv_read)
    #print(data[0:5])

In [228]:
    headers = data[0]
    data = data[1:]
    #print(headers)
    #print(data[0:5])

In [229]:
    # Function to count the number of occurences of each unique item in a list.
    # Returns dictionary where key is item, value is number of occurences
    def count(input_list):
        return_dict = {}
        for row in input_list:
            if row in return_dict:
                return_dict[row] += 1
            else:
                return_dict[row] = 1
        return return_dict

In [230]:
    # Calculate how many gun deaths happened each year: 
    years = []
    for row in data:
        years.append(row[1])
    year_counts = count(years)
    year_counts

{'2012': 33563, '2013': 33636, '2014': 33599}

Gun deaths appeared to remain relatively constant across years 2012 - 2014, so I examined by month and year:

In [231]:
    #Calculate gun deaths by month and year: 
    dates = []
    for row in data:
        date = datetime.datetime(year = int(row[1]), month = int(row[2]), day = 1)
        dates.append(date)
    date_counts = count(dates)
    date_counts

{datetime.datetime(2012, 1, 1, 0, 0): 2758,
 datetime.datetime(2012, 2, 1, 0, 0): 2357,
 datetime.datetime(2012, 3, 1, 0, 0): 2743,
 datetime.datetime(2012, 4, 1, 0, 0): 2795,
 datetime.datetime(2012, 5, 1, 0, 0): 2999,
 datetime.datetime(2012, 6, 1, 0, 0): 2826,
 datetime.datetime(2012, 7, 1, 0, 0): 3026,
 datetime.datetime(2012, 8, 1, 0, 0): 2954,
 datetime.datetime(2012, 9, 1, 0, 0): 2852,
 datetime.datetime(2012, 10, 1, 0, 0): 2733,
 datetime.datetime(2012, 11, 1, 0, 0): 2729,
 datetime.datetime(2012, 12, 1, 0, 0): 2791,
 datetime.datetime(2013, 1, 1, 0, 0): 2864,
 datetime.datetime(2013, 2, 1, 0, 0): 2375,
 datetime.datetime(2013, 3, 1, 0, 0): 2862,
 datetime.datetime(2013, 4, 1, 0, 0): 2798,
 datetime.datetime(2013, 5, 1, 0, 0): 2806,
 datetime.datetime(2013, 6, 1, 0, 0): 2920,
 datetime.datetime(2013, 7, 1, 0, 0): 3079,
 datetime.datetime(2013, 8, 1, 0, 0): 2859,
 datetime.datetime(2013, 9, 1, 0, 0): 2742,
 datetime.datetime(2013, 10, 1, 0, 0): 2808,
 datetime.datetime(2013, 11,

I noticed that higher numbers of gun deaths seemed to occur in the summer (5, 6, 7, 8). To better analyze this, I decided to find the average number of gun deaths per month, from 2012 - 2014: 

In [232]:
    def monthly_average(input_dict, month):
        total_num = 0
        total_sum = 0
        for key, value in input_dict.items(): 
            if key.month == month:
                total_sum += value
                total_num += 1
        avg = total_sum / total_num
        return avg

In [233]:
    monthly_averages = {}
    month_numbers = (list(range(1, 13)))
    for month in month_numbers: 
        monthly_averages[month] = monthly_average(date_counts, month)
    monthly_averages

{1: 2757.6666666666665,
 2: 2364.3333333333335,
 3: 2763.0,
 4: 2818.3333333333335,
 5: 2889.6666666666665,
 6: 2892.3333333333335,
 7: 2996.3333333333335,
 8: 2927.6666666666665,
 9: 2836.0,
 10: 2802.0,
 11: 2747.6666666666665,
 12: 2804.3333333333335}

It appears as the overall number of gun deaths does increase in months 5-8. I then decided to examine the data 
by types of gun deaths: 

In [234]:
    intents = [row[3] for row in data]
    intent_counts = count(intents)
    intent_counts

{'Suicide': 63175,
 'Undetermined': 807,
 'Accidental': 1639,
 'Homicide': 35176,
 'NA': 1}

I then examined gun deaths by race: 

In [235]:
    races = [row[7] for row in data]
    race_counts = count(races)
    race_counts

{'Asian/Pacific Islander': 1326,
 'White': 66237,
 'Native American/Native Alaskan': 917,
 'Black': 23296,
 'Hispanic': 9022}

In [236]:
    #Calculate gun deaths per 100,000 people in each race
    with open("census.csv", 'r') as d:
        census_csv = csv.reader(d)
        census = list(census_csv)
        mapping = {
            "Asian/Pacific Islander": 15159516 + 674625,
            "Native American/Native Alaskan": 3739506,
            "Black": 40250635,
            "Hispanic": 44618105,
            "White": 197318956
        }
        race_per_hundredk = {}
        for key, value in race_counts.items(): 
            race_per_hundredk[key] = (value / mapping[key]) * 100000
        race_per_hundredk

In [237]:
        # Calculate gun deaths / 100,000 by race for homicides only: 
        homicide_race_counts = {}
        for i, race in enumerate(races): 
            if intents[i] == "Homicide":
                if race not in homicide_race_counts:
                    homicide_race_counts[race] = 0
                else:
                    homicide_race_counts[race] += 1

In [238]:
        for key, value in homicide_race_counts.items():
            homicide_race_counts[key] = (value / mapping[key]) * 100000
        homicide_race_counts

{'White': 4.635135004464548,
 'Asian/Pacific Islander': 3.5240307636517825,
 'Black': 48.468800554326656,
 'Native American/Native Alaskan': 8.690987526159873,
 'Hispanic': 12.624919861567406}

Findings: 

Upon analysis of gun death data, I found that homicides due to gun violence disproportionately affect Black and Hispanic people, and black people were also disproportionately affected by gun violence in general. 

Further questions: 
    * I would like to explore the relationship between gun deaths due to other causes (Suicide, Accidental) and determine 
        which races and genders were most affected by that. 
    * Which races and genders are most likely to be labeled "Undetermined?

In [239]:
f.close()