This is an analysis of a dataset from FiveThirtyEight on gun deaths in the US from 2012 - 2014. 
Each row in the dataset represents a single gun death. 
Column values: 
    year -- year in which fatality occured
    month -- month in which fatality occured
    intent -- the intent of the perpetrator of the crime (Suicide, Accidental, NA, Homicide, or Undetermined.)
    police -- whether a police officer was involved in the shooting (0 - false or 1 - true)
    sex -- gender of the victim (M or F)
    age -- age of victim
    race -- race of victim (Asian/Pacific Islander, Native American/Native Alaskan, Black, Hispanic, or White.)
    hispanic -- a code indicating the Hispanic origin of the victim.
    place -- where the shooting occurred. Has several categories:
        - Home
        - Street
        - Other specified
    education -- educational status of the victim. Can be one of the following:
        1 -- Less than High School
        2 -- Graduated from High School or equivalent
        3 -- Some College
        4 -- At least graduated from College
        5 -- Not available

In [90]:
import csv
import datetime

In [91]:
with open("guns.csv", 'r') as f:
    csv_read = csv.reader(f)
    data = list(csv_read)
    #print(data[0:5])

In [92]:
    headers = data[0]
    data = data[1:]
    #print(headers)
    #print(data[0:5])

In [93]:
    # Calculate how many gun deaths happened in each year:
    years = []
    for row in data:
        years.append(row[1])
    year_counts = {}
    for year in years: 
        if year in year_counts:
            year_counts[year] += 1
        else:
            year_counts[year] = 1
    #print(year_counts)

{'2014': 33599, '2013': 33636, '2012': 33563}


In [94]:
    #Examine how gun deaths change by month and year:
    dates = []
    for row in data:
        date = datetime.datetime(year = int(row[1]), month = int(row[2]), day = 1)
        dates.append(date)
    #print(dates[:10])

[datetime.datetime(2012, 1, 1, 0, 0), datetime.datetime(2012, 1, 1, 0, 0), datetime.datetime(2012, 1, 1, 0, 0), datetime.datetime(2012, 2, 1, 0, 0), datetime.datetime(2012, 2, 1, 0, 0), datetime.datetime(2012, 2, 1, 0, 0), datetime.datetime(2012, 2, 1, 0, 0), datetime.datetime(2012, 3, 1, 0, 0), datetime.datetime(2012, 2, 1, 0, 0), datetime.datetime(2012, 2, 1, 0, 0)]


In [95]:
    date_counts = {}
    for date in dates:
        if date in date_counts:
            date_counts[date] += 1
        else:
            date_counts[date] = 1
    #print(date_counts)
    

In [96]:
    date_counts = {}
    for date in dates:
        if date in date_counts:
            date_counts[date] += 1
        else:
            date_counts[date] = 1
    #print(date_counts)

In [97]:
    sex_list = [row[5] for row in data]
    sex_counts = {}
    for row in sex_list:
        if row in sex_counts:
            sex_counts[row] +=1
        else:
            sex_counts[row] = 1
    #print(sex_counts)

In [98]:
    race_list = [row[7] for row in data]
    race_counts = {}
    for row in race_list:
        if row in race_counts:
            race_counts[row] +=1
        else:
            race_counts[row] = 1
    #print(race_counts)

In [99]:
    #Calculate gun deaths per 100,000 people in each race
    with open("census.csv", 'r') as d:
        census_csv = csv.reader(d)
        census = list(census_csv)
        mapping = {
            "Asian/Pacific Islander": 15159516 + 674625,
            "Native American/Native Alaskan": 3739506,
            "Black": 40250635,
            "Hispanic": 44618105,
            "White": 197318956
        }
        race_per_hundredk = {}
        for key, value in race_counts.items(): 
            race_per_hundredk[key] = (value / mapping[key]) * 100000
        #print(race_per_hundredk)

{'Asian/Pacific Islander': 8.374309664161762, 'White': 33.56849303419181, 'Black': 57.8773477735196, 'Hispanic': 20.220491210910907, 'Native American/Native Alaskan': 24.521955573811088}


In [100]:
        # Calculate gun deaths / 100,000 by race for homicides only: 
        intents = [row[3] for row in data]
        races = [row[7] for row in data]
        homicide_race_counts = {}
        for i, race in enumerate(races): 
            if intents[i] == "Homicide":
                if race not in homicide_race_counts:
                    homicide_race_counts[race] = 0
                else:
                    homicide_race_counts[race] += 1
        

In [101]:
        for key, value in homicide_race_counts.items():
            homicide_race_counts[key] = (value / mapping[key]) * 100000
        print(homicide_race_counts)

{'Native American/Native Alaskan': 8.690987526159873, 'White': 4.635135004464548, 'Hispanic': 12.624919861567406, 'Black': 48.468800554326656, 'Asian/Pacific Islander': 3.5240307636517825}


Findings: 

Upon analysis of gun death data, I found that homicides due to gun violence disproportionately affect Black and Hispanic people, and black people were also disproportionately affected by gun violence in general. 

Further questions: 
    - I would like to explore the relationship between gun deaths due to other causes (Suicide, Accidental) and determine 
        which races and genders were most affected by that. 
    - Which races and genders are most likely to be labeled "Undetermined?

In [102]:
f.close()