# Exploring Gun Deaths In The US

The dataset came from FiveThirtyEight, and can be found https://github.com/fivethirtyeight/guns-data. It contains information on gun deaths in the US from 2012 to 2014. Each row in the dataset represents a single fatality. The columns contain demographic and other information about the victim.

## A First Look on US Gun Deaths Data

In [5]:
import csv
with open("guns.csv","r") as file:
    data = list(csv.reader(file))

In [6]:
print(data[:5])

[['', 'year', 'month', 'intent', 'police', 'sex', 'age', 'race', 'hispanic', 'place', 'education'], ['1', '2012', '01', 'Suicide', '0', 'M', '34', 'Asian/Pacific Islander', '100', 'Home', '4'], ['2', '2012', '01', 'Suicide', '0', 'F', '21', 'White', '100', 'Street', '3'], ['3', '2012', '01', 'Suicide', '0', 'M', '60', 'White', '100', 'Other specified', '4'], ['4', '2012', '02', 'Suicide', '0', 'M', '64', 'White', '100', 'Home', '4']]


## Removing Headers From List of Lists

In [7]:
headers = data[0]
data = data[1:]

In [12]:
print(headers)

['', 'year', 'month', 'intent', 'police', 'sex', 'age', 'race', 'hispanic', 'place', 'education']


In [14]:
print(data[:5])

[['1', '2012', '01', 'Suicide', '0', 'M', '34', 'Asian/Pacific Islander', '100', 'Home', '4'], ['2', '2012', '01', 'Suicide', '0', 'F', '21', 'White', '100', 'Street', '3'], ['3', '2012', '01', 'Suicide', '0', 'M', '60', 'White', '100', 'Other specified', '4'], ['4', '2012', '02', 'Suicide', '0', 'M', '64', 'White', '100', 'Home', '4'], ['5', '2012', '02', 'Suicide', '0', 'M', '31', 'White', '100', 'Other specified', '2']]


## Counting Gun Deaths Based on Years

In [15]:
years = [row[1] for row in data]

year_counts = {}
for year in years:
    if year in year_counts:
        year_counts[year] += 1
    else:
        year_counts[year] = 1

In [16]:
year_counts

{'2012': 33563, '2013': 33636, '2014': 33599}

## Exploring Gun Deaths By Month and Year

In [22]:
import datetime

dates = []
dates = [datetime.datetime(year=int(row[1]), month=int(row[2]),day=1) for row in data]

In [23]:
dates[:5]

[datetime.datetime(2012, 1, 1, 0, 0),
 datetime.datetime(2012, 1, 1, 0, 0),
 datetime.datetime(2012, 1, 1, 0, 0),
 datetime.datetime(2012, 2, 1, 0, 0),
 datetime.datetime(2012, 2, 1, 0, 0)]

In [24]:
date_counts = {}
for row in dates:
    if row in date_counts:
        date_counts[row] += 1
    else:
        date_counts[row] = 1

In [25]:
date_counts

{datetime.datetime(2012, 1, 1, 0, 0): 2758,
 datetime.datetime(2012, 2, 1, 0, 0): 2357,
 datetime.datetime(2012, 3, 1, 0, 0): 2743,
 datetime.datetime(2012, 4, 1, 0, 0): 2795,
 datetime.datetime(2012, 5, 1, 0, 0): 2999,
 datetime.datetime(2012, 6, 1, 0, 0): 2826,
 datetime.datetime(2012, 7, 1, 0, 0): 3026,
 datetime.datetime(2012, 8, 1, 0, 0): 2954,
 datetime.datetime(2012, 9, 1, 0, 0): 2852,
 datetime.datetime(2012, 10, 1, 0, 0): 2733,
 datetime.datetime(2012, 11, 1, 0, 0): 2729,
 datetime.datetime(2012, 12, 1, 0, 0): 2791,
 datetime.datetime(2013, 1, 1, 0, 0): 2864,
 datetime.datetime(2013, 2, 1, 0, 0): 2375,
 datetime.datetime(2013, 3, 1, 0, 0): 2862,
 datetime.datetime(2013, 4, 1, 0, 0): 2798,
 datetime.datetime(2013, 5, 1, 0, 0): 2806,
 datetime.datetime(2013, 6, 1, 0, 0): 2920,
 datetime.datetime(2013, 7, 1, 0, 0): 3079,
 datetime.datetime(2013, 8, 1, 0, 0): 2859,
 datetime.datetime(2013, 9, 1, 0, 0): 2742,
 datetime.datetime(2013, 10, 1, 0, 0): 2808,
 datetime.datetime(2013, 11,

## Find The Peak and The Decline Of Gun Deaths Based On Years

In [45]:
year2012 = { key:value for key, value in date_counts.items() if key.year == 2012 }

In [46]:
year2012

{datetime.datetime(2012, 1, 1, 0, 0): 2758,
 datetime.datetime(2012, 2, 1, 0, 0): 2357,
 datetime.datetime(2012, 3, 1, 0, 0): 2743,
 datetime.datetime(2012, 4, 1, 0, 0): 2795,
 datetime.datetime(2012, 5, 1, 0, 0): 2999,
 datetime.datetime(2012, 6, 1, 0, 0): 2826,
 datetime.datetime(2012, 7, 1, 0, 0): 3026,
 datetime.datetime(2012, 8, 1, 0, 0): 2954,
 datetime.datetime(2012, 9, 1, 0, 0): 2852,
 datetime.datetime(2012, 10, 1, 0, 0): 2733,
 datetime.datetime(2012, 11, 1, 0, 0): 2729,
 datetime.datetime(2012, 12, 1, 0, 0): 2791}

In [65]:
maximum2012 = max(year2012, key=year2012.get) 

In [66]:
print("Peak month for 2012 is " + maximum2012.strftime("%B") + " with value " + str(year2012[maximum2012]))

Peak month for 2012 is July with value 3026


In [67]:
minimum2012 = min(year2012, key=year2012.get) 

In [68]:
print("Decline month for 2012 is " + minimum2012.strftime("%B") + " with value " + str(year2012[minimum2012]))

Decline month for 2012 is February with value 2357


In [48]:
year2013 = { key:value for key, value in date_counts.items() if key.year == 2013 }

In [49]:
year2013

{datetime.datetime(2013, 1, 1, 0, 0): 2864,
 datetime.datetime(2013, 2, 1, 0, 0): 2375,
 datetime.datetime(2013, 3, 1, 0, 0): 2862,
 datetime.datetime(2013, 4, 1, 0, 0): 2798,
 datetime.datetime(2013, 5, 1, 0, 0): 2806,
 datetime.datetime(2013, 6, 1, 0, 0): 2920,
 datetime.datetime(2013, 7, 1, 0, 0): 3079,
 datetime.datetime(2013, 8, 1, 0, 0): 2859,
 datetime.datetime(2013, 9, 1, 0, 0): 2742,
 datetime.datetime(2013, 10, 1, 0, 0): 2808,
 datetime.datetime(2013, 11, 1, 0, 0): 2758,
 datetime.datetime(2013, 12, 1, 0, 0): 2765}

In [69]:
maximum2013 = max(year2013, key=year2013.get) 

In [71]:
print("Peak month for 2013 is " + maximum2013.strftime("%B") + " with value " + str(year2013[maximum2013]))

Peak month for 2013 is July with value 3079


In [72]:
minimum2013 = min(year2013, key=year2013.get) 

In [73]:
print("Decline month for 2013 is " + minimum2013.strftime("%B") + " with value " + str(year2013[minimum2013]))

Decline month for 2013 is February with value 2375


In [75]:
year2014 = { key:value for key, value in date_counts.items() if key.year == 2014 }

In [76]:
year2014

{datetime.datetime(2014, 1, 1, 0, 0): 2651,
 datetime.datetime(2014, 2, 1, 0, 0): 2361,
 datetime.datetime(2014, 3, 1, 0, 0): 2684,
 datetime.datetime(2014, 4, 1, 0, 0): 2862,
 datetime.datetime(2014, 5, 1, 0, 0): 2864,
 datetime.datetime(2014, 6, 1, 0, 0): 2931,
 datetime.datetime(2014, 7, 1, 0, 0): 2884,
 datetime.datetime(2014, 8, 1, 0, 0): 2970,
 datetime.datetime(2014, 9, 1, 0, 0): 2914,
 datetime.datetime(2014, 10, 1, 0, 0): 2865,
 datetime.datetime(2014, 11, 1, 0, 0): 2756,
 datetime.datetime(2014, 12, 1, 0, 0): 2857}

In [77]:
maximum2014 = max(year2014, key=year2014.get) 

In [78]:
print("Peak month for 2014 is " + maximum2014.strftime("%B") + " with value " + str(year2014[maximum2014]))

Peak month for 2014 is August with value 2970


In [79]:
minimum2014 = min(year2014, key=year2014.get) 

In [80]:
print("Decline month for 2014 is " + minimum2014.strftime("%B") + " with value " + str(year2014[minimum2014]))

Decline month for 2014 is February with value 2361


## Exploring Gun Deaths By Race And Sex

In [26]:
sex_counts = {}
for row in data:
    if row[5] in sex_counts:
        sex_counts[row[5]] = sex_counts[row[5]] + 1
    else:
        sex_counts[row[5]] = 1

In [32]:
sex_counts

{'F': 14449, 'M': 86349}

In [28]:
race_counts = {}
for row in data:
    if row[7] in race_counts:
        race_counts[row[7]] = race_counts[row[7]] + 1
    else:
        race_counts[row[7]] = 1

In [30]:
race_counts

{'Asian/Pacific Islander': 1326,
 'Black': 23296,
 'Hispanic': 9022,
 'Native American/Native Alaskan': 917,
 'White': 66237}

In [115]:
sorted_race= [(k, race_counts[k]) for k in sorted(race_counts, key=race_counts.get, reverse=True)]
sorted_race

[('White', 66237),
 ('Black', 23296),
 ('Hispanic', 9022),
 ('Asian/Pacific Islander', 1326),
 ('Native American/Native Alaskan', 917)]

In [86]:
maximumDeathInRace = max(race_counts, key=race_counts.get) 

In [89]:
print("Most victim by gun deaths are " + maximumDeathInRace + " people with a value of " + str(race_counts[maximumDeathInRace]))

Most victim by gun deaths are White people with a value of 66237


## Findings 1

1. From year to year (2012 - 2014), it seems it has been a consistent number of gun deaths. No significant differences from year to year.
2. There are some seasonal peak in July or August (Summer) and a decline every February (Winter)
3. Male victims are significantly more than female victims
4. White people seems to be the highest victims based on the data, although some additional data is needed to know the amount of population each group races

## Second Dataset
The data contains information on the total population of the US, as well as the total population of each racial group in the US. 

In [81]:
with open("census.csv","r") as file2:
    census = list(csv.reader(file2))

In [83]:
print(census)

[['Id', 'Year', 'Id', 'Sex', 'Id', 'Hispanic Origin', 'Id', 'Id2', 'Geography', 'Total', 'Race Alone - White', 'Race Alone - Hispanic', 'Race Alone - Black or African American', 'Race Alone - American Indian and Alaska Native', 'Race Alone - Asian', 'Race Alone - Native Hawaiian and Other Pacific Islander', 'Two or More Races'], ['cen42010', 'April 1, 2010 Census', 'totsex', 'Both Sexes', 'tothisp', 'Total', '0100000US', '', 'United States', '308745538', '197318956', '44618105', '40250635', '3739506', '15159516', '674625', '6984195']]


## Rates Of Gun Deaths By Race

In [94]:
mapping = {
    "Asian/Pacific Islander": 15159516 + 674625,
    "Native American/Native Alaskan": 3739506,
    "Black": 40250635,
    "Hispanic": 44618105,
    "White": 197318956
}
race_per_hundredk = {}
for key, value in race_counts.items():
    race_per_hundredk[key] = value / mapping[key] * 100000    

In [99]:
race_per_hundredk

{'Asian/Pacific Islander': 8.374309664161762,
 'Black': 57.8773477735196,
 'Hispanic': 20.220491210910907,
 'Native American/Native Alaskan': 24.521955573811088,
 'White': 33.56849303419181}

In [113]:
sorted_race_per_hundredk = [(k, race_per_hundredk[k]) for k in sorted(race_per_hundredk, key=race_per_hundredk.get, reverse=True)]
sorted_race_per_hundredk

[('Black', 57.8773477735196),
 ('White', 33.56849303419181),
 ('Native American/Native Alaskan', 24.521955573811088),
 ('Hispanic', 20.220491210910907),
 ('Asian/Pacific Islander', 8.374309664161762)]

In [90]:
maximumDeathInRacePer100k = max(race_per_hundredk, key=race_per_hundredk.get) 

In [91]:
print("Most victim by gun deaths are " + maximumDeathInRacePer100k + " people with a value of " + str(race_per_hundredk[maximumDeathInRacePer100k]) + " for every 100000 people")

Most victim by gun deaths are Black people with a value of 57.8773477735196 for every 100000 people


## Filtering By Homicide Intent

In [116]:
intents = [row[3] for row in data]
races = [row[7] for row in data]

homicide_race_counts = {}
for i,race in enumerate(races):
    if intents[i] == "Homicide":
        if race in homicide_race_counts:
            homicide_race_counts[race] += 1
        else:
            homicide_race_counts[race] = 1

In [118]:
homicide_race_counts

{'Asian/Pacific Islander': 559,
 'Black': 19510,
 'Hispanic': 5634,
 'Native American/Native Alaskan': 326,
 'White': 9147}

In [119]:
sorted_race_based_on_homicide = [(k, homicide_race_counts[k]) for k in sorted(homicide_race_counts, key=homicide_race_counts.get, reverse=True)]
sorted_race_based_on_homicide

[('Black', 19510),
 ('White', 9147),
 ('Hispanic', 5634),
 ('Asian/Pacific Islander', 559),
 ('Native American/Native Alaskan', 326)]

In [120]:
race_homicide_per_hundredk = {}
for key, value in homicide_race_counts.items():
    race_homicide_per_hundredk[key] = (value / mapping[key]) * 100000

In [121]:
race_homicide_per_hundredk

{'Asian/Pacific Islander': 3.530346230970155,
 'Black': 48.471284987180944,
 'Hispanic': 12.627161104219914,
 'Native American/Native Alaskan': 8.717729026240365,
 'White': 4.6356417981453335}

In [122]:
sorted_race_homicide_per_hundredk = [(k, race_homicide_per_hundredk[k]) for k in sorted(race_homicide_per_hundredk, key=race_homicide_per_hundredk.get, reverse=True)]
sorted_race_homicide_per_hundredk

[('Black', 48.471284987180944),
 ('Hispanic', 12.627161104219914),
 ('Native American/Native Alaskan', 8.717729026240365),
 ('White', 4.6356417981453335),
 ('Asian/Pacific Islander', 3.530346230970155)]

## Findings 2

Black is the number one case for homicide. Many people kill black people than any other race. 48.47 out of 100000 black people died because of homicide. Hispanic people took second place with a rate of 12.62 people out of 100000 people.
It appears that gun related homicides affected mostly the racial minorities in US.
Further investigations:

-  The link between month and homicide rate.
-  Homicide rate by gender.
-  The rates of other intents by gender and race.
-  Gun death rates by location and education.