# Gun Deaths in the US

#### This data project is based on a dataset from [FiveThirtyEight](http://fivethirtyeight.com/). The dataset gathers gun deaths from 2012 to 2014 with specifics about _intent_, _police involvement_, _sex_, _age_, _race_, and _place_.

This project was intiated from a DataQuest course in April 2018.  I used Python 3 (with some built-in modules) to open, read, and organize the data then use the data to calculate statistics with the goal of better understanding gun deaths in the USA.

created April 2018 by Steve Hanlon

In [49]:
import csv
import datetime

dataset = open('guns.csv')
csvreader = csv.reader(dataset)
data = list(csvreader)
headers = data[:1]
data = data[1:]

years = [year[1] for year in data]
year_counts = {}
total_counts = []

for year in years:
    if year not in year_counts:
        year_counts[year] = 1
    else:
        year_counts[year] += 1

for year, count in year_counts.items():
    total_counts.append(count)

total_deaths = sum(total_counts)
year_avg = total_deaths / len(total_counts)

print(year_counts)
print(total_deaths)
print(int(year_avg))


{'2012': 33563, '2013': 33636, '2014': 33599}
100798
33599


In [20]:
dates = [datetime.datetime(year=int(year[1]), month=int(year[2]), day=1) for year in data]
date_counts = {}

for date in dates:
    if date not in date_counts:
        date_counts[date] = 1
    else:
        date_counts[date] += 1
        
print(date_counts)        

{datetime.datetime(2013, 6, 1, 0, 0): 2920, datetime.datetime(2012, 12, 1, 0, 0): 2791, datetime.datetime(2012, 10, 1, 0, 0): 2733, datetime.datetime(2014, 10, 1, 0, 0): 2865, datetime.datetime(2014, 9, 1, 0, 0): 2914, datetime.datetime(2012, 8, 1, 0, 0): 2954, datetime.datetime(2014, 1, 1, 0, 0): 2651, datetime.datetime(2012, 5, 1, 0, 0): 2999, datetime.datetime(2014, 5, 1, 0, 0): 2864, datetime.datetime(2012, 9, 1, 0, 0): 2852, datetime.datetime(2012, 7, 1, 0, 0): 3026, datetime.datetime(2014, 11, 1, 0, 0): 2756, datetime.datetime(2014, 6, 1, 0, 0): 2931, datetime.datetime(2013, 5, 1, 0, 0): 2806, datetime.datetime(2012, 3, 1, 0, 0): 2743, datetime.datetime(2014, 7, 1, 0, 0): 2884, datetime.datetime(2012, 2, 1, 0, 0): 2357, datetime.datetime(2014, 8, 1, 0, 0): 2970, datetime.datetime(2014, 4, 1, 0, 0): 2862, datetime.datetime(2012, 1, 1, 0, 0): 2758, datetime.datetime(2013, 1, 1, 0, 0): 2864, datetime.datetime(2014, 2, 1, 0, 0): 2361, datetime.datetime(2012, 6, 1, 0, 0): 2826, dateti

In [42]:
sex_counts = {}
sex_count_total = []

for row in data:
    if row[5] not in sex_counts:
        sex_counts[row[5]] = 1
    else:
        sex_counts[row[5]] += 1
            
        
print(sex_counts)    

{'M': 86349, 'F': 14449}


In [43]:
race_counts = {}

# the death count totals by race
for row in data:
    if row[7] not in race_counts:
        race_counts[row[7]] = 1
    else:
        race_counts[row[7]] += 1
        
print(race_counts)        

{'Black': 23296, 'Asian/Pacific Islander': 1326, 'White': 66237, 'Hispanic': 9022, 'Native American/Native Alaskan': 917}


#### Current Assessment

While the total gun deaths between 2012 and 2014 is high at over 100,000, there is not much difference between the years with each averaging about 33,600 deaths.

The biggest surprise is the difference between male and female deaths during that time with males deaths accounting for roughly 86%.

#### Needing further examination

Among the races, whites accounted for the higher deaths at about 66% which closely matches the percentage of whites in the US while black deaths account for 23,296 deaths or almost 24%.

- What is thst figure in comparison to the demographic percentages in the US?  (e.g. whites account for about 73% of the population.  What is the ration of pop. pecentage to gun deaths per year.)

- How many suicides per year?  Which race commits the most/least?  Where do they take place the most/least?  What age group is the most/least?

- How many homicides per year? Which race commits the most/least? Where do they take place the most/least?

- How many homicides by police per year?  Which race is killed the most/least?





In [23]:
# Read in census.csv, and convert to a list of lists. Assign the result to the census variable.
census_data = open('census.csv', 'r')
csvreader = csv.reader(census_data)
census = list(csvreader)
census = census[1:]
print(census)


[['cen42010', 'April 1, 2010 Census', 'totsex', 'Both Sexes', 'tothisp', 'Total', '0100000US', '', 'United States', '308745538', '197318956', '44618105', '40250635', '3739506', '15159516', '674625', '6984195']]


In [28]:
mapping = {}
race_per_hundredk = {}

# add race_counts keys and values to mapping dict
for key, value in race_counts.items():
    if key not in mapping:
        mapping[key] = value        

# Change mapping values to 'death rate' based on population divided by gun deaths
for row in census:
    asian_pac_group = int(row[14]) + int(row[15])
    mapping["White"] /= int(row[10])
    mapping["Hispanic"] /= int(row[11])
    mapping["Black"] /= int(row[12])
    mapping["Native American/Native Alaskan"] /= int(row[13])
    mapping["Asian/Pacific Islander"] /= asian_pac_group

print(mapping)

# finalize rate per 100k by multiplying by 100k and add to race_per_hundredk dict    
for key, value in mapping.items():
    if value not in race_per_hundredk:
        race_per_hundredk[key] = round(value * 100000, 2) # trim float to 2 after decimal with round function
        
print(race_per_hundredk)
    

{'Native American/Native Alaskan': 0.0002452195557381109, 'Asian/Pacific Islander': 8.374309664161763e-05, 'White': 0.0003356849303419181, 'Hispanic': 0.00020220491210910907, 'Black': 0.000578773477735196}
{'Asian/Pacific Islander': 8.37, 'Black': 57.88, 'Native American/Native Alaskan': 24.52, 'White': 33.57, 'Hispanic': 20.22}


In [32]:
mapping_population = {}

# add race_counts keys and values to mapping dict
for key, value in race_counts.items():
    if key not in mapping_population:
        mapping_population[key] = value        

# Change mapping values to 'death rate' based on population divided by gun deaths
for row in census:
    mapping_population["White"] = int(row[10])
    mapping_population["Hispanic"] = int(row[11])
    mapping_population["Black"] = int(row[12])
    mapping_population["Native American/Native Alaskan"] = int(row[13])
    mapping_population["Asian/Pacific Islander"] = int(row[14]) + int(row[15])

print(mapping_population)       

{'Native American/Native Alaskan': 3739506, 'Asian/Pacific Islander': 15834141, 'White': 197318956, 'Hispanic': 44618105, 'Black': 40250635}


In [48]:
homicide_race_counts = {}   

# a list of the death count by intent (i.e. suicide, homocide)
intents = [row[3] for row in data]
# print(intents)

# a list of the races in race column
races = [row[7] for row in data]
# print(races)

# Homocide death count by race (raw numbers)
for i, race in enumerate(races):
    if intents[i] == "Homicide":
        if race not in homicide_race_counts:
            homicide_race_counts[race] = 1
        else:
            homicide_race_counts[race] += 1
            
# Homocide death count by race (rates per 100000)           
for key, value in mapping_population.items():
    # if race keys match
    if key in homicide_race_counts:
        # update homicide_race_counts values to percentage rate per 100k
        homicide_race_counts[key] = round((homicide_race_counts[key] / value) * 100000, 2)
    

print(homicide_race_counts) 

{'Native American/Native Alaskan': 8.72, 'Asian/Pacific Islander': 3.53, 'White': 4.64, 'Hispanic': 12.63, 'Black': 48.47}


#### Current Assessment of Homocide Deaths by Race
While the homocide count for whites is 9,147 deaths, that is only 4.6% of the total white population in the USA.  Another way to say this is, about 5 whites per 100,000 died by homocide over this 3-year period of data.

In contrast, the homocide count for blacks is 19,510 deaths. When based on the total black population in the USA, 19,510 is 48.4%.  That is to say, 48 blacks per 100,000 died by homocide over this 3-year period of data.

The homocide count for Hispanics is 5634 deaths. That number is 12.6% of the total Hispanic population in the USA. From a different angle, it is about 13 Hispanic homocide deaths per 100,000 Hispanics over this 3-year period of data.


#### Further study

- How many suicides per year?  Which race commits the most/least?  Where do they take place the most/least?  What age group is the most/least?

- Based on homocides: Which race experiences the most/least? Where do they take place the most/least?

- How many homicides by police per year?  Which race is killed the most/least?

- Figure out the link, if any, between month and homicide rate.

- Explore the homicide rate by gender.
- Explore the rates of other intents, like Accidental, by gender and race.
- Find out if gun death rates correlate to location and education.
