# Exploring Gun Deaths in the US

Let's analyze some data that relates to gun deaths in the US.

In [2]:
import csv
import datetime

# import data file
f = open("guns.csv", 'r')
data = list(csv.reader(f))
f.close()
# remove headers row
headers = data[0]
data = data[1:]
#print(data)

# Let's find out how many instances of gun deaths occur each year
year_counts = {}
years = list(years[1] for years in data)
for year in years:
    if year in year_counts:
        year_counts[year] += 1
    else:
        year_counts[year] = 1

#print(year_counts)


# Not much variability here.. how about we look on a monthly basis
date_counts = {}

dates = list(datetime.datetime(year=int(row[1]), month=int(row[2]), day=1) for row in data)

for date in dates:
    if date.strftime("%d-%m-%y") in date_counts:
        date_counts[date.strftime("%d-%m-%y")] += 1
    else:
        date_counts[date.strftime("%d-%m-%y")] = 1    
#print(date_counts)  
        
# Now let's do the same with sex and race

sex_counts = {}

sexes = list(sex[5] for sex in data)
for sex in sexes:
    if sex in sex_counts:
        sex_counts[sex] += 1
    else:
        sex_counts[sex] = 1
        
race_counts = {}

races = list(race[7] for race in data)
for race in races:
    if race in race_counts:
        race_counts[race] += 1
    else:
        race_counts[race] = 1

print(race_counts)
print(sex_counts)

{'Hispanic': 9022, 'Black': 23296, 'White': 66237, 'Asian/Pacific Islander': 1326, 'Native American/Native Alaskan': 917}
{'M': 86349, 'F': 14449}


Using this data set, it seems that many more males are killed due to gun violence each year and that white individuals are 3x more likely than the next nearest race, which are black individuals.

Something that we should look into more is the population of each race. Without this, we can't make any more meaningful conclusions or make any true statements without understanding the full scope of the problem.

Lets do that now:

In [10]:
f = open("census.csv", 'r')
census = list(csv.reader(f))
f.close()
#print(census)

mapping = {"Asian/Pacific Islander":16508766, "Black":40250635, "Native American/Native Alaskan":3739506, "Hispanic":44618105,"White":197318956}
race_per_hundredk = {}

for race in race_counts:
    race_per_hundredk[race] = race_counts[race]/ mapping[race]*100000
    
# Let's now look at the intent column to see what other conclusions we can draw

intents = list(intent[3] for intent in data)

races = list(race[7] for race in data)

homicide_race_per_hundredk = {}
#print(races)
for i, race in enumerate(races):
    if intents[i] == "Homicide":
        if race in homicide_race_per_hundredk:
            homicide_race_per_hundredk[race] += 1
        else:
            homicide_race_per_hundredk[race] = 1
print(homicide_race_per_hundredk)        
for race in homicide_race_per_hundredk:
    homicide_race_per_hundredk[race] = homicide_race_per_hundredk[race]/ mapping[race]*100000   

print(homicide_race_per_hundredk)

    
    

{'Hispanic': 5634, 'Native American/Native Alaskan': 326, 'Asian/Pacific Islander': 559, 'White': 9147, 'Black': 19510}
{'Hispanic': 12.627161104219914, 'Native American/Native Alaskan': 8.717729026240365, 'Asian/Pacific Islander': 3.386079855998928, 'White': 4.6356417981453335, 'Black': 48.471284987180944}


It turns out that the largest homicide values are related to Black ethnicity, by almost a factor of 4 when compared to the next closest value. when taking in the population of each race in the United States. Needless to say, something needs to be done about this!!!