## Exploring Gun Deaths in the US

The dataset came from FiveThirtyEight, and can be found here. The dataset is stored in the guns.csv file. It contains information on gun deaths in the US from 2012 to 2014.

In [2]:
# Introduce US GUn Deaths Data
import csv
with open('guns.csv') as f:
    csvreader = csv.reader(f)
    data = list(csvreader)
print(data[:5])

[['', 'year', 'month', 'intent', 'police', 'sex', 'age', 'race', 'hispanic', 'place', 'education'], ['1', '2012', '01', 'Suicide', '0', 'M', '34', 'Asian/Pacific Islander', '100', 'Home', '4'], ['2', '2012', '01', 'Suicide', '0', 'F', '21', 'White', '100', 'Street', '3'], ['3', '2012', '01', 'Suicide', '0', 'M', '60', 'White', '100', 'Other specified', '4'], ['4', '2012', '02', 'Suicide', '0', 'M', '64', 'White', '100', 'Home', '4']]


In [3]:
# Removing headers from a list of lists
headers = data[0]
data = data[1:]
print(headers)
print(data[:5])

['', 'year', 'month', 'intent', 'police', 'sex', 'age', 'race', 'hispanic', 'place', 'education']
[['1', '2012', '01', 'Suicide', '0', 'M', '34', 'Asian/Pacific Islander', '100', 'Home', '4'], ['2', '2012', '01', 'Suicide', '0', 'F', '21', 'White', '100', 'Street', '3'], ['3', '2012', '01', 'Suicide', '0', 'M', '60', 'White', '100', 'Other specified', '4'], ['4', '2012', '02', 'Suicide', '0', 'M', '64', 'White', '100', 'Home', '4'], ['5', '2012', '02', 'Suicide', '0', 'M', '31', 'White', '100', 'Other specified', '2']]


In [11]:
# Counting gun deaths by year
years = [i[1] for i in data]
year_counts = {}
for year in years:
    if year not in year_counts:
        year_counts[year] = 1
    else:  
        year_counts[year] += 1

year_counts

{'2012': 33563, '2013': 33636, '2014': 33599}

In [12]:
# Exploring Gun Death by month and year
import datetime

dates = [datetime.datetime(year = int(i[1]), month = int(i[2]),
                           day = 1) for i in data]
dates[:5]

[datetime.datetime(2012, 1, 1, 0, 0),
 datetime.datetime(2012, 1, 1, 0, 0),
 datetime.datetime(2012, 1, 1, 0, 0),
 datetime.datetime(2012, 2, 1, 0, 0),
 datetime.datetime(2012, 2, 1, 0, 0)]

In [18]:
date_counts = {}
for i in dates:
    if i not in date_counts:
        date_counts[i] = 1
    else:
        date_counts[i] += 1
date_counts

{datetime.datetime(2012, 1, 1, 0, 0): 2758,
 datetime.datetime(2012, 2, 1, 0, 0): 2357,
 datetime.datetime(2012, 3, 1, 0, 0): 2743,
 datetime.datetime(2012, 4, 1, 0, 0): 2795,
 datetime.datetime(2012, 5, 1, 0, 0): 2999,
 datetime.datetime(2012, 6, 1, 0, 0): 2826,
 datetime.datetime(2012, 7, 1, 0, 0): 3026,
 datetime.datetime(2012, 8, 1, 0, 0): 2954,
 datetime.datetime(2012, 9, 1, 0, 0): 2852,
 datetime.datetime(2012, 10, 1, 0, 0): 2733,
 datetime.datetime(2012, 11, 1, 0, 0): 2729,
 datetime.datetime(2012, 12, 1, 0, 0): 2791,
 datetime.datetime(2013, 1, 1, 0, 0): 2864,
 datetime.datetime(2013, 2, 1, 0, 0): 2375,
 datetime.datetime(2013, 3, 1, 0, 0): 2862,
 datetime.datetime(2013, 4, 1, 0, 0): 2798,
 datetime.datetime(2013, 5, 1, 0, 0): 2806,
 datetime.datetime(2013, 6, 1, 0, 0): 2920,
 datetime.datetime(2013, 7, 1, 0, 0): 3079,
 datetime.datetime(2013, 8, 1, 0, 0): 2859,
 datetime.datetime(2013, 9, 1, 0, 0): 2742,
 datetime.datetime(2013, 10, 1, 0, 0): 2808,
 datetime.datetime(2013, 11,

In [19]:
# Exploring gun deaths by race and sex
sex_counts = {}
sexs = [i[5] for i in data]
for i in sexs:
    if i not in sex_counts:
        sex_counts[i] = 1
    else:
        sex_counts[i] += 1
sex_counts

{'F': 14449, 'M': 86349}

In [20]:
race_counts = {}
races = [i[7] for i in data]
for i in races:
    if i not in race_counts:
        race_counts[i] = 1
    else:
        race_counts[i] += 1
race_counts

{'Asian/Pacific Islander': 1326,
 'Black': 23296,
 'Hispanic': 9022,
 'Native American/Native Alaskan': 917,
 'White': 66237}

Majority of the gun deaths is male (85%). In terms of racial distribution, Black (23%) and Hispanic (9%) are also disporportionally large vs. the general population. And now the question is how that different from 'the general' so next we will need to look into the census data to get an idea.

In [21]:
# Read in Census data
with open('census.csv') as f:
    reader = csv.reader(f)
    census = list(reader)
census

[['Id',
  'Year',
  'Id',
  'Sex',
  'Id',
  'Hispanic Origin',
  'Id',
  'Id2',
  'Geography',
  'Total',
  'Race Alone - White',
  'Race Alone - Hispanic',
  'Race Alone - Black or African American',
  'Race Alone - American Indian and Alaska Native',
  'Race Alone - Asian',
  'Race Alone - Native Hawaiian and Other Pacific Islander',
  'Two or More Races'],
 ['cen42010',
  'April 1, 2010 Census',
  'totsex',
  'Both Sexes',
  'tothisp',
  'Total',
  '0100000US',
  '',
  'United States',
  '308745538',
  '197318956',
  '44618105',
  '40250635',
  '3739506',
  '15159516',
  '674625',
  '6984195']]

In [22]:
# Computing rates of gun deaths per race
mapping = {
    "Asian/Pacific Islander": 15159516 + 674625,
    "Native American/Native Alaskan": 3739506,
    "Black": 40250635,
    "Hispanic": 44618105,
    "White": 197318956
}

race_per_hundredk = {}
for i, j in race_counts.items():
    race_per_hundredk[i] = j / mapping[i] * 100000

race_per_hundredk

{'Asian/Pacific Islander': 8.374309664161762,
 'Black': 57.8773477735196,
 'Hispanic': 20.220491210910907,
 'Native American/Native Alaskan': 24.521955573811088,
 'White': 33.56849303419181}

I was wrong about my intuition that Hispanic is also high on gun deaths and that's why it's good to check with other data sources to come up unbiased conclusion. 

In [24]:
# Filtering by Intent
intents = [i[3] for i in data]

homicide_race_counts = {}
for i, j in enumerate(races):
    if j not in homicide_race_counts:
        homicide_race_counts[j] = 0
    if intents[i] == 'Homicide':
        homicide_race_counts[j] += 1

homicide_race_counts

{'Asian/Pacific Islander': 559,
 'Black': 19510,
 'Hispanic': 5634,
 'Native American/Native Alaskan': 326,
 'White': 9147}

In [25]:
race_per_hundredk = {}
for i, j in homicide_race_counts.items():
    race_per_hundredk[i] = j / mapping[i] * 100000

race_per_hundredk

{'Asian/Pacific Islander': 3.530346230970155,
 'Black': 48.471284987180944,
 'Hispanic': 12.627161104219914,
 'Native American/Native Alaskan': 8.717729026240365,
 'White': 4.6356417981453335}

When filtered down to just homicide gun deaths, Hispanic did came back with high per 100K deaths. 

### Some areas to investigate further:
- The link between month and homicide rate.
- Homicide rate by gender.
- The rates of other intents by gender and race.
- Gun death rates by location and education.