# Exploring US Gun Deaths

The following data set contains information on gun deaths in the US from 2012 to 2014. Each row represents a single fatality. The columns contain demographic and other information about the victim. 

## Introduction To The Dataset

- Open and view the data set
- Convert data into a list of lists (rows)

In [1]:
import csv

with open(r"C:\projectdatasets\guns.csv", "r") as f:
    reader = csv.reader(f)
    data = list(reader)

In [2]:
print(data[:5])

[['', 'year', 'month', 'intent', 'police', 'sex', 'age', 'race', 'hispanic', 'place', 'education'], ['1', '2012', '01', 'Suicide', '0', 'M', '34', 'Asian/Pacific Islander', '100', 'Home', 'BA+'], ['2', '2012', '01', 'Suicide', '0', 'F', '21', 'White', '100', 'Street', 'Some college'], ['3', '2012', '01', 'Suicide', '0', 'M', '60', 'White', '100', 'Other specified', 'BA+'], ['4', '2012', '02', 'Suicide', '0', 'M', '64', 'White', '100', 'Home', 'BA+']]


## Removing Headers

- Separate the headers from the data

In [4]:
headers = data[:1]
data = data[1:]
print(headers)
print()
print(data[:5])

[['1', '2012', '01', 'Suicide', '0', 'M', '34', 'Asian/Pacific Islander', '100', 'Home', 'BA+']]

[['2', '2012', '01', 'Suicide', '0', 'F', '21', 'White', '100', 'Street', 'Some college'], ['3', '2012', '01', 'Suicide', '0', 'M', '60', 'White', '100', 'Other specified', 'BA+'], ['4', '2012', '02', 'Suicide', '0', 'M', '64', 'White', '100', 'Home', 'BA+'], ['5', '2012', '02', 'Suicide', '0', 'M', '31', 'White', '100', 'Other specified', 'HS/GED'], ['6', '2012', '02', 'Suicide', '0', 'M', '17', 'Native American/Native Alaskan', '100', 'Home', 'Less than HS']]


## Counting Gun Deaths By Year

- Store in a dictionary

In [7]:
years = [row[1] for row in data]  # create list of all years, across all rows

year_counts = {}
for year in years:
    if year not in year_counts:
        year_counts[year] = 1
    else:  
        year_counts[year] += 1

print(year_counts)

{'2012': 33562, '2013': 33636, '2014': 33599}


## Exploring Gun Deaths By Month And Year

 - Store in a dictionary

In [8]:
import datetime

dates = [datetime.datetime(year=int(row[1]), month=int(row[2]), day=1) for row in data]
# for each row, parse date fields to show year, month and day for each row
dates[:5]


[datetime.datetime(2012, 1, 1, 0, 0),
 datetime.datetime(2012, 1, 1, 0, 0),
 datetime.datetime(2012, 2, 1, 0, 0),
 datetime.datetime(2012, 2, 1, 0, 0),
 datetime.datetime(2012, 2, 1, 0, 0)]

In [9]:
# store each unique instance of year and month, with the number of rows associated (number of deaths)
date_counts = {}

for date in dates:
    if date not in date_counts:
        date_counts[date] = 0
    date_counts[date] += 1

date_counts

{datetime.datetime(2012, 1, 1, 0, 0): 2757,
 datetime.datetime(2012, 2, 1, 0, 0): 2357,
 datetime.datetime(2012, 3, 1, 0, 0): 2743,
 datetime.datetime(2012, 4, 1, 0, 0): 2795,
 datetime.datetime(2012, 5, 1, 0, 0): 2999,
 datetime.datetime(2012, 6, 1, 0, 0): 2826,
 datetime.datetime(2012, 7, 1, 0, 0): 3026,
 datetime.datetime(2012, 8, 1, 0, 0): 2954,
 datetime.datetime(2012, 9, 1, 0, 0): 2852,
 datetime.datetime(2012, 10, 1, 0, 0): 2733,
 datetime.datetime(2012, 11, 1, 0, 0): 2729,
 datetime.datetime(2012, 12, 1, 0, 0): 2791,
 datetime.datetime(2013, 1, 1, 0, 0): 2864,
 datetime.datetime(2013, 2, 1, 0, 0): 2375,
 datetime.datetime(2013, 3, 1, 0, 0): 2862,
 datetime.datetime(2013, 4, 1, 0, 0): 2798,
 datetime.datetime(2013, 5, 1, 0, 0): 2806,
 datetime.datetime(2013, 6, 1, 0, 0): 2920,
 datetime.datetime(2013, 7, 1, 0, 0): 3079,
 datetime.datetime(2013, 8, 1, 0, 0): 2859,
 datetime.datetime(2013, 9, 1, 0, 0): 2742,
 datetime.datetime(2013, 10, 1, 0, 0): 2808,
 datetime.datetime(2013, 11,

## Exploring Gun Deaths By Race And Sex

- Store in a dictionary

In [14]:
sexes = [row[5] for row in data] # create list of all sexes, across all rows
sex_counts = {}
for sex in sexes:
    if sex not in sex_counts:
        sex_counts[sex] = 0
    sex_counts[sex] += 1
sex_counts

{'M': 86349, 'F': 14449}

In [10]:
races = [row[7] for row in data] # create list of all races, across all rows
race_counts = {}
for race in races:
    if race not in race_counts:
        race_counts[race] = 0
    race_counts[race] += 1
race_counts

{'White': 66237,
 'Native American/Native Alaskan': 917,
 'Black': 23296,
 'Asian/Pacific Islander': 1325,
 'Hispanic': 9022}

## Findings and Analysis

Gun deaths in the US seem to disproportionately affect men vs women. They also seem to disproportionately affect minorities, although having data on the percentage of each race in the overall US population would help.

There appears to be a minor seasonal correlation, with gun deaths peaking in the summer and declining in the winter. It might be useful to filter by intent, to see if this factor has different correlations with season, race, or gender.

## Reading In A Second Dataset (Census Data)

- Open and view the data set
- Convert data into a list of lists (rows)

In [13]:
import csv

with open(r"C:\projectdatasets\census.csv", "r") as f:
    reader = csv.reader(f)
    census = list(reader)
    
census

[['Id',
  'Year',
  'Id',
  'Sex',
  'Id',
  'Hispanic Origin',
  'Id',
  'Id2',
  'Geography',
  'Total',
  'Race Alone - White',
  'Race Alone - Hispanic',
  'Race Alone - Black or African American',
  'Race Alone - American Indian and Alaska Native',
  'Race Alone - Asian',
  'Race Alone - Native Hawaiian and Other Pacific Islander',
  'Two or More Races'],
 ['cen42010',
  'April 1, 2010 Census',
  'totsex',
  'Both Sexes',
  'tothisp',
  'Total',
  '0100000US',
  '',
  'United States',
  '308745538',
  '197318956',
  '44618105',
  '40250635',
  '3739506',
  '15159516',
  '674625',
  '6984195']]

## Computing Rates Of Gun Deaths Per Race

In [15]:
# use census data to create mapping table containing populations of each race
mapping = {
    "Asian/Pacific Islander": 15159516 + 674625,
    "Native American/Native Alaskan": 3739506,
    "Black": 40250635,
    "Hispanic": 44618105,
    "White": 197318956
}

# for each key and value in previously created dictionary of races and guns deaths (race_counts)
# populate new dictionary with the key (race type), and value equal to the number of gun deaths / the
# value of key in the mapping dictionary (population)
race_per_hundredk = {}
for k,v in race_counts.items():
    race_per_hundredk[k] = (v / mapping[k]) * 100000

race_per_hundredk

{'White': 33.56849303419181,
 'Native American/Native Alaskan': 24.521955573811088,
 'Black': 57.8773477735196,
 'Asian/Pacific Islander': 8.36799419684339,
 'Hispanic': 20.220491210910907}

## Filtering By Intent

See how column 'Intent' (filtered on value 'Homicide') effects the number of gun deaths

In [19]:
intents = [row[3] for row in data]
homicide_race_counts = {}
for i,race in enumerate(races):
    if race not in homicide_race_counts:
        homicide_race_counts[race] = 0
    if intents[i] == "Homicide":
        homicide_race_counts[race] += 1

race_per_hundredk = {}
for k,v in homicide_race_counts.items():
    race_per_hundredk[k] = (v / mapping[k]) * 100000

race_per_hundredk

{'Asian/Pacific Islander': 3.530346230970155,
 'White': 4.6356417981453335,
 'Native American/Native Alaskan': 8.717729026240365,
 'Black': 48.471284987180944,
 'Hispanic': 12.627161104219914}

## Findings and Analysis

It appears that gun related homicides in the US disproportionately affect people in the Black and Hispanic racial categories.