# Exploring Gun Deaths in the US

The dataset contains data on gun deaths occuring in the USA from 2012 to 2014. Each row represents one fatality, with the following column structure:

- `row identifier` - record count
- `year` - year of incident
- `month` - month of incident
- `intent` - the intent of the perpetrator of the crime. Possible fields include: Suicide, Accidental, NA, Homicide, Undetermined
- `police` - whether police was involved: 1 (true), 0 (false)
- `sex` - the gender of the victim
- `age` - victim's age
- `race` - victim's race: 
>- Asian/Pacific Islander 
>- Native American/Native Alaskan 
>- Black 
>- Hispanic 
>- White

- `hispanic` - a code indicating Hispanic origin of the Victim
- `place` - where the shooting occured
- `education` - victim's level of education:
>- 1 - Less than High School
>- 2 - Graduated from High School or equivalent
>- 3 - Some College
>- 4 - At least graduated from College
>- 5 - Not available 


In [1]:
# read in csv file and parse header from data
import csv
f = open("gun-data.csv","r")
data = list(csv.reader(f))
print(data[:5])

FileNotFoundError: [Errno 2] No such file or directory: 'guns.csv'

In [None]:
headers = data[0]
data = data[1:]
print(headers)
print(data[:5])

In [None]:
# count number of deaths per year
years = [row[1] for row in data]
year_counts = {}
for y in years:
    if y not in year_counts:
        year_counts[y] = 1
    else:
        year_counts[y] += 1
year_counts

In [None]:
# convert year/month columns to datetime objects
import datetime as d
dates = [d.datetime(year=int(row[1]), month=int(row[2]), day=1) for row in data]
print(dates[:5])

In [None]:
date_counts = {}
for date in dates:
    if date not in date_counts:
        date_counts[date] = 1
    else:
        date_counts[date] += 1
date_counts


In [None]:
sex_counts = {}
for row in data:
    if row[5] not in sex_counts:
        sex_counts[row[5]] = 1
    else:
        sex_counts[row[5]] += 1
sex_counts


In [None]:
race_counts = {}
for row in data:
    if row[7] not in race_counts:
        race_counts[row[7]] = 1
    else:
        race_counts[row[7]] += 1
race_counts

# Analysis Pt. 1: Race vs. Sex
- Incident counts remain relatively consistent by day, generally averaging ~2,700
- Victims of gun violence are predominantly male
- Whites and Blacks comprise the majority of victims based on total counts

In [None]:
f = open("census.csv",'r')
census = list(csv.reader(f))
census

In [None]:
mapping = {}
for i in range(-7,0):
    census[1][i] = int(census[1][i])

mapping["White"] = census[1][-7]
mapping["Hispanic"] = census[1][-6]
mapping["Black"] = census[1][-5]
mapping["Native American/Native Alaskan"] = census[1][-4]
mapping["Asian/Pacific Islander"] = census[1][-2] + census[1][-3]
mapping


In [None]:
race_per_hundredk = {}
for k, v in race_counts.items():
    race_per_hundredk[k] = round(v / mapping[k] * 100000, 2)
race_per_hundredk

In [None]:
intents = [row[3] for row in data]
races = [row[7] for row in data]
print(intents[:5])
print(races[:5])

homicide_race_counts = {}
for i, race in enumerate(races):
    if intents[i] == "Homicide":
        if race not in homicide_race_counts:
            homicide_race_counts[race] = 1
        else:
            homicide_race_counts[race] += 1
homicide_race_counts

In [None]:
for k, v in homicide_race_counts.items():
    homicide_race_counts[k] = round(v / mapping[k] * 100000, 2)
homicide_race_counts

# Analysis Pt. 2: Gun Death Rates by Race
- Based on total counts, it looks like Whites are most susceptible to gun violence.
- After examining each race on the same scale (per 100,000), the data shows that Blacks (48.5 per 100k) are most likely to be victims of gun violence compared to other races.
- Looking at Homicides only, this disparity between Blacks and other races becomes more apparent. Homicide incidents are 12x more likely to occur to Black people than White people (4.6 per 100k) and 4x more likely than Hispanics (12.6 per 100k), the race with the second highest homicide rates.

## Next Steps:
1. Identify how other intents of gun violence incidents vary by race:
> - Ratios of each intent per race
> - Rates per hundred thousand of each intent

2. Analyze trends in gun deaths over time
> - Total gun deaths per year/month
> - Total gun deaths per year/month categorized by intent
> - Total gun deaths per year/month categorized by race

3. Analyze correlations in education for groups with the highest gun deaths

In [None]:
def rates_per_100k_race(intent):
    intents = [row[3] for row in data]
    races = [row[7] for row in data]
    race_counts = {}
    for i, race in enumerate(races):
        if intents[i] == intent:
            if race not in race_counts:
                race_counts[race] = 1
            else:
                race_counts[race] += 1
    for k, v in race_counts.items():
        race_counts[k] = round(v / mapping[k] * 100000, 2)
    return race_counts

accidental_race_counts = rates_per_100k_race("Accidental")
suicide_race_counts = rates_per_100k_race("Suicide")

In [None]:
accidental_race_counts

In [None]:
suicide_race_counts

In [None]:
def intent_ratios(race):
    intents = [row[3] for row in data]
    races = [row[7] for row in data]
    
    # count gun incidents for each intent under given race input
    counts = {}

    for i, intent in enumerate(intents):
        if races[i] == race:
            if intent not in counts:
                counts[intent] = 1
            else:
                counts[intent] += 1
    
    # calculate total counts per race in gun data
    race_counts = {}
    for row in data:
        if row[7] not in race_counts:
            race_counts[row[7]] = 1
        else:
            race_counts[row[7]] += 1
    
    # calculate ratios of each intent
    for k, v in counts.items():
        counts[k] = round(v / race_counts[race] * 100, 2)
    
    counts[".Race"] = race
    return counts


intent_ratios = [intent_ratios(race) for race in set(races)]
intent_ratios


In [None]:
# convert year columns to datetime objects
import datetime as d
dates = [d.datetime(year=int(row[1]), month=1, day=1) for row in data]
print(dates[:5])

# calculate total gun death counts by year
date_counts = {}
for date in dates:
    if date not in date_counts:
        date_counts[date] = 1
    else:
        date_counts[date] += 1
date_counts

In [None]:
# calculates gun death counts by intent per unit: month or year
# defaults to month if no unit given
def intent_date_counts(intent, unit="year/month"):
    import datetime as d
    dates = []
    if unit == "year":
        dates = [d.datetime(year=int(row[1]), month=1, day=1) for row in data]
    else:
        dates = [d.datetime(year=1, month=int(row[2]), day=1) for row in data]        
    
    intents = [row[3] for row in data]
    
    # count number of gun deaths by intent based on enumerated lists
    date_counts = {}
    for i, date in enumerate(dates):
        if intents[i] == intent:
            if date not in date_counts:
                date_counts[date] = 1
            else:
                date_counts[date] += 1
    
    # create sorted lsit for representation
    date_counts = sorted([(k, v) for k, v in date_counts.items()])
    date_counts.insert(0,"Intent: " + intent)
    
    return date_counts

intent_per_year = [intent_date_counts(intent, "year") for intent in set(intents)]
intent_per_year

In [None]:
intent_per_month = [intent_date_counts(intent, "month") for intent in set(intents)]
intent_per_month

In [None]:
# calculates gun death counts by race per unit: month or year
# defaults to month if no unit given
def race_date_counts(race, unit="month"):
    import datetime as d
    dates = []
    if unit == "year":
        dates = [d.datetime(year=int(row[1]), month=1, day=1) for row in data]
    else:
        dates = [d.datetime(year=1, month=int(row[2]), day=1) for row in data]        
    
    races = [row[7] for row in data]

    # count number of gun deaths by intent based on enumerated lists
    date_counts = {}
    for i, date in enumerate(dates):
        if races[i] == race:
            if date not in date_counts:
                date_counts[date] = 1
            else:
                date_counts[date] += 1
    
    # create sorted lsit for representation
    date_counts = sorted([(k, v) for k, v in date_counts.items()])
    date_counts.insert(0,"Race: " + race)
    
    return date_counts

counts_per_year_by_race = [race_date_counts(race, "year") for race in set(races)]
counts_per_year_by_race

In [None]:
counts_per_month_by_race = [race_date_counts(race, "month") for race in set(races)]
counts_per_month_by_race

# Analysis Pt. 3: Conclusion

## Gun deaths categorized by intent
- After analyzing gun deaths by intent, the data shows that **Blacks have the highest rate of Accidental gun deaths (.81) and Homicide gun deaths (48.47) per 100,000; while Whites have the highest rate of Suicide gun deaths (28.3) per 100,000.**
- These findings are further supported after looking at intent ratios by population as shown below. **83.75% of gun deaths among Blacks are Homicide related whereas 83.6% of gun deaths among Whites are Suicide related.**
- Intent ratios by population show us that most gun deaths for each race (~95%) are either Homicide or Suicide related. Asians and Native American/Native Alaskans are more likely to be involved in Suicide related cases whereas Hispanics are more likely to be involved in Homicide related cases.

>  **Race: Black**
  'Accidental': 1.41,
  'Homicide': 83.75,
  'Suicide': 14.3,
  'Undetermined': 0.54
  
>  **Race: White**
  'Accidental': 1.71,
  'Homicide': 13.81,
  'NA': 0.0,
  'Suicide': 83.6,
  'Undetermined': 0.88
  
>  **Race: Asian/Pacific Islander**
  'Accidental': 0.9,
  'Homicide': 42.16,
  'Suicide': 56.18,
  'Undetermined': 0.75}
  
> **Race: Native American/Native Alaskan**
  'Accidental': 2.4,
  'Homicide': 35.55,
  'Suicide': 60.52,
  'Undetermined': 1.53
  
> **Race: Hispanic**
  'Accidental': 1.61,
  'Homicide': 62.45,
  'Suicide': 35.15,
  'Undetermined': 0.8
  
## Gun deaths categorized by dates
- After analyzing gun deaths by year and intent, two things become evident:
    1. **Homicide counts have slowly declined from 2012 to 2014.** While we don't have corresponding census data to show that rates per 100,000 are also decreasing, it might be safe to assume that rates are also decreasing since population has generally been increasing.
    2. **Suicide counts have slowly increased from 2012 to 2014.** However, due to a lack of census data during these years, it's uncertain as to whether or not counts have increased along with popuation growth, or have actually increased due to higher suicide rates.

> **Homicide gun deaths per year**
   -  2012: 12,093
   -  2013: 11,674
   -  2014: 11,409
  
> **Suicide gun deaths per year**
  -  2012: 20,666
  -  2013: 21,175
  -  2014: 21,334
