# Exploring Gun Deaths in the US

The dataset is from [FiveThirtyEight](https://fivethirtyeight.com/), and can be found [here](https://raw.githubusercontent.com/fivethirtyeight/guns-data/master/full_data.csv).
It contains information on gun deaths in the US from 2012 to 2014. Each row in the dataset represents a single incident of death by gun shut. The columns contain demographic and other information about the victims.

**year** -- the year in which the fatality occurred.

**month**-- the month in which the fatality occurred.

**intent** -- the intent of the perpetrator of the crime. This can be Suicide, Accidental, NA, Homicide, or Undetermined.

**police** -- whether a police officer was involved with the shooting. Either 0 (false) or 1 (true).

**sex** -- the gender of the victim. Either M or F.

**age** -- the age of the victim.

**race** -- the race of the victim. Either Asian/Pacific Islander, Native American/Native Alaskan, Black, Hispanic, or White.

**hispanic** -- a code indicating the Hispanic origin of the victim.
place -- where the shooting occurred. Has several categories, which you're encouraged to explore on your own.
education -- educational status of the victim. Can be one of the following:

1. Less than High School
2. Graduated from High School or equivalent
3. Some College
4. At least graduated from College
5. Not available

### In this project, I'll explore the dataset, to find patterns in the demographics of the victims.

In [2]:
import csv


f = open('guns.csv')
read = csv.reader(f)
data = list(read)

In [3]:
#view first 5 rows of t
data[:5]

[['',
  'year',
  'month',
  'intent',
  'police',
  'sex',
  'age',
  'race',
  'hispanic',
  'place',
  'education'],
 ['1',
  '2012',
  '01',
  'Suicide',
  '0',
  'M',
  '34',
  'Asian/Pacific Islander',
  '100',
  'Home',
  '4'],
 ['2', '2012', '01', 'Suicide', '0', 'F', '21', 'White', '100', 'Street', '3'],
 ['3',
  '2012',
  '01',
  'Suicide',
  '0',
  'M',
  '60',
  'White',
  '100',
  'Other specified',
  '4'],
 ['4', '2012', '02', 'Suicide', '0', 'M', '64', 'White', '100', 'Home', '4']]

In [3]:
headers = data[0]
del(data[0]) #remove the header
print(headers)
data[:5]

['', 'year', 'month', 'intent', 'police', 'sex', 'age', 'race', 'hispanic', 'place', 'education']


[['1',
  '2012',
  '01',
  'Suicide',
  '0',
  'M',
  '34',
  'Asian/Pacific Islander',
  '100',
  'Home',
  '4'],
 ['2', '2012', '01', 'Suicide', '0', 'F', '21', 'White', '100', 'Street', '3'],
 ['3',
  '2012',
  '01',
  'Suicide',
  '0',
  'M',
  '60',
  'White',
  '100',
  'Other specified',
  '4'],
 ['4', '2012', '02', 'Suicide', '0', 'M', '64', 'White', '100', 'Home', '4'],
 ['5',
  '2012',
  '02',
  'Suicide',
  '0',
  'M',
  '31',
  'White',
  '100',
  'Other specified',
  '2']]

## To find out how many gun death death happened in a year


In [4]:
years = [i[1] for i in data]
years
year_counts = {} #empty dictionary
for year in years:
    if year in year_counts:
        year_counts[year] += 1
    else:
        year_counts[year] = 1
print(year_counts)



{'2012': 33563, '2014': 33599, '2013': 33636}


### The gun death did not change much by year from **2012** to **2014**
Explore further to see if gun deaths in the US change by month.


In [5]:
import datetime

dates = [datetime.datetime(year=int(i[1]), month=int(i[2]),day=1) for i in data] 
dates[:5]

date_count = {}

for date in dates:
    if date in date_count:
        date_count[date] += 1
    else:
        date_count[date] = 1
        
date_count

{datetime.datetime(2012, 1, 1, 0, 0): 2758,
 datetime.datetime(2012, 2, 1, 0, 0): 2357,
 datetime.datetime(2012, 3, 1, 0, 0): 2743,
 datetime.datetime(2012, 4, 1, 0, 0): 2795,
 datetime.datetime(2012, 5, 1, 0, 0): 2999,
 datetime.datetime(2012, 6, 1, 0, 0): 2826,
 datetime.datetime(2012, 7, 1, 0, 0): 3026,
 datetime.datetime(2012, 8, 1, 0, 0): 2954,
 datetime.datetime(2012, 9, 1, 0, 0): 2852,
 datetime.datetime(2012, 10, 1, 0, 0): 2733,
 datetime.datetime(2012, 11, 1, 0, 0): 2729,
 datetime.datetime(2012, 12, 1, 0, 0): 2791,
 datetime.datetime(2013, 1, 1, 0, 0): 2864,
 datetime.datetime(2013, 2, 1, 0, 0): 2375,
 datetime.datetime(2013, 3, 1, 0, 0): 2862,
 datetime.datetime(2013, 4, 1, 0, 0): 2798,
 datetime.datetime(2013, 5, 1, 0, 0): 2806,
 datetime.datetime(2013, 6, 1, 0, 0): 2920,
 datetime.datetime(2013, 7, 1, 0, 0): 3079,
 datetime.datetime(2013, 8, 1, 0, 0): 2859,
 datetime.datetime(2013, 9, 1, 0, 0): 2742,
 datetime.datetime(2013, 10, 1, 0, 0): 2808,
 datetime.datetime(2013, 11,

## Explore the gender and race of the victims


In [6]:
gender_counts = {}

genders = [i[5] for i in data]
for gender in genders:
    if gender in gender_counts:
        gender_counts[gender] += 1
    else:
        gender_counts[gender] = 1
            
gender_counts 

{'F': 14449, 'M': 86349}

In [7]:
race_counts = {}

races = [i[7] for i in data]
for race in races:
    if race in race_counts:
        race_counts[race] += 1
    else:
        race_counts[race] = 1
            
race_counts 

{'Asian/Pacific Islander': 1326,
 'Black': 23296,
 'Hispanic': 9022,
 'Native American/Native Alaskan': 917,
 'White': 66237}

## Findings so far

Gun deaths in the US disproportionately affect men vs women. It also seem to disproportionately affect races. It would be helpful to have  some data on the percentage of each race in the US overall populations.

There also seem to be a seasonal trend. Gun deaths rises in the warmer months (May - September), it peaks in the summer and decline in the winter.

The intent can be further explored to determine if there is a correlation among ages and races. We can also explore the involvement of the police.


## Comparing US gun deaths by race.

The total number of deaths per race was explored above. To meaningfully compare the numbers, the rate of gun deather per **100000** people of each race.

Import data that contains information on the total population of US, as well as the total population of each racial group.


In [8]:
d = csv.reader(open('census.csv'))
census = list(d)
census

[['Id',
  'Year',
  'Id',
  'Sex',
  'Id',
  'Hispanic Origin',
  'Id',
  'Id2',
  'Geography',
  'Total',
  'Race Alone - White',
  'Race Alone - Hispanic',
  'Race Alone - Black or African American',
  'Race Alone - American Indian and Alaska Native',
  'Race Alone - Asian',
  'Race Alone - Native Hawaiian and Other Pacific Islander',
  'Two or More Races'],
 ['cen42010',
  'April 1, 2010 Census',
  'totsex',
  'Both Sexes',
  'tothisp',
  'Total',
  '0100000US',
  '',
  'United States',
  '308745538',
  '197318956',
  '44618105',
  '40250635',
  '3739506',
  '15159516',
  '674625',
  '6984195']]

In order to get from the raw counts of gun deaths by race to a rate of gun deaths per **100000** people in each race, we'll need to divide the total number of gun deaths by the population of each race. 

In [13]:
mapping = {'Asian/Pacific Islander':15159516 + 674625,
           'Black': 40250635,
           'Hispanic' :44618105,
           'Native American/Native Alaskan': 3739506,
           'White': 197318956
}

race_per_hundredk = {}
for k,v in race_counts.items():
    race_per_hundredk[k] = (v / mapping[k])*100000
race_per_hundredk    

{'Asian/Pacific Islander': 8.374309664161762,
 'Black': 57.8773477735196,
 'Hispanic': 20.220491210910907,
 'Native American/Native Alaskan': 24.521955573811088,
 'White': 33.56849303419181}

## Murder by Gun


In [15]:
intents = [i[3] for i in data]
races = [i[7] for i in data]

homicide_race_counts = {}

for i,v in enumerate(races):
    if intents[i] == 'Homicide':
        if v in homicide_race_counts:
            homicide_race_counts[v] += 1
        else:
            homicide_race_counts[v] = 1
            
homicide_race_counts    

race_per_hundredk_homicides ={}
for k,v in homicide_race_counts.items():
    race_per_hundredk_homicides[k] = (v / mapping[k]) * 100000
    
race_per_hundredk_homicides    

{'Asian/Pacific Islander': 3.530346230970155,
 'Black': 48.471284987180944,
 'Hispanic': 12.627161104219914,
 'Native American/Native Alaskan': 8.717729026240365,
 'White': 4.6356417981453335}

# Conclusion

It appears that gun related homicides disproportionately affect people of black and hispanic races.

Some areas to explore further:

The link between month and homicide rate.
Homicide rate by gender.
The rates of other intents by gender and race.
Gun death rates by location and education.
