# Gun deaths in the US from 2012 to 2014

The dataset contains the information about the gun deaths in the United States from 2012 to 2014.

Data can be found [here](https://github.com/fivethirtyeight/guns-data)

Each row in the dataset represents a single fatality. The columns contain demographic and other information about the victim.

* -- this is an identifier column, which contains the row number. It's common in CSV files to include a unique identifier for each row, but we can ignore it in this analysis.
* year -- the year in which the fatality occurred.
* month -- the month in which the fatality occurred.
* intent -- the intent of the perpetrator of the crime.  This can be Suicide, Accidental, NA, Homicide, or Undetermined.
* police -- whether a police officer was involved with the shooting. Either 0 (false) or 1 (true).
* sex -- the gender of the victim. Either M or F.
* age -- the age of the victim.
* race -- the race of the victim. Either Asian/Pacific Islander, Native American/Native Alaskan, Black, Hispanic, or White.
* hispanic -- a code indicating the Hispanic origin of the victim.
* place -- where the shooting occurred. Has several categories, which you're encouraged to explore on your own.
* education -- educational status of the victim. Can be one of the following:
    - 1 -- Less than High School
    - 2 -- Graduated from High School or equivalent
    - 3 -- Some College
    - 4 -- At least graduated from College
    - 5 -- Not available
    
    


In [15]:
import csv

open_file = open('guns.csv','r')
csv_reader = csv.reader(open_file)

data = list(csv_reader)  # Converting the read data to list of lists.
data[:5]

[['',
  'year',
  'month',
  'intent',
  'police',
  'sex',
  'age',
  'race',
  'hispanic',
  'place',
  'education'],
 ['1',
  '2012',
  '01',
  'Suicide',
  '0',
  'M',
  '34',
  'Asian/Pacific Islander',
  '100',
  'Home',
  '4'],
 ['2', '2012', '01', 'Suicide', '0', 'F', '21', 'White', '100', 'Street', '3'],
 ['3',
  '2012',
  '01',
  'Suicide',
  '0',
  'M',
  '60',
  'White',
  '100',
  'Other specified',
  '4'],
 ['4', '2012', '02', 'Suicide', '0', 'M', '64', 'White', '100', 'Home', '4']]

### Removing Headers

In [16]:
headers = data[0]  # Assigning the headers
data = data[1:]  # Removing the headers

headers
data[:5]

[['1',
  '2012',
  '01',
  'Suicide',
  '0',
  'M',
  '34',
  'Asian/Pacific Islander',
  '100',
  'Home',
  '4'],
 ['2', '2012', '01', 'Suicide', '0', 'F', '21', 'White', '100', 'Street', '3'],
 ['3',
  '2012',
  '01',
  'Suicide',
  '0',
  'M',
  '60',
  'White',
  '100',
  'Other specified',
  '4'],
 ['4', '2012', '02', 'Suicide', '0', 'M', '64', 'White', '100', 'Home', '4'],
 ['5',
  '2012',
  '02',
  'Suicide',
  '0',
  'M',
  '31',
  'White',
  '100',
  'Other specified',
  '2']]

### Counting Gun Deaths By Year

In [21]:
# Extracting year inforamtion from the data
year_counts = {}
years = [each[1] for each in data]

# Count of how many times each year occurs in the year column.
for year in years:
    if year in year_counts:
        year_counts[year] = year_counts[year] + 1
    else:
        year_counts[year] = 1

year_counts

    

{'2012': 33563, '2013': 33636, '2014': 33599}

### Counting Gun Deaths By Months & Year

In [43]:
import datetime
# Extracting month & year inforamtion from the data
dates = [datetime.datetime(year = int(each[1]), month = int(each[2]), day = 1) for each in data]
dates[:5]

# Count of how many times each month & year occurs in the data.
date_counts = {}
for date in dates:
    if  date in date_counts:
        date_counts[date] = date_counts[date] +1
    else:
        date_counts[date] = 1
date_counts

{datetime.datetime(2014, 10, 1, 0, 0): 2865, datetime.datetime(2014, 1, 1, 0, 0): 2651, datetime.datetime(2013, 10, 1, 0, 0): 2808, datetime.datetime(2014, 6, 1, 0, 0): 2931, datetime.datetime(2012, 10, 1, 0, 0): 2733, datetime.datetime(2014, 5, 1, 0, 0): 2864, datetime.datetime(2013, 4, 1, 0, 0): 2798, datetime.datetime(2014, 4, 1, 0, 0): 2862, datetime.datetime(2012, 4, 1, 0, 0): 2795, datetime.datetime(2013, 1, 1, 0, 0): 2864, datetime.datetime(2012, 9, 1, 0, 0): 2852, datetime.datetime(2013, 2, 1, 0, 0): 2375, datetime.datetime(2013, 3, 1, 0, 0): 2862, datetime.datetime(2012, 1, 1, 0, 0): 2758, datetime.datetime(2013, 9, 1, 0, 0): 2742, datetime.datetime(2014, 8, 1, 0, 0): 2970, datetime.datetime(2014, 9, 1, 0, 0): 2914, datetime.datetime(2012, 11, 1, 0, 0): 2729, datetime.datetime(2014, 2, 1, 0, 0): 2361, datetime.datetime(2012, 12, 1, 0, 0): 2791, datetime.datetime(2014, 7, 1, 0, 0): 2884, datetime.datetime(2013, 5, 1, 0, 0): 2806, datetime.datetime(2014, 3, 1, 0, 0): 2684, datet

## Counting Gun Deaths by Sex & Race

In [42]:
# Extracting sex inforamtion from the data
sex_counts = {}
sex_list = [each[5] for each in data]

for sex in sex_list:
    if sex in sex_counts:
        sex_counts[sex] = sex_counts[sex] +1
    else:
        sex_counts[sex] = 1

# Extracting race inforamtion from the data
race_counts = {}
race_list = [each[7] for each in data]

for race in race_list:
    if race in race_counts:
        race_counts[race] = race_counts[race] + 1
    else:
        race_counts[race] = 1
        
print(race_counts)
print(sex_counts)

{'Native American/Native Alaskan': 917, 'Black': 23296, 'White': 66237, 'Hispanic': 9022, 'Asian/Pacific Islander': 1326}
{'F': 14449, 'M': 86349}


## Analysis So Far..

* Gun deaths per year didn't vary much from the year 2012 - 2014. 
* The gun deaths every year lie in the range of 33.5K - 33.7K.
* Minimum number of deaths having a count of 2357 were seen in the month of February, 2012. 
* Maximum number of deaths with a count of 3079 were seen in the month of July, 2013. 
* The gun deaths vary by a huge margin between Females and Males. The ratio of gun deaths between Females and Males is almost 1:6.
* Least number of gun deaths were seen for the  Native American/Native Alaskan race with a total of 917 deaths. 
* Highest number of gun deaths were seen for the White populaiton with a total count of 66237 deaths. 

Various further studies and analysis can be perfomed based on the observation so far to find out patterns and intents behind the gun deaths.

## Reading Census Data

In [44]:
with open('census.csv','r') as file_read:
    csv_reader = csv.reader(file_read)
    census = list(csv_reader)
census

[['Id',
  'Year',
  'Id',
  'Sex',
  'Id',
  'Hispanic Origin',
  'Id',
  'Id2',
  'Geography',
  'Total',
  'Race Alone - White',
  'Race Alone - Hispanic',
  'Race Alone - Black or African American',
  'Race Alone - American Indian and Alaska Native',
  'Race Alone - Asian',
  'Race Alone - Native Hawaiian and Other Pacific Islander',
  'Two or More Races'],
 ['cen42010',
  'April 1, 2010 Census',
  'totsex',
  'Both Sexes',
  'tothisp',
  'Total',
  '0100000US',
  '',
  'United States',
  '308745538',
  '197318956',
  '44618105',
  '40250635',
  '3739506',
  '15159516',
  '674625',
  '6984195']]

## Mapping Between guns.csv and census.csv Header Columns

In [48]:
# Creating the mapping dict.
mapping = {
    "Asian/Pacific Islander": 15159516 + 674625,
    "Native American/Native Alaskan": 3739506,
    "Black": 40250635,
    "Hispanic": 44618105,
    "White": 197318956
}
race_per_hundredk = {}

for each in race_counts:
    race_per_hundredk[each] = (race_counts[each]/mapping[each]) * 100000
    
print(race_per_hundredk) 
    

{'Native American/Native Alaskan': 24.521955573811088, 'Black': 57.8773477735196, 'White': 33.56849303419181, 'Hispanic': 20.220491210910907, 'Asian/Pacific Islander': 8.374309664161762}


## Counting THe Gun Deaths With the Intent of 'Homicide'

In [59]:
# Extracting intent information
intents = [each[3] for each in data]

# Extracting race information
races = [each[7] for each in data]

# Extracting gun deaths with Homicide intent for each race. 
homicide_race_counts = {}
for i,race in enumerate(races):
    if intents[i] == 'Homicide':
        if race not in homicide_race_counts:
            homicide_race_counts[race] = 1
        else:
            homicide_race_counts[race] = homicide_race_counts[race] + 1 
        
# Calculating Homicide counts per race per 100K 
race_per_hunderdk = {}
for each in homicide_race_counts:
    race_per_hunderdk[each] = (homicide_race_counts[each]/mapping[each]) * 100000
    
race_per_hunderdk

{'Asian/Pacific Islander': 3.530346230970155,
 'Black': 48.471284987180944,
 'Hispanic': 12.627161104219914,
 'Native American/Native Alaskan': 8.717729026240365,
 'White': 4.6356417981453335}

## Findings Per Race

For every 100K people, 
* Highest number of gun realted homicides are among the Black population.
* Highest number of gun related homicides are among the Asian/Pacific Islander population.
* The proportion of gun realted homicides are high in *Black* and *Hispanic* racial categories

##### Further Analysis can be done based on the findings so far.

- Which time of the year did we see more number of homicides.
- Gun related Homicide analysis based on geographic locaitons.
- Gun related Homicide analysis basd on gender in each race.
- Was a police officer involved in the shooting.
- How does the education levels vary among different age, sex, racial categories
