## US Gun Deaths Data
In this project, you'll be analyzing data on gun deaths in the US. 
The dataset came from **FiveThirtyEight**, and can be found at https://github.com/fivethirtyeight/guns-data.

The dataset is stored in the **guns.csv** file. It contains information on gun deaths in the US from 2012 to 2014. Each row in the dataset represents a single fatality. The columns contain demographic and other information about the victim. 

In [4]:
import csv
f = open("guns.csv", 'r')
data = list(csv.reader(f))
data[:5]

[['',
  'year',
  'month',
  'intent',
  'police',
  'sex',
  'age',
  'race',
  'hispanic',
  'place',
  'education'],
 ['1',
  '2012',
  '01',
  'Suicide',
  '0',
  'M',
  '34',
  'Asian/Pacific Islander',
  '100',
  'Home',
  '4'],
 ['2', '2012', '01', 'Suicide', '0', 'F', '21', 'White', '100', 'Street', '3'],
 ['3',
  '2012',
  '01',
  'Suicide',
  '0',
  'M',
  '60',
  'White',
  '100',
  'Other specified',
  '4'],
 ['4', '2012', '02', 'Suicide', '0', 'M', '64', 'White', '100', 'Home', '4']]

## Remove Header

In [5]:
headers = data[0]
data = data[1:]
print(headers)
data[:5]

['', 'year', 'month', 'intent', 'police', 'sex', 'age', 'race', 'hispanic', 'place', 'education']


[['1',
  '2012',
  '01',
  'Suicide',
  '0',
  'M',
  '34',
  'Asian/Pacific Islander',
  '100',
  'Home',
  '4'],
 ['2', '2012', '01', 'Suicide', '0', 'F', '21', 'White', '100', 'Street', '3'],
 ['3',
  '2012',
  '01',
  'Suicide',
  '0',
  'M',
  '60',
  'White',
  '100',
  'Other specified',
  '4'],
 ['4', '2012', '02', 'Suicide', '0', 'M', '64', 'White', '100', 'Home', '4'],
 ['5',
  '2012',
  '02',
  'Suicide',
  '0',
  'M',
  '31',
  'White',
  '100',
  'Other specified',
  '2']]

## Counting Gun Deaths By Year

In [6]:
years = [row[1] for row in data]
year_counts = {}
for year in years:
    if year not in year_counts:
        year_counts[year] = 1
    else:
        year_counts[year] += 1

year_counts

{'2012': 33563, '2013': 33636, '2014': 33599}

## Exploring Gun Deaths By Month And Year

In [7]:
import datetime
dates = [datetime.datetime(year = int(row[1]), month = int(row[2]), day = 1) for row in data]
dates[:5]

[datetime.datetime(2012, 1, 1, 0, 0),
 datetime.datetime(2012, 1, 1, 0, 0),
 datetime.datetime(2012, 1, 1, 0, 0),
 datetime.datetime(2012, 2, 1, 0, 0),
 datetime.datetime(2012, 2, 1, 0, 0)]

In [8]:
date_counts = {}
for date in dates:
    if date not in date_counts:
        date_counts[date] = 1
    else:
        date_counts[date] += 1
        
date_counts

{datetime.datetime(2012, 1, 1, 0, 0): 2758,
 datetime.datetime(2012, 2, 1, 0, 0): 2357,
 datetime.datetime(2012, 3, 1, 0, 0): 2743,
 datetime.datetime(2012, 4, 1, 0, 0): 2795,
 datetime.datetime(2012, 5, 1, 0, 0): 2999,
 datetime.datetime(2012, 6, 1, 0, 0): 2826,
 datetime.datetime(2012, 7, 1, 0, 0): 3026,
 datetime.datetime(2012, 8, 1, 0, 0): 2954,
 datetime.datetime(2012, 9, 1, 0, 0): 2852,
 datetime.datetime(2012, 10, 1, 0, 0): 2733,
 datetime.datetime(2012, 11, 1, 0, 0): 2729,
 datetime.datetime(2012, 12, 1, 0, 0): 2791,
 datetime.datetime(2013, 1, 1, 0, 0): 2864,
 datetime.datetime(2013, 2, 1, 0, 0): 2375,
 datetime.datetime(2013, 3, 1, 0, 0): 2862,
 datetime.datetime(2013, 4, 1, 0, 0): 2798,
 datetime.datetime(2013, 5, 1, 0, 0): 2806,
 datetime.datetime(2013, 6, 1, 0, 0): 2920,
 datetime.datetime(2013, 7, 1, 0, 0): 3079,
 datetime.datetime(2013, 8, 1, 0, 0): 2859,
 datetime.datetime(2013, 9, 1, 0, 0): 2742,
 datetime.datetime(2013, 10, 1, 0, 0): 2808,
 datetime.datetime(2013, 11,

## Exploring Gun Deaths By Race And Sex

In [9]:
sex = [row[5] for row in data]
sex_counts = {}
for it in sex:
    if it not in sex_counts:
        sex_counts[it] = 1
    else:
        sex_counts[it] += 1
        
race = [row[7] for row in data]
race_counts = {}
for it in race:
    if it not in race_counts:
        race_counts[it] = 1
    else:
        race_counts[it] += 1
          
print(race_counts)
print(sex_counts)

{'Hispanic': 9022, 'Asian/Pacific Islander': 1326, 'Black': 23296, 'Native American/Native Alaskan': 917, 'White': 66237}
{'M': 86349, 'F': 14449}


So far, I have learnt to use list comprehension to slice column from list of lists, and use for loop to count. 

The result has shown that compare with female, the number of gunshots in male is much larger. What is more, they also seem to disproportionately affect minorities.

## Read in data  about what percentage of the US population falls into each racial category

In [10]:
import csv
f = open("census.csv", 'r')
census = list(csv.reader(f))

census

[['Id',
  'Year',
  'Id',
  'Sex',
  'Id',
  'Hispanic Origin',
  'Id',
  'Id2',
  'Geography',
  'Total',
  'Race Alone - White',
  'Race Alone - Hispanic',
  'Race Alone - Black or African American',
  'Race Alone - American Indian and Alaska Native',
  'Race Alone - Asian',
  'Race Alone - Native Hawaiian and Other Pacific Islander',
  'Two or More Races'],
 ['cen42010',
  'April 1, 2010 Census',
  'totsex',
  'Both Sexes',
  'tothisp',
  'Total',
  '0100000US',
  '',
  'United States',
  '308745538',
  '197318956',
  '44618105',
  '40250635',
  '3739506',
  '15159516',
  '674625',
  '6984195']]

## Computing Rates Of Gun Deaths Per Race

In [11]:
mapping_dic = {
    "Asian/Pacific Islander": 15159516 + 674625,
    "Black": 3739506,
    "Hispanic": 44618105,
    "Native American/Native Alaskan": 3739506,
    "White": 197318956
}

In [12]:
race_per_hundredk = {}
for key, value in race_counts.items():
    race_per_hundredk[key] = value / mapping_dic[key] * 100000
    
print(race_per_hundredk)

{'Hispanic': 20.220491210910907, 'Asian/Pacific Islander': 8.374309664161762, 'Black': 622.9699858751396, 'Native American/Native Alaskan': 24.521955573811088, 'White': 33.56849303419181}


In [14]:
intents = [row[3] for row in data]
races = [row[7] for row in data]
homicide_race_per_hundredk = {}
for i, race in enumerate(races):
    if intents[i] == "Homicide":
        if race not in homicide_race_per_hundredk:
            homicide_race_per_hundredk[race] = 0
        homicide_race_per_hundredk[race] += 1

for key, value in homicide_race_per_hundredk.items():
    value = value / mapping_dic[key] * 100000
    
print(homicide_race_per_hundredk)

{'Hispanic': 5634, 'Asian/Pacific Islander': 559, 'Black': 19510, 'Native American/Native Alaskan': 326, 'White': 9147}


## Link between month and homicide rate

In [15]:
months = [row[2] for row in data]
homicide_rate_per_month = {}
for i, month in enumerate(months):
    if intents[i] == "Homicide":
        if month not in homicide_rate_per_month:
            homicide_rate_per_month[month] = 0
        homicide_rate_per_month[month] += 1

print(homicide_rate_per_month)

{'06': 3130, '02': 2178, '08': 3125, '05': 2976, '10': 2968, '04': 2845, '03': 2780, '09': 2966, '07': 3269, '11': 2919, '01': 2829, '12': 3191}


There is no obvious link between month and homicide rate.

## Explore the homicide rate by gender

In [17]:
genders = [row[5] for row in data]
homicide_rate_gender = {}

for i, gender in enumerate(genders):
    if intents[i] == "Homicide":
        if gender not in homicide_rate_gender:
            homicide_rate_gender[gender] = 0
        homicide_rate_gender[gender] +=1
        
print(homicide_rate_gender)

{'M': 29803, 'F': 5373}
