# Analyzing Gun Deaths in the US

## About the Dataset

The dataset came from FiveThirtyEight. The dataset is stored in the guns.csv file. It contains information on gun deaths in the US from 2012 to 2014. Each row in the dataset represents a single fatality. The columns contain demographic and other information about the victim.

The first row of the data is a header row, which tells you what kind of data is in each column of the CSV file. Each row contains information about the fatality, and the victim. Here's an explanation of each column:

1. "" -- this is an identifier column, which contains the row number. It's common in CSV files to include a unique identifier for each row, but we can ignore it in this analysis.
2. year -- the year in which the fatality occurred.
3. month -- the month in which the fatality occurred.
4. intent -- the intent of the perpetrator of the crime. This can be Suicide, Accidental, NA, Homicide, or Undetermined.
5. police -- whether a police officer was involved with the shooting. Either 0 (false) or 1 (true).
6. sex -- the gender of the victim. Either M or F.
7. age -- the age of the victim.
8. race -- the race of the victim. Either Asian/Pacific Islander, Native American/Native Alaskan, Black, Hispanic, or White.
9. hispanic -- a code indicating the Hispanic origin of the victim.
10. place -- where the shooting occurred. Has several categories, which you're encouraged to explore on your own.
11. education -- educational status of the victim. Can be one of the following:
    1 -- Less than High School
    2 -- Graduated from High School or equivalent
    3 -- Some College
    4 -- At least graduated from College
    5 -- Not available

## Importing US Gun Deaths Data as Lists

In [7]:
import csv
data = list(csv.reader(open("gun_deaths.csv",'r')))
data[:5]

[['',
  'year',
  'month',
  'intent',
  'police',
  'sex',
  'age',
  'race',
  'hispanic',
  'place',
  'education'],
 ['1',
  '2012',
  '01',
  'Suicide',
  '0',
  'M',
  '34',
  'Asian/Pacific Islander',
  '100',
  'Home',
  '4'],
 ['2', '2012', '01', 'Suicide', '0', 'F', '21', 'White', '100', 'Street', '3'],
 ['3',
  '2012',
  '01',
  'Suicide',
  '0',
  'M',
  '60',
  'White',
  '100',
  'Other specified',
  '4'],
 ['4', '2012', '02', 'Suicide', '0', 'M', '64', 'White', '100', 'Home', '4']]

## Removing Headers

In [8]:
headers = data[0]
data = data[1:]
data[:5]

[['1',
  '2012',
  '01',
  'Suicide',
  '0',
  'M',
  '34',
  'Asian/Pacific Islander',
  '100',
  'Home',
  '4'],
 ['2', '2012', '01', 'Suicide', '0', 'F', '21', 'White', '100', 'Street', '3'],
 ['3',
  '2012',
  '01',
  'Suicide',
  '0',
  'M',
  '60',
  'White',
  '100',
  'Other specified',
  '4'],
 ['4', '2012', '02', 'Suicide', '0', 'M', '64', 'White', '100', 'Home', '4'],
 ['5',
  '2012',
  '02',
  'Suicide',
  '0',
  'M',
  '31',
  'White',
  '100',
  'Other specified',
  '2']]

## Gun Deaths by Year

In [9]:
years = [row[1] for row in data]
year_counts = {}
for yr in years:
    if yr in year_counts:
        year_counts[yr] += 1
    else:
        year_counts[yr] = 1
year_counts

{'2012': 33563, '2013': 33636, '2014': 33599}

## Gun Deaths by Month and Year

In [10]:
import datetime
dates = [datetime.datetime(year=int(row[1]), month=int(row[2]), day=1) for row in data]
print(dates[:5])
date_counts = {}
for date in dates:
    date_str = date.strftime("%Y-%m")
    if date_str in date_counts:
        date_counts[date_str] += 1
    else:
        date_counts[date_str] = 1
date_counts

[datetime.datetime(2012, 1, 1, 0, 0), datetime.datetime(2012, 1, 1, 0, 0), datetime.datetime(2012, 1, 1, 0, 0), datetime.datetime(2012, 2, 1, 0, 0), datetime.datetime(2012, 2, 1, 0, 0)]


{'2012-01': 2758,
 '2012-02': 2357,
 '2012-03': 2743,
 '2012-04': 2795,
 '2012-05': 2999,
 '2012-06': 2826,
 '2012-07': 3026,
 '2012-08': 2954,
 '2012-09': 2852,
 '2012-10': 2733,
 '2012-11': 2729,
 '2012-12': 2791,
 '2013-01': 2864,
 '2013-02': 2375,
 '2013-03': 2862,
 '2013-04': 2798,
 '2013-05': 2806,
 '2013-06': 2920,
 '2013-07': 3079,
 '2013-08': 2859,
 '2013-09': 2742,
 '2013-10': 2808,
 '2013-11': 2758,
 '2013-12': 2765,
 '2014-01': 2651,
 '2014-02': 2361,
 '2014-03': 2684,
 '2014-04': 2862,
 '2014-05': 2864,
 '2014-06': 2931,
 '2014-07': 2884,
 '2014-08': 2970,
 '2014-09': 2914,
 '2014-10': 2865,
 '2014-11': 2756,
 '2014-12': 2857}

## Gun Deaths by Race and Sex

In [11]:
sex_counts = {}
race_counts = {}
for row in data:
    if row[5] in sex_counts:
        sex_counts[row[5]] += 1
    else:
        sex_counts[row[5]] = 1
    if row[7] in race_counts:
        race_counts[row[7]] += 1
    else:
        race_counts[row[7]] = 1
print(sex_counts)
race_counts

{'F': 14449, 'M': 86349}


{'Asian/Pacific Islander': 1326,
 'Black': 23296,
 'Hispanic': 9022,
 'Native American/Native Alaskan': 917,
 'White': 66237}

## Analysis so far

Gun death counts by year and month do not indicate any significant trends.

Male gun death victims are relatively much higher.

White and Black gun death victims are much higher when compared to the other races.

## Importing census data

In [12]:
census = list(csv.reader(open("gun_census.csv",'r')))
census

[['Id',
  'Year',
  'Id',
  'Sex',
  'Id',
  'Hispanic Origin',
  'Id',
  'Id2',
  'Geography',
  'Total',
  'Race Alone - White',
  'Race Alone - Hispanic',
  'Race Alone - Black or African American',
  'Race Alone - American Indian and Alaska Native',
  'Race Alone - Asian',
  'Race Alone - Native Hawaiian and Other Pacific Islander',
  'Two or More Races'],
 ['cen42010',
  'April 1, 2010 Census',
  'totsex',
  'Both Sexes',
  'tothisp',
  'Total',
  '0100000US',
  '',
  'United States',
  '308745538',
  '197318956',
  '44618105',
  '40250635',
  '3739506',
  '15159516',
  '674625',
  '6984195']]

## Gun Deaths per 100000 of each Race

In [13]:
mapping = {
    'Asian/Pacific Islander': 15834141,
    'Black': 40250635,
    'Native American/Native Alaskan': 3739506,
    'Hispanic': 44618105,
    'White': 197318956,
}
race_per_hundredk = {}
for race in race_counts:
    race_per_hundredk[race] = race_counts[race]/mapping[race]*100000
race_per_hundredk

{'Asian/Pacific Islander': 8.374309664161762,
 'Black': 57.8773477735196,
 'Hispanic': 20.220491210910907,
 'Native American/Native Alaskan': 24.521955573811088,
 'White': 33.56849303419181}

## Guns Death per 100000 by Race and 'Homicide' Intent

In [14]:
intents = [row[3] for row in data]
races = [row[7] for row in data]
homicide_race_per_hundredk = {}
for i,race in enumerate(races):
    if intents[i] == 'Homicide':
        if race in homicide_race_per_hundredk:
            homicide_race_per_hundredk[race] += 1
        else:
            homicide_race_per_hundredk[race] = 1
for race in homicide_race_per_hundredk:
    homicide_race_per_hundredk[race] = homicide_race_per_hundredk[race]/mapping[race]*100000
homicide_race_per_hundredk

{'Asian/Pacific Islander': 3.530346230970155,
 'Black': 48.471284987180944,
 'Hispanic': 12.627161104219914,
 'Native American/Native Alaskan': 8.717729026240365,
 'White': 4.6356417981453335}

## Homicide counts by month

In [15]:
homicide_month_count = {}
for i,date in enumerate(dates):
    if intents[i] == 'Homicide':
        date_str = date.strftime("%Y-%m")
        if date_str in homicide_month_count:
            homicide_month_count[date_str] += 1
        else:
            homicide_month_count[date_str] = 1
homicide_month_count

{'2012-01': 972,
 '2012-02': 749,
 '2012-03': 966,
 '2012-04': 999,
 '2012-05': 1003,
 '2012-06': 1044,
 '2012-07': 1160,
 '2012-08': 1090,
 '2012-09': 1070,
 '2012-10': 979,
 '2012-11': 978,
 '2012-12': 1083,
 '2013-01': 986,
 '2013-02': 721,
 '2013-03': 923,
 '2013-04': 916,
 '2013-05': 955,
 '2013-06': 1066,
 '2013-07': 1137,
 '2013-08': 1000,
 '2013-09': 954,
 '2013-10': 1009,
 '2013-11': 979,
 '2013-12': 1028,
 '2014-01': 871,
 '2014-02': 708,
 '2014-03': 891,
 '2014-04': 930,
 '2014-05': 1018,
 '2014-06': 1020,
 '2014-07': 972,
 '2014-08': 1035,
 '2014-09': 942,
 '2014-10': 980,
 '2014-11': 962,
 '2014-12': 1080}

## Homicides by Gender

In [16]:
genders = [row[5] for row in data]
homicide_gender_count = {}
for i,gender in enumerate(genders):
    if intents[i] == 'Homicide':
        if gender in homicide_gender_count:
            homicide_gender_count[gender] += 1
        else:
            homicide_gender_count[gender] = 1
homicide_gender_count

{'F': 5373, 'M': 29803}

## Gun Deaths by Intent

In [17]:
intents_count = {}
for row in data:
    intent = row[3] 
    if intent in intents_count:
        intents_count[intent] += 1
    else:
        intents_count[intent] = 1
intents_count

{'Accidental': 1639,
 'Homicide': 35176,
 'NA': 1,
 'Suicide': 63175,
 'Undetermined': 807}

## Gun Deaths by Location

In [18]:
locations_count = {}
for row in data:
    location = row[9] 
    if location in locations_count:
        locations_count[location] += 1
    else:
        locations_count[location] = 1
locations_count

{'Farm': 470,
 'Home': 60486,
 'Industrial/construction': 248,
 'NA': 1384,
 'Other specified': 13751,
 'Other unspecified': 8867,
 'Residential institution': 203,
 'School/instiution': 671,
 'Sports': 128,
 'Street': 11151,
 'Trade/service area': 3439}

## Gun Deaths by Education

In [19]:
education_count = {}
for row in data:
    education = row[10] 
    if education in education_count:
        education_count[education] += 1
    else:
        education_count[education] = 1
education_count

{'1': 21823, '2': 42927, '3': 21680, '4': 12946, '5': 1369, 'NA': 53}

## Gun Deaths by Age

In [20]:
age_count = {}
for row in data:
    age = row[6] 
    if age in age_count:
        age_count[age] += 1
    else:
        age_count[age] = 1
age_count

{'0': 33,
 '1': 38,
 '10': 53,
 '100': 1,
 '101': 2,
 '102': 2,
 '107': 1,
 '11': 61,
 '12': 117,
 '13': 229,
 '14': 364,
 '15': 561,
 '16': 864,
 '17': 1185,
 '18': 1753,
 '19': 2065,
 '2': 50,
 '20': 2219,
 '21': 2504,
 '22': 2712,
 '23': 2472,
 '24': 2437,
 '25': 2230,
 '26': 2231,
 '27': 2070,
 '28': 1986,
 '29': 1955,
 '3': 66,
 '30': 1869,
 '31': 1833,
 '32': 1824,
 '33': 1700,
 '34': 1699,
 '35': 1631,
 '36': 1512,
 '37': 1500,
 '38': 1491,
 '39': 1389,
 '4': 54,
 '40': 1414,
 '41': 1485,
 '42': 1492,
 '43': 1527,
 '44': 1449,
 '45': 1372,
 '46': 1437,
 '47': 1532,
 '48': 1621,
 '49': 1669,
 '5': 43,
 '50': 1674,
 '51': 1755,
 '52': 1715,
 '53': 1708,
 '54': 1684,
 '55': 1596,
 '56': 1625,
 '57': 1472,
 '58': 1510,
 '59': 1430,
 '6': 50,
 '60': 1361,
 '61': 1306,
 '62': 1099,
 '63': 1041,
 '64': 1126,
 '65': 1039,
 '66': 998,
 '67': 865,
 '68': 868,
 '69': 879,
 '7': 43,
 '70': 883,
 '71': 791,
 '72': 736,
 '73': 737,
 '74': 671,
 '75': 676,
 '76': 582,
 '77': 575,
 '78': 598,
 