# Exploring Gun Deaths in the US
This Dataset contains gun deaths in the US. The datset came from <a href="https://fivethirtyeight.com/">FiveThirtyEight</a>, and can be found <a href="https://github.com/fivethirtyeight/guns-data">here.</a> <br>
<br>
The dataset is stored in the <mark><font color="red">guns.csv</font></mark> file. It contails information on gun deaths in the US from <mark><font color="red">2012</font></mark> to <mark><font color="red">2014</font></mark>. Each row in the dataset represents a single fatality. The columns contain demographic and other information about the victim. Here are the first few rows of the dataset:

<table style="width:100%">
  <tr>
    <th></th>
    <th>year</th> 
    <th>month</th>
    <th>intent</th>
    <th>police</th> 
    <th>sex</th>
    <th>age</th>
    <th>race</th>
    <th>hispanic</th> 
    <th>place</th>
    <th>education</th>
  </tr>
  <tr>
    <td>1</td>
    <td>2012</td> 
    <td>1</td>
    <td>Suicide</td>
    <td>0</td> 
    <td>M</td>
    <td>34.0</td>
    <td>Asian/Pacific Islander</td> 
    <td>100</td>
    <td>Home</td>
    <td>4.0</td>
  </tr>
  <tr>
    <td>1</td>
    <td>2012</td> 
    <td>1</td>
    <td>Suicide</td>
    <td>0</td> 
    <td>F</td>
    <td>21.0</td>
    <td>White</td> 
    <td>100</td>
    <td>Street</td>
    <td>3.0</td>
  </tr>
</table>
<br>
As you can see above, the first row of the data is a header row, which tells you what kind of data is in each column of the CSV file. Each row contains information about the fatality, and the victim. Here's an explanation of each column:

<ul>
<li>
<mark><font color="red">   </font></mark>  -- this is an identifier column, which contains the row number. It's common in CSV files to include a unique identifier for each row, but we can ignore it in this analysis.
</li>
<li>
<mark><font color="red">year</font></mark>  -- the year in which the fatality occurred.
</li>
<li> <mark><font color="red">month</font></mark>  -- the intent of the perpetrator of the crime. This can be <mark><font color="red">Suicide</font></mark>, <mark><font color="red">Accidental</font></mark>, <mark><font color="red">NA</font></mark>, <mark><font color="red">Homicide</font></mark>, or <mark><font color="red">Undetermined</font></mark>.
</li>
<li><mark><font color="red">polics</font></mark> -- whether a police officer was involved with the shooting. Either <mark><font color="red">0</font></mark> (false) or <mark><font color="red">1</font></mark> (true).
</li>
<li> <mark><font color="red">sex</font></mark> -- the gender of the victim. Either <mark><font color="red">M</font></mark> or <mark><font color="red">F</font></mark>.
</li>
<li><mark><font color="red">race</font></mark> -- the race of the victim. Either <mark><font color="red">Asian/Pacific Islander</font></mark>,<mark><font color="red">Native American/Native Alaskan</font></mark>,<mark><font color="red">Black</font></mark>,<mark><font color="red">Hispanic</font></mark>, or <mark><font color="red">White</font></mark>.
</li>
<li><mark><font color="red">hispanic</font></mark> -- a code indicating the Hispanic origin of the victim.
</li>
<li><mark><font color="red">place</font></mark> -- where the shooting occurred.
</li>
<li><mark><font color="red">education</font></mark> -- educational status of the victim. Can be one of the following:
<ul>
<li><mark><font color="red">1</font></mark> -- Less than High School
</li>
<li><mark><font color="red">2</font></mark> -- Graduated from High School or equivalent
</li>
<li><mark><font color="red">3</font></mark> -- Some College </li>
<li> <mark><font color="red">4</font></mark> -- At least graduated from College </li>
<li><mark><font color="red">5</font></mark> -- Not Available </li>
</ul>
</li>
</ul>
<br>
we'll explore the dataset, and try to find patterns in the demographics of the victims. Our first step is to read the data in and take a look at it.



In [4]:
import csv
with open("guns.csv",'r') as f:
    data = list(csv.reader(f))


In [5]:
print(data[:5])

[['', 'year', 'month', 'intent', 'police', 'sex', 'age', 'race', 'hispanic', 'place', 'education'], ['1', '2012', '01', 'Suicide', '0', 'M', '34', 'Asian/Pacific Islander', '100', 'Home', '4'], ['2', '2012', '01', 'Suicide', '0', 'F', '21', 'White', '100', 'Street', '3'], ['3', '2012', '01', 'Suicide', '0', 'M', '60', 'White', '100', 'Other specified', '4'], ['4', '2012', '02', 'Suicide', '0', 'M', '64', 'White', '100', 'Home', '4']]


# Removing Headers from a List of Lists

In [6]:
headers = data[:1]
data = data[1:]

In [7]:
print(data[:5])

[['1', '2012', '01', 'Suicide', '0', 'M', '34', 'Asian/Pacific Islander', '100', 'Home', '4'], ['2', '2012', '01', 'Suicide', '0', 'F', '21', 'White', '100', 'Street', '3'], ['3', '2012', '01', 'Suicide', '0', 'M', '60', 'White', '100', 'Other specified', '4'], ['4', '2012', '02', 'Suicide', '0', 'M', '64', 'White', '100', 'Home', '4'], ['5', '2012', '02', 'Suicide', '0', 'M', '31', 'White', '100', 'Other specified', '2']]


# Counting Gun Deaths By Year

In [8]:
years = [row[1] for row in data]

year_counts = {}
for year in years:
    if year in year_counts:
        year_counts[year]+= 1
    else:
        year_counts[year] = 1

year_counts

{'2012': 33563, '2013': 33636, '2014': 33599}

# Exploring Gun Deaths By Year and Month

In [9]:
import datetime
dates = [datetime.datetime(year=int(row[1]),month=int(row[2]),day=1) for row in data]
dates[:5]

[datetime.datetime(2012, 1, 1, 0, 0),
 datetime.datetime(2012, 1, 1, 0, 0),
 datetime.datetime(2012, 1, 1, 0, 0),
 datetime.datetime(2012, 2, 1, 0, 0),
 datetime.datetime(2012, 2, 1, 0, 0)]

In [10]:
date_counts = {}
for date in dates:
    if date not in date_counts:
        date_counts[date] = 1
    else:
        date_counts[date]+=1
    
date_counts

{datetime.datetime(2012, 1, 1, 0, 0): 2758,
 datetime.datetime(2012, 2, 1, 0, 0): 2357,
 datetime.datetime(2012, 3, 1, 0, 0): 2743,
 datetime.datetime(2012, 4, 1, 0, 0): 2795,
 datetime.datetime(2012, 5, 1, 0, 0): 2999,
 datetime.datetime(2012, 6, 1, 0, 0): 2826,
 datetime.datetime(2012, 7, 1, 0, 0): 3026,
 datetime.datetime(2012, 8, 1, 0, 0): 2954,
 datetime.datetime(2012, 9, 1, 0, 0): 2852,
 datetime.datetime(2012, 10, 1, 0, 0): 2733,
 datetime.datetime(2012, 11, 1, 0, 0): 2729,
 datetime.datetime(2012, 12, 1, 0, 0): 2791,
 datetime.datetime(2013, 1, 1, 0, 0): 2864,
 datetime.datetime(2013, 2, 1, 0, 0): 2375,
 datetime.datetime(2013, 3, 1, 0, 0): 2862,
 datetime.datetime(2013, 4, 1, 0, 0): 2798,
 datetime.datetime(2013, 5, 1, 0, 0): 2806,
 datetime.datetime(2013, 6, 1, 0, 0): 2920,
 datetime.datetime(2013, 7, 1, 0, 0): 3079,
 datetime.datetime(2013, 8, 1, 0, 0): 2859,
 datetime.datetime(2013, 9, 1, 0, 0): 2742,
 datetime.datetime(2013, 10, 1, 0, 0): 2808,
 datetime.datetime(2013, 11,

# Exploring Gun Deaths By Sex And Race

In [11]:
sexes = [row[5] for row in data]
sex_counts = {}

for sex in sexes:
    if sex not in sex_counts:
        sex_counts[sex]=1
    else:
        sex_counts[sex]+=1

sex_counts

{'F': 14449, 'M': 86349}

In [12]:
races = [row[7] for row in data]
race_counts = {}

for race in races:
    if race not in race_counts:
        race_counts[race]=1
    else:
        race_counts[race]+=1

race_counts

{'Asian/Pacific Islander': 1326,
 'Black': 23296,
 'Hispanic': 9022,
 'Native American/Native Alaskan': 917,
 'White': 66237}

## Findings So Far
Gun Deaths in the US seem disproportionally affecting male and females. Their ratio comparatively very less.
They also seem to affect minority more, compare to native.

There is less seasonal co-relation between Gun Deaths. It can be explored more by filtering through Intent. Maybe we can get some useful information from there.

# Reading In A Second Dataset

In [13]:
import csv
with open("census.csv",'r') as f:
    reader = csv.reader(f)
    census = list(reader)
census

[['Id',
  'Year',
  'Id',
  'Sex',
  'Id',
  'Hispanic Origin',
  'Id',
  'Id2',
  'Geography',
  'Total',
  'Race Alone - White',
  'Race Alone - Hispanic',
  'Race Alone - Black or African American',
  'Race Alone - American Indian and Alaska Native',
  'Race Alone - Asian',
  'Race Alone - Native Hawaiian and Other Pacific Islander',
  'Two or More Races'],
 ['cen42010',
  'April 1, 2010 Census',
  'totsex',
  'Both Sexes',
  'tothisp',
  'Total',
  '0100000US',
  '',
  'United States',
  '308745538',
  '197318956',
  '44618105',
  '40250635',
  '3739506',
  '15159516',
  '674625',
  '6984195']]

In [15]:
mapping = {
    "Asian/Pacific Islander":15159516+674625,
    "Black":40250635,
    "Hispanic":44618105,
    "Native American/Native Alaskan":3739506,
    "White":197318956
}

race_per_hundredk = {}
for k,v in race_counts.items():
    race_per_hundredk[k] = (v/mapping[k])*100000
race_per_hundredk

{'Asian/Pacific Islander': 8.374309664161762,
 'Black': 57.8773477735196,
 'Hispanic': 20.220491210910907,
 'Native American/Native Alaskan': 24.521955573811088,
 'White': 33.56849303419181}

# Filtering By Intent

In [19]:
intents = [row[3] for row in data]
homicide_race_counts = {}
for i,value in enumerate(races):
    if intents[i]=="Homicide":
        if value not in homicide_race_counts:
            homicide_race_counts[value] = 0
        else:
            homicide_race_counts[value]+=1

race_per_hundredk = {}
for k,v in homicide_race_counts.items():
    race_per_hundredk[k] = (v/mapping[k])*100000
race_per_hundredk

{'Asian/Pacific Islander': 3.5240307636517825,
 'Black': 48.468800554326656,
 'Hispanic': 12.624919861567406,
 'Native American/Native Alaskan': 8.690987526159873,
 'White': 4.635135004464548}

# Findings
It appears that homicides that are largely affected by Gun Deaths are Black and Hispanic.

There are still some areas left behind:
<ul>
<li> Deaths according to gender</li>
<li> The rates of other intents by gender and race</li>
<li> Exploration based on education and location</li>