# # Project: Exploring Gun Deaths In the US
### <p style="color:Tomato">Use list comprehensions, modules, and the datetime package to find patterns in US gun death data.<p/>
#### <p style="color:Gray">guns.csv<p/><hr>
The dataset came from FiveThirtyEight, and can be found [here](https://github.com/dataquestio/solutions/blob/master/Mission218Solution.ipynb). The dataset is stored in the guns.csv file.<br/>
It contains information on gun deaths in the US from 2012 to 2014. Each row in the dataset represents a single fatality. The columns contain **demographic and other information about the victim**
<br/>
피해자에 대한 인구 통계 및 기타 정보가 들어 있습니다.<br/>
[FiveThirtyEight, Gun Deaths In America](https://fivethirtyeight.com/features/gun-deaths/)

## <p style="color:Gray">Intorducing US Gun Deaths Data<p/><hr>

In [1]:
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = 'all'

In [2]:
import csv
f = open('full_data.csv', 'r')
csvreader = csv.reader(f)
data = list(csvreader)
data[:5]

[['',
  'year',
  'month',
  'intent',
  'police',
  'sex',
  'age',
  'race',
  'hispanic',
  'place',
  'education'],
 ['1',
  '2012',
  '01',
  'Suicide',
  '0',
  'M',
  '34',
  'Asian/Pacific Islander',
  '100',
  'Home',
  'BA+'],
 ['2',
  '2012',
  '01',
  'Suicide',
  '0',
  'F',
  '21',
  'White',
  '100',
  'Street',
  'Some college'],
 ['3',
  '2012',
  '01',
  'Suicide',
  '0',
  'M',
  '60',
  'White',
  '100',
  'Other specified',
  'BA+'],
 ['4', '2012', '02', 'Suicide', '0', 'M', '64', 'White', '100', 'Home', 'BA+']]

## <p style="color:Gray">Removing Headers From A List Of Lists<p/><hr>

In [32]:
class Dataset:
    def __init__(self, data):
        self.header = data[0]
        self.data = data[1:]
    def column (self, label):
        if label not in self.header:
            return None
        index = 0
        for idx, element in enumerate(self.header):
            if label == element:
                index = idx
        column = []
        for row in self.data:
            column.append(row[index])
        return column
        

In [33]:
full_dataset = Dataset(data)
headers = full_dataset.header
guns_data = full_dataset.data

In [34]:
print(headers)
guns_data[:10]

['', 'year', 'month', 'intent', 'police', 'sex', 'age', 'race', 'hispanic', 'place', 'education']


[['1',
  '2012',
  '01',
  'Suicide',
  '0',
  'M',
  '34',
  'Asian/Pacific Islander',
  '100',
  'Home',
  'BA+'],
 ['2',
  '2012',
  '01',
  'Suicide',
  '0',
  'F',
  '21',
  'White',
  '100',
  'Street',
  'Some college'],
 ['3',
  '2012',
  '01',
  'Suicide',
  '0',
  'M',
  '60',
  'White',
  '100',
  'Other specified',
  'BA+'],
 ['4', '2012', '02', 'Suicide', '0', 'M', '64', 'White', '100', 'Home', 'BA+'],
 ['5',
  '2012',
  '02',
  'Suicide',
  '0',
  'M',
  '31',
  'White',
  '100',
  'Other specified',
  'HS/GED'],
 ['6',
  '2012',
  '02',
  'Suicide',
  '0',
  'M',
  '17',
  'Native American/Native Alaskan',
  '100',
  'Home',
  'Less than HS'],
 ['7',
  '2012',
  '02',
  'Undetermined',
  '0',
  'M',
  '48',
  'White',
  '100',
  'Home',
  'HS/GED'],
 ['8',
  '2012',
  '03',
  'Suicide',
  '0',
  'M',
  '41',
  'Native American/Native Alaskan',
  '100',
  'Home',
  'HS/GED'],
 ['9',
  '2012',
  '02',
  'Accidental',
  '0',
  'M',
  '50',
  'White',
  '100',
  'Other speci

## <p style="color:Gray">Counting Gun Deaths By Year<p/><hr>

In [7]:
years = full_dataset.column('year')

In [8]:
years[:10]

['2012',
 '2012',
 '2012',
 '2012',
 '2012',
 '2012',
 '2012',
 '2012',
 '2012',
 '2012']

In [9]:
year_counts = {}
for year in years:
    if year not in year_counts:
        year_counts[year] = 0
    year_counts[year] += 1
    
year_counts

{'2012': 33563, '2013': 33636, '2014': 33599}

It looks like gun deaths didn't change much by year from 2012 to 2014. <br/>
Let's see if gun deaths in the US change by month and year.<br/>
To create a datetime.datetime object using the year and month columns.<br/>
Then **to count up gun deaths by date**.

In [10]:
print(header)

['', 'year', 'month', 'intent', 'police', 'sex', 'age', 'race', 'hispanic', 'place', 'education']


## <p style="color:Gray">Exploring Gun Deaths By Month And Year<p/><hr>
The sex and race columns contain potentially interesting information on how gun deaths in the US very by gender and race. 

In [11]:
import datetime

In [12]:
dates = [datetime.datetime(year=int(row[1]), month=int(row[2]), day=1) for row in guns_data]
dates[:5]

[datetime.datetime(2012, 1, 1, 0, 0),
 datetime.datetime(2012, 1, 1, 0, 0),
 datetime.datetime(2012, 1, 1, 0, 0),
 datetime.datetime(2012, 2, 1, 0, 0),
 datetime.datetime(2012, 2, 1, 0, 0)]

In [13]:
date_counts = {}

for date in dates:
    if date not in date_counts:
        date_counts[date] = 0
    date_counts[date] += 1
    
date_counts

{datetime.datetime(2012, 1, 1, 0, 0): 2758,
 datetime.datetime(2012, 2, 1, 0, 0): 2357,
 datetime.datetime(2012, 3, 1, 0, 0): 2743,
 datetime.datetime(2012, 4, 1, 0, 0): 2795,
 datetime.datetime(2012, 5, 1, 0, 0): 2999,
 datetime.datetime(2012, 6, 1, 0, 0): 2826,
 datetime.datetime(2012, 7, 1, 0, 0): 3026,
 datetime.datetime(2012, 8, 1, 0, 0): 2954,
 datetime.datetime(2012, 9, 1, 0, 0): 2852,
 datetime.datetime(2012, 10, 1, 0, 0): 2733,
 datetime.datetime(2012, 11, 1, 0, 0): 2729,
 datetime.datetime(2012, 12, 1, 0, 0): 2791,
 datetime.datetime(2013, 1, 1, 0, 0): 2864,
 datetime.datetime(2013, 2, 1, 0, 0): 2375,
 datetime.datetime(2013, 3, 1, 0, 0): 2862,
 datetime.datetime(2013, 4, 1, 0, 0): 2798,
 datetime.datetime(2013, 5, 1, 0, 0): 2806,
 datetime.datetime(2013, 6, 1, 0, 0): 2920,
 datetime.datetime(2013, 7, 1, 0, 0): 3079,
 datetime.datetime(2013, 8, 1, 0, 0): 2859,
 datetime.datetime(2013, 9, 1, 0, 0): 2742,
 datetime.datetime(2013, 10, 1, 0, 0): 2808,
 datetime.datetime(2013, 11,

## <p style="color:Gray">Exploring Gun Deaths By Race And Sex<p/><hr>

In [14]:
print(header)

['', 'year', 'month', 'intent', 'police', 'sex', 'age', 'race', 'hispanic', 'place', 'education']


In [15]:
for item in guns_data[:5]:
    print("sex")
    print(item[5])
    print("race")
    print(item[7])

sex
M
race
Asian/Pacific Islander
sex
F
race
White
sex
M
race
White
sex
M
race
White
sex
M
race
White


In [16]:
sexes = full_dataset.column('sex')

In [17]:
sex_counts = {}
for sex in sexes:
    if sex not in sex_counts:
        sex_counts[sex] = 0
    sex_counts[sex] += 1
sex_counts

{'F': 14449, 'M': 86349}

In [18]:
races = full_dataset.column('race')

In [19]:
races[:5]

['Asian/Pacific Islander', 'White', 'White', 'White', 'White']

In [20]:
race_counts = {}
for race in races:
    if race not in race_counts:
        race_counts[race] = 0
    race_counts[race] += 1
race_counts

{'Asian/Pacific Islander': 1326,
 'Black': 23296,
 'Hispanic': 9022,
 'Native American/Native Alaskan': 917,
 'White': 66237}

#### <p style="color:Gray">Findings so far<p/>
> Gun deaths in the US seem to disproportionately affect men vs women.They also seem t o disproportionately affect minorities although having some data on the percentage of each race in the overall US population would help.<br/>
There appears to be a minor seasonal correlation, with gun deaths peaking in the summer and declining in the winter. It might be useful to filter by intent, to see if different categories of intent have different correlations with season, race, or gender.<br/>

<br/>
> 미국의 총기사망은 남성과 여성간에 불균형 적으로 영향을 미치는 것으로 보입니다.<br/>
계절적 상관관계가 미미한 것으로 보여집니다. 총기 사망자는 여름에 정점에 이르며 겨울에는 감소합니다. 다양한 의도의 카테고리가 계절, 인종 또는 성별과 다른 상관 관계가 있는지 확인하기 위해 의도별로 필터링하는 것이 유용합니다.

In [21]:
print(header)

['', 'year', 'month', 'intent', 'police', 'sex', 'age', 'race', 'hispanic', 'place', 'education']


In [23]:
class Dataset:
    def __init__(self, data):
        self.header = data[0]
        self.data = data[1:]
    def column (self, label):
        if label not in self.header:
            return None
        index = 0
        for idx, element in enumerate(self.header):
            if label == element:
                index = idx
        column = []
        for row in self.data:
            column.append(row[index])
        return column
    def count_unique(self, label):
        unique_results = set(self.column(label))
        count = len(unique_results)
        return count
    def set_column(self, label):
        unique_results = set(self.column(label))
        return unique_results

In [25]:
full_dataset = Dataset(data)
intents = full_dataset.column('intent')
intents[:10]

['Suicide',
 'Suicide',
 'Suicide',
 'Suicide',
 'Suicide',
 'Suicide',
 'Undetermined',
 'Suicide',
 'Accidental',
 'Suicide']

In [28]:
intent_unique = full_dataset.count_unique('intent')
print(intent_unique)
intent_set = full_dataset.set_column('intent')
print(intent_set)

5
{'Homicide', 'NA', 'Undetermined', 'Suicide', 'Accidental'}


In [30]:
intent_counts = {}
for intent in intents:
    if intent not in intent_counts:
        intent_counts[intent] = 0
    intent_counts[intent] += 1
print(intent_counts)

{'Suicide': 63175, 'Undetermined': 807, 'Accidental': 1639, 'Homicide': 35176, 'NA': 1}


In [35]:
print(headers)

['', 'year', 'month', 'intent', 'police', 'sex', 'age', 'race', 'hispanic', 'place', 'education']


## <p style="color:Gray">Reading in A second Dataset<p/><hr>
> A rate of gun deaths per 100000 people of each race(종족의 1000,000명당 총 사망률)
* The total number of gun deaths by race in the US. <br/>
* The proportion of each race in the US
#### <p style="color:Gray">census.csv<p/>
The data contains information on the total population of the US, as well as the total population of each racial group in the US.

In [36]:
f = open('census.csv', 'r')
csvreader = csv.reader(f)
census = list(csvreader)

In [37]:
census

[['Id',
  'Year',
  'Id',
  'Sex',
  'Id',
  'Hispanic Origin',
  'Id',
  'Id2',
  'Geography',
  'Total',
  'Race Alone - White',
  'Race Alone - Hispanic',
  'Race Alone - Black or African American',
  'Race Alone - American Indian and Alaska Native',
  'Race Alone - Asian',
  'Race Alone - Native Hawaiian and Other Pacific Islander',
  'Two or More Races'],
 ['cen42010',
  'April 1, 2010 Census',
  'totsex',
  'Both Sexes',
  'tothisp',
  'Total',
  '0100000US',
  '',
  'United States',
  '308745538',
  '197318956',
  '44618105',
  '40250635',
  '3739506',
  '15159516',
  '674625',
  '6984195']]

In [38]:
census_dataset = Dataset(census)
census_headers = census_dataset.header
census_data = census_dataset.data

In [39]:
print(census_headers)
print(census_data)

['Id', 'Year', 'Id', 'Sex', 'Id', 'Hispanic Origin', 'Id', 'Id2', 'Geography', 'Total', 'Race Alone - White', 'Race Alone - Hispanic', 'Race Alone - Black or African American', 'Race Alone - American Indian and Alaska Native', 'Race Alone - Asian', 'Race Alone - Native Hawaiian and Other Pacific Islander', 'Two or More Races']
[['cen42010', 'April 1, 2010 Census', 'totsex', 'Both Sexes', 'tothisp', 'Total', '0100000US', '', 'United States', '308745538', '197318956', '44618105', '40250635', '3739506', '15159516', '674625', '6984195']]


In [41]:
print(race_counts)

{'Asian/Pacific Islander': 1326, 'White': 66237, 'Native American/Native Alaskan': 917, 'Black': 23296, 'Hispanic': 9022}


In [43]:
white_gun_death_rate = 66327 / 197318956
print(white_gun_death_rate)

0.0003361410446546251
