# FBI Gun Data Analysis

## Table of Content:
<ul>
<li><a href = '#introduction'>Introduction</a></li>
<li><a href = '#data_wrangling'>Data Wrangling</a></li>
<li><a href = '#exploratory_data_analysis'>Exploratory Data Analysis</a></li>
<li><a href = '#conclusion'>Conclusion</a></li>
<ul>
    

<a id='introduction'></a>
## Introduction



<a id = 'data_wrangling'></a>
## Data Wrangling

In [7]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sea
%matplotlib inline


In [32]:
# Read the us_census.csv file and show the first couple columns  
df1 = pd.read_csv('us_census data.csv')
df1 = df1.drop(columns = ['Fact Note']) #Drop the Fact Note column since it contains no useful information for analysis
df1 = df1[:64] #drop all the footnote and keep only the useful data with numbers
df1 = df1.replace('Z',0) #Replace all Z with 0 based on the footnote definition
df1 #show the new us_census_data after the wrangling

Unnamed: 0,Fact,Alabama,Alaska,Arizona,Arkansas,California,Colorado,Connecticut,Delaware,Florida,...,South Dakota,Tennessee,Texas,Utah,Vermont,Virginia,Washington,West Virginia,Wisconsin,Wyoming
0,"Population estimates, July 1, 2016, (V2016)",4863300,741894,6931071,2988248,39250017,5540545,3576452,952065,20612439,...,865454,6651194,27862596,3051217,624594,8411808,7288000,1831102,5778708,585501
1,"Population estimates base, April 1, 2010, (V2...",4780131,710249,6392301,2916025,37254522,5029324,3574114,897936,18804592,...,814195,6346298,25146100,2763888,625741,8001041,6724545,1853011,5687289,563767
2,"Population, percent change - April 1, 2010 (es...",1.70%,4.50%,8.40%,2.50%,5.40%,10.20%,0.10%,6.00%,9.60%,...,0.063,0.048,10.80%,10.40%,-0.20%,5.10%,8.40%,-1.20%,1.60%,3.90%
3,"Population, Census, April 1, 2010",4779736,710231,6392017,2915918,37253956,5029196,3574097,897934,18801310,...,814180,6346105,25145561,2763885,625741,8001024,6724540,1852994,5686986,563626
4,"Persons under 5 years, percent, July 1, 2016, ...",6.00%,7.30%,6.30%,6.40%,6.30%,6.10%,5.20%,5.80%,5.50%,...,0.071,0.061,7.20%,8.30%,4.90%,6.10%,6.20%,5.50%,5.80%,6.50%
5,"Persons under 5 years, percent, April 1, 2010",6.40%,7.60%,7.10%,6.80%,6.80%,6.80%,5.70%,6.20%,5.70%,...,0.073,0.064,7.70%,9.50%,5.10%,6.40%,6.50%,5.60%,6.30%,7.10%
6,"Persons under 18 years, percent, July 1, 2016,...",22.60%,25.20%,23.50%,23.60%,23.20%,22.80%,21.10%,21.50%,20.10%,...,0.246,0.226,26.20%,30.20%,19.00%,22.20%,22.40%,20.50%,22.30%,23.70%
7,"Persons under 18 years, percent, April 1, 2010",23.70%,26.40%,25.50%,24.40%,25.00%,24.40%,22.90%,22.90%,21.30%,...,0.249,0.236,27.30%,31.50%,20.70%,23.20%,23.50%,20.90%,23.60%,24.00%
8,"Persons 65 years and over, percent, July 1, 2...",16.10%,10.40%,16.90%,16.30%,13.60%,13.40%,16.10%,17.50%,19.90%,...,0.16,0.157,12.00%,10.50%,18.10%,14.60%,14.80%,18.80%,16.10%,15.00%
9,"Persons 65 years and over, percent, April 1, 2010",13.80%,7.70%,13.80%,14.40%,11.40%,10.90%,14.20%,14.40%,17.30%,...,0.143,0.134,10.30%,9.00%,14.60%,12.20%,12.30%,16.00%,13.70%,12.40%


In [31]:
df1.info() #show the information of the dataframe1

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 64 entries, 0 to 63
Data columns (total 51 columns):
Fact              64 non-null object
Alabama           64 non-null object
Alaska            64 non-null object
Arizona           64 non-null object
Arkansas          64 non-null object
California        64 non-null object
Colorado          64 non-null object
Connecticut       64 non-null object
Delaware          64 non-null object
Florida           64 non-null object
Georgia           64 non-null object
Hawaii            64 non-null object
Idaho             64 non-null object
Illinois          64 non-null object
Indiana           64 non-null object
Iowa              64 non-null object
Kansas            64 non-null object
Kentucky          64 non-null object
Louisiana         64 non-null object
Maine             64 non-null object
Maryland          64 non-null object
Massachusetts     64 non-null object
Michigan          64 non-null object
Minnesota         64 non-null object
Mississip

In [21]:
# Read the gun_data.csv file and show the first couple columns (This data is from 1988-11 to 2017-09)
df2 = pd.read_csv('gun_data.csv')
df2 = df2.groupby(['state']).mean().sort_values(['permit'], ascending = False)
df2


Unnamed: 0_level_0,permit,permit_recheck,handgun,long_gun,other,multiple,admin,prepawn_handgun,prepawn_long_gun,prepawn_other,...,returned_other,rentals_handgun,rentals_long_gun,private_sale_handgun,private_sale_long_gun,private_sale_other,return_to_seller_handgun,return_to_seller_long_gun,return_to_seller_other,totals
state,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Kentucky,109809.599119,0.0,7723.550661,9867.237885,198.69,486.515419,162.933921,14.078125,26.078125,0.258065,...,0.030303,0.0,0.0,6.82,5.42,0.22,0.288889,0.28,0.170732,131112.044053
Illinois,41844.22467,57065.95,10726.797357,10083.881057,0.01,473.788546,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,68156.537445
California,28768.911894,0.0,26039.118943,26153.171806,3426.35,915.700441,73.480176,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,83762.39207
North Carolina,17061.559471,0.0,665.682819,13052.118943,448.11,158.942731,0.0,11.177083,22.734375,0.236559,...,0.030303,1.611111,0.866667,3.32,9.2,0.76,0.4,0.9,0.292683,34262.947137
Michigan,14595.048458,1023.2,4144.753304,12601.493392,289.24,91.762115,14.110132,0.135417,5.796875,0.086022,...,0.787879,0.0,0.0,5.22,3.44,0.28,0.2,0.2,0.02439,31957.176211
Indiana,13065.022026,0.0,9609.136564,10293.929515,750.32,335.185022,6.229075,0.427083,3.942708,0.11828,...,0.333333,0.0,0.0,17.66,9.9,1.24,1.177778,0.92,0.04878,34084.45815
Texas,12588.46696,0.0,28607.189427,33706.590308,1934.65,1620.977974,110.207048,45.380208,50.708333,0.913978,...,0.484848,0.388889,0.0,21.96,17.22,2.44,0.955556,1.26,0.317073,85617.559471
Utah,12129.770925,28.95,2498.955947,4098.45815,110.57,112.013216,0.017621,0.427083,0.256545,0.225806,...,24.787879,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,19111.334802
Minnesota,10808.973568,0.0,5307.682819,10603.15859,403.23,209.660793,0.004405,0.697917,4.145833,0.075269,...,0.969697,0.0,0.0,5.78,6.02,0.4,0.133333,0.24,0.097561,27661.986784
Georgia,7870.960352,0.0,9373.46696,10080.995595,286.92,389.409692,1819.013216,11.901042,18.088542,0.27957,...,0.0,0.0,0.0,5.66,3.6,0.44,0.244444,0.36,0.04878,31761.859031


In [13]:
df2.info() #showing the information of the datafame2

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 12485 entries, 0 to 12484
Data columns (total 27 columns):
month                        12485 non-null object
state                        12485 non-null object
permit                       12461 non-null float64
permit_recheck               1100 non-null float64
handgun                      12465 non-null float64
long_gun                     12466 non-null float64
other                        5500 non-null float64
multiple                     12485 non-null int64
admin                        12462 non-null float64
prepawn_handgun              10542 non-null float64
prepawn_long_gun             10540 non-null float64
prepawn_other                5115 non-null float64
redemption_handgun           10545 non-null float64
redemption_long_gun          10544 non-null float64
redemption_other             5115 non-null float64
returned_handgun             2200 non-null float64
returned_long_gun            2145 non-null float64
returned_other   

<a id = 'exploratory_data_analysis'></a>
## Exploratory Data Analysis

<a id = 'conclusion'></a>
## Coclusion