> **Tip**: Welcome to the Investigate a Dataset project! You will find tips in quoted sections like this to help organize your approach to your investigation. Before submitting your project, it will be a good idea to go back through your report and remove these sections to make the presentation of your work as tidy as possible. First things first, you might want to double-click this Markdown cell and change the title so that it reflects your dataset and investigation.

# Project: Investigate a Dataset (Replace this with something more specific!)

## Table of Contents
<ul>
<li><a href="#intro">Introduction</a></li>
<li><a href="#wrangling">Data Wrangling</a></li>
<li><a href="#eda">Exploratory Data Analysis</a></li>
<li><a href="#conclusions">Conclusions</a></li>
</ul>

<a id='intro'></a>
## Introduction
The data comes from the FBI's National Instant Criminal Background Check System.From november 1998 to June 2020 The NICS is used by to determine whether a prospective buyer is eligible to buy firearms or explosives. Gun shops call into this system to ensure that each customer does not have a criminal record or isn’t otherwise ineligible to make a purchase.



### Objectives (Research Questions)

* Descriptive Statistics
1 What census data is most associated with high gun per capita?
2 Which states have had the highest growth in gun registrations?
3 Which states have had the lowest  growth in gun registration?
4 What is the overall trend of gun purchases?



In [59]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
from scipy.stats import pearsonr as pr

%matplotlib inline


<a id='wrangling'></a>
## Data Wrangling

> **Tip**: In this section of the report, you will load in the data, check for cleanliness, and then trim and clean your dataset for analysis. Make sure that you document your steps carefully and justify your cleaning decisions.

### General Properties

In [113]:
data = pd.read_csv('nics-firearm-background-checks.csv')
data.head(2)

Unnamed: 0,month,state,permit,permit_recheck,handgun,long_gun,other,multiple,admin,prepawn_handgun,...,returned_other,rentals_handgun,rentals_long_gun,private_sale_handgun,private_sale_long_gun,private_sale_other,return_to_seller_handgun,return_to_seller_long_gun,return_to_seller_other,totals
0,2020-07,Alabama,48126.0,682.0,34909.0,17250.0,2498.0,1170,0.0,33.0,...,0.0,0.0,0.0,43.0,23.0,10.0,1.0,2.0,0.0,107490
1,2020-07,Alaska,69.0,152.0,4949.0,3779.0,435.0,283,0.0,0.0,...,0.0,0.0,0.0,14.0,6.0,2.0,0.0,0.0,0.0,10108


To view our data better we need to drop unecessary columns, we are droping the cell of return to seller and returned


In [119]:
data1 = data.drop(columns=['return_to_seller_handgun','return_to_seller_long_gun','return_to_seller_other','returned_other','returned_handgun','returned_long_gun'])

Futher more we are going to drop cells that we dont need for this anlysis


In [120]:
data1.drop(columns=['permit_recheck','multiple','admin' ],axis=1, inplace=True)

Futher more we are going to drop the cells of rentals 

In [121]:
data1.drop(columns=['rentals_handgun','rentals_long_gun' ],axis=1, inplace=True)
data1.head(10)

Unnamed: 0,month,state,permit,handgun,long_gun,other,prepawn_handgun,prepawn_long_gun,prepawn_other,redemption_handgun,redemption_long_gun,redemption_other,private_sale_handgun,private_sale_long_gun,private_sale_other,totals
0,2020-07,Alabama,48126.0,34909.0,17250.0,2498.0,33.0,20.0,5.0,1908.0,770.0,12.0,43.0,23.0,10.0,107490
1,2020-07,Alaska,69.0,4949.0,3779.0,435.0,0.0,1.0,0.0,177.0,129.0,1.0,14.0,6.0,2.0,10108
2,2020-07,Arizona,9402.0,31040.0,12361.0,2411.0,7.0,3.0,1.0,827.0,314.0,5.0,23.0,10.0,4.0,60330
3,2020-07,Arkansas,4292.0,12391.0,7322.0,738.0,12.0,8.0,0.0,865.0,696.0,5.0,19.0,16.0,0.0,27647
4,2020-07,California,35097.0,67672.0,38618.0,7530.0,0.0,0.0,0.0,449.0,315.0,4.0,0.0,0.0,0.0,149685
5,2020-07,Colorado,11124.0,29171.0,15184.0,2077.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,59666
6,2020-07,Connecticut,8500.0,6815.0,2062.0,1791.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,19494
7,2020-07,Delaware,445.0,4034.0,1840.0,249.0,0.0,0.0,0.0,7.0,4.0,0.0,81.0,43.0,11.0,6935
8,2020-07,District of Columbia,575.0,296.0,11.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,898
9,2020-07,Florida,33826.0,106745.0,33102.0,7751.0,13.0,6.0,1.0,2830.0,767.0,9.0,522.0,265.0,100.0,190975


Now we can check out the properties of our data to gain a deeper intution with the data


In [122]:
data1.dtypes

month                     object
state                     object
permit                   float64
handgun                  float64
long_gun                 float64
other                    float64
prepawn_handgun          float64
prepawn_long_gun         float64
prepawn_other            float64
redemption_handgun       float64
redemption_long_gun      float64
redemption_other         float64
private_sale_handgun     float64
private_sale_long_gun    float64
private_sale_other       float64
totals                     int64
dtype: object

 Lets check the general infomation of our data


data1.info()

In [126]:
f'Our data contains: 64 float columns,  1 integer columns, and 2 object columns'


'Our data contains: 64 float columns,  1 integer columns, and 2 object columns'

Let's check the shape of our data


In [125]:
f'Our  data has {data1.shape[0]} rows and {data1.shape[1]} columns'


'Our  data has 14355 rows and 16 columns'

we want to check for null and  duplicate values in our data


In [130]:
f'Is any value  duplicate in our dataset:  { data1.duplicated().any()}'


'Is any value  duplicate in our dataset:  False'

In [131]:
f'Are there null values in our dataset : {data1.isnull().any().any()}'


'Are there null values in our dataset : True'

In [132]:
print('Sum of Null values in each column:\n')
data1.isnull().sum()

Sum of Null values in each column:



month                       0
state                       0
permit                     24
handgun                    20
long_gun                   19
other                    6985
prepawn_handgun          1943
prepawn_long_gun         1945
prepawn_other            7370
redemption_handgun       1940
redemption_long_gun      1941
redemption_other         7370
private_sale_handgun     9735
private_sale_long_gun    9735
private_sale_other       9735
totals                      0
dtype: int64

### Research Question 2  (Replace this header name!)

<a id='eda'></a>
## Exploratory Data Analysis

> **Tip**: Now that you've trimmed and cleaned your data, you're ready to move on to exploration. Compute statistics and create visualizations with the goal of addressing the research questions that you posed in the Introduction section. It is recommended that you be systematic with your approach. Look at one variable at a time, and then follow it up by looking at relationships between variables.

### Research Question 1 (Replace this header name!)

In [None]:
# Continue to explore the data to address your additional research
#   questions. Add more headers as needed if you have more questions to
#   investigate.


In [3]:
!git checkout submission

error: pathspec 'submission' did not match any file(s) known to git


<a id='conclusions'></a>
## Conclusions

> **Tip**: Finally, summarize your findings and the results that have been performed. Make sure that you are clear with regards to the limitations of your exploration. If you haven't done any statistical tests, do not imply any statistical conclusions. And make sure you avoid implying causation from correlation!

> **Tip**: Once you are satisfied with your work, you should save a copy of the report in HTML or PDF form via the **File** > **Download as** submenu. Before exporting your report, check over it to make sure that the flow of the report is complete. You should probably remove all of the "Tip" quotes like this one so that the presentation is as tidy as possible. Congratulations!