<div align="center">

# Investigating NICS and Census Datasets

</div>

## Table of Contents
<ul>
<li><a href="#intro">Introduction</a></li>
<li><a href="#wrangling">Data Wrangling</a></li>
<li><a href="#eda">Exploratory Data Analysis</a></li>
<li><a href="#conclusions">Conclusions</a></li>
</ul>

<a id='intro'></a>
## Introduction

### Dataset Description 

- brief introduction to the dataset you've selected/downloaded for analysis. Read through the description available on the `Project Details` page of `Investigate a Dataset` lesson for this course. 
- List all column names in each table, and their significance. In case of multiple tables, describe the relationship between tables. 

The data comes from the FBI's National Instant Criminal Background Check System. The NICS is used by to determine whether a prospective buyer is eligible to buy firearms or explosives.
Gun shops call into this system to ensure that each customer does not have a criminal record or isn’t otherwise ineligible to make a purchase.
The data has been supplemented with state level data from census.gov(opens in a new tab).
- The NICS data(opens in a new tab) is found in one sheet of an .xlsx file. It contains the number of firearm checks by month, state, and type.
- The U.S. census data(opens in a new tab) is found in a .csv file. It contains several variables at the state level. Most variables just have one data point per state (2016), but a few have data for more than one year.

### Question(s) for Analysis
- state one or more questions that you plan on exploring over the course of the report.
- address these questions in the **data analysis** and **conclusion** sections. 
- Try to build your report around the analysis of at least one dependent variable and three independent variables. 
- If you're not sure what questions to ask, then make sure you familiarize yourself with the dataset, its variables and the dataset context for ideas of what to explore.

Example from them:
- What census data is most associated with high gun per capita? 
- Which states have had the highest growth in gun registrations? 
- What is the overall trend of gun purchases?

Necessary imports:

In [15]:
import pandas as pd
import numpy as np
import os

<a id='wrangling'></a>
## Data Wrangling
- load in the data, 
- check for cleanliness, 
- trim and clean your dataset for analysis. Make sure that you  justify your cleaning decisions
- use NumPy arrays, Pandas Series, and DataFrames where appropriate rather than Python lists and dictionaries. 

In [19]:
def load_data(path="./Database_Ncis_and_Census_data"):
    """
    This function loads the dataset into pandas dataFrames
    Args:
        path(str): Path to the datasets files with the deafult value
    Returns:
        dfs (list): list of of pandas dataFrames (pd.DataFrame)
    """
    dfs = np.array([])
    for file in os.listdir(path):
        if file.endswith(".csv"):
            dfs.append(pd.read_csv(file))
    return dfs

In [None]:
dfs = load_data()

In [None]:
# Load your data and print out a few lines. What is the size of your dataframe? 
#   Perform operations to inspect data types and look for instances of missing
#   or possibly errant data. There are at least 4 - 6 methods you can call on your
#   dataframe to obtain this information.

### Data Cleaning
- keep your reader informed on the steps that you are taking in your investigation. 
- Follow every code cell, or every set of related code cells, with a markdown cell to describe to the reader what was found in the preceding cell(s).
-  Try to make it so that the reader can then understand what they will be seeing in the following cell(s).

In [None]:
# After discussing the structure of the data and any problems that need to be
#   cleaned, perform those cleaning steps in the second part of this section.

<a id='eda'></a>
## Exploratory Data Analysis

- **Compute statistics**: You should compute the relevant statistics throughout the analysis when an inference is made about the data.
- **create visualizations**:  Note that at least two or more kinds of plots should be created as part of the exploration, and you must  compare and show trends in the varied visualizations. Remember to utilize the visualizations that the pandas library already has available.

with the goal of addressing the research questions that you posed in the Introduction section. 

- Investigate the stated question(s) from multiple angles. It is recommended that you be systematic with your approach. Look at one variable at a time, and then follow it up by looking at relationships between variables.
-  You should explore at least three variables in relation to the primary question. 
- This can be an exploratory relationship between three variables of interest, or looking at how two independent variables relate to a single dependent variable of interest. 
- Lastly, you  should perform both single-variable (1d) and multiple-variable (2d) explorations.


### Research Question 1 (Replace this header name!)

In [None]:
# Use this, and more code cells, to explore your data. Don't forget to add
#   Markdown cells to document your observations and findings.

### Research Question 2  (Replace this header name!)

In [None]:
# Continue to explore the data to address your additional research
#   questions. Add more headers as needed if you have more questions to
#   investigate.


<a id='conclusions'></a>
## Conclusions

- Summarize the results accurately, and point out where additional research can be done or where additional information could be useful.
-  Make sure that you are clear with regards to the limitations of your exploration. You should have at least 1 limitation explained clearly. 
-  If you haven't done any statistical tests, do not imply any statistical conclusions. And make sure you avoid implying causation from correlation!
- cheack all the areas of the rubric. 

create a .html or .pdf version of this notebook in the workspace here. 

In [None]:
# Running this cell will execute a bash command to convert this notebook to an .html file
!python -m nbconvert --to html Investigate_a_Dataset.ipynb