# Exploratory Data Analysis
## CFPB Complaints

The purpose of this project is to download and explore a dataset using Python and associated libraries.

The dataset is about consumer complaints about finacial products and services collected by the Consumer Financial Protection Bureau (CFPB). The data can be downloaded from the [data.gov](https://www.data.gov) website which hosts the U.S Government's open data.

### Import Libraries

In [2]:
%matplotlib inline

In [6]:
import pandas as pd
import numpy as np
import json
import requests
import re
import matplotlib.pyplot as plt
import seaborn as sns

In [4]:
pd.set_option('display.max_colwidth',1000) # Show complete text in dataframe with truncating.

### Gather

Downloaded dataset manually as a csv file and saved it locally as the file is too large and is causing app to crash when I download automatically using the requests library. (should check this out to see why)

In [32]:
complaints_df = pd.read_csv("Consumer_Complaints.csv") # Read csv file into Pandas DataFrame.

### Univariate Analysis
In this section, I will explore variables individually.

In [34]:
complaints_df.shape

(932473, 18)

In [31]:
complaints_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 932473 entries, 0 to 932472
Data columns (total 18 columns):
Date received                   932473 non-null object
Product                         932473 non-null object
Sub-product                     697303 non-null object
Issue                           932473 non-null object
Sub-issue                       450184 non-null object
Consumer complaint narrative    227328 non-null object
Company public response         271533 non-null object
Company                         932473 non-null object
State                           922555 non-null object
ZIP code                        918556 non-null object
Tags                            129864 non-null object
Consumer consent provided?      410778 non-null object
Submitted via                   932473 non-null object
Date sent to company            932473 non-null object
Company response to consumer    932473 non-null object
Timely response?                932473 non-null object
Consumer 

In [37]:
complaints_df.sample(5)

Unnamed: 0,Date received,Product,Sub-product,Issue,Sub-issue,Consumer complaint narrative,Company public response,Company,State,ZIP code,Tags,Consumer consent provided?,Submitted via,Date sent to company,Company response to consumer,Timely response?,Consumer disputed?,Complaint ID
755702,10/03/2016,Student loan,Federal student loan servicing,Dealing with my lender or servicer,Received bad information about my loan,Scam and charge for consolidating student loan. Great Lakes Servicer Referred me to a scam called XXXX XXXX XXXX. Illegally charged me XXXX to consolidate my loan. It was the fault of Great Lakes who did the referral.,Company believes it acted appropriately as authorized by contract or law,GREAT LAKES,CA,921XX,Servicemember,Consent provided,Web,10/06/2016,Closed with explanation,Yes,No,2144036
75617,04/17/2015,Debt collection,"Other (i.e. phone, health club, etc.)",False statements or representation,Attempted to collect wrong amount,they are charging me more than i owed.,,"Diversified Consultants, Inc.",CA,945XX,Servicemember,Consent provided,Web,04/20/2015,Closed with explanation,Yes,No,1334752
294009,04/13/2014,Mortgage,Conventional adjustable mortgage (ARM),"Loan modification,collection,foreclosure",,,,"SELECT PORTFOLIO SERVICING, INC.",OK,73026,,,Web,04/13/2014,Closed,Yes,No,806548
277183,06/16/2017,Checking or savings account,Checking account,Managing an account,Problem accessing account,"Trying to connect my bank to app called active hour they are having issue with verifying account using a third party vendor. We do not have a connection with your bank because the third party bank provider is not seeing able any recent direct deposits, but not because we need a screenshot. Your account was looked at by our team and we were able to see that the issue was the third party provider. There is nothing you and we can do at this point besides wait for our third party provider to fix the issue. : - (",Company has responded to the consumer and the CFPB and chooses not to provide a public response,M&T BANK CORPORATION,NY,142XX,,Consent provided,Web,06/16/2017,Closed with explanation,Yes,,2551692
823688,06/23/2017,Debt collection,Credit card debt,Written notification about debt,Didn't receive notice of right to dispute,"My husband and I have had an XXXX XXXX XXXX XXXX for 9 years and have never had an experience like this. We are 2 hard working honest people and not use to dealing with criminals. Some bad people are using stolen credit cards and purchasing products from our company, and had the products shipped to an address in XXXX XXXX, FL that turns out to be a freight forwarding company. The credit cards used are stolen and the owners are not aware of the use till its to late. Orders for our merchandise are processed by Pay Pal. We collect the name, address, phone number, credit card number, CVV number for each transaction and the information is processed by the card holders bank and approved back to PayPal. They approved the purchases in question ( approximately {$6000.00} ) and Pay Pal sends us an OK to ship the products. We then send order information to our suppliers who ship the products to the purchaser. After a few weeks, the REAL Owner of the credit card files a complaint with Pay Pal ...",,PAYPAL HOLDINGS INC.,FL,321XX,"Older American, Servicemember",Consent provided,Web,06/23/2017,Closed with explanation,Yes,,2557296
