# **EXPLORATORY DATA ANALYSIS OF FINANCIAL CONSUMER COMPLAINTS FOR BANK OF AMERICA(2017-2013)**

### AIM:
The aim of this project is to perform and end-to-end exploratory data analysis to analyse Bank of Americaâ€™s consumer complaints (2017-2023) to uncover seasonal trends, identify which products receive the most complaints and their common issues, examine how complaints are typically resolved, and explore insights from complaints with untimely responses. 

### STEPS:
1. Data loading and initial overview
2. Data Pre-processing
3. Exploratory Data Analysis (EDA)
4. Visualizations
5. Insight Generation and Report

## DATA LOADING AND INITIAL OVERVIEW

### Install and Import pandas library

Before loading the dataset, we need to ensure that the required Python libraries are installed.
The main library used for data manipulation and analysis is Pandas.


In [9]:
pip install pandas openpyxl

Note: you may need to restart the kernel to use updated packages.


In [5]:
import pandas as pd

### Load the dataset
Reads the Excel file into a DataFrame named df


In [6]:
file_path = "/Users/arsha/Downloads/Consumer_Complaints.xlsx"
df = pd.read_excel(file_path)
print("Dataset Loaded Successfully!")

Dataset Loaded Successfully!


### Overview of Dataset


In [13]:
#Number of rows and columns
print("Dataset Shape (Rows, Columns):", df.shape)

Dataset Shape (Rows, Columns): (62516, 12)


In [14]:
# Data types of each column
print("\nData Types:")
print(df.dtypes)


Data Types:
Complaint ID                             int64
Submitted via                           object
Date submitted                  datetime64[ns]
Date received                   datetime64[ns]
State                                   object
Product                                 object
Sub-product                             object
Issue                                   object
Sub-issue                               object
Company public response                 object
Company response to consumer            object
Timely response?                        object
dtype: object


### Basic Dataset Information


In [15]:
print("\nDataset Info")
df.info()



Dataset Info
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 62516 entries, 0 to 62515
Data columns (total 12 columns):
 #   Column                        Non-Null Count  Dtype         
---  ------                        --------------  -----         
 0   Complaint ID                  62516 non-null  int64         
 1   Submitted via                 62516 non-null  object        
 2   Date submitted                62516 non-null  datetime64[ns]
 3   Date received                 62516 non-null  datetime64[ns]
 4   State                         62516 non-null  object        
 5   Product                       62516 non-null  object        
 6   Sub-product                   62509 non-null  object        
 7   Issue                         62516 non-null  object        
 8   Sub-issue                     51658 non-null  object        
 9   Company public response       60341 non-null  object        
 10  Company response to consumer  62516 non-null  object        
 11  Timely respons

In [16]:
print("\nFirst 5 Rows")
print(df.head())


First 5 Rows
   Complaint ID Submitted via Date submitted Date received State  \
0       4848023      Referral     2021-10-24    2021-10-27    NY   
1       3621464           Web     2020-04-24    2020-04-24    FL   
2       5818349           Web     2022-07-27    2022-07-27    CA   
3       7233015      Referral     2023-07-10    2023-07-11    CA   
4       5820224      Referral     2022-07-27    2022-07-28    VA   

                                             Product  \
0                                           Mortgage   
1  Money transfer, virtual currency, or money ser...   
2  Credit reporting, credit repair services, or o...   
3                        Credit card or prepaid card   
4                        Credit card or prepaid card   

                                  Sub-product  \
0                  Conventional home mortgage   
1                   Refund anticipation check   
2                            Credit reporting   
3                General-purpose prepaid car

### Descriptive Statistics

In [17]:
print("\nDescriptive Statistics")
print(df.describe(include='all'))


Descriptive Statistics
        Complaint ID Submitted via                 Date submitted  \
count   6.251600e+04         62516                          62516   
unique           NaN             7                            NaN   
top              NaN           Web                            NaN   
freq             NaN         45423                            NaN   
mean    4.512642e+06           NaN  2020-11-24 16:07:14.883869696   
min     2.471340e+06           NaN            2017-05-01 00:00:00   
25%     3.254020e+06           NaN            2019-05-22 00:00:00   
50%     4.178582e+06           NaN            2021-03-02 00:00:00   
75%     5.771284e+06           NaN            2022-07-14 00:00:00   
max     7.458912e+06           NaN            2023-08-28 00:00:00   
std     1.442917e+06           NaN                            NaN   

                        Date received  State                      Product  \
count                           62516  62516                        62