Skip to content

goswamimohit/EDA-on-Credit-Card-Anslysis

Repository files navigation

Statlog Dataset

Perform exploratory data analysis and provide key insights derived from the same backed with suitable graphs and plots.

Dataset Description: The dataset is based on the “Statlog Dataset” from the UCI Machine Learning Repository. Columns of the dataset and their meaning are as follows;

Age (numeric)
Sex (text: male, female)
Job (numeric: 0 - unskilled and non-resident, 1 - unskilled and resident, 2 - skilled, 3 - highly skilled)
Housing (text: own, rent, or free)
Saving accounts (text - little, moderate, quite rich, rich)
Checking account (text - little, moderate, rich)
Credit amount (numeric, in Deutsche Mark)
Duration (numeric, in month)
Purpose (text: car, furniture/equipment, radio/TV, domestic appliances, repairs, education, business, vacation/others

Assignment questions:

  1. Load the dataset into pandas and get a peek at the underlying data in the dataframe.

  2. Provide the following information about the dataframe;

    Dimensions of the dataframe Information about the schema Statistical metrics of each column

  3. Conduct the following data pre-processing steps only as necessary along with the reason behind doing it with suitable steps; Missing values Erroneous/wrong values Skewed data Outliers

  4. Perform exploratory data analysis and provide key insights derived from the same backed with suitable graphs and plots.

Few hints to get you started; Distribution of numerical variables Distribution of categorical variables Numerical vs Categorical plots Numerical vs Numerical plots

About

Perform exploratory data analysis and provide key insights derived from the same backed with suitable graphs and plots.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published