Data Analysis Project

Dream Housing Finance Company Customer's Data

Data Cleaning
Data Exploration
Model Development
Model Interpretation/Evaluation

Data Cleaning
This involved handling missing data. I used KNNImputer to impute the missing values, It works by finding other variables that have high correlation with the missing value variable and use those variables to predict the missing values and them fill them in.
Data Exploration
It helped understand the different variables by anayzing them through visuaizations and statistics, the data set contains 12 columns The exploration was in two stages; Univariate and Bivariate
- -In Univariate The variables were explored independently, there are two types of variables and we analyze them differently, they are Categorical and Continuous Variables
  a. Continuous Variables - I explored the distibutions of Continuous variables which included the Applicant's's Income, Coapplicant's Income, Loan Amount and Loan Amount Term to find the average, middle value, highest occurence, and skewness, etc.
  b. Categorical Variables - I found the number of occurence of each category of a variable in the entire dataset e.g I explored the Gender Variable by finding the number of males and females in the dataset.
- -In Bivariate I explored the relationship between the different variabes to remove Highy Correlated Independent Variables, and find variables that have much and less correlation with the dependent variable. There are also different ways of exploring Categorical and Continuous variables in bivariate exploration.\
Model Development
After the data preparation and exploratory data analysis, I built the model on different algorithms and selected XGBClassifier, I also performed Feature engineering, and hyperparameter optimization with optuna to Improve the accuracy of the model.
Model Interpretation
The model is to predict whether a loan applicant will be eligible for a loan, the classifier learns the relationship between the features and the target variable which is the eligibility status, so as to predict eligibility status for new set of loan applicants.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.ipynb_checkpoints		.ipynb_checkpoints
app		app
catboost_info		catboost_info
Data_Cleaning_&_Preparation.ipynb		Data_Cleaning_&_Preparation.ipynb
Exploratory_Data_Analysis.ipynb		Exploratory_Data_Analysis.ipynb
Model_Improvement.ipynb		Model_Improvement.ipynb
Model_Improvement_xgb.ipynb		Model_Improvement_xgb.ipynb
Model_Init.ipynb		Model_Init.ipynb
README.md		README.md
cat_loan		cat_loan
data_cleaned.csv		data_cleaned.csv
test.csv		test.csv
train.csv		train.csv
xgb_loan		xgb_loan

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Analysis Project

Dream Housing Finance Company Customer's Data

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Data Analysis Project

Dream Housing Finance Company Customer's Data

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages