LendingClub Loan Default Prediction

Build a loan default prediction model through machine learning and deep learning

List of python libraries used

pandas
numpy
seaborn
matplotlib
lightgbm
sklearn

Files in the repository

Jupyter notebook including all the code for this project
loan.csv dataset
Figures from the model evaluation results

Overview

I. Business Understanding
II. Data Understanding
III. Prepare Data
IV. Data Modeling
V. Evaluate the Results

Business Understanding

LendingClub is an American peer-to-peer lending company. It enables borrowers to create unsecured personal loans while investors can choose the loan to invest based on the information provided.

Borrowers pay the interest for the loan to settle their financial hardship. Investors make money from the interest. In the meanwhile, the platform charges startup fees from borrowers and service fees from investors.

One main concern of investors is whether borrowers would default on loans or not. If the default does happen, investors would lose their investment. In this project, I used the Light GBM and MLP to predict the borrowers' loan status and achieved an accuracy of 85%.

Data Understanding

The dataset contains loan data for loans issued through 2007 to 2011, including the current loan status (Charged off, Fully Paid, etc.), loan amount, loan grade, loan purpose, and latest payment information.

Prepare Data

To build a prediction model based on this dataset, one important thing to notice is information leakage. As this dataset contains much up-to-date information related to the number of delinquency month and charge off collection fee etc, these features suggest a charge-off the loan. During the prepossessing stage, in addition to deal with missing values and categorical features, columns related to information leakage should also be removed for the model training.

Data Modeling

Light GBM and MLP were trained on this dataset. The hyperparameters for Light GBM were chosen through 5-fold cross-validation. The feature importance plot gave some of the influential features in the prediction.

Evaluate the Results

Our model achieved a high precion of 0.86 on the test set. The F1 score is 0.92.

When comparing the results from Light GBM with these from MLP, we can see that MLP did a better job on the training set with an 87 % training accuracy but the accuracy dropped by about 2 % on the testing set.

From the confusion matrix, we can see our classifier has high precision but low recall. This means the proportion of borrowers predicted to have good loan behaviors are indeed those who would fully pay the loan is high. But when the true label is charge-off, our classifier is not sensitive enough to notice that. This might be caused by the imbalanced class in the dataset. Actually, in this case, we do care more about precision as investors want to invest in those who are less likely to default.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
Lending_Club_Loan_Default_Prediction.ipynb		Lending_Club_Loan_Default_Prediction.ipynb
Light GBM_FI.png		Light GBM_FI.png
Light GBM_ROC.png		Light GBM_ROC.png
Loan.csv		Loan.csv
MLP_ROC.png		MLP_ROC.png
README.md		README.md
RF_FI.png		RF_FI.png
RF_ROC.png		RF_ROC.png
SVM_ROC.png		SVM_ROC.png
XGBM_ROC.png		XGBM_ROC.png
cm_RF_cm.png		cm_RF_cm.png
cm_svm_cm.png		cm_svm_cm.png
lgbm_cm.png		lgbm_cm.png
mlp_cm.png		mlp_cm.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LendingClub Loan Default Prediction

List of python libraries used

Files in the repository

Overview

Business Understanding

Data Understanding

Prepare Data

Data Modeling

Evaluate the Results

Reference:

About

Releases

Packages

Languages

yanhan-si/LendingClub-Loan-Default-Prediction

Folders and files

Latest commit

History

Repository files navigation

LendingClub Loan Default Prediction

List of python libraries used

Files in the repository

Overview

Business Understanding

Data Understanding

Prepare Data

Data Modeling

Evaluate the Results

Reference:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages