Risky Business

Background

In order to mitigate risk, I built and evaluated several machine-learning models to predict credit risk using free data from LendingClub. I employed the imbalanced-learn and Scikit-learn libraries to build and evaluate models using the two following techniques:

Resampling
Ensemble Learning

Files

Resampling

For this approach, I used the imbalanced learn library to resample the LendingClub data; built and evaluated logistic regression classifiers using the resampled data.
Refer to: Resampling Notebook

Conclusion

Which model had the best balanced accuracy score?

SMOTEENN had the best balanced accuracy score of 0.7975462408998795
versus 0.7752245065690078; 0.7966770207605626; 0.7856360112968401
for Cluster Centroids, SMOTE, and Random Oversampler respectively.
Which model had the best recall score?

SMOTE had the best recall score: 0.88.
Which model had the best geometric mean score?

SMOTEENN had the best geometric mean score: 0.79.

Ensemble Learning

For this method, I trained and compared two different ensemble classifiers to predict loan risk and evaluate each model. I used the Balanced Random Forest Classifier and the Easy Ensemble Classifier. For the ensemble learners, I used 100 estimators (n_estimators=100) for both models.
Refer to: Ensemble Notebook

Conclusion

Which model had the best balanced accuracy score?

Easy Ensemble Classifier had the best balanced accuracy score: 0.931601605553446
versus 0.7855345052746622 for Balanced Random Forest Classifier.
Which model had the best recall score?

Easy Ensemble Classifier had the best recall score: 0.94 versus
0.90for Balanced Random Forest Classifier.
Which model had the best geometric mean score?

Easy Ensemble Classifier had the best geometric mean score: 0.93 versus
0.78 for Balanced Random Forest Classifier.
What are the top three features?

Top three features are the following: (0.09175752102205247, 'total_rec_prncp'), (0.06410003199501778, 'total_pymnt_inv'), (0.05764917485461809, 'total_pymnt')

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Code		Code
Images		Images
Resources		Resources
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Risky Business

Background

Files

Resampling

Conclusion

Ensemble Learning

Conclusion

About

Releases

Packages

Languages

tamobee/credit-risk-ML

Folders and files

Latest commit

History

Repository files navigation

Risky Business

Background

Files

Resampling

Conclusion

Ensemble Learning

Conclusion

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages