Credit Risk Analysis

Supervised ML evaluation of credit risk data

Purpose

This analysis uses machine learning to train a model to predict high risk credit loans. The number of low risk loans in the dataset far outweigh the number of high risk loans. Therefore, the imblanced learn library was used to deal with this asymmetry.

Methods

Several algorithms were used in order to determine the best performance for the model. In particular, resampling and ensemble techniques were used.

Resampling

Oversampling from the high risk cases using RandomOverSampler and SMOTE algorithms were used to balance the number of cases being analyzed. The ClusterCentroids algorithm was used to undersample the low risk cases for the same purpose. Additionally, a combination of oversampling and undersampling, SMOTEENN was used to balance the inputs. ###Ensemble Classifiers The BalancedRandomForestClassifier and AdaBoost algorithms were used to employ decision tree techniques to produce a more robust and accurate model.

Analysis

The recall score will provide the best indicator of the number of high risk cases that are caught. Using this figure may label some low risk cases as high risk, but a higher recall score ensures that the highest number of high risk cases are identified.

Native Random Oversampling
High risk recall score of 0.72

SMOTE Oversampling
High risk recall score of 0.61

Cluster Centroids Undersampling
High risk recall score of 0.69

SMOTEENN Combination Sampling
High risk recall score of 0.78

Balanced Random Forest Classifier
High risk recall score of 0.70

Easy Ensemble AdaBoost Classifier
High risk recall score of 0.92

Recommendation

The Easy Ensemble AdaBoost Classifier algorithm is the obvious choice, because it correctly identified 92% of the high risk cases.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.ipynb_checkpoints		.ipynb_checkpoints
Images		Images
__MACOSX		__MACOSX
LoanStats_2019Q1.csv		LoanStats_2019Q1.csv
README.md		README.md
credit_risk_ensemble.ipynb		credit_risk_ensemble.ipynb
credit_risk_resampling.ipynb		credit_risk_resampling.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Credit Risk Analysis

Purpose

Methods

Resampling

Analysis

Recommendation

About

Releases

Packages

Languages

Dmccullor/Credit_Risk_Analysis

Folders and files

Latest commit

History

Repository files navigation

Credit Risk Analysis

Purpose

Methods

Resampling

Analysis

Recommendation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages