Skip to content

Using supervised machine learning models to assess credit risk using Python

Notifications You must be signed in to change notification settings

ChrisBarton107/Credit_Risk_Analysis

Repository files navigation

Credit_Risk_Analysis

Overview

The purpose of the analysis was to predict credit risk using supervised learning algorithms with various machine learning models

Results

The analysis used six machine learning algorithms including:

  • RandomOverSampler: Oversampling
  • SMOTE (Synthetic Minority Oversampling Technique): Oversampling
  • Cluster Centroids: Undersampling
  • SMOTEENN (Synthetic Minority Oversampling Technique - Edited Nearest Neighbor): Oversampling & undersampling combinatorial approach
  • Balanced Random Forest Classifier: Reduction bias
  • Easy Ensemble AdaBoost Classifier: Reduction bias

All precision, recall, and F1 summary statistics are based on high-risk detection

RandomOverSampler

drawing

  • Balanced Accuracy Score: 65%
  • Precision: 1%
  • Recall/Sensitivity: 63%
  • F1: 2%

SMOTE (Synthetic Minority Oversampling Technique)

drawing

  • Balanced Accuracy Score: 65%
  • Precision: 1%
  • Recall/Sensitivity: 64%
  • F1: 2%

Cluster Centroids

drawing

  • Balanced Accuracy Score: 52%
  • Precision: 1%
  • Recall/Sensitivity: 69%
  • F1: 2%

SMOTEENN (Synthetic Minority Oversampling Technique - Edited Nearest Neighbor)

drawing

  • Balanced Accuracy Score: 62%
  • Precision: 1%
  • Recall/Sensitivity: 69%
  • F1: 2%

Balanced Random Forest Classifier

drawing

  • Balanced Accuracy Score: 79%
  • Precision: 4%
  • Recall/Sensitivity: 67%
  • F1: 7%

Easy Ensemble Classifier

drawing

  • Balanced Accuracy Score: 93%
  • Precision: 7%
  • Recall/Sensitivity: 91%
  • F1: 14%

Summary

Ensemble models, including the Balanced Random Forest Classifier and Easy Ensemble Classifier, demonstrated superior recall performance in high risk credit decisions when compared to the other models in the analysis. Despite this superior performance, these models still demonstrated low precision, making them potential liabilities in real-life situations. These models are unrealiable in their intended tasks and I would not recommend them for predicting credit risk

Releases

No releases published

Packages

No packages published