Credit_Risk_Analysis

Overview of the analysis: Explain the purpose of this analysis.

Credit risk is an inherently unbalanced classification problem, as good loans easily outnumber risky loans. Therefore, I need to employ different techniques to train and evaluate models withunbalanced classes. Jill asked me to use imbalanced-learn and scikit-learn libraries to build and evaluate models using resampling. Using the credit card credit dataset from LendingClub, a peer-to-peer lending services company, I oversampled the data using the RandomOverSampler and SMOTE algorithms, and undersample the data using the ClusterCentroids algorithm. Then, I used a combinatorial approach of over- and undersampling using the SMOTEENN algorithm. Then I compared two new machine learning models that reduce bias, BalancedRandomForestClassifier and EasyEnsembleClassifier, to predict credit risk. At the end I evaluated the performance of these models and make a written recommendation on whether they should be used to predict credit risk.

Results: Using bulleted lists, describe the balanced accuracy scores and the precision and recall scores of all six machine learning models. Use screenshots of your outputs to support your results.

RandomOversampler Balance Accuracy Score 66.29%

Combination sampling using SMOTEEN

Balance Accuracy Score: 66.29

Balanced RandomForestClassifier Balance accuracy score= 78.78

EasyEnsembleClassifer Accuracy score- 92.5%

Confusion matrix Accuracy score-64.39%

Summary: Summarize the results of the machine learning models, and include a recommendation on the model to use, if any. If you do not recommend any of the models, justify your reasoning.

Based on the results, it seems likes the EasyEnsemble in AdaBoost is the best method. It had a 92.54% accuracy. The balancing is achieved by random under-sampling.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.ipynb_checkpoints		.ipynb_checkpoints
LoanStats_2019Q1.csv		LoanStats_2019Q1.csv
README.md		README.md
credit_risk.resampling.ipynb		credit_risk.resampling.ipynb
credit_risk_ensemble.ipynb		credit_risk_ensemble.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.ipynb_checkpoints

.ipynb_checkpoints

LoanStats_2019Q1.csv

LoanStats_2019Q1.csv

README.md

README.md

credit_risk.resampling.ipynb

credit_risk.resampling.ipynb

credit_risk_ensemble.ipynb

credit_risk_ensemble.ipynb

Repository files navigation

Credit_Risk_Analysis

About

Releases

Packages

Languages

MariaSalinas87/Credit_Risk_Analysis

Folders and files

Latest commit

History

Repository files navigation

Credit_Risk_Analysis

About

Resources

Stars

Watchers

Forks

Languages