Skip to content

zatafa/4_OC_AI_Credit_Risk_Model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 

Repository files navigation

🎯 Scoring Model

Use historical loan application data to predict whether or not an applicant will be able to repay a loan.

Expected output: a probability of default prediction, between 0 and 1.

🗂️ Dataset

The dataset is provided by Home Credit

📜 Tasks

  • ✔️ Exploratory Data Analysis (EDA);
  • ✔️ Data cleaning;
  • ✔️ Feature engineering;
  • ✔️ Imbalanced classes management;
  • ✔️ Model training: Naïve Bayes, Logistic Regression, Stochastic Gradient Descent, Random Forest, LightGBM
  • ✔️ Model evaluation: AUC (Area Under the Curve), Recall, F1-Score
  • ✔️ Hyperparameters optimization;
  • ✔️ Evaluation of variable importance with SHAP (Shapley Additive exPlanations).

Distribution of the classes

Performance of lightGBM

Global explanation with SHAP

💻 Dependencies

scikit-learn, LightGBM, SHAP

📌 References

▶️ Further reading