Accountable AI for Credit Default Risk Modeling

Authors: Catherine Miao, Karan Palsani, Michael Sparkman, Jenny Tseng, Tony (Junhong) Xu

FATML project focused on fairness and explainability of credit default risk modeling

This repository contains our work in building fair and explainable models that predict credit default risks. The original dataset was obtained from Kaggle's Home Credit Default Risk competition. For the purpose of running the code, all csv's generated for modeling were stored in a folder named cleaned_tables. CSV's generated for exploratory data analysis (EDA) were stored in a folder named EDA-helpers.

The Jupyter notebooks are numbered in the sequence we approached the problem and described below:

EDA of the main table containing applicant info; e.g., detecting outliers, treating null values
EDA of five supplemental tables that contain additional information for some of the applicants; e.g., summarizing key information for an applicant
Preparing datasets for modeling; e.g., checking bias of the three protected attributes in the dataset (age, gender, marital status), merging all tables, setting up training and testing sets, preparing a SMOTED dataset
Building a BRCG and a GLRM models (both in IBM's AIX 360 toolkit) based on the SMOTED top 20 features found from a Random Forest (RF) model
Building a RF and an XGBoosting models based on the SMOTED top 20 features found from the RF model found from #3 above; explaining the models with SHAP
Building various Decision Tree, Logistic Regression, and Random Forest models to compare with the models built in #3 and #4.

For more detailed descsription of the project, see here.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
.ipynb_checkpoints		.ipynb_checkpoints
0_main_train_EDA.ipynb		0_main_train_EDA.ipynb
1_Installment_POS_integrity.ipynb		1_Installment_POS_integrity.ipynb
1_Installment_POS_summary.ipynb		1_Installment_POS_summary.ipynb
1_Installment_payments_EDA.ipynb		1_Installment_payments_EDA.ipynb
1_POS_CASH_balance_EDA.ipynb		1_POS_CASH_balance_EDA.ipynb
1_bureau_features_EDA.ipynb		1_bureau_features_EDA.ipynb
1_credit_card_features_EDA.ipynb		1_credit_card_features_EDA.ipynb
1_prev_app_EDA1.ipynb		1_prev_app_EDA1.ipynb
1_prev_app_EDA2.ipynb		1_prev_app_EDA2.ipynb
2_fairness_check.ipynb		2_fairness_check.ipynb
2_main_preprocessing.ipynb		2_main_preprocessing.ipynb
3_Top20_BRCG_GLRM.ipynb		3_Top20_BRCG_GLRM.ipynb
4_Top20_RF_XG_SHAP.ipynb		4_Top20_RF_XG_SHAP.ipynb
5_all_DT_LR_RF.ipynb		5_all_DT_LR_RF.ipynb
README.md		README.md

jentseng/CreditRiskClassification_FATML

Folders and files

Latest commit

History

Repository files navigation

Accountable AI for Credit Default Risk Modeling

Authors: Catherine Miao, Karan Palsani, Michael Sparkman, Jenny Tseng, Tony (Junhong) Xu

FATML project focused on fairness and explainability of credit default risk modeling

About

Resources

Stars

Watchers

Forks

Languages