# Credit Risk Scorecard Model Development

## Introduction

The **Credit risk Scorecard** model created from the Lending Club dataset is instrumental in computing the Probability of Default (PD), a key factor in ECL calculations. This scorecard assesses several credit characteristics of potential borrowers, like their credit history, income, outstanding debts, and more, each of which is assigned a specific score. By combining these scores, we derive a total score for each borrower, which translates into an estimated Point-in-Time (PiT) PD. The PiT PD reflects the borrower's likelihood of default at a specific point in time, accounting for both current and foreseeable future conditions.

Additionally, for a holistic view of credit risk, it's essential to estimate the Lifetime PD. The Lifetime PD, as the name suggests, predicts the borrower's likelihood of default throughout the life of the exposure, taking into account potential future changes in the economic and financial conditions.

## Setup

### Import Libraries

In [None]:
from notebooks.probability_of_default.helpers.Developer import Developer
from notebooks.probability_of_default.helpers.scorecard_model import *
from notebooks.probability_of_default.helpers.model_development import *

from IPython.display import HTML


### Input Parameters

In [None]:
default_column = "default"

lending_club_url = "https://vmai.s3.us-west-1.amazonaws.com/datasets/lending_club_loan_data_2007_2014.csv"

preliminary_features_to_drop = [
    "Unnamed: 0",
    "id", "member_id", "funded_amnt", "emp_title", "url", "desc", "application_type",
    "title", "zip_code", "delinq_2yrs", "mths_since_last_delinq", "mths_since_last_record",
    "revol_bal", "total_rec_prncp", "total_rec_late_fee", "recoveries", "out_prncp_inv", "out_prncp",
    "collection_recovery_fee", "next_pymnt_d", "initial_list_status", "pub_rec",
    "collections_12_mths_ex_med", "policy_code", "acc_now_delinq", "pymnt_plan",
    "tot_coll_amt", "tot_cur_bal", "total_rev_hi_lim", "last_pymnt_d", "last_credit_pull_d",
    'earliest_cr_line', 'issue_d']

final_features_to_drop = ['addr_state', 'total_rec_int', 'loan_amnt',
                    'funded_amnt_inv', 'dti', 'revol_util', 'total_pymnt',
                    'total_pymnt_inv', 'last_pymnt_amnt', "inq_last_6mths"]

min_missing_percentage = 0.8

iqr_threshold = 1.5

test_size = 0.2

woe_breaks_adj = {"int_rate": [5,10,15]}

# Import Data
id_params = {"source": lending_club_url}

# Data Preparation
dp_params = {
    "features_to_drop": preliminary_features_to_drop,
    "default_column": default_column,
    "min_missing_percentage": min_missing_percentage,
    "iqr_threshold": iqr_threshold,
}

# Data Split
ds_params = {
    "target_column": default_column,
    "test_size": test_size
}

# Feature Selection
fs_params = {
    "features_to_drop": final_features_to_drop
}

# Feature Engineering
fe_params = {
    "target_column": default_column,
    "woe_breaks_adj": woe_breaks_adj
}

# Model Training
mt_params = {
    "target_column": default_column,
    "add_constant": False
}

## Model Development

In [None]:
df_raw = import_data(id_params)

In [None]:
df_prepared = data_preparation(df_raw, dp_params)

In [None]:
df_train, df_test = data_split(df_prepared, ds_params)

In [None]:
df_train_feature_selection = feature_selection(df_train, fs_params)
df_test_feature_selection = feature_selection(df_test, fs_params)

In [None]:
df_train_feature_eng = feature_engineering(df_train_feature_selection, fe_params)
df_test_feature_eng = feature_engineering(df_test_feature_selection, fe_params)

In [None]:
df_train_feature_eng = add_constant(df_train_feature_eng)
df_test_feature_eng = add_constant(df_test_feature_eng)

model_fit_candidate = model_training(df_train_feature_eng, mt_params)

In [None]:
print(model_fit_candidate.summary())

## Save Data and Models

In [None]:
developer = Developer()

objects_to_store = {
    "df_raw": df_raw,
    "df_prepared": df_prepared,
    "df_train": df_train,
    "df_train_feature_selection": df_train_feature_selection,
    "df_train_feature_eng": df_train_feature_eng,
    "df_test_feature_eng": df_test_feature_eng,
    "model_fit_final": model_fit_candidate,
}

developer.save_objects_to_pickle(
    filename="datasets/scorecard_data_and_models.pkl",
    objects_to_save=objects_to_store)