# Credit Risk Scorecard Model Validation

## Before you begin

::: {.callout-tip}
### New to ValidMind? 
To access the ValidMind Platform UI, you'll need an account.

Signing up is FREE — **[Create your account](https://app.prod.validmind.ai)**.
:::

If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).

## Install the client library

In [None]:
%pip install -q validmind

## Initialize the client library

ValidMind generates a unique _code snippet_ for each registered model to connect with your developer environment. You initialize the client library with this code snippet, which ensures that your documentation and tests are uploaded to the correct model when you run the notebook.

Get your code snippet:

1. In a browser, log into the [Platform UI](https://app.prod.validmind.ai).

2. In the left sidebar, navigate to **Model Inventory** and click **+ Register new model**.

3. Enter the model details, making sure to select **Credit Risk Scorecard** as the template and **Credit Risk - Underwriting - Credit Cards** as the use case, and click **Continue**. ([Need more help?](https://docs.validmind.ai/guide/register-models-in-model-inventory.html))

4. Go to **Getting Started** and click **Copy snippet to clipboard**.

Next, replace this placeholder with your own code snippet:

In [None]:
# Replace with your code snippet

import validmind as vm

vm.init(
    api_host="...",
    api_key="...",
    api_secret="...",
    project="..."
)

## Introduction

The **Credit risk Scorecard** model created from the Lending Club dataset is instrumental in computing the Probability of Default (PD), a key factor in ECL calculations. This scorecard assesses several credit characteristics of potential borrowers, like their credit history, income, outstanding debts, and more, each of which is assigned a specific score. By combining these scores, we derive a total score for each borrower, used to estimate the Point-in-Time (PiT) PD.

## Setup

### Import Libraries

In [None]:
# Load API key and secret from environment variables
%load_ext dotenv
%dotenv .env

from IPython.display import HTML
from notebooks.probability_of_default.helpers.Developer import Developer
from notebooks.probability_of_default.helpers.scorecard_model import *
from notebooks.probability_of_default.helpers.model_development import *

# Visualization imports
%matplotlib inline

### Input Parameters

In [None]:
default_column = "default"

### Load Datasets and Models

In [None]:
developer = Developer()
scorecard = developer.load_objects_from_pickle("datasets/scorecard_data_and_models.pkl")

df_raw = scorecard["df_raw"]
df_prepared = scorecard["df_prepared"]
df_train = scorecard["df_train"]
df_train_feature_selection = scorecard["df_train_feature_selection"]

df_train_feature_eng = scorecard["df_train_feature_eng"]
df_test_feature_eng = scorecard["df_test_feature_eng"]

model_fit_final = scorecard["model_fit_final"]

### Create ValidMind Datasets and Model

In [None]:
from validmind.vm_models.test_context import TestContext

vm_df_raw = vm.init_dataset(dataset=df_raw, target_column=default_column)
vm_df_prepared = vm.init_dataset(dataset=df_prepared, target_column=default_column)
vm_df_train_feature_selection = vm.init_dataset(
    dataset=df_train_feature_selection, target_column=default_column)

vm_df_train_feature_eng = vm.init_dataset(
    dataset=df_train_feature_eng, target_column=default_column)
vm_df_test_feature_eng = vm.init_dataset(
    dataset=df_test_feature_eng, target_column=default_column)

vm_model_fit_final = vm.init_model(
    model=model_fit_final,
    train_ds=vm_df_train_feature_eng,
    test_ds=vm_df_test_feature_eng)


test_context_raw = TestContext(dataset=vm_df_raw)
test_context_prepared = TestContext(dataset=vm_df_prepared)
test_context_train_feature_selection = TestContext(
    dataset=vm_df_train_feature_selection)

test_context_models_fit_final = TestContext(models=[vm_model_fit_final])
test_context_model_fit_final = TestContext(model=[vm_model_fit_final])

## Model validation

### Data Description

In [None]:
from validmind.tests.data_validation.DescriptiveStatistics import DescriptiveStatistics

metric = DescriptiveStatistics(test_context_raw, params=None)
metric.run()
metric.result.log()
metric.result.show()

In [None]:
from validmind.tests.data_validation.MissingValuesBarPlot import MissingValuesBarPlot

params = {"threshold": 80,
          "fig_height": 1100}

metric = MissingValuesBarPlot(test_context_raw, params)
metric.run()
metric.result.log()
metric.result.show()

### Data Preparation

In [None]:
from validmind.tests.data_validation.ClassImbalance import ClassImbalance

metric = ClassImbalance(test_context_prepared, params=None)
metric.run()
metric.result.log()
metric.result.show()

In [None]:
from validmind.tests.data_validation.IQROutliersTable import IQROutliersTable

numerical_features = get_numerical_columns(df_prepared)
params = {"num_features": numerical_features,
          "threshold": 1.5
          }

metric = IQROutliersTable(test_context_prepared, params)
metric.run()
metric.result.log()
metric.result.show()

In [None]:
from validmind.tests.data_validation.IQROutliersBarPlot import IQROutliersBarPlot

numerical_features = get_numerical_columns(df_prepared)
params = {"num_features": numerical_features,
          "threshold": 1.5,
          "fig_width": 500}

metric = IQROutliersBarPlot(test_context_prepared, params)
metric.run()
metric.result.log()
metric.result.show()

### Exploratory Data Analysis

In [None]:
from validmind.tests.data_validation.TabularNumericalHistograms import TabularNumericalHistograms

metric = TabularNumericalHistograms(test_context_train_feature_selection, params=None)
metric.run()
metric.result.log()
metric.result.show()

In [None]:
from validmind.tests.data_validation.HighCardinality import HighCardinality
metric = HighCardinality(test_context_train_feature_selection, params=None)
metric.run()
metric.result.log()
metric.result.show()

In [None]:
from validmind.tests.data_validation.TabularCategoricalBarPlots import TabularCategoricalBarPlots
metric = TabularCategoricalBarPlots(test_context_train_feature_selection, params=None)
metric.run()
metric.result.log()
metric.result.show()

In [None]:
from validmind.tests.data_validation.TargetRateBarPlots import TargetRateBarPlots

# Configure the metric
params = {
    "default_column": default_column,
    "columns": None
}

metric = TargetRateBarPlots(test_context_train_feature_selection, params=params)
metric.run()
metric.result.log()
metric.result.show()

In [None]:
from validmind.tests.data_validation.ChiSquaredFeaturesTable import ChiSquaredFeaturesTable

categorical_features = get_categorical_columns(df_train_feature_selection)
params = {"cat_features": categorical_features,
          "p_threshold": 0.05}

metric = ChiSquaredFeaturesTable(test_context_train_feature_selection, params)
metric.run()
metric.result.log()
metric.result.show()

In [None]:
from validmind.tests.data_validation.ANOVAOneWayTable import ANOVAOneWayTable

numerical_features = get_numerical_columns(df_train_feature_selection)
params = {"num_features": numerical_features,
          "p_threshold": 0.05}

metric = ANOVAOneWayTable(test_context_train_feature_selection, params)
metric.run()
metric.result.log()
metric.result.show()

In [None]:
from validmind.tests.data_validation.PearsonCorrelationMatrix import PearsonCorrelationMatrix

params = {"declutter": False,
          "features": None,
          "fontsize": 13}

metric = PearsonCorrelationMatrix(test_context_train_feature_selection, params)
metric.run()
metric.result.log()
metric.result.show()

In [None]:
from validmind.tests.data_validation.FeatureTargetCorrelationPlot import FeatureTargetCorrelationPlot

params = {"features": None}

metric = FeatureTargetCorrelationPlot(test_context_train_feature_selection, params)
metric.run()
metric.result.log()
metric.result.show()

### Feature Engineering

In [None]:
from validmind.tests.data_validation.WOEBinTable import WOEBinTable

metric = WOEBinTable(test_context_train_feature_selection, params=None)
metric.run()
metric.result.log()
metric.result.show()

In [None]:
params = {
    "breaks_adj": {
        "int_rate": [5, 10, 15]}
}

metric = WOEBinTable(test_context_train_feature_selection, params)
metric.run()
metric.result.log()
metric.result.show()

In [None]:
from validmind.tests.data_validation.WOEBinPlots import WOEBinPlots

params = {
    "breaks_adj": {"int_rate": [5, 10, 15]},
    "fig_height": 500,
}

metric = WOEBinPlots(test_context_train_feature_selection, params=params)
metric.run()
metric.result.log()
metric.result.show()

### Model Training

In [None]:
from validmind.tests.model_validation.statsmodels.RegressionCoeffsPlot import RegressionCoeffsPlot

metric = RegressionCoeffsPlot(test_context_models_fit_final, params=None)
metric.run()
metric.result.log()
metric.result.show()

In [None]:
from validmind.tests.model_validation.statsmodels.RegressionModelsCoeffs import RegressionModelsCoeffs

metric = RegressionModelsCoeffs(test_context_models_fit_final, params=None)
metric.run()
metric.result.log()
metric.result.show()

### Model Evaluation

In [None]:
from validmind.tests.model_validation.statsmodels.GINITable import GINITable

metric = GINITable(test_context_model_fit_final, params=None)
metric.run()
metric.result.log()
metric.result.show()

In [None]:
from validmind.tests.model_validation.statsmodels.LogRegressionConfusionMatrix import LogRegressionConfusionMatrix

params = {
    "cut_off_threshold": 0.5
}

metric = LogRegressionConfusionMatrix(test_context_model_fit_final, params)
metric.run()
metric.result.log()
metric.result.show()

In [None]:
from validmind.tests.model_validation.sklearn.ROCCurve import ROCCurve

metric = ROCCurve(test_context_model_fit_final, params=None)
metric.run()
metric.result.log()
metric.result.show()

In [None]:
from validmind.tests.model_validation.sklearn.PrecisionRecallCurve import PrecisionRecallCurve

metric = PrecisionRecallCurve(test_context_model_fit_final, params=None)
metric.run()
metric.result.log()
metric.result.show()

In [None]:
from validmind.tests.model_validation.statsmodels.LogisticRegPredictionHistogram import LogisticRegPredictionHistogram

#  Configure test parameters
params = {
    "title": "Histogram of Probability of Default",
}

metric = LogisticRegPredictionHistogram(test_context_model_fit_final, params)
metric.run()
metric.result.log()
metric.result.show()

In [None]:
from validmind.tests.model_validation.statsmodels.LogisticRegCumulativeProb import LogisticRegCumulativeProb

#  Configure test parameters
params = {
    "title": "Cumulative Probability of Default",
}

metric = LogisticRegCumulativeProb(test_context_model_fit_final, params)
metric.run()
metric.result.log()
metric.result.show()

In [None]:
from validmind.tests.model_validation.statsmodels.ScorecardHistogram import ScorecardHistogram

#  Configure test parameters
params = {
    "target_score": 600,
    "target_odds": 50,
    "pdo": 20,
    "title": "Histogram of Credit Scores",
}

metric = ScorecardHistogram(test_context_model_fit_final, params)
metric.run()
metric.result.log()
metric.result.show()