# Macroeconomic Model Validation

## Before you begin
To use the ValidMind Developer Framework with a Jupyter notebook, you need to install and initialize the client library first, along with getting your Python environment ready.

If you don't already have one, you should also create a documentation project on the ValidMind platform. You will use this project to upload your documentation and test results.

## Install the client library

In [1]:
# %pip install --upgrade validmind

## Initialize the client library
In a browser, go to the Client Integration page of your documentation project and click Copy to clipboard next to the code snippet. This code snippet gives you the API key, API secret, and project identifier to link your notebook to your documentation project.

This step requires a documentation project. Learn how you can create one.

Next, replace this placeholder with your own code snippet:

In [2]:
import validmind as vm

vm.init(
  api_host = "http://localhost:3000/api/v1/tracking",
  api_key = "2494c3838f48efe590d531bfe225d90b",
  api_secret = "4f692f8161f128414fef542cab2a4e74834c75d01b3a8e088a1834f2afcfe838",
  project = "cllaz74gb067dszy6donpqm98"
)

2023-08-16 14:44:27,596 - INFO(validmind.api_client): Connected to ValidMind. Project: [9] Credit Risk Scorecard - Initial Validation (cllaz74gb067dszy6donpqm98)


## Introduction

## Setup

### Import Libraries

In [3]:
# Load API key and secret from environment variables
%load_ext dotenv
%dotenv .env

from IPython.display import HTML
from notebooks.probability_of_default.helpers.Developer import Developer
from notebooks.probability_of_default.helpers.econometric_model import *

### Input Parameters

In [4]:
target_column = "DRSFRMACBS"

### Load Datasets and Models

In [5]:
developer = Developer()
macro_model = developer.load_objects_from_pickle("datasets/macroeconomic_data_and_models.pkl")

df_raw = macro_model["df_raw"]
df_preparation = macro_model["df_prepared"]

df_train = macro_model["df_train"]
df_test = macro_model["df_test"]
model_fit = macro_model["model_fit"]

df_train_final = macro_model["df_train_final"]
df_test_final = macro_model["df_test_final"]
model_fit_final = macro_model["model_fit_final"]

INFO: Loaded 8 objects from datasets/macroeconomic_data_and_models.pkl


### Create ValidMind Datasets and Models

In [6]:
from validmind.vm_models.test_context import TestContext

# Create ValidMind Datasets
vm_df_raw = vm.init_dataset(
    dataset=df_raw, 
    target_column=target_column)
vm_df_preparation = vm.init_dataset(
    dataset=df_preparation, 
    target_column=target_column)
vm_df_train = vm.init_dataset(
    dataset=df_train, 
    target_column=target_column)
vm_df_test = vm.init_dataset(
    dataset=df_test, 
    target_column=target_column)
vm_df_train_final = vm.init_dataset(
    dataset=df_train_final, 
    target_column=target_column)
vm_df_test_final = vm.init_dataset(
    dataset=df_test_final, 
    target_column=target_column)

# Create ValidMind Models
vm_model_fit = vm.init_model(
    model = model_fit, 
    train_ds=vm_df_train, 
    test_ds=vm_df_test)
vm_model_fit_final = vm.init_model(
    model = model_fit_final, 
    train_ds=vm_df_train_final, 
    test_ds=vm_df_test_final)

# Create test contexts 
test_context_raw = TestContext(dataset=vm_df_raw)
test_context_preparation = TestContext(dataset=vm_df_preparation)
test_context_train = TestContext(dataset=vm_df_train)
test_context_test = TestContext(dataset=vm_df_test)
test_context_train_final = TestContext(dataset=vm_df_train_final)
test_context_test_final = TestContext(dataset=vm_df_test_final)

test_context_model = TestContext(model = vm_model_fit)
test_context_models = TestContext(models = [vm_model_fit, vm_model_fit_final])


2023-08-16 14:44:27,841 - INFO(validmind.client): Pandas dataset detected. Initializing VM Dataset instance...
INFO: Pandas dataset detected. Initializing VM Dataset instance...
2023-08-16 14:44:27,851 - INFO(validmind.client): Pandas dataset detected. Initializing VM Dataset instance...
INFO: Pandas dataset detected. Initializing VM Dataset instance...
2023-08-16 14:44:27,855 - INFO(validmind.client): Pandas dataset detected. Initializing VM Dataset instance...
INFO: Pandas dataset detected. Initializing VM Dataset instance...
2023-08-16 14:44:27,859 - INFO(validmind.client): Pandas dataset detected. Initializing VM Dataset instance...
INFO: Pandas dataset detected. Initializing VM Dataset instance...
2023-08-16 14:44:27,862 - INFO(validmind.client): Pandas dataset detected. Initializing VM Dataset instance...
INFO: Pandas dataset detected. Initializing VM Dataset instance...
2023-08-16 14:44:27,866 - INFO(validmind.client): Pandas dataset detected. Initializing VM Dataset instance...

## Model Validation

### Raw Data

In [7]:
from validmind.tests.data_validation.TimeSeriesMissingValues import TimeSeriesMissingValues

params = {"min_threshold": 2}

metric = TimeSeriesMissingValues(test_context_raw, params)
metric.run()
await metric.result.log()
metric.result.show()

VBox(children=(HTML(value='\n            <h2>Time Series Missing Values ❌</h2>\n            <p>Test that the n…

In [8]:
from validmind.tests.data_validation.TimeSeriesOutliers import TimeSeriesOutliers

params = {"zscore_threshold": 3}

metric = TimeSeriesOutliers(test_context_raw, params)
metric.run()
await metric.result.log()
metric.result.show()

INFO: No artists with labels found to put in legend.  Note that artists whose label start with an underscore are ignored when legend() is called with no argument.
INFO: No artists with labels found to put in legend.  Note that artists whose label start with an underscore are ignored when legend() is called with no argument.
INFO: No artists with labels found to put in legend.  Note that artists whose label start with an underscore are ignored when legend() is called with no argument.
INFO: No artists with labels found to put in legend.  Note that artists whose label start with an underscore are ignored when legend() is called with no argument.
INFO: No artists with labels found to put in legend.  Note that artists whose label start with an underscore are ignored when legend() is called with no argument.
INFO: No artists with labels found to put in legend.  Note that artists whose label start with an underscore are ignored when legend() is called with no argument.
INFO: No artists with 

VBox(children=(HTML(value='\n            <h2>Time Series Outliers ❌</h2>\n            <p>Test that find outlie…

In [9]:
from validmind.tests.data_validation.TimeSeriesFrequency import TimeSeriesFrequency

metric = TimeSeriesFrequency(test_context_raw)
metric.run()
await metric.result.log()
metric.result.show()

VBox(children=(HTML(value='\n            <h2>Time Series Frequency ❌</h2>\n            <p>Test that detects fr…

### Prepared Data 

In [10]:
from validmind.tests.data_validation.TimeSeriesMissingValues import TimeSeriesMissingValues

params = {"min_threshold": 2}

metric = TimeSeriesMissingValues(test_context_preparation, params)
metric.run()
await metric.result.log()
metric.result.show()

VBox(children=(HTML(value='\n            <h2>Time Series Missing Values ✅</h2>\n            <p>Test that the n…

In [11]:
from validmind.tests.data_validation.TimeSeriesFrequency import TimeSeriesFrequency

metric = TimeSeriesFrequency(test_context_preparation)
metric.run()
await metric.result.log()
metric.result.show()

VBox(children=(HTML(value='\n            <h2>Time Series Frequency ✅</h2>\n            <p>Test that detects fr…

### Train Data 

In [12]:
from validmind.tests.data_validation.TimeSeriesLinePlot import TimeSeriesLinePlot

metric = TimeSeriesLinePlot(test_context_train)
metric.run()
await metric.result.log()
metric.result.show()

VBox(children=(HTML(value='<p>Generates a visual analysis of time series data by plotting the raw time series.…

In [13]:
metric = TimeSeriesLinePlot(test_context_test)
metric.run()
await metric.result.log()
metric.result.show()

VBox(children=(HTML(value='<p>Generates a visual analysis of time series data by plotting the raw time series.…

In [14]:
from validmind.tests.data_validation.LaggedCorrelationHeatmap import LaggedCorrelationHeatmap

metric = LaggedCorrelationHeatmap(test_context_train)
metric.run()
await metric.result.log()
metric.result.show()

VBox(children=(HTML(value='<p>Generates a heatmap of correlations between the target variable and the lags of …

In [15]:
from validmind.tests.data_validation.EngleGrangerCoint import EngleGrangerCoint

metric = EngleGrangerCoint(test_context_train)
metric.run()
await metric.result.log()
metric.result.show()

VBox(children=(HTML(value='<p>Test for cointegration between pairs of time series variables in a given dataset…

### Model Training

In [16]:
from validmind.tests.data_validation.DatasetSplit import DatasetSplit

metric = DatasetSplit(test_context_model)
metric.run()
await metric.result.log()
metric.result.show()

VBox(children=(HTML(value='<p>This section shows the size of the dataset split into training, test (and valida…

In [17]:
from validmind.tests.model_validation.ModelMetadata import ModelMetadata

metric = ModelMetadata(test_context_model)
metric.run()
await metric.result.log()
metric.result.show()

VBox(children=(HTML(value="<p>This section describes attributes of the selected model such as its modeling tec…

In [18]:
from validmind.tests.model_validation.statsmodels.RegressionCoeffsPlot import RegressionCoeffsPlot

metric = RegressionCoeffsPlot(test_context_models)
metric.run()
await metric.result.log()
metric.result.show()

VBox(children=(HTML(value="<p>Regression Coefficients with Confidence Intervals Plot</p>\n<p>This class is use…

In [19]:
from validmind.tests.model_validation.statsmodels.RegressionModelsCoeffs import RegressionModelsCoeffs

metric = RegressionModelsCoeffs(test_context_models)
metric.run()
await metric.result.log()
metric.result.show()

VBox(children=(HTML(value='<p>This section shows the coefficients of different regression models that were tra…

### Model Evaluation

In [20]:
from validmind.tests.model_validation.statsmodels.RegressionModelsPerformance import RegressionModelsPerformance

metric = RegressionModelsPerformance(test_context_models)
metric.run()
await metric.result.log()
metric.result.show()

VBox(children=(HTML(value='<p>This section shows the in-sample and out-of-sample comparison of regression mode…