# Configure parameters for a specific test

This notebook guides you through using a simple classification model for bank customer churn dataset. It shows you how to set up the ValidMind Developer Framework and guide you through documenting a model using the ValidMind Developer framework. It shows how user can configure parameters for a test or set of tests in a specific section of document.

For this simple demonstration, we will use the following bank customer churn dataset from Kaggle: https://www.kaggle.com/code/kmalit/bank-customer-churn-prediction/data.

We will train a sample model and demonstrate the following documentation functionalities:

- Initializing the ValidMind Developer Framework
- Using a sample datasets provided by the library to train a simple classification model
- Configure a set of tests' parameters to generate document about the data and model

### ValidMind at a glance

We offer a platform for managing model risk, including risk associated with AI and statistical models. As a model developer, you use the ValidMind Developer Framework to automate documentation and validation tests, and then use the ValidMind AI Risk Platform UI to collaborate on doucumentation projects. Together, these products simplify model risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and model validators.

If this is your first time trying out ValidMind:

<a href="https://docs.validmind.ai/guide/get-started.html">Get started</a> — The basics, including key concepts, and how our products work

<a href="https://docs.validmind.ai/guide/get-started-developer-framework.html">Get started with the ValidMind Developer Framework</a> — The path for developers, more code samples, and our developer reference


### Before you begin
To use the ValidMind Developer Framework with a Jupyter notebook, you need to install and initialize the client library first, along with getting your Python environment ready. This includes installing any missing prerequisite modules that you discover with pip install.

If you don't already have one, <a href= "https://docs.validmind.ai/guide/create-your-first-documentation-project.html">create a documentation project</a> for yourself in the Platform UI.

### Install ValidMind Developer Framework


In [None]:
# %pip install -q validmind

### Initializing the Python environment

In [6]:
import pandas as pd
import xgboost as xgb

from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split

%matplotlib inline

## Initializing the ValidMind Client Library

Log in to the ValidMind platform with your registered email address, and navigate to the Documentation Projects page.

### Creating a new Documentation Project 

***(Note: if a documentation project has already been created, you can skip this section and head directly "Finding Project API key and secret")***

Clicking on "Create a new project" allows to you to register a new documentation project for our demo model. 

Select "Customer Churn model" from the Model drop-down, and "Initial Validation" as Type. Finally, click on "Create Project".

### Finding the project API key and secret 

In the "Client Integration" page of the newly created project, you will find the initialization code that allows the client library to associate documentation and tests with the appropriate project. The initialization code configures the following arguments: 

* api_host: Location of the ValidMind API.
* api_key: Account API key.
* api_secret: Account Secret key.
* project: The project identifier. The `project` argument is mandatory since it allows the library to associate all data collected with a specific account project.


The code snippet can be copied and pasted directly in the cell below to initialize the ValidMind Developer Framework when run:  

In [1]:
## Replace the code below with the code snippet from your project ## 

import validmind as vm

vm.init(
    api_host = "https://api.prod.validmind.ai/api/v1/tracking",
    api_key = "...",
    api_secret = "...",
    project = "..."
)

2023-10-03 17:57:36,139 - INFO(validmind.api_client): Connected to ValidMind. Project: Churn model - Initial Validation (clkinvmhv003mkmrlvv9wkxlj)


### Preview the model's documentation template

All models are assigned a documentation template when registered. The template defines a list of sections that are used to document the model. Each section can contain any number of rich text and test driven blocks that populate the documentation. Test driven blocks are populated by running tests against the model.

We can preview the model documentation template for this project by running the following code:

In [2]:
vm.preview_template()

Accordion(children=(Accordion(children=(HTML(value='<p>Empty Section</p>'), Accordion(children=(HTML(value='<p…

## Load the demo dataset

For the purpose of this demonstration, we will use a sample dataset provided by the ValidMind library. 

In [3]:
# Import the sample dataset from the library
from validmind.datasets.classification import customer_churn as demo_dataset
# You can try a different dataset with: 
#from validmind.datasets.classification import taiwan_credit as demo_dataset

df = demo_dataset.load_data()

#### Initialize a dataset object for ValidMind

Before running the test plan, we must first initialize a ValidMind dataset object using the `init_dataset` function from the `vm` module. This function takes in arguements: `dataset` which is the dataset that we want to analyze; `target_column` which is used to identify the target variable; `class_labels` which is used to identify the labels used for classification model training.

In [4]:
vm_dataset = vm.init_dataset(
    dataset=df,
    target_column=demo_dataset.target_column,
    class_labels=demo_dataset.class_labels
)

2023-10-03 17:57:44,412 - INFO(validmind.client): Pandas dataset detected. Initializing VM Dataset instance...


### Documenting the model

We will need to preprocess the dataset and produce the training, test and validation splits first.

#### Prepocess the raw dataset

For demonstration purposes, we simplified the preprocessing using demo_dataset.preprocess which executes the following operations: 

In [7]:
train_df, validation_df, test_df = demo_dataset.preprocess(df)

x_train = train_df.drop(demo_dataset.target_column, axis=1)
y_train = train_df[demo_dataset.target_column]
x_val = validation_df.drop(demo_dataset.target_column, axis=1)
y_val = validation_df[demo_dataset.target_column]

model = xgb.XGBClassifier(early_stopping_rounds=10)
model.set_params(
    eval_metric=["error", "logloss", "auc"],
)
model.fit(
    x_train,
    y_train,
    eval_set=[(x_val, y_val)],
    verbose=False,
)

We can now initialize the training and test datasets into dataset objects using vm.init_dataset():

In [8]:
vm_train_ds = vm.init_dataset(
    dataset=train_df,
    target_column=demo_dataset.target_column
)

vm_test_ds = vm.init_dataset(
    dataset=test_df,
    target_column=demo_dataset.target_column
)

2023-10-03 17:58:20,045 - INFO(validmind.client): Pandas dataset detected. Initializing VM Dataset instance...
2023-10-03 17:58:20,073 - INFO(validmind.client): Pandas dataset detected. Initializing VM Dataset instance...


We also initialize a model object using vm.init_model():

In [9]:
vm_model = vm.init_model(
    model,
    train_ds=vm_train_ds,
    test_ds=vm_test_ds,
)

#### Run the template documentation suite

We are now ready to run the model's documentation tests as defined in its template. The following function runs every test in the template and sends all documentation artifacts to the ValidMind platform.

In [10]:
full_suite = vm.run_documentation_tests(
    dataset=vm_dataset,
    model=vm_model
)

HBox(children=(Label(value='Running test suite...'), IntProgress(value=0, max=56)))

VBox(children=(HTML(value='<h2>Test Suite Results: <i style="color: #DE257E">Binary Classification V2</i></h2>…

#### Configuration of parameters for model diagnosis tests 
Each test has it's default parameters and their values depend on a usecase you are trying to solve. ValidMind's developer framework expose these
parameters at the user level so that it can be adjusted based on requirements.

The config can be apply to specific test to override the default configuration parameters.

The format of a config is:
```
config = {
    "<test1_id>": {
        "<default_param_1>": value,
        "<default_param_2>": value,
    },
     "<test2_id>": {
        "<default_param_1>": value,
        "<default_param_2>": value,
    },
}
```
Users can input the configuration to run documentation tests using **`config`**, allowing fine-tuning the suite according to their specific data requirements. 

In [11]:
config={
    "overfit_regions": {
        "cut_off_percentage": 3,
        "feature_columns": ["Age", "Balance", "Tenure", "NumOfProducts"]
    },
    "weak_spots":{
        "features_columns": ["Age", "Balance"],
        "accuracy_gap_threshold": 85,
    },
    "robustness":{
        "features_columns": [ "Balance", "Tenure"],
        "scaling_factor_std_dev_list": [0.0, 0.1, 0.2, 0.3, 0.4, 0.5],
        "accuracy_decay_threshold": 4,
    }
}

full_suite = vm.run_documentation_tests(
    dataset=vm_dataset,
    model=vm_model,
    section="model_diagnosis",
    config=config,
)

HBox(children=(Label(value='Running test suite...'), IntProgress(value=0, max=6)))

VBox(children=(HTML(value='<h2>Test Suite Results: <i style="color: #DE257E">Binary Classification V2</i></h2>…

### Next steps
You can look at the results of this test plan right in the notebook where you ran the code, as you would expect. But there is a better way: view the test results as part of your model documentation right in the ValidMind Platform UI:

Log back into the Platform UI

Go to **Documentation Projects > YOUR_DOCUMENTATION_PROJECT > Documentation.**

Expand ** Model Development**

What you can see now is a more easily consumable version of the model diagnosis tests you just performed, along with other parts of your documentation project that still need to be completed.

If you want to learn more about where you are in the model documentation process, take a look at <a href="https://docs.validmind.ai/guide/get-started-developer-framework.html#how-do-i-use-the-framework"> How do I use the framework? </a>.