# ValidMind for model development — 102 Start the model development process

Learn how to use ValidMind for your end-to-end model documentation process with our series of four introductory notebooks. In this second notebook, you'll run tests and investigate results, then add the results or evidence to your documentation.

You'll become familiar with the individual tests available in ValidMind, as well as how to run them and change parameters as necessary. Using ValidMind's repository of individual tests as building blocks helps you ensure that a model is being built appropriately. 

**For a full list of out-of-the-box tests,** refer to our [Test descriptions](https://docs.validmind.ai/developer/model-testing/test-descriptions.html) or try the interactive [Test sandbox](https://docs.validmind.ai/developer/model-testing/test-sandbox.html).

## Prerequisites

In order to log test results or evidence to your model documentation with this notebook, you'll need to first:

- [ ] Register a model within the ValidMind Platform with a predefined documentation template
- [ ] Install and initialize the ValidMind Library, enabling you to connect to the correct model in the ValidMind Platform
- [ ] Preview the selected documentation template for your model and verify that it's appropriate for your use case

<div class="alert alert-block alert-info" style="background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;"><span style="color: #083E44;"><b>Need help with the above steps?</b></span>
<br></br>
Refer to the first notebook in this series: <a href="101-set_up_validmind.ipynb" style="color: #DE257E;"><b>101 Set up ValidMind</b></a></div>


## Setting up

### Initialize the ValidMind Library

First, let's connect up the ValidMind Library to our model we previously registered in the ValidMind Platform:

1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/configuration/log-in-to-validmind.html).

2. In the left sidebar, navigate to **Model Inventory** and select the model you registered for this "ValidMind for model development" series of notebooks.

3. Go to **Getting Started** and click **Copy snippet to clipboard**.

Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/model-documentation/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:

In [None]:
# Load your model identifier credentials from an `.env` file

%load_ext dotenv
%dotenv .env

# Or replace with your code snippet

import validmind as vm

vm.init(
    # api_host="...",
    # api_key="...",
    # api_secret="...",
    # model="...",
)

### Import sample dataset

First, let's import the public [Bank Customer Churn Prediction](https://www.kaggle.com/datasets/shantanudhakadd/bank-customer-churn-prediction) dataset from Kaggle. 

In our below example, note that: 

- The target column, `Exited` has a value of `1` when a customer has churned and `0` otherwise.
- The ValidMind Library provides a wrapper to automatically load the dataset as a Pandas DataFrame object.

In [None]:
from validmind.datasets.classification import customer_churn as demo_dataset

print(
    f"Loaded demo dataset with: \n\n\t• Target column: '{demo_dataset.target_column}' \n\t• Class labels: {demo_dataset.class_labels}"
)

raw_df = demo_dataset.load_data()
raw_df.head()

### Assess data quality

Next, let's do some data quality assessments by running a few individual tests related to data assessment.

Use the `vm.tests.list_tests()` function introduced by the first notebook in this series in combination with `vm.tests.list_tags()` and `vm.tests.list_tasks()` to find which prebuilt tests are relevant for data quality assessment:


In [None]:
# Get the list of available tags
sorted(vm.tests.list_tags())

In [None]:
# Get the list of available task types
sorted(vm.tests.list_tasks())

You can pass `tags` and `tasks` as parameters to the `vm.tests.list_tests()` function to filter the tests based on the tags and task types. For example, to find tests related to tabular data quality for classification models, you can call `list_tests()` like this:

In [None]:
vm.tests.list_tests(task="classification", tags=["tabular_data", "data_quality"])

### Initialize the ValidMind datasets

Now, assume we have identified some tests we want to run with regards to the data we are intending to use. The next step is to connect your data with a ValidMind `Dataset` object. **This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind,** but you only need to do it once per dataset.

Initialize a ValidMind dataset object using the [`init_dataset` function](https://docs.validmind.ai/validmind/validmind.html#init_dataset) from the ValidMind (`vm`) module. For this example, we'll pass in the following arguments:

- **`dataset`** — The raw dataset that you want to provide as input to tests
- **`input_id`** — A unique identifier that allows tracking what inputs are used when running each individual test
- **`target_column`** — A required argument if tests require access to true values. This is the name of the target column in the dataset


In [None]:
# vm_raw_dataset is now a VMDataset object that you can pass to any ValidMind test
vm_raw_dataset = vm.init_dataset(
    dataset=raw_df,
    input_id="raw_dataset",
    target_column="Exited",
)

## Running tests

<div class="alert alert-block alert-info" style="background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;"><span style="color: #083E44;"><b>Want to learn more about ValidMind tests?</b></span>
<br></br>
Refer to our notebook that includes code samples and usage of key functions: <a href="https://docs.validmind.ai/notebooks/how_to/explore_tests.html" style="color: #DE257E;"><b>Explore tests</b></a></div>

### Run tabular data tests

You run individual tests by calling [the `run_test` function](https://docs.validmind.ai/validmind/validmind/tests.html#run_test) provided by the `validmind.tests` module. For the examples below, we'll pass in the following arguments:

- **`test_id`** — The ID of the test to run, as seen in the `ID` column when you run `list_tests`. 
- **`params`** — A dictionary of parameters for the test. These will override any `default_params` set in the test definition. 

The inputs expected by a test can also be found in the test definition — let's take `validmind.data_validation.DescriptiveStatistics` as an example. Note that the output of [the `describe_test()` function](https://docs.validmind.ai/validmind/validmind/tests.html#describe_test) below shows that this test expects a `dataset` as input:


In [None]:
vm.tests.describe_test("validmind.data_validation.DescriptiveStatistics")

Now, let's run a few tests to assess the quality of the dataset:

In [None]:
result = vm.tests.run_test(
    test_id="validmind.data_validation.DescriptiveStatistics",
    inputs={"dataset": vm_raw_dataset},
)

In [None]:
result2 = vm.tests.run_test(
    test_id="validmind.data_validation.ClassImbalance",
    inputs={"dataset": vm_raw_dataset},
    params={"min_percent_threshold": 30},
)

You can see that the class imbalance test did not pass according to the value of `min_percent_threshold` we have set. Here is how you can re-run the test on some processed data to address this data quality issue. In this case we apply a very simple rebalancing technique to the dataset:


In [None]:
import pandas as pd

raw_copy_df = raw_df.sample(frac=1)  # Create a copy of the raw dataset

# Create a balanced dataset with the same number of exited and not exited customers
exited_df = raw_copy_df.loc[raw_copy_df["Exited"] == 1]
not_exited_df = raw_copy_df.loc[raw_copy_df["Exited"] == 0].sample(n=exited_df.shape[0])

balanced_raw_df = pd.concat([exited_df, not_exited_df])
balanced_raw_df = balanced_raw_df.sample(frac=1, random_state=42)

With this new raw dataset, you can re-run the individual test to see if it passes the class imbalance test requirement.

**Remember to first register a new VM Dataset object** since that is the type of input required by `run_test()`:


In [None]:
# Register new data and now 'balanced_raw_dataset' is the new dataset object of interest
vm_balanced_raw_dataset = vm.init_dataset(
    dataset=balanced_raw_df,
    input_id="balanced_raw_dataset",
    target_column="Exited",
)

In [None]:
result = vm.tests.run_test(
    test_id="validmind.data_validation.ClassImbalance",
    inputs={"dataset": vm_balanced_raw_dataset},
    params={"min_percent_threshold": 30},
)

<a id='toc5_3_'></a>

### Utilize test output

Here is an example for how you can utilize the output from a ValidMind test for futher use, for example, if you want to remove highly correlated features. The example below shows how you can get the list of features with the highest correlation coefficients and use them to reduce the final list of features for modeling.


In [None]:
corr_result = vm.tests.run_test(
    test_id="validmind.data_validation.HighPearsonCorrelation",
    params={"max_threshold": 0.3},
    inputs={"dataset": vm_balanced_raw_dataset},
)

Let's assume we want to remove highly correlated features from the dataset. `corr_result` is an object of type `TestResult`. We can inspect the result object to see what the test has produced.

In [None]:
print(type(corr_result))
print("Result ID: ", corr_result.result_id)
print("Params: ", corr_result.params)
print("Passed: ", corr_result.passed)
print("Tables: ", corr_result.tables)

Let's check out the table in the result and extract a list of features that failed the test:

In [None]:
features_df = corr_result.tables[0].data
features_df

Remove the highly correlated features and create a new VM dataset object. Note the use of different `input_id`s. This allows tracking the inputs used when running each individual test.


In [None]:
high_correlation_features = features_df[features_df["Pass/Fail"] == "Fail"]["Columns"].tolist()
high_correlation_features

Extract the feature names from the list of strings (e.g. '(Age, Exited)' -> 'Age')

In [None]:
high_correlation_features = [feature.split(",")[0].strip("()") for feature in high_correlation_features]
high_correlation_features

In [42]:
# Remove the highly correlated features from the dataset
balanced_raw_no_age_df = balanced_raw_df.drop(columns=high_correlation_features)

# Re-initialize the dataset object
vm_raw_dataset_preprocessed = vm.init_dataset(
    dataset=balanced_raw_no_age_df,
    input_id="raw_dataset_preprocessed",
    target_column="Exited",
)

Re-running the test with the reduced feature set should pass the test. You can also plot the correlation matrix to visualize the new correlation between features:


In [None]:
corr_result = vm.tests.run_test(
    test_id="validmind.data_validation.HighPearsonCorrelation",
    params={"max_threshold": 0.3},
    inputs={"dataset": vm_raw_dataset_preprocessed},
)

In [None]:
corr_result = vm.tests.run_test(
    test_id="validmind.data_validation.PearsonCorrelationMatrix",
    inputs={"dataset": vm_raw_dataset_preprocessed},
)

<a id='toc5_4_'></a>

### Documenting the results based on two datasets

We have now done some analysis on two different datasets and we should able to document why certain things were done to the raw data with testing to support it. Every test result returned by the `run_test()` function has a `.log()` method that can be used to log the test results to ValidMind. When logging individual results to ValidMind you need to manually add those results in a specific section of the model documentation.

When using `run_documentation_tests()`, it's possible to automatically populate a section with the results of all tests that were registered in the documentation template.

To show how to add individual results to any documentation section, we're going to populate the entire `data_preparation` section of the documentation using the clean `vm_raw_dataset_preprocessed` dataset as input, and then we're going to document an additional result for the highly correlated dataset `vm_balanced_raw_dataset`. The following two steps will accomplish this:

1. Run `run_documentation_tests()` using `vm_raw_dataset_preprocessed` as input. This populates the entire data preparation section for every test that is already part of the documentation template.
2. Log the individual result of the high correlation test that used `vm_balanced_raw_dataset` (that had a highly correlated `Age` column) as input

After adding the result of step #2 to the documentation you will be able to explain the changes made to the raw data by editing the default description of the test result within the ValidMind Platform.


<a id='toc5_4_1_'></a>

#### Run `run_documentation_tests()` using `vm_raw_dataset_preprocessed` as input

`run_documentation_tests()` allows you to run multiple tests at once and log the results to the documentation. The function takes the following arguments:

- `inputs`: any inputs to be passed to the tests
- `config`: a dictionary `<test_id>:<test_config>` that allows configuring each test individually. Each test config has the following form:
  - `params`: individual test parameters
  - `inputs`: individual test inputs. When passed, this overrides any inputs passed from the `run_documentation_tests()` function


In [None]:
test_config = {
    "validmind.data_validation.ClassImbalance": {
        "params": {"min_percent_threshold": 30},
    },
    "validmind.data_validation.HighPearsonCorrelation": {
        "params": {"max_threshold": 0.3},
    },
}

tests_suite = vm.run_documentation_tests(
    inputs={
        "dataset": vm_raw_dataset_preprocessed,
    },
    config=test_config,
    section=["data_preparation"],
)

<a id='toc5_4_2_'></a>

#### Log the individual result of the high correlation test that used `vm_balanced_raw_dataset` (that had a highly correlated `Age` column) as input

Here you can use a custom `result_id` to tag the individual result with a unique identifier. This `result_id` can be appended to `test_id` with a `:` separator. The `balanced_raw_dataset` result identifier will correspond to the `balanced_raw_dataset` input, the dataset that still has the `Age` column.


In [None]:
result = vm.tests.run_test(
    test_id="validmind.data_validation.HighPearsonCorrelation:balanced_raw_dataset",
    params={"max_threshold": 0.3},
    inputs={"dataset": vm_balanced_raw_dataset},
)
result.log()

<a id='toc5_5_'></a>

### Add individual test results to model documentation

You can now visit the documentation page for the model you connected to at the beginning of this notebook and add a new content block in the relevant section.

To do this, go to the documentation page of your model and navigate to the `Data Preparation` -> `Correlations and Interactions` section. Then hover after the "Pearson Correlation Matrix" content block to reveal the `+` button as shown in the screenshot below.

![screenshot showing insert button for test-driven blocks](../images/insert-test-driven-block-correlations.png)

Click on the `+` button and select `Test-Driven Block`. This will open a dialog where you can select `Threshold Test` as the type of the test-driven content block, and then select `High Pearson Correlation Vm Raw Dataset Test`. This will show a preview of the result and it should match the results shown above.

![screenshot showing the selected test result in the dialog](../images/selecting-high-pearson-correlation-test.png)

Finally, click on the `Insert block` button to add the test result to the documentation. You'll now see two individual results for the high correlation test in the `Correlations and Interactions` section of the documentation. To finalize the documentation, you can edit the test result's description block to explain the changes made to the raw data and the reasons behind them as we can see in the screenshot below.

![screenshot showing the high pearson correlation block](../images/high-pearson-correlation-block.png)


<a id='toc5_6_'></a>

### Model Testing

We have focused so far on the data assessment and pre-processing that usually occurs prior to any models being built. Now we are going to assume we have built a model and we want to incorporate some model results in our documentation.

Let's train a simple logistic regression model on the dataset and evaluate its performance. You will use the `LogisticRegression` class from the `sklearn.linear_model` and use ValidMind tests to evaluate the model's performance.

Before training the model, we need to encode the categorical features in the dataset. You will use the `OneHotEncoder` class from the `sklearn.preprocessing` module to encode the categorical features. The categorical features in the dataset are `Geography` and `Gender`.


In [None]:
balanced_raw_no_age_df.head()

In [None]:
balanced_raw_no_age_df = pd.get_dummies(
    balanced_raw_no_age_df, columns=["Geography", "Gender"], drop_first=True
)
balanced_raw_no_age_df.head()

In [49]:
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split

# Split the input and target variables
X = balanced_raw_no_age_df.drop("Exited", axis=1)
y = balanced_raw_no_age_df["Exited"]
X_train, X_test, y_train, y_test = train_test_split(
    X,
    y,
    test_size=0.2,
    random_state=42,
)

# Logistic Regression grid params
log_reg_params = {
    "penalty": ["l1", "l2"],
    "C": [0.001, 0.01, 0.1, 1, 10, 100, 1000],
    "solver": ["liblinear"],
}

# Grid search for Logistic Regression
from sklearn.model_selection import GridSearchCV

grid_log_reg = GridSearchCV(LogisticRegression(), log_reg_params)
grid_log_reg.fit(X_train, y_train)

# Logistic Regression best estimator
log_reg = grid_log_reg.best_estimator_

<a id='toc5_7_'></a>

### Initialize model evaluation objects and assigning predictions

The last step for evaluating the model's performance is to initialize the ValidMind `Dataset` and `Model` objects and assign model predictions to each dataset. You will use the `init_dataset`, `init_model` and `assign_predictions` functions to initialize these objects.


In [50]:
train_df = X_train
train_df["Exited"] = y_train
test_df = X_test
test_df["Exited"] = y_test

vm_train_ds = vm.init_dataset(
    input_id="train_dataset_final",
    dataset=train_df,
    target_column="Exited",
)

vm_test_ds = vm.init_dataset(
    input_id="test_dataset_final",
    dataset=test_df,
    target_column="Exited",
)

# Register the model
vm_model = vm.init_model(log_reg, input_id="log_reg_model_v1")

Once the model has been registered you can assign model predictions to the training and test datasets. The `assign_predictions()` method from the `Dataset` object can link existing predictions to any number of models. If no prediction values are passed, the method will compute predictions automatically:


In [None]:
vm_train_ds.assign_predictions(model=vm_model)
vm_test_ds.assign_predictions(model=vm_model)

<a id='toc5_8_'></a>

### Run the model evaluation tests

In this part, we focus on running the tests within the model development section of the model documentation. Only tests associated with this section will be executed, and the corresponding results will be updated in the model documentation. In the example below, you will focus on only running tests for the `model development` section of the document.

Note the additional config that is passed to `run_documentation_tests()`. This allows you to override inputs or params in certain tests. In our case, we want to explicitly use the `vm_train_ds` for the `validmind.model_validation.sklearn.ClassifierPerformance:in_sample` test, since it's supposed to run on the training dataset and not the test dataset.


In [None]:
test_config = {
    "validmind.model_validation.sklearn.ClassifierPerformance:in_sample": {
        "inputs": {
            "dataset": vm_train_ds,
            "model": vm_model,
        },
    }
}
results = vm.run_documentation_tests(
    section=["model_development"],
    inputs={
        "dataset": vm_test_ds,  # Any test that requires a single dataset will use vm_test_ds
        "model": vm_model,
        "datasets": (
            vm_train_ds,
            vm_test_ds,
        ),  # Any test that requires multiple datasets will use vm_train_ds and vm_test_ds
    },
    config=test_config,
)

## Next steps

### Integrate custom tests