# Explore tests

Explore the individual out-the-box tests available in the ValidMind Library, and identify which tests to run to evaluate different aspects of your model. Browse available tests, view their descriptions, and filter by tags or task type to find tests relevant to your use case.


::: {.content-hidden when-format="html"}
## Contents    
- [About ValidMind](#toc1_)    
  - [Before you begin](#toc1_1_)    
  - [New to ValidMind?](#toc1_2_)    
  - [Key concepts](#toc1_3_)    
- [Install the ValidMind Library](#toc2_)    
- [List all available tests](#toc3_)    
- [Understand tags and task types](#toc4_)    
- [Filter tests by tags and task types](#toc5_)
- [Store test sets for use](#toc6_)    
- [Next steps](#toc7_)    
  - [Discover more learning resources](#toc7_1_)  

:::
<!-- jn-toc-notebook-config
	numbering=false
	anchor=true
	flat=false
	minLevel=2
	maxLevel=4
	/jn-toc-notebook-config -->
<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->

<a id='toc1_'></a>

## About ValidMind

ValidMind is a suite of tools for managing model risk, including risk associated with AI and statistical models.

You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on model documentation. Together, these products simplify model risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and model validators.

<a id='toc1_1_'></a>

### Before you begin

This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. 

If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).

<a id='toc1_2_'></a>


### New to ValidMind?

If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting models and running tests, as well as find code samples and our Python Library API reference.

<div class="alert alert-block alert-info" style="background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;"><span style="color: #083E44;"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>
<br></br>
<a href="https://docs.validmind.ai/guide/configuration/register-with-validmind.html" style="color: #DE257E;"><b>Register with ValidMind</b></a></div>

<a id='toc1_3_'></a>

### Key concepts

**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.

**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.

**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.

**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.

**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:

  - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).
  - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).
  - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.
  - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/run_tests_that_require_multiple_datasets.html) for more information.

**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.

**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.

**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.

Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases.


<a id='toc2_'></a>

## Install the ValidMind Library

<div class="alert alert-block alert-info" style="background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;"><span style="color: #083E44;"><b>Recommended Python versions</b></span>
<br></br>
Python 3.8 <= x <= 3.11</div>

To install the library:

In [None]:
%pip install -q validmind

<a id='toc3_'></a>

## List all available tests



Start by importing the functions from the [validmind.tests](https://docs.validmind.ai/validmind/validmind/tests.html) module for listing tests, listing tasks, listing tags, and listing tasks and tags to access these functions in the rest of this notebook:

In [None]:
from validmind.tests import (
    list_tests,
    list_tasks,
    list_tags,
    list_tasks_and_tags,
)

Use [list_tests()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to retrieve all available ValidMind tests, which returns a DataFrame with the following columns:

- **ID** – A unique identifier for each test.
- **Name** – The test’s name.
- **Description** – A short summary of what the test evaluates.
- **Tags** –  Keywords that describe what the test does or applies to.
- **Tasks** – The type of modeling task the test supports.

In [None]:
list_tests()

ID,Name,Description,Has Figure,Has Table,Required Inputs,Params,Tags,Tasks
validmind.data_validation.ACFandPACFPlot,AC Fand PACF Plot,Analyzes time series data using Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) plots to...,True,False,['dataset'],{},"['time_series_data', 'forecasting', 'statistical_test', 'visualization']",['regression']
validmind.data_validation.ADF,ADF,Assesses the stationarity of a time series dataset using the Augmented Dickey-Fuller (ADF) test....,False,True,['dataset'],{},"['time_series_data', 'statsmodels', 'forecasting', 'statistical_test', 'stationarity']",['regression']
validmind.data_validation.AutoAR,Auto AR,Automatically identifies the optimal Autoregressive (AR) order for a time series using BIC and AIC criteria....,False,True,['dataset'],"{'max_ar_order': {'type': 'int', 'default': 3}}","['time_series_data', 'statsmodels', 'forecasting', 'statistical_test']",['regression']
validmind.data_validation.AutoMA,Auto MA,Automatically selects the optimal Moving Average (MA) order for each variable in a time series dataset based on...,False,True,['dataset'],"{'max_ma_order': {'type': 'int', 'default': 3}}","['time_series_data', 'statsmodels', 'forecasting', 'statistical_test']",['regression']
validmind.data_validation.AutoStationarity,Auto Stationarity,Automates Augmented Dickey-Fuller test to assess stationarity across multiple time series in a DataFrame....,False,True,['dataset'],"{'max_order': {'type': 'int', 'default': 5}, 'threshold': {'type': 'float', 'default': 0.05}}","['time_series_data', 'statsmodels', 'forecasting', 'statistical_test']",['regression']
validmind.data_validation.BivariateScatterPlots,Bivariate Scatter Plots,Generates bivariate scatterplots to visually inspect relationships between pairs of numerical predictor variables...,True,False,['dataset'],{},"['tabular_data', 'numerical_data', 'visualization']",['classification']
validmind.data_validation.BoxPierce,Box Pierce,Detects autocorrelation in time-series data through the Box-Pierce test to validate model performance....,False,True,['dataset'],{},"['time_series_data', 'forecasting', 'statistical_test', 'statsmodels']",['regression']
validmind.data_validation.ChiSquaredFeaturesTable,Chi Squared Features Table,Assesses the statistical association between categorical features and a target variable using the Chi-Squared test....,False,True,['dataset'],"{'p_threshold': {'type': '_empty', 'default': 0.05}}","['tabular_data', 'categorical_data', 'statistical_test']",['classification']
validmind.data_validation.ClassImbalance,Class Imbalance,Evaluates and quantifies class distribution imbalance in a dataset used by a machine learning model....,True,True,['dataset'],"{'min_percent_threshold': {'type': 'int', 'default': 10}}","['tabular_data', 'binary_classification', 'multiclass_classification', 'data_quality']",['classification']
validmind.data_validation.DatasetDescription,Dataset Description,Provides comprehensive analysis and statistical summaries of each column in a machine learning model's dataset....,False,True,['dataset'],{},"['tabular_data', 'time_series_data', 'text_data']","['classification', 'regression', 'text_classification', 'text_summarization']"


<a id='toc4_'></a>

## Understand tags and task types

Use [list_tasks()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tasks) to view all unique task types used to classify tests in the ValidMind Library.

Understanding `task` types helps you filter tests that match your model’s objective. For example:

- **classification:** Works with Classification Models and Datasets.
- **regression:** Works with Regression Models and Datasets.
- **text classification:** Works with Text Classification Models and Datasets.
- **text summarization:** Works with Text Summarization Models and Datasets.

In [3]:
list_tasks()

['text_qa',
 'classification',
 'data_validation',
 'text_classification',
 'feature_extraction',
 'regression',
 'visualization',
 'clustering',
 'time_series_forecasting',
 'text_summarization',
 'nlp',
 'residual_analysis',
 'monitoring',
 'text_generation']

Use [list_tags()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tags) to view all unique tags used to describe tests in the ValidMind Library.

`Tags` describe what a test applies to and help you filter tests for your use case. Examples include:

- **llm:** Tests that work with Large Language Models.
- **nlp:** Tests relevant for natural language processing.
- **binary_classification:** Tests for binary classification tasks.
- **forecasting:** Tests for forecasting and time-series analysis.
- **tabular_data:** Tests for tabular data like CSVs and Excel spreadsheets.



In [4]:
list_tags()

['senstivity_analysis',
 'calibration',
 'clustering',
 'anomaly_detection',
 'nlp',
 'classification_metrics',
 'dimensionality_reduction',
 'tabular_data',
 'time_series_data',
 'model_predictions',
 'feature_selection',
 'correlation',
 'frequency_analysis',
 'embeddings',
 'regression',
 'llm',
 'statsmodels',
 'ragas',
 'model_performance',
 'model_validation',
 'rag_performance',
 'model_training',
 'qualitative',
 'classification',
 'kmeans',
 'multiclass_classification',
 'linear_regression',
 'data_quality',
 'text_data',
 'binary_classification',
 'threshold_optimization',
 'stationarity',
 'bias_and_fairness',
 'scorecard',
 'model_explainability',
 'model_comparison',
 'numerical_data',
 'sklearn',
 'model_selection',
 'retrieval_performance',
 'zero_shot',
 'statistical_test',
 'descriptive_statistics',
 'seasonality',
 'analysis',
 'data_validation',
 'data_distribution',
 'feature_importance',
 'metadata',
 'few_shot',
 'visualization',
 'credit_risk',
 'forecasting',
 '

Finally, to match each task type with its related tags, use the [list_tasks_and_tags()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tasks_and_tags) function:

In [None]:
list_tasks_and_tags()

Task,Tags
regression,"senstivity_analysis, tabular_data, time_series_data, model_predictions, feature_selection, correlation, regression, statsmodels, model_performance, model_training, multiclass_classification, linear_regression, data_quality, text_data, model_explainability, binary_classification, stationarity, bias_and_fairness, numerical_data, sklearn, model_selection, statistical_test, descriptive_statistics, seasonality, analysis, data_validation, data_distribution, metadata, feature_importance, visualization, forecasting, model_diagnosis, model_interpretation, unit_root_test, categorical_data, data_analysis"
classification,"calibration, anomaly_detection, classification_metrics, tabular_data, time_series_data, feature_selection, correlation, statsmodels, model_performance, model_validation, model_training, classification, multiclass_classification, linear_regression, data_quality, text_data, binary_classification, threshold_optimization, bias_and_fairness, scorecard, model_comparison, numerical_data, sklearn, statistical_test, descriptive_statistics, feature_importance, data_distribution, metadata, visualization, credit_risk, AUC, logistic_regression, model_diagnosis, categorical_data, data_analysis"
text_classification,"model_performance, feature_importance, multiclass_classification, few_shot, frequency_analysis, zero_shot, text_data, visualization, llm, binary_classification, ragas, model_diagnosis, model_comparison, sklearn, nlp, retrieval_performance, tabular_data, time_series_data"
text_summarization,"qualitative, few_shot, frequency_analysis, embeddings, zero_shot, text_data, visualization, llm, rag_performance, ragas, retrieval_performance, nlp, dimensionality_reduction, tabular_data, time_series_data"
data_validation,"stationarity, statsmodels, unit_root_test, time_series_data"
time_series_forecasting,"model_training, data_validation, metadata, visualization, model_explainability, sklearn, model_performance, model_predictions, time_series_data"
nlp,"data_validation, frequency_analysis, text_data, visualization, nlp"
clustering,"clustering, model_performance, kmeans, sklearn"
residual_analysis,regression
visualization,regression


<a id='toc5_'></a>

## Filter tests by tags and task types

While listing all tests is useful, you’ll often want to narrow your search. The [list_tests()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) function supports `filter`, `task`, and `tags` parameters to assist in refining your results.

Use the `filter` parameter to find tests that match a specific keyword, such as `sklearn`:


In [6]:
list_tests(filter="sklearn")

ID,Name,Description,Has Figure,Has Table,Required Inputs,Params,Tags,Tasks
validmind.model_validation.ClusterSizeDistribution,Cluster Size Distribution,Assesses the performance of clustering models by comparing the distribution of cluster sizes in model predictions...,True,False,"['dataset', 'model']",{},"['sklearn', 'model_performance']",['clustering']
validmind.model_validation.TimeSeriesR2SquareBySegments,Time Series R2 Square By Segments,Evaluates the R-Squared values of regression models over specified time segments in time series data to assess...,True,True,"['dataset', 'model']","{'segments': {'type': None, 'default': None}}","['model_performance', 'sklearn']","['regression', 'time_series_forecasting']"
validmind.model_validation.sklearn.AdjustedMutualInformation,Adjusted Mutual Information,"Evaluates clustering model performance by measuring mutual information between true and predicted labels, adjusting...",False,True,"['model', 'dataset']",{},"['sklearn', 'model_performance', 'clustering']",['clustering']
validmind.model_validation.sklearn.AdjustedRandIndex,Adjusted Rand Index,Measures the similarity between two data clusters using the Adjusted Rand Index (ARI) metric in clustering machine...,False,True,"['model', 'dataset']",{},"['sklearn', 'model_performance', 'clustering']",['clustering']
validmind.model_validation.sklearn.CalibrationCurve,Calibration Curve,Evaluates the calibration of probability estimates by comparing predicted probabilities against observed...,True,False,"['model', 'dataset']","{'n_bins': {'type': 'int', 'default': 10}}","['sklearn', 'model_performance', 'classification']",['classification']
validmind.model_validation.sklearn.ClassifierPerformance,Classifier Performance,"Evaluates performance of binary or multiclass classification models using precision, recall, F1-Score, accuracy,...",False,True,"['dataset', 'model']","{'average': {'type': 'str', 'default': 'macro'}}","['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']","['classification', 'text_classification']"
validmind.model_validation.sklearn.ClassifierThresholdOptimization,Classifier Threshold Optimization,Analyzes and visualizes different threshold optimization methods for binary classification models....,False,True,"['dataset', 'model']","{'methods': {'type': None, 'default': None}, 'target_recall': {'type': None, 'default': None}}","['model_validation', 'threshold_optimization', 'classification_metrics']",['classification']
validmind.model_validation.sklearn.ClusterCosineSimilarity,Cluster Cosine Similarity,Measures the intra-cluster similarity of a clustering model using cosine similarity....,False,True,"['model', 'dataset']",{},"['sklearn', 'model_performance', 'clustering']",['clustering']
validmind.model_validation.sklearn.ClusterPerformanceMetrics,Cluster Performance Metrics,Evaluates the performance of clustering machine learning models using multiple established metrics....,False,True,"['model', 'dataset']",{},"['sklearn', 'model_performance', 'clustering']",['clustering']
validmind.model_validation.sklearn.CompletenessScore,Completeness Score,Evaluates a clustering model's capacity to categorize instances from a single class into the same cluster....,False,True,"['model', 'dataset']",{},"['sklearn', 'model_performance', 'clustering']",['clustering']


Use the `task` parameter to find tests that match a specific task type,  such as `classification`:




In [7]:
list_tests(task="classification")

ID,Name,Description,Has Figure,Has Table,Required Inputs,Params,Tags,Tasks
validmind.data_validation.BivariateScatterPlots,Bivariate Scatter Plots,Generates bivariate scatterplots to visually inspect relationships between pairs of numerical predictor variables...,True,False,['dataset'],{},"['tabular_data', 'numerical_data', 'visualization']",['classification']
validmind.data_validation.ChiSquaredFeaturesTable,Chi Squared Features Table,Assesses the statistical association between categorical features and a target variable using the Chi-Squared test....,False,True,['dataset'],"{'p_threshold': {'type': '_empty', 'default': 0.05}}","['tabular_data', 'categorical_data', 'statistical_test']",['classification']
validmind.data_validation.ClassImbalance,Class Imbalance,Evaluates and quantifies class distribution imbalance in a dataset used by a machine learning model....,True,True,['dataset'],"{'min_percent_threshold': {'type': 'int', 'default': 10}}","['tabular_data', 'binary_classification', 'multiclass_classification', 'data_quality']",['classification']
validmind.data_validation.DatasetDescription,Dataset Description,Provides comprehensive analysis and statistical summaries of each column in a machine learning model's dataset....,False,True,['dataset'],{},"['tabular_data', 'time_series_data', 'text_data']","['classification', 'regression', 'text_classification', 'text_summarization']"
validmind.data_validation.DatasetSplit,Dataset Split,"Evaluates and visualizes the distribution proportions among training, testing, and validation datasets of an ML...",False,True,['datasets'],{},"['tabular_data', 'time_series_data', 'text_data']","['classification', 'regression', 'text_classification', 'text_summarization']"
validmind.data_validation.DescriptiveStatistics,Descriptive Statistics,Performs a detailed descriptive statistical analysis of both numerical and categorical data within a model's...,False,True,['dataset'],{},"['tabular_data', 'time_series_data', 'data_quality']","['classification', 'regression']"
validmind.data_validation.Duplicates,Duplicates,"Tests dataset for duplicate entries, ensuring model reliability via data quality verification....",False,True,['dataset'],"{'min_threshold': {'type': '_empty', 'default': 1}}","['tabular_data', 'data_quality', 'text_data']","['classification', 'regression']"
validmind.data_validation.FeatureTargetCorrelationPlot,Feature Target Correlation Plot,Visualizes the correlation between input features and the model's target output in a color-coded horizontal bar...,True,False,['dataset'],"{'fig_height': {'type': '_empty', 'default': 600}}","['tabular_data', 'visualization', 'correlation']","['classification', 'regression']"
validmind.data_validation.HighCardinality,High Cardinality,Assesses the number of unique values in categorical columns to detect high cardinality and potential overfitting....,False,True,['dataset'],"{'num_threshold': {'type': 'int', 'default': 100}, 'percent_threshold': {'type': 'float', 'default': 0.1}, 'threshold_type': {'type': 'str', 'default': 'percent'}}","['tabular_data', 'data_quality', 'categorical_data']","['classification', 'regression']"
validmind.data_validation.HighPearsonCorrelation,High Pearson Correlation,Identifies highly correlated feature pairs in a dataset suggesting feature redundancy or multicollinearity....,False,True,['dataset'],"{'max_threshold': {'type': 'float', 'default': 0.3}, 'top_n_correlations': {'type': 'int', 'default': 10}, 'feature_columns': {'type': 'list', 'default': None}}","['tabular_data', 'data_quality', 'correlation']","['classification', 'regression']"


Use the `tags` parameter to find tests based on their tags, such as `model_performance` or `visualization`:

In [None]:
list_tests(tags=["model_performance", "visualization"])

ID,Name,Description,Has Figure,Has Table,Required Inputs,Params,Tags,Tasks
validmind.model_validation.RegressionResidualsPlot,Regression Residuals Plot,Evaluates regression model performance using residual distribution and actual vs. predicted plots....,True,False,"['model', 'dataset']","{'bin_size': {'type': 'float', 'default': 0.1}}","['model_performance', 'visualization']",['regression']
validmind.model_validation.sklearn.ConfusionMatrix,Confusion Matrix,Evaluates and visually represents the classification ML model's predictive performance using a Confusion Matrix...,True,False,"['dataset', 'model']","{'threshold': {'type': 'float', 'default': 0.5}}","['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']","['classification', 'text_classification']"
validmind.model_validation.sklearn.PrecisionRecallCurve,Precision Recall Curve,Evaluates the precision-recall trade-off for binary classification models and visualizes the Precision-Recall curve....,True,False,"['model', 'dataset']",{},"['sklearn', 'binary_classification', 'model_performance', 'visualization']","['classification', 'text_classification']"
validmind.model_validation.sklearn.ROCCurve,ROC Curve,Evaluates binary classification model performance by generating and plotting the Receiver Operating Characteristic...,True,False,"['model', 'dataset']",{},"['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']","['classification', 'text_classification']"
validmind.model_validation.sklearn.TrainingTestDegradation,Training Test Degradation,Tests if model performance degradation between training and test datasets exceeds a predefined threshold....,False,True,"['datasets', 'model']","{'max_threshold': {'type': 'float', 'default': 0.1}}","['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']","['classification', 'text_classification']"
validmind.ongoing_monitoring.CalibrationCurveDrift,Calibration Curve Drift,Evaluates changes in probability calibration between reference and monitoring datasets....,True,True,"['datasets', 'model']","{'n_bins': {'type': 'int', 'default': 10}, 'drift_pct_threshold': {'type': 'float', 'default': 20}}","['sklearn', 'binary_classification', 'model_performance', 'visualization']","['classification', 'text_classification']"
validmind.ongoing_monitoring.ROCCurveDrift,ROC Curve Drift,Compares ROC curves between reference and monitoring datasets....,True,False,"['datasets', 'model']",{},"['sklearn', 'binary_classification', 'model_performance', 'visualization']","['classification', 'text_classification']"


Use `filter`, `task`, and `tags` together to create more specific queries.

For example, apply all three to find tests compatible with `sklearn` models, designed for `classification` tasks:

In [None]:
list_tests(filter="sklearn",
    tags=["model_performance", "visualization"], task="classification"
)

ID,Name,Description,Has Figure,Has Table,Required Inputs,Params,Tags,Tasks
validmind.model_validation.sklearn.ConfusionMatrix,Confusion Matrix,Evaluates and visually represents the classification ML model's predictive performance using a Confusion Matrix...,True,False,"['dataset', 'model']","{'threshold': {'type': 'float', 'default': 0.5}}","['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']","['classification', 'text_classification']"
validmind.model_validation.sklearn.PrecisionRecallCurve,Precision Recall Curve,Evaluates the precision-recall trade-off for binary classification models and visualizes the Precision-Recall curve....,True,False,"['model', 'dataset']",{},"['sklearn', 'binary_classification', 'model_performance', 'visualization']","['classification', 'text_classification']"
validmind.model_validation.sklearn.ROCCurve,ROC Curve,Evaluates binary classification model performance by generating and plotting the Receiver Operating Characteristic...,True,False,"['model', 'dataset']",{},"['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']","['classification', 'text_classification']"
validmind.model_validation.sklearn.TrainingTestDegradation,Training Test Degradation,Tests if model performance degradation between training and test datasets exceeds a predefined threshold....,False,True,"['datasets', 'model']","{'max_threshold': {'type': 'float', 'default': 0.1}}","['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']","['classification', 'text_classification']"
validmind.ongoing_monitoring.CalibrationCurveDrift,Calibration Curve Drift,Evaluates changes in probability calibration between reference and monitoring datasets....,True,True,"['datasets', 'model']","{'n_bins': {'type': 'int', 'default': 10}, 'drift_pct_threshold': {'type': 'float', 'default': 20}}","['sklearn', 'binary_classification', 'model_performance', 'visualization']","['classification', 'text_classification']"
validmind.ongoing_monitoring.ROCCurveDrift,ROC Curve Drift,Compares ROC curves between reference and monitoring datasets....,True,False,"['datasets', 'model']",{},"['sklearn', 'binary_classification', 'model_performance', 'visualization']","['classification', 'text_classification']"


<a id='toc6_'></a>

## Store test sets for use

Once you've identified specific sets of tests you'd like to run, you can store the tests in variables, enabling you to easily reuse those tests in later steps.

For example, if you're validating a summarization model, use [`list_tests()`](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to retrieve all tests tagged for text summarization and save them to `text_summarization_tests` for later use:


In [None]:
text_summarization_tests = list_tests(task="text_summarization", pretty=False)
text_summarization_tests

['validmind.data_validation.DatasetDescription',
 'validmind.data_validation.DatasetSplit',
 'validmind.data_validation.nlp.CommonWords',
 'validmind.data_validation.nlp.Hashtags',
 'validmind.data_validation.nlp.LanguageDetection',
 'validmind.data_validation.nlp.Mentions',
 'validmind.data_validation.nlp.Punctuations',
 'validmind.data_validation.nlp.StopWords',
 'validmind.data_validation.nlp.TextDescription',
 'validmind.model_validation.BertScore',
 'validmind.model_validation.BleuScore',
 'validmind.model_validation.ContextualRecall',
 'validmind.model_validation.MeteorScore',
 'validmind.model_validation.RegardScore',
 'validmind.model_validation.RougeScore',
 'validmind.model_validation.TokenDisparity',
 'validmind.model_validation.ToxicityScore',
 'validmind.model_validation.embeddings.CosineSimilarityComparison',
 'validmind.model_validation.embeddings.CosineSimilarityHeatmap',
 'validmind.model_validation.embeddings.EuclideanDistanceComparison',
 'validmind.model_validation.

<a id='toc7_'></a>

## Next steps

Now that you know how to browse and filter tests in the ValidMind Library, you’re ready to take the next step. Use the test IDs you’ve selected to either run individual tests or batch run them with custom test suites.

<div class="alert alert-block alert-info" style="background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;"><span style="color: #083E44;"><b>Learn about the tests suites available in the ValidMind Library.</b></span>
<br></br>
Check out our <a href="https://docs.validmind.ai/notebooks/how_to/explore_test_suites.html" style="color: #DE257E;"><b>Explore test suites</b></a> notebook for more code examples and usage of key functions.</div>

<a id='toc7_1_'></a>

### Discover more learning resources

We offer many interactive notebooks to help you document models:

- [Run tests & test suites](https://docs.validmind.ai/developer/model-testing/testing-overview.html)
- [Code samples](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)

Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind.
