diff --git a/README.md b/README.md index 775542ed9a..d35a5c978f 100644 --- a/README.md +++ b/README.md @@ -174,6 +174,19 @@ The script reads from: Output: Content is injected directly into `site/guide/templates/customize-document-templates.qmd` between marker comments. +#### Chatbot product map and LLM corpus + +The in-app assistant (Valerie) uses generated files under `site/llm/`, including `chatbot-product-map.md` (platform routes mapped to docs URLs and section headings). CI regenerates that map and fails if it is out of date with your changes. + +If you edit `.qmd` files that affect linked docs or headings (for example FAQ or guide pages referenced from the product UI), regenerate and commit the map before opening or updating a pull request: + +```bash +cd site +make generate-chatbot-product-map +``` + +If product routes or in-app help links changed, use `make refresh-chatbot-product-map` instead (requires a local `validmind/frontend` checkout). See [`site/llm/README.md`](site/llm/README.md) for the full LLM render pipeline, snapshot maintenance, and when to refresh each artifact. + #### Stylesheet organization (IN PROGRESS) The site uses a modular stylesheet architecture to maintain organized and maintainable styles: diff --git a/site/_quarto.yml b/site/_quarto.yml index eba558917b..977562cd82 100644 --- a/site/_quarto.yml +++ b/site/_quarto.yml @@ -103,7 +103,7 @@ website: - text: "Library and platform" file: about/library-and-platform.qmd contents: - - about/overview-model-documentation.qmd + - about/overview-documentation.qmd - about/overview-llm-features.qmd - text: "Deployment options" file: about/deployment/deployment-options.qmd diff --git a/site/about/contributing/style-guide/conventions.qmd b/site/about/contributing/style-guide/conventions.qmd index b9656e55f7..17fc6e2c78 100644 --- a/site/about/contributing/style-guide/conventions.qmd +++ b/site/about/contributing/style-guide/conventions.qmd @@ -125,7 +125,7 @@ Column 2, 50% wide Sometimes, it's helpful to highlight a call to action with a button that takes you to a topic or to a notebook on JupyterHub. -Change any Markdown link into a our theme-styled button by appending `{.button}`: +Change any Markdown link into one of our theme-styled buttons by appending `{.button}`: :::: {.flex .flex-wrap .justify-around} @@ -167,8 +167,8 @@ Using a markdown button also enables you to link to to the `.qmd` path instead o ```markdown - The record is registered in the inventory.[^1] - - You've already customized your model lifecycle statuses for use in workflows.[^2] - - Workflows have already been set up for use with your models.[^3] + - You've already customized your record stages for use in workflows.[^2] + - Workflows have already been set up for use with your records.[^3] - You are assigned a role that has access to complete actions set up by workflows.[^5] @@ -344,7 +344,7 @@ Use backticks to enclose keyboard commands, parameters, field values, and file n | Correct | Incorrect | |------|-----| -| Learn how to store model identifier credentials in a `.env` file instead of using inline credentials. | Learn how to store model identifier credentials in a ".env" file instead of using inline credentials. | +| Learn how to store record identifier credentials in a `.env` file instead of using inline credentials. | Learn how to store record identifier credentials in a ".env" file instead of using inline credentials. | | For example, the `classifier_full_suite` test suite runs tests from the `tabular_dataset` and `classifier` test suites to fully document the data and model sections for binary classification model use cases. | For example, the "classifier_full_suite" test suite runs tests from the "tabular_dataset" and "classifier" test suites to fully document the data and model sections for binary classification model use cases. | | Under When these conditions are met, you are able to set both `AND` and `OR` conditions. | Under When these conditions are met, you are able to set both "AND" and "OR" conditions.| : **Backtick** examples {.hover} @@ -380,7 +380,7 @@ Within our documentation (`https://docs.validmind.ai/`), you are able to referen | Product Name | Variable Key | Description | |---:|---|---| -| {{< var validmind.product >}} | `{{{< var validmind.product >}}}` | Comphrensive suite of tools with a {{< var vm.developer >}} for documenting and testing models, alongside a {{< var vm.platform >}} hosting cloud-based tools, APIs, databases, and validation engines. | +| {{< var validmind.product >}} | `{{{< var validmind.product >}}}` | Comprehensive suite of tools with a {{< var vm.developer >}} for documenting and testing records (such as models), alongside a {{< var vm.platform >}} hosting cloud-based tools, APIs, databases, and validation engines. | | {{< var validmind.developer >}} | `{{{< var validmind.developer >}}}` | Open-source library that connects to the {{< var validmind.platform >}}. | | {{< var validmind.platform >}} | `{{{< var validmind.platform >}}}` | Hosted multi-tenant architecture that includes a cloud-based web interface. | | {{< var validmind.api >}} | `{{{< var validmind.api >}}}` | Used to make calls to the {{< var validmind.developer >}}.[^21] | @@ -438,7 +438,7 @@ From **{{< fa gear >}} Settings** in the {{< var validmind.platform >}},
yo - Set up your organization - Onboard new users - Manage roles, groups and
permissions -- Configure the model inventory +- Configure the inventory - Manage templates and workflows - And much more! diff --git a/site/about/contributing/style-guide/voice-and-tone.qmd b/site/about/contributing/style-guide/voice-and-tone.qmd index 60b20dcbd4..f43495e50b 100644 --- a/site/about/contributing/style-guide/voice-and-tone.qmd +++ b/site/about/contributing/style-guide/voice-and-tone.qmd @@ -47,7 +47,7 @@ Behind every page, there’s a person. In every word, lies an opportunity to win | Correct | Incorrect | |------|-----| | **User acknowledgement:** Documenting artifacts can be difficult and tedious for even the most seasoned of validators. | **User dismissal:** For experienced validators, documenting artifacts is a breeze. | -| **Success toast:** Nice work — you’ve successfully registered your first model! | **Inappropriate humor:** We lost your model documentation, oops! Here, have a pony! (e.g. error message for serious issue) | +| **Success toast:** Nice work — you’ve successfully registered your first record! | **Inappropriate humor:** We lost your documentation, oops! Here, have a pony! (e.g. error message for serious issue) | : **Empathy & humor** examples {.hover} ### Be positive @@ -82,7 +82,7 @@ Address the reader directly by using the second person. | Correct | Incorrect | |------|-----| -| After completing this quickstart, you will be able to view your test results as part of your model documentation right in the {{< var validmind.platform >}}. | After completing this quickstart, the model developer will be able to view the test results as part of the model documentation right in the {{< var validmind.platform >}}. | +| After completing this quickstart, you will be able to view your test results as part of your documentation right in the {{< var validmind.platform >}}. | After completing this quickstart, the developer will be able to view the test results as part of the documentation right in the {{< var validmind.platform >}}. | : **2nd person** examples {.hover} ### Avoid stiff formality @@ -92,7 +92,7 @@ Address the reader directly by using the second person. | Correct | Incorrect | |------|-----| -| Once you’ve registered the model, you can then grab the unique code snippet that will have been generated for you to use in the next step. | First, you must register the model as this will generate a unique code snippet that needs to be copied. Then, you need to retrieve the code snippet so that you can make use of it in the following step. | +| Once you’ve registered the record, you can then grab the unique code snippet that will have been generated for you to use in the next step. | First, you must register the record as this will generate a unique code snippet that needs to be copied. Then, you need to retrieve the code snippet so that you can make use of it in the following step. | : **Informal language** examples {.hover} ### Focus on teamwork diff --git a/site/about/contributing/validmind-community.qmd b/site/about/contributing/validmind-community.qmd index bc0996e66b..8dc36167ef 100644 --- a/site/about/contributing/validmind-community.qmd +++ b/site/about/contributing/validmind-community.qmd @@ -11,7 +11,7 @@ aliases: - /about/join-community.html --- -Work with financial models, in model risk management (MRM), or are simply enthusiastic about artificial intelligence (AI) and machine learning and how these tools are actively shaping our futures within the finance industry and beyond? Congratulations — you're already part of the {{< var vm.product >}} community! Come learn and play with us. +Work with financial models, in model risk management (MRM), in AI governance, or are simply enthusiastic about artificial intelligence (AI) and machine learning and how these tools are actively shaping our futures within the finance industry and beyond? Congratulations — you're already part of the {{< var vm.product >}} community! Come learn and play with us. ::: {.callout} diff --git a/site/about/deployment/deployment-options.qmd b/site/about/deployment/deployment-options.qmd index 9dab1db881..acb49defa8 100644 --- a/site/about/deployment/deployment-options.qmd +++ b/site/about/deployment/deployment-options.qmd @@ -26,23 +26,23 @@ Choose the {{< var vm.product >}} deployment option that best suits your organiz ![{{< var vm.product >}} architecture overview](validmind-architecture-overview.png){fig-alt="An image showing the ValidMind architecture"} -In your own environment, model developers can continue to run models using your existing tools for data science and model development, such as Python, Jupyter Notebooks, and R, accessing data from sources such as Google Cloud Storage, Amazon S3, and Snowflake. +In your own environment, developers can continue to run records (such as models) using your existing tools for data science and development, such as Python, Jupyter Notebooks, and R, accessing data from sources such as Google Cloud Storage, Amazon S3, and Snowflake. -These models are then integrated with the {{< var validmind.developer >}}, which communicates with the {{< var validmind.platform >}} via our {{< var validmind.api >}}. +These records are then integrated with the {{< var validmind.developer >}}, which communicates with the {{< var validmind.platform >}} via our {{< var validmind.api >}}. The {{< var validmind.platform >}} provides: -- **Model inventory** — Centralized tracking and organization of models, accessible by developers, validators, and executives. +- **Inventory** — Centralized tracking and organization of records, accessible by developers, validators, and executives. - **Documentation & validation engine** — Automated testing and documentation, with validation processes, ensuring compliance with regulations and internal policies. - **Template management** — Allows for easy creation, customization, and reuse of document templates. -- **{{< var vm.product >}} dashboard** — A user-friendly interface providing insights, status updates, and governance reporting for model risk. +- **{{< var vm.product >}} dashboard** — A user-friendly interface providing insights, status updates, and governance reporting for risk. ## Security & data privacy -We ensure data security through strong data isolation, encryption, and role-based access controls.[^1] Personal identifiable information and customer data are not stored in model documentation. For more information, see our data privacy policy.[^2] +We ensure data security through strong data isolation, encryption, and role-based access controls.[^1] Personal identifiable information and customer data are not stored in documentation. For more information, see our data privacy policy.[^2] ## Secure access diff --git a/site/about/deployment/system-access-requirements.qmd b/site/about/deployment/system-access-requirements.qmd index 9e0275968d..37586d99bf 100644 --- a/site/about/deployment/system-access-requirements.qmd +++ b/site/about/deployment/system-access-requirements.qmd @@ -10,7 +10,7 @@ Allow list the following domains in your organization’s firewall to ensure you ## ValidMind Library Python API access -To use our documentation automation tools and test suites for model developers and validators: +To use our documentation automation tools and test suites for developers and validators: ```html *.validmind.ai diff --git a/site/about/fine-print/data-privacy-policy.qmd b/site/about/fine-print/data-privacy-policy.qmd index 734d2c7f6a..2bb274fdc2 100644 --- a/site/about/fine-print/data-privacy-policy.qmd +++ b/site/about/fine-print/data-privacy-policy.qmd @@ -38,16 +38,16 @@ Understanding our policies shouldn’t feel like deciphering code, so we’ve ma The key points of our data privacy policy include: -- **No personal identifiable information in documentation** — When the {{< var validmind.developer >}} generates documentation, it ensures that no personally identifiable information (PII) is included. This practice is a critical part of our commitment to protecting your privacy and maintaining the confidentiality of your data. +- **No personally identifiable information in documentation** — When the {{< var validmind.developer >}} generates documentation, it ensures that no personally identifiable information (PII) is included. This practice is a critical part of our commitment to protecting your privacy and maintaining the confidentiality of your data. -- **No storage of customer data** — {{< var vm.product >}} does not retain any customer datasets or models. This policy is in place in order to protect your data privacy and security. By not storing this information, {{< var vm.product >}} minimizes the risk of unauthorized access or data breaches. +- **No storage of customer data** — {{< var vm.product >}} does not retain any customer datasets or records (models). This policy is in place in order to protect your data privacy and security. By not storing this information, {{< var vm.product >}} minimizes the risk of unauthorized access or data breaches. We believe it is important for users of {{< var vm.product >}}'s products to understand these practices as they reflect our dedication to data security and privacy. ::: {.callout-important} ## {{< var vm.product >}} does NOT: -- Include any personal identifiable information (PII) when generating documentation reports. -- Store any customer datasets or models. +- Include any personally identifiable information (PII) when generating documentation reports. +- Store any customer datasets or records (models). ::: ## Do you comply with the SOC 2 security standard? @@ -64,13 +64,13 @@ The {{< var validmind.vpv >}} option provides all our features and services but Access is available through AWS PrivateLink, Azure Private Link, or GCP Private Service Connect, all of which provide private connectivity between {{< var vm.product >}} and your on-premises network without exposing your traffic to the public internet. -## What model assets are imported into documentation? +## What record (model) assets are imported into documentation? When you generate documentation or run tests, {{< var vm.product >}} imports the following assets into the documentation via our {{< var validmind.api >}} endpoint integration: ![Artifacts imported into the documentation via our {{< var vm.api >}}](overview-api-integration.jpg){width=80% fig-alt="A representation of assets imported into the documentation via our Python API"} -- Metadata about datasets and models, used to look up programmatic documentation content, such as the stored definition for _common logistic regression limitations_ when a logistic regression model has been passed to the {{< var vm.product >}} test suite to be run. +- Metadata about datasets and records, used to look up programmatic documentation content, such as the stored definition for _common logistic regression limitations_ when a logistic regression model has been passed to the {{< var vm.product >}} test suite to be run. - Quality and performance metrics collected from datasets and models. - Output from tests and test suites that have been run. - Images, plots, visuals that were generated as part of extracting metrics and running tests. diff --git a/site/about/glossary/_ai-governance.qmd b/site/about/glossary/_ai-governance.qmd new file mode 100644 index 0000000000..4b9324e7c6 --- /dev/null +++ b/site/about/glossary/_ai-governance.qmd @@ -0,0 +1,42 @@ + + +AI ethics +: A set of principles and practices guiding the responsible design, development, and deployment of AI systems. Common tenets include fairness, transparency, accountability, privacy, and human well-being. + +AI lifecycle +: The end-to-end stages an AI system progresses through, including problem framing, data collection, model development, validation, deployment, monitoring, and retirement. Each stage carries distinct governance requirements. + +AI risk +: The potential for adverse outcomes — financial, reputational, ethical, regulatory, or societal — arising from the design, deployment, or use of AI systems. AI risk extends beyond traditional model risk to include concerns such as bias, opacity, misuse, and unintended consequences. + +algorithmic accountability +: The principle that organizations must take responsibility for the outcomes of the AI systems they deploy, including documenting decisions, monitoring performance, and providing mechanisms to identify and remediate harm. + +bias, algorithmic bias +: Systematic errors or unfair outcomes in AI system results that disproportionately affect specific groups. Sources include unrepresentative training data, flawed assumptions in system design, or feedback loops introduced during deployment. Detecting and mitigating bias is a core AI governance activity. + +EU AI Act +: A regulatory framework introduced by the European Union that classifies AI systems by risk tier^[**European Union:** [Regulation (EU) 2024/1689: Artificial Intelligence Act](https://eur-lex.europa.eu/eli/reg/2024/1689/oj)] — prohibited, high-risk, limited-risk, and minimal-risk — and imposes proportionate obligations such as risk management, data governance, transparency, human oversight, and conformity assessment. + +explainability +: The degree to which the internal mechanics or outputs of an AI system can be understood by humans. Explainability is a core requirement for high-risk AI systems and supports accountability, debugging, and regulatory review. + +fairness +: The principle that AI systems should produce equitable outcomes across individuals and groups. Fairness assessments are a routine part of bias evaluation and impact assessment within AI governance programs. + +ISO/IEC 42001 +: An international management system standard for artificial intelligence published by the International Organization for Standardization. Provides requirements for establishing, implementing, maintaining, and continually improving an AI management system within an organization. + +model card, system card +: A standardized document that summarizes an AI system's intended use, training data, performance characteristics, limitations, and ethical considerations. Model and system cards support transparency and informed deployment decisions.^[**Refer also to:** [documentation](#documentation)] + +NIST AI Risk Management Framework (AI RMF) +: A voluntary framework published by the U.S. National Institute of Standards and Technology to help organizations manage risks associated with AI. Organized around four core functions: govern, map, measure, and manage. + +responsible AI +: An umbrella approach to designing, building, and deploying AI systems in ways that are ethical, transparent, accountable, fair, and aligned with human values and societal expectations. + +transparency +: The disclosure of meaningful information about an AI system's design, data, capabilities, limitations, and decision-making processes to relevant stakeholders. Transparency supports trust, accountability, and informed oversight. diff --git a/site/about/glossary/_ai.qmd b/site/about/glossary/_ai.qmd index 5ef7bba2ff..b2b90b0106 100644 --- a/site/about/glossary/_ai.qmd +++ b/site/about/glossary/_ai.qmd @@ -5,40 +5,48 @@ SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> Refer to [IBM's series on artificial intelligence](https://www.ibm.com/think/artificial-intelligence) for more in-depth resources. AI governance -: The organizational framework for directing and overseeing how AI is designed, deployed, and used. It sets policy, accountability, and decision rights, covering ethics, compliance, risk appetite, lifecycle controls, and ongoing oversight across people, process, and technology. +: The organizational framework for directing and overseeing how AI is designed, deployed, and used. It sets policy, accountability, and decision rights, covering ethics, compliance, risk appetite, lifecycle controls, and ongoing oversight across people, process, and technology.^[**Refer also to:** [AI governance](#ai-governance)] AI system -: A combination of software, algorithms, and data designed to perform tasks that typically require human intelligence. In AI governance, an AI system is the primary unit of management, distinct from individual models. +: A combination of software, algorithms, and data designed to perform tasks that typically require human intelligence. In AI governance, an AI system is the primary unit of management, distinct from individual records (such as models).[^ai-system] AI use case : A specific application or deployment of AI technology to solve a business problem or achieve an objective. Use cases are often the unit of oversight in AI governance frameworks. artificial intelligence (AI) -: Artificial intelligence is a broad term used to classify machines that mimic human intelligence and human cognitive functions like problem-solving and learning. +: Artificial intelligence is a broad term used to classify machines that mimic human intelligence and human cognitive functions like problem-solving and learning. deep-learning -: A subset of machine learning that uses multi-layered neural networks (deep neural networks) to simulate the complex decision-making power of the human brain. +: A subset of machine learning that uses multi-layered neural networks (deep neural networks) to simulate the complex decision-making power of the human brain. generative AI (GenAI) : Generative AI refers to deep-learning models that can generate high-quality text, images, and other content based on the data they were trained on. human oversight -: Controls and processes ensuring human involvement in AI-driven decisions. Required by regulations like the EU AI Act for high-risk AI systems to enable human intervention and override capabilities. +: Controls and processes ensuring human involvement in AI-driven decisions. Required by regulations like the EU AI Act for high-risk AI systems to enable human intervention and override capabilities.^[**Refer also to:** [EU AI Act](./glossary.qmd#eu-ai-act)] impact assessment : An evaluation of the potential risks, harms, and consequences associated with deploying an AI system. Impact assessments are a core artifact in AI governance programs. large language model (LLM) -: Advanced types of artificial intelligence models designed to understand, generate, and interact with human language at a sophisticated level, such as ChatGPT.^[[ChatGPT](https://chat.openai.com)] +: An advanced type of artificial intelligence model designed to understand, generate, and interact with human language at a sophisticated level, such as ChatGPT.^[[ChatGPT](https://chat.openai.com)] -machine learning -: Machine learning is a subset of artificial intelligence that allows for optimization. It helps make predictions that minimize the errors that arise from merely guessing. +machine learning (ML) +: Machine learning is a subset of artificial intelligence that allows for optimization. It helps make predictions that minimize the errors that arise from merely guessing. risk tier -: A classification level assigned to an AI system based on its potential impact and risk. The EU AI Act defines tiers including prohibited, high-risk, limited-risk, and minimal-risk categories. +: A classification level assigned to an AI system based on its potential impact and risk. The EU AI Act defines tiers including prohibited, high-risk, limited-risk, and minimal-risk categories.^[**Refer also to:** [EU AI Act](./glossary.qmd#eu-ai-act)] -traditional statistical models -: Mathematical frameworks used to analyze and make inferences from data. These models are foundational in statistics and serve to explain relationships, predict outcomes, and guide decision-making across various fields, such as economics, biology, engineering, and social sciences. +traditional statistical model +: A mathematical framework used to analyze and make inferences from data. Traditional statistical models are foundational in statistics and serve to explain relationships, predict outcomes, and guide decision-making across various fields, such as economics, biology, engineering, and social sciences. use case owner : The individual accountable for an AI use case within an organization. Responsible for decisions about AI deployment, compliance, and ongoing oversight. + + + + +[^ai-system]: **Refer to:** + + - [record](#records) + - [model](#models) diff --git a/site/about/glossary/_attestation.qmd b/site/about/glossary/_attestation.qmd index 2ca29bdacf..75271ae84e 100644 --- a/site/about/glossary/_attestation.qmd +++ b/site/about/glossary/_attestation.qmd @@ -2,29 +2,29 @@ Refer to the LICENSE file in the root of this repository for details. SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> -attestation -: A formal process where attestation participants certify key model information at a specific time. Attestation is part of your audit trail and confirms that governance, documentation, and control requirements are met. +attestation +: A formal process where attestation participants certify key record (model) information at a specific time. Attestation is part of your audit trail and confirms that governance, documentation, and control requirements are met. attestation instance -: The invocation of the attestation process on the {{< var validmind.platform >}}. Created when the attestation is triggered by the schedule you set up, it includes a snapshot with record activity and artifacts, questionnaire responses and review status, forming a full record of the review and approval process. +: The invocation of the attestation process on the {{< var validmind.platform >}}. Created when the attestation is triggered by the schedule you set up, it includes a snapshot with record activity and artifacts, questionnaire responses and review status, forming a full record of the review and approval process. attestation participant -: A user who participates in the attestation workflow as a submitter, reviewer, or approver. Submitters are assigned from model stakeholders; reviewers and approvers are assigned from organizational roles. +: A user who participates in the attestation workflow as a submitter, reviewer, or approver. Submitters are assigned from inventory record stakeholders; reviewers and approvers are assigned from organizational roles. -attestation period -: The time window during which attestation is active, with fixed start and end dates. Each period creates an unchanging model snapshot. Periods are usually scheduled quarterly or annually and can align with regulatory or internal cycles. +attestation period +: The time window during which attestation is active, with fixed start and end dates. Each period creates an unchanging record (model) snapshot. Periods are usually scheduled quarterly or annually and can align with regulatory or internal cycles. -attestation questionnaire -: A structured form that submitters use to confirm model status, documentation and compliance. It supports formatted inputs like checkboxes and text fields, serving as both a compliance check and formal review record. +attestation questionnaire +: A structured form that submitters use to confirm record (model) status, documentation and compliance. It supports formatted inputs like checkboxes and text fields, serving as both a compliance check and formal review record. -execution schedule +execution schedule : The mechanism, manual or automated, that starts the attestation process based on set periods. It creates attestation instances, triggers snapshots and begins the workflow for attestation participants. group -: An organizational unit that associates models with specific teams or functions. When reviewers or approvers are assigned by role, they can only act on models within groups they belong to — resulting in one attestation submission per model owner per group. +: An organizational unit that associates records (models) with specific teams or functions. When reviewers or approvers are assigned by role, they can only act on records within groups they belong to — resulting in one attestation submission per owner per group. inventory scope -: The filter conditions that define which models are included in an attestation. Scope can be set using rules based on model fields, stages, or custom attributes. +: The filter conditions that define which records (models) are included in an attestation. Scope can be set using rules based on fields, stages, or custom attributes. -snapshot -: A fixed capture of model data at a specific time. It includes optional custom fields and related artifacts and stays unchanged throughout the attestation, ensuring historical accuracy. \ No newline at end of file +snapshot +: A fixed capture of record (model) data at a specific time. It includes optional custom fields and related artifacts and stays unchanged throughout the attestation, ensuring historical accuracy. \ No newline at end of file diff --git a/site/about/glossary/_developer-tools.qmd b/site/about/glossary/_developer-tools.qmd index 0381ba6ff1..9dd5cb07bd 100644 --- a/site/about/glossary/_developer-tools.qmd +++ b/site/about/glossary/_developer-tools.qmd @@ -14,9 +14,9 @@ Decorators are a simpler way for users to run their own code as a {{< var vm.pro {{< include key_concepts/_parameters.qmd >}} pip -: A package manager for Python, used to install and manage software packages written in the Python programming language. +: A package manager for Python, used to install and manage software packages written in the Python programming language. -{{< var vm.product >}} uses the `pip` command to install the Python client library that is part of the {{< var validmind.developer >}} so that model developers can make use of its features. +{{< var vm.product >}} uses the `pip` command to install the Python client library that is part of the {{< var validmind.developer >}} so that developers can make use of its features. JupyterHub : A multi-user server provides a platform for users to interactively work with data science and scientific computing tools in a collaborative environment. @@ -33,4 +33,4 @@ Jupyter Notebook GitHub : A cloud-based platform that provides hosting for software development and version control using Git. GitHub^[[GitHub](https://github.com/)] offers collaboration tools such as bug tracking, feature requests, task management, and continuous integration pipelines. -{{< var vm.product >}} uses GitHub to share [pen-source software^[**GitHub:** [validmind](https://github.com/validmind/)] with you. +{{< var vm.product >}} uses GitHub to share open-source software^[**GitHub:** [validmind](https://github.com/validmind/)] with you. diff --git a/site/about/glossary/_documentation.qmd b/site/about/glossary/_documentation.qmd new file mode 100644 index 0000000000..fc4a038cd3 --- /dev/null +++ b/site/about/glossary/_documentation.qmd @@ -0,0 +1,23 @@ + + + + +{{< include documentation/_doc-intro.qmd >}} + +{{< include documentation/_conceptual-soundness.qmd >}} + +{{< include documentation/_data-preparation.qmd >}} + +{{< include documentation/_model-development.qmd >}} + +{{< include documentation/_monitoring-governance.qmd >}} + diff --git a/site/about/glossary/_model-documentation.qmd b/site/about/glossary/_model-documentation.qmd deleted file mode 100644 index b0c983236b..0000000000 --- a/site/about/glossary/_model-documentation.qmd +++ /dev/null @@ -1,23 +0,0 @@ - - - - -{{< include model_documentation/_doc-intro.qmd >}} - -{{< include model_documentation/_conceptual-soundness.qmd >}} - -{{< include model_documentation/_data-preparation.qmd >}} - -{{< include model_documentation/_model-development.qmd >}} - -{{< include model_documentation/_monitoring-governance.qmd >}} - diff --git a/site/about/glossary/_models.qmd b/site/about/glossary/_models.qmd index b3a72d4e68..8c9ce74f80 100644 --- a/site/about/glossary/_models.qmd +++ b/site/about/glossary/_models.qmd @@ -4,14 +4,13 @@ SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> #### Models -model -: Under SR 26-2^[[SR 26-2: Interagency Guidance on Model Risk Management for Banking Organizations](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm)] (which supersedes SR 11-7), banking organizations focus MRM on *complex quantitative methods* that apply statistical, economic, or financial theories, techniques, and assumptions to produce estimates or inferences that inform business decisions. Simple arithmetic, purely deterministic rules without substantive quantitative theory, or software that does not apply such theories are generally outside the guidance’s model definition. +{{< include /about/glossary/key_concepts/_docs.qmd >}} + +{{< include /about/glossary/key_concepts/_models.qmd >}} model development : An iterative process in which many models are derived, tested, and built upon until a model fitting the desired criteria is achieved. -{{< include key_concepts/_docs.qmd >}} - :::: {.content-visible when-format="html" when-meta="includes.glossary"} -model inventory^[**Refer also to:** [{{< var vm.product >}} model inventory](./glossary.qmd#platform-model-inventory)] +model inventory^[**Refer also to:** [inventory](./glossary.qmd#inventory)] : A systematic and organized record of all quantitative and qualitative models used within an organization. This inventory facilitates oversight, tracking, and assessment by listing each model's purpose, characteristics, owners, validation status, and associated risks. :::: diff --git a/site/about/glossary/_mrm.qmd b/site/about/glossary/_mrm.qmd index 4872d9f341..31835071a4 100644 --- a/site/about/glossary/_mrm.qmd +++ b/site/about/glossary/_mrm.qmd @@ -13,27 +13,27 @@ SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> 3rd line of defense : Typically an internal audit function responsible for providing an independent and comprehensive review of the risk management processes and controls that the first two lines have implemented. -model developer +model developer, developer : Responsible for the design, implementation, and maintenance of models to ensure they are fit-for-purpose, accurate, and aligned with business requirements. As subject matter experts, they collaborate with model validators and other business units, ensuring the models are conceptually sound and robust. -model governance +model governance, governance : A framework of policies, procedures, and standards established to oversee the lifecycle of models within an organization. Ensures that models are developed, validated, implemented, and retired in a controlled and consistent manner, promoting accountability, transparency, and adherence to regulatory requirements. -model implementation +model implementation, implementation : A collaborative effort among model developers and model owners. Model implementation includes a formalized implementation plan and associated procedures, a review of results, and a record of model change procedures. -model owner +model owner, owner : Responsible for coordinating model development, model implementation, ongoing model monitoring and maintaining the model’s administration, such as model documentation and model risk reporting. -model user +model user, user : Those who rely on the model’s outputs to inform business decisions. -model validation +model validation, validation : A systematic process to evaluate and verify that a model is performing as intended, accurately represents the phenomena it is designed to capture, and is appropriate for its specified purpose. This assessment encompasses a review of the model's conceptual soundness, data integrity, calibration, and performance outcomes, as well as testing against out-of-sample datasets. Within model risk management, model validation ensures that potential risks associated with model errors, misuse, or misunderstanding are identified and mitigated. -model validator +model validator, validator : Responsible for conducting independent assessments of models to ensure their accuracy, reliability, and appropriateness for intended purposes. The role involves evaluating a model's conceptual soundness, data integrity, calibration methods, and overall performance, typically using out-of-sample datasets. Model validators identify potential risks and weaknesses, ensuring that models within an organization meet established standards and regulatory requirements, and provide recommendations to model developers for improvements or modifications. diff --git a/site/about/glossary/_validmind-features.qmd b/site/about/glossary/_validmind-features.qmd index 2be12d5923..4d3cc46369 100644 --- a/site/about/glossary/_validmind-features.qmd +++ b/site/about/glossary/_validmind-features.qmd @@ -8,16 +8,26 @@ client library, Python client library : Enables the interaction of your development environment with the {{< var validmind.platform >}} as part of the {{< var validmind.developer >}}. content block -: Content blocks provide you with sections that are part of a template, and are used in model documentation, validation reports, ongoing monitoring reports, and custom document types.^[[Work with content blocks](/guide/documentation/work-with-content-blocks.qmd)] +: A modular document template component. Content blocks are used to populate text and test results in documentation, validation reports, ongoing monitoring reports, and custom document types.^[[Work with content blocks](/guide/documentation/work-with-content-blocks.qmd)] documentation automation -: A core benefit of {{< var vm.product >}} that allows for the automatic creation of model documentation using predefined templates and test suites. +: A core benefit of {{< var vm.product >}} that allows for the automatic creation of documentation using predefined templates and test suites.[^test-suite] -model inventory -: A feature of the {{< var validmind.platform >}} where you can track, manage, and oversee the lifecycle of models. Covers the full model lifecycle, including customizable approval workflows for different user roles, status and activity tracking, and periodic revalidation. +inventory +: A feature of the {{< var validmind.platform >}} where you can track, manage, and oversee the lifecycle of your records (such as models). Covers the full record lifecycle, including customizable approval workflows for different user roles, status and activity tracking, and periodic revalidation. + +{{< include key_concepts/_records.qmd >}} {{< include key_concepts/_template.qmd >}} {{< include key_concepts/_test.qmd >}} -{{< include key_concepts/_test-suite.qmd >}} \ No newline at end of file +{{< include key_concepts/_test-suite.qmd >}} + + + + +[^test-suite]: **Refer to:** + + - [document template](/about/glossary/glossary.qmd#document-template) + - [test suite](/about/glossary/glossary.qmd#test-suite) \ No newline at end of file diff --git a/site/about/glossary/_validmind.qmd b/site/about/glossary/_validmind.qmd index fcba5f96cc..b9d29c8ff5 100644 --- a/site/about/glossary/_validmind.qmd +++ b/site/about/glossary/_validmind.qmd @@ -4,10 +4,10 @@ SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> #### {{< var validmind.product >}} -These two features are intertwined and work in tandem to help streamline your model lifecycle. +These two features are intertwined and work in tandem to help streamline your risk management lifecycles. {{< var validmind.developer >}} ({{< var vm.developer >}}) -: An open-source^[**{{< var vm.product >}} GitHub:** [`validmind-library`](https://github.com/validmind/validmind-library/)] suite of documentation tools and test suites designed to document models, test models for weaknesses, and identify overfit areas. Enables automating the generation of model documentation by uploading documentation and test results to the {{< var validmind.platform >}}. +: An open-source^[**{{< var vm.product >}} GitHub:** [`validmind-library`](https://github.com/validmind/validmind-library/)] suite of documentation tools and test suites designed to document records (such as models), test records for weaknesses, and identify overfit areas. Enables automating the generation of documentation by uploading documentation and test results to the {{< var validmind.platform >}}. {{< var validmind.platform >}} ({{< var vm.platform >}}) -: A hosted multi-tenant architecture^[[Log into {{< var vm.product >}}](/guide/access/log-in-to-validmind.qmd)] that includes the {{< var vm.product >}} cloud-based web interface, APIs, databases, documentation and validation engine, and various internal services. +: A hosted multi-tenant architecture^[[Log into {{< var vm.product >}}](/guide/access/log-in-to-validmind.qmd)] that includes the {{< var vm.product >}} cloud-based web interface, APIs, databases, documentation and validation engine, and various internal services. diff --git a/site/about/glossary/documentation/_conceptual-soundness.qmd b/site/about/glossary/documentation/_conceptual-soundness.qmd new file mode 100644 index 0000000000..1dae64104f --- /dev/null +++ b/site/about/glossary/documentation/_conceptual-soundness.qmd @@ -0,0 +1,6 @@ + + +conceptual soundness +: Establishes the foundation of a selected record (such as a model), covering the overview, intended use and business use case, regulatory requirements, limitations, and the rationale behind selection. It emphasizes purpose, scope, and constraints, which are crucial for stakeholders to understand applicability and limitations. diff --git a/site/about/glossary/model_documentation/_data-preparation.qmd b/site/about/glossary/documentation/_data-preparation.qmd similarity index 88% rename from site/about/glossary/model_documentation/_data-preparation.qmd rename to site/about/glossary/documentation/_data-preparation.qmd index 9a5b539e85..84646b6c75 100644 --- a/site/about/glossary/model_documentation/_data-preparation.qmd +++ b/site/about/glossary/documentation/_data-preparation.qmd @@ -3,4 +3,4 @@ Refer to the LICENSE file in the root of this repository for details. SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> data preparation -: Details the data description, including dataset summary, data quality tests, descriptive statistics, correlations and interactions, and feature selection and engineering. It provides transparency into the data used for training, ensuring that the record such as a model is built on a solid and relevant dataset. \ No newline at end of file +: Details the data description, including dataset summary, data quality tests, descriptive statistics, correlations and interactions, and feature selection and engineering. It provides transparency into the data used for training, ensuring that the record (such as a model) is built on a solid and relevant dataset. \ No newline at end of file diff --git a/site/about/glossary/model_documentation/_doc-intro.qmd b/site/about/glossary/documentation/_doc-intro.qmd similarity index 100% rename from site/about/glossary/model_documentation/_doc-intro.qmd rename to site/about/glossary/documentation/_doc-intro.qmd diff --git a/site/about/glossary/model_documentation/_model-development.qmd b/site/about/glossary/documentation/_model-development.qmd similarity index 81% rename from site/about/glossary/model_documentation/_model-development.qmd rename to site/about/glossary/documentation/_model-development.qmd index 5acf1242eb..1c7eafc727 100644 --- a/site/about/glossary/model_documentation/_model-development.qmd +++ b/site/about/glossary/documentation/_model-development.qmd @@ -3,4 +3,4 @@ Refer to the LICENSE file in the root of this repository for details. SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> model development, development -: Discusses the training, evaluation, explainability, interpretability, and diagnosis, including weak spots, overfit regions, and robustness. This section is vital for understanding how the record such as a model was developed, how it performs, and its areas of strength and weakness. +: Discusses the training, evaluation, explainability, interpretability, and diagnosis, including weak spots, overfit regions, and robustness. This section is vital for understanding how the record (such as a model) was developed, how it performs, and its areas of strength and weakness. diff --git a/site/about/glossary/documentation/_monitoring-governance.qmd b/site/about/glossary/documentation/_monitoring-governance.qmd new file mode 100644 index 0000000000..05ccf5390e --- /dev/null +++ b/site/about/glossary/documentation/_monitoring-governance.qmd @@ -0,0 +1,6 @@ + + +monitoring and governance +: Focuses on the record (such as a model)’s ongoing monitoring plan, implementation, and governance plan. It outlines strategies for maintaining the performance over time and ensuring that it remains compliant with regulatory requirements and ethical standards. \ No newline at end of file diff --git a/site/about/glossary/glossary.qmd b/site/about/glossary/glossary.qmd index cbf0c8ab03..43167cf368 100644 --- a/site/about/glossary/glossary.qmd +++ b/site/about/glossary/glossary.qmd @@ -17,32 +17,39 @@ includes: This glossary of terms provides short definitions for technical terms you find commonly used in our product documentation grouped by terms related to: - [{{< var vm.product >}}](#validmind) -- [Artificial intelligence](#artificial-intelligence) -- [Models and model risk management](#models-and-model-risk-management) -- [Model documentation](#model-documentation) +- [Artificial intelligence (AI) governance](#artificial-intelligence-ai-governance) +- [Model risk management](#model-risk-management) +- [Documentation](#documentation) - [Validation reports](#validation-reports) - [Ongoing monitoring](#ongoing-monitoring) - [Attestations](#attestations) - [Integrations](#integrations) - [Developer tools](#developer-tools) +

## {{< var vm.product >}} {{< include _validmind.qmd >}} {{< include _validmind-features.qmd >}} -## Artificial intelligence +## Artificial intelligence (AI) governance + +#### AI {{< include _ai.qmd >}} -## Models and model risk management +#### AI governance + +{{< include _ai-governance.qmd >}} + +## Model risk management {{< include _models.qmd >}} {{< include _mrm.qmd >}} -## Model documentation +## Documentation -{{< include _model-documentation.qmd >}} +{{< include _documentation.qmd >}} ## Validation reports diff --git a/site/about/glossary/key_concepts/_docs.qmd b/site/about/glossary/key_concepts/_docs.qmd index 0ff2fa8d76..886d9f2f20 100644 --- a/site/about/glossary/key_concepts/_docs.qmd +++ b/site/about/glossary/key_concepts/_docs.qmd @@ -2,7 +2,12 @@ Refer to the LICENSE file in the root of this repository for details. SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> -model documentation -: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. - -Within the realm of model risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model's application. \ No newline at end of file + + +documentation, model documentation +: A structured and detailed record pertaining to a record (such as a model), encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. + +:::: {.content-visible when-format="html" when-meta="includes.glossary"} +Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application. + +:::: \ No newline at end of file diff --git a/site/about/glossary/key_concepts/_inputs.qmd b/site/about/glossary/key_concepts/_inputs.qmd index 0c45358b5e..bc148ade75 100644 --- a/site/about/glossary/key_concepts/_inputs.qmd +++ b/site/about/glossary/key_concepts/_inputs.qmd @@ -5,7 +5,19 @@ SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> inputs : Objects to be evaluated and documented in the {{< var validmind.developer >}}. They can be any of the following: - - **model**: A single model that has been initialized in {{< var vm.product >}}. Refer to the [`vm.init_model()` function](/validmind/validmind.qmd#init_model){target="_blank"} for more information. - - **dataset**: Single dataset that has been initialized in {{< var vm.product >}}. Refer to the [`vm.init_dataset()` function](/validmind/validmind.qmd#init_dataset){target="_blank"} for more information. - - **models**: A list of {{< var vm.product >}} models - usually this is used when you want to compare multiple models in your custom tests. - - **datasets**: A list of {{< var vm.product >}} datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb)) \ No newline at end of file +:::: {.content-visible when-format="html" when-meta="includes.glossary"} + - **model**: A single record (such as a model) that has been initialized in {{< var vm.product >}}. Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with {{< var vm.product >}}.^[**Refer to:** [`init_model()`](/validmind/validmind.qmd#init_model){target="_blank"}] + - **dataset**: A single dataset that has been initialized in {{< var vm.product >}}.^[**Refer to:** [`init_dataset()`](/validmind/validmind.qmd#init_dataset){target="_blank"}] + - **models**: A list of {{< var vm.product >}} records — usually this is used when you want to compare multiple records in your custom tests. + - **datasets**: A list of {{< var vm.product >}} datasets — usually this is used when you want to compare multiple datasets in your custom tests.^[**Learn more:** [Run tests with multiple datasets](/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb)] + +:::: + +:::: {.content-visible when-format="html" unless-meta="includes.glossary"} + - **model**: A single record (such as a model) that has been initialized in {{< var vm.product >}}. Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with {{< var vm.product >}}. Refer to the [`vm.init_model()` function](/validmind/validmind.qmd#init_model){target="_blank"} for more information. + - **dataset**: A single dataset that has been initialized in {{< var vm.product >}}. Refer to the [`vm.init_dataset()` function](/validmind/validmind.qmd#init_dataset){target="_blank"} for more information. + - **models**: A list of {{< var vm.product >}} records — usually this is used when you want to compare multiple records in your custom tests. + - **datasets**: A list of {{< var vm.product >}} datasets — usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb)) + +:::: + diff --git a/site/about/glossary/key_concepts/_key-concepts.qmd b/site/about/glossary/key_concepts/_key-concepts.qmd index 4e2fe9c64d..73a4f68bab 100644 --- a/site/about/glossary/key_concepts/_key-concepts.qmd +++ b/site/about/glossary/key_concepts/_key-concepts.qmd @@ -6,20 +6,28 @@ SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> 1. Create a new file under the `about/glossary/key_concepts` folder with the following structure `_concept.qmd` (the `_` is mandatory for Quarto to retrieve the file as a single-source embed: https://quarto.org/docs/authoring/includes.html) 2. Include it below with the structure `{{< include /about/glossary/key_concepts/_concept.qmd >}}` -3. In the `about/glossary` folder, locate the correct section file it belongs to (e.g. `_ai.qmd`) and embed it there as well in ABC order with the structure `{{< include key_concepts/_concept.qmd >}}` +3. In the `about/glossary` folder, locate the correct section file it belongs to (e.g. `_ai.qmd`) and embed it there as well in ABC order with the structure `{{< include key_concepts/_concept.qmd >}}` These instructions update the key concept on anywhere the key concepts are reference as well as within the glossary. --> +{{< include /about/glossary/key_concepts/_records.qmd >}} + +{{< include /about/glossary/key_concepts/_models.qmd >}} + {{< include /about/glossary/key_concepts/_docs.qmd >}} {{< include /about/glossary/key_concepts/_report.qmd >}} +{{< include /about/glossary/monitoring/_ongoing-monitoring.qmd >}} + {{< include /about/glossary/key_concepts/_template.qmd >}} {{< include /about/glossary/key_concepts/_test.qmd >}} +{{< include /about/glossary/key_concepts/_test-suite.qmd >}} + {{< include /about/glossary/key_concepts/_metrics.qmd >}} {{< include /about/glossary/key_concepts/_inputs.qmd >}} @@ -28,4 +36,3 @@ These instructions update the key concept on anywhere the key concepts are refer {{< include /about/glossary/key_concepts/_outputs.qmd >}} -{{< include /about/glossary/key_concepts/_test-suite.qmd >}} \ No newline at end of file diff --git a/site/about/glossary/key_concepts/_metrics.qmd b/site/about/glossary/key_concepts/_metrics.qmd index 53dd1223f7..d95bb3588d 100644 --- a/site/about/glossary/key_concepts/_metrics.qmd +++ b/site/about/glossary/key_concepts/_metrics.qmd @@ -2,7 +2,16 @@ Refer to the LICENSE file in the root of this repository for details. SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> +:::: {.content-visible when-format="html" when-meta="includes.glossary"} metrics, custom metrics -: Metrics are a subset of tests that do not have thresholds. Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered via the {{< var validmind.developer >}} to be used with the {{< var validmind.platform >}}. +: Metrics are a subset of tests that do not have thresholds. Custom metrics are functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered via the {{< var validmind.developer >}} to be used with the {{< var validmind.platform >}}. -In the context of {{< var vm.product >}}'s Jupyter Notebooks, metrics and tests can be thought of as interchangeable concepts. \ No newline at end of file +In the context of {{< var vm.product >}}'s Jupyter Notebooks, metrics and tests can be thought of as interchangeable concepts.^[**Refer also to:** [test](/about/glossary/glossary.qmd#tests)] +:::: + +:::: {.content-visible when-format="html" unless-meta="includes.glossary"} +metrics, custom metrics +: Metrics are a subset of tests that do not have thresholds. Custom metrics are functions that you define to evaluate your record (such as a model) or dataset. These functions can be registered via the {{< var validmind.developer >}} to be used with the {{< var validmind.platform >}}. + +In the context of {{< var vm.product >}}'s Jupyter Notebooks, metrics and tests can be thought of as interchangeable concepts. +:::: \ No newline at end of file diff --git a/site/about/glossary/key_concepts/_models.qmd b/site/about/glossary/key_concepts/_models.qmd new file mode 100644 index 0000000000..15bdf932df --- /dev/null +++ b/site/about/glossary/key_concepts/_models.qmd @@ -0,0 +1,17 @@ + + +:::: {.content-visible when-format="html" when-meta="includes.glossary"} +model +: SR 26-2^[[SR 26-2: Interagency Guidance on Model Risk Management for Banking Organizations](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm)] (which supersedes SR 11-7) defines a model as a "complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates." Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. + +Within {{< var vm.product >}}, a model is a type of record tracked in the inventory.^[**Refer also to:** [record](/about/glossary/glossary.qmd#records)] +:::: + +:::: {.content-visible when-format="html" unless-meta="includes.glossary"} +model +: SR 26-2 (which supersedes SR 11-7) defines a model as a "complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates." Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. + +Within {{< var vm.product >}}, a model is a type of record tracked in the inventory. +:::: \ No newline at end of file diff --git a/site/about/glossary/key_concepts/_records.qmd b/site/about/glossary/key_concepts/_records.qmd new file mode 100644 index 0000000000..174f514997 --- /dev/null +++ b/site/about/glossary/key_concepts/_records.qmd @@ -0,0 +1,16 @@ + + +:::: {.content-visible when-format="html" when-meta="includes.glossary"} +record +: A tool tracked in the {{< var validmind.platform >}} inventory,^[**Refer to:** [model](/about/glossary/glossary.qmd#models)] such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management. + +:::: + +:::: {.content-visible when-format="html" unless-meta="includes.glossary"} +record +: A tool tracked in the {{< var validmind.platform >}} inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management. +:::: + + diff --git a/site/about/glossary/key_concepts/_report.qmd b/site/about/glossary/key_concepts/_report.qmd index 42157729e9..c5bf7b6960 100644 --- a/site/about/glossary/key_concepts/_report.qmd +++ b/site/about/glossary/key_concepts/_report.qmd @@ -5,4 +5,7 @@ SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> validation report : A formal document produced after a model validation process, outlining the artifacts, assessments, and recommendations related to a specific model's performance, appropriateness, and limitations. Provides a comprehensive review of the model's conceptual framework, data sources and integrity, calibration methods, and performance outcomes. -Within model risk management, the validation report is crucial for ensuring transparency, demonstrating regulatory compliance, and offering actionable insights for model refinement or adjustments. \ No newline at end of file +:::: {.content-visible when-format="html" when-meta="includes.glossary"} +Within model risk management, the validation report is crucial for ensuring transparency, demonstrating regulatory compliance, and offering actionable insights for model refinement or adjustments. + +:::: \ No newline at end of file diff --git a/site/about/glossary/key_concepts/_template.qmd b/site/about/glossary/key_concepts/_template.qmd index 6a47859219..7c035e586b 100644 --- a/site/about/glossary/key_concepts/_template.qmd +++ b/site/about/glossary/key_concepts/_template.qmd @@ -10,16 +10,18 @@ SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> :::: {.content-visible when-format="html" when-meta="includes.glossary"} document template -: Lays out the structure of model documents, segmented into various sections and sub-sections, and function as test suites to help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default {{< var vm.product >}} document types[^default-documents] as well as custom document types. +: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite to help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default {{< var vm.product >}} document types[^default-documents] as well as custom document types. -documentation template^[**Refer also to:** [Model documentation](/about/glossary/glossary.qmd#model-documentation)] -: A default {{< var vm.product >}} document type that serves as a standardized framework for developing and documenting models, including sections designated for model details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across model documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes. +documentation template^[**Refer also to:** [documentation](/about/glossary/glossary.qmd#documentation)] +: A default {{< var vm.product >}} document template that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, documentation templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes. -validation report template^[**Refer also to:** [Validation reports](/about/glossary/glossary.qmd#validation-reports)] -: A default {{< var vm.product >}} document type that serves as a standardized framework for conducting and documenting model validation, including sections designated for attaching test results, evidence, or artifacts (findings). By outlining required documentation, recommended analyses, and expected validation tests, validation report templates ensure consistency and completeness across validation reports and help guide validators through a systematic review process while promoting comparability and traceability of validation outcomes. +{{< var vm.product >}} documentation templates function as test suites by defining the structure of your documentation, specifying the tests that should be run, and how the results should be displayed. -monitoring template, monitoring report template^[**Refer also to:** [Ongoing monitoring](/about/glossary/glossary.qmd#ongoing-monitoring)] -: A default {{< var vm.product >}} document type that serves as a standardized framework for ongoing model monitoring, including sections designated for test results, performance metrics, and drift analyses. By outlining required monitoring checks and expected routine tests, monitoring templates ensure consistency and completeness across monitoring reports and help guide model owners through a systematic monitoring process while promoting early detection of model performance degradation. +validation report template^[**Refer also to:** [validation reports](/about/glossary/glossary.qmd#validation-reports)] +: A default {{< var vm.product >}} document template that serves as a standardized framework for conducting and documenting validation, including sections designated for attaching test results, evidence, or artifacts (findings). By outlining required documentation, recommended analyses, and expected validation tests, validation report templates ensure consistency and completeness across validation reports and help guide validators through a systematic review process while promoting comparability and traceability of validation outcomes. + +monitoring template, monitoring report template^[**Refer also to:** [ongoing monitoring](/about/glossary/glossary.qmd#ongoing-monitoring)] +: A default {{< var vm.product >}} document template that serves as a standardized framework for ongoing monitoring, including sections designated for test results, performance metrics, and drift analyses. By outlining required monitoring checks and expected routine tests, monitoring templates ensure consistency and completeness across monitoring reports and help guide owners through a systematic monitoring process while promoting early detection of performance degradation. :::: @@ -31,16 +33,16 @@ document template :::: {.content-visible when-format="html" unless-meta="includes.glossary"} document template -: Lays out the structure of model documents, segmented into various sections and sub-sections, and function as test suites to help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default {{< var vm.product >}} document types as well as custom document types. +: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite to help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default {{< var vm.product >}} document types as well as custom document types. -documentation template -: A default {{< var vm.product >}} document type that serves as a standardized framework for developing and documenting models, including sections designated for model details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across model documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes. +documentation template +: A default {{< var vm.product >}} document type that serves as a standardized framework for developing and documenting records (such as models), including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, documentation templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes. -validation report template -: A default {{< var vm.product >}} document type that serves as a standardized framework for conducting and documenting model validation, including sections designated for attaching test results, evidence, or artifacts (findings). By outlining required documentation, recommended analyses, and expected validation tests, validation report templates ensure consistency and completeness across validation reports and help guide validators through a systematic review process while promoting comparability and traceability of validation outcomes. +validation report template +: A default {{< var vm.product >}} document type that serves as a standardized framework for conducting and documenting validation, including sections designated for attaching test results, evidence, or artifacts (findings). By outlining required documentation, recommended analyses, and expected validation tests, validation report templates ensure consistency and completeness across validation reports and help guide validators through a systematic review process while promoting comparability and traceability of validation outcomes. -monitoring template, monitoring report template -: A default {{< var vm.product >}} document type that serves as a standardized framework for ongoing model monitoring, including sections designated for test results, performance metrics, and drift analyses. By outlining required monitoring checks and expected routine tests, monitoring templates ensure consistency and completeness across monitoring reports and help guide model owners through a systematic monitoring process while promoting early detection of model performance degradation. +monitoring template, monitoring report template +: A default {{< var vm.product >}} document type that serves as a standardized framework for ongoing monitoring, including sections designated for test results, performance metrics, and drift analyses. By outlining required monitoring checks and expected routine tests, monitoring templates ensure consistency and completeness across monitoring reports and help guide owners through a systematic monitoring process while promoting early detection of performance degradation. :::: diff --git a/site/about/glossary/key_concepts/_test-suite.qmd b/site/about/glossary/key_concepts/_test-suite.qmd index b1d2289365..59024193bc 100644 --- a/site/about/glossary/key_concepts/_test-suite.qmd +++ b/site/about/glossary/key_concepts/_test-suite.qmd @@ -2,7 +2,13 @@ Refer to the LICENSE file in the root of this repository for details. SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> -test suite -: A collection of tests which are run together to generate model documentation end-to-end for specific use cases. +:::: {.content-visible when-format="html" when-meta="includes.glossary"} + +test suite +: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases.^[**Learn more:** [test_suites](/validmind/validmind/test_suites.qmd)] +:::: -For example, the [`classifier_full_suite`](/validmind/validmind/test_suites/classifier.qmd#classifierfullsuite){target="_blank"} test suite runs tests from the [`tabular_dataset`](/validmind/validmind/test_suites/tabular_datasets.qmd){target="_blank"} and [`classifier`](/validmind/validmind/test_suites/classifier.qmd){target="_blank"} test suites to fully document the data and model sections for binary classification model use cases. \ No newline at end of file +:::: {.content-visible when-format="html" unless-meta="includes.glossary"} +test suite +: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [test_suites](/validmind/validmind/test_suites.qmd)) +:::: \ No newline at end of file diff --git a/site/about/glossary/key_concepts/_test.qmd b/site/about/glossary/key_concepts/_test.qmd index 6708c16a77..c4ead6e4b3 100644 --- a/site/about/glossary/key_concepts/_test.qmd +++ b/site/about/glossary/key_concepts/_test.qmd @@ -2,7 +2,7 @@ Refer to the LICENSE file in the root of this repository for details. SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> -test -: A function contained in the {{< var vm.developer >}}, designed to run a specific quantitative test on the dataset or model. Test results are sent to the {{< var validmind.platform >}} to generate the model documentation according to the template that is associated with the documentation. +test +: A function contained in the {{< var vm.developer >}}, designed to run a specific quantitative test on the dataset or record (such as a model). Test results are logged to the {{< var validmind.platform >}}, where they are attached to documents. -Tests are the building blocks of {{< var vm.product >}}, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template. \ No newline at end of file +Tests are the building blocks of {{< var vm.product >}}, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates. \ No newline at end of file diff --git a/site/about/glossary/model_documentation/_conceptual-soundness.qmd b/site/about/glossary/model_documentation/_conceptual-soundness.qmd deleted file mode 100644 index b173ee9f63..0000000000 --- a/site/about/glossary/model_documentation/_conceptual-soundness.qmd +++ /dev/null @@ -1,6 +0,0 @@ - - -conceptual soundness -: Establishes the foundation of a selected record such as a model, covering the overview, intended use and business use case, regulatory requirements, limitations, and the rationale behind selection. It emphasizes purpose, scope, and constraints, which are crucial for stakeholders to understand applicability and limitations. diff --git a/site/about/glossary/model_documentation/_monitoring-governance.qmd b/site/about/glossary/model_documentation/_monitoring-governance.qmd deleted file mode 100644 index 9d6426d097..0000000000 --- a/site/about/glossary/model_documentation/_monitoring-governance.qmd +++ /dev/null @@ -1,6 +0,0 @@ - - -monitoring and governance -: Focuses on the record such as a model’s ongoing monitoring plan, implementation, and governance plan. It outlines strategies for maintaining the performance over time and ensuring that it remains compliant with regulatory requirements and ethical standards. \ No newline at end of file diff --git a/site/about/glossary/monitoring/_backtesting.qmd b/site/about/glossary/monitoring/_backtesting.qmd index 77a7e5cfe5..e642f0cef0 100644 --- a/site/about/glossary/monitoring/_backtesting.qmd +++ b/site/about/glossary/monitoring/_backtesting.qmd @@ -2,5 +2,5 @@ Refer to the LICENSE file in the root of this repository for details. SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> -backtesting -: Comparing a model's predictions against actual outcomes to verify its predictive power and reliability. +backtesting +: Comparing a record's predictions against actual outcomes to verify its predictive power and reliability. diff --git a/site/about/glossary/monitoring/_compliance-and-regulatory-adherence.qmd b/site/about/glossary/monitoring/_compliance-and-regulatory-adherence.qmd index 2f1b50ff63..a0b9e89f71 100644 --- a/site/about/glossary/monitoring/_compliance-and-regulatory-adherence.qmd +++ b/site/about/glossary/monitoring/_compliance-and-regulatory-adherence.qmd @@ -3,4 +3,4 @@ Refer to the LICENSE file in the root of this repository for details. SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> compliance and regulatory adherence -: Ensuring that the model continues to meet evolving regulatory requirements and standards. \ No newline at end of file +: Ensuring that the record (model) continues to meet evolving regulatory requirements and standards. \ No newline at end of file diff --git a/site/about/glossary/monitoring/_model-drift.qmd b/site/about/glossary/monitoring/_model-drift.qmd index 3023bc8ffc..84272e38da 100644 --- a/site/about/glossary/monitoring/_model-drift.qmd +++ b/site/about/glossary/monitoring/_model-drift.qmd @@ -2,5 +2,5 @@ Refer to the LICENSE file in the root of this repository for details. SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> -model drift -: Changes in data patterns, input distributions, or model behavior that may indicate a degradation in model performance over time. +model drift, drift +: Changes in data patterns, input distributions, or record (such as a model) behavior that may indicate a degradation in performance over time. diff --git a/site/about/glossary/monitoring/_model-performance.qmd b/site/about/glossary/monitoring/_model-performance.qmd index 90d343f912..bf3dd4c6cb 100644 --- a/site/about/glossary/monitoring/_model-performance.qmd +++ b/site/about/glossary/monitoring/_model-performance.qmd @@ -2,5 +2,5 @@ Refer to the LICENSE file in the root of this repository for details. SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> -model performance -: The measure of a model's accuracy, stability, and robustness in achieving its intended outcomes, which is regularly evaluated through monitoring after deployment to ensure ongoing reliability. \ No newline at end of file +model performance, performance +: The measure of a record's accuracy, stability, and robustness in achieving its intended outcomes, which is regularly evaluated through monitoring after deployment to ensure ongoing reliability. \ No newline at end of file diff --git a/site/about/glossary/monitoring/_ongoing-monitoring.qmd b/site/about/glossary/monitoring/_ongoing-monitoring.qmd index 818f5ac5f9..da04e6c99d 100644 --- a/site/about/glossary/monitoring/_ongoing-monitoring.qmd +++ b/site/about/glossary/monitoring/_ongoing-monitoring.qmd @@ -2,5 +2,5 @@ Refer to the LICENSE file in the root of this repository for details. SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> -ongoing monitoring -: A periodic report assessing the tool such as a model's performance and compliance over time, ensuring it remains valid under changing conditions. \ No newline at end of file +ongoing monitoring, ongoing monitoring report, ongoing monitoring plan, monitoring plan +: A comprehensive and structured periodic report assessing the record's performance and compliance over time, ensuring it remains valid under changing conditions. Monitoring includes key elements such as data sources, inputs, performance metrics, and periodic evaluations, ensuring transparency and visibility of the record's performance in the production environment. \ No newline at end of file diff --git a/site/about/glossary/monitoring/_recalibrating-models.qmd b/site/about/glossary/monitoring/_recalibrating-models.qmd index 252a8ac653..9c1f2285d9 100644 --- a/site/about/glossary/monitoring/_recalibrating-models.qmd +++ b/site/about/glossary/monitoring/_recalibrating-models.qmd @@ -2,5 +2,5 @@ Refer to the LICENSE file in the root of this repository for details. SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> -recalibrating models -: The process of adjusting a model to account for detected drift or changes in the underlying data or environment. +recalibrating models, recalibrating +: The process of adjusting a record (such as a model) to account for detected drift or changes in the underlying data or environment. diff --git a/site/about/glossary/monitoring/_reporting-and-governance.qmd b/site/about/glossary/monitoring/_reporting-and-governance.qmd index e0908f6ce2..d0152b7056 100644 --- a/site/about/glossary/monitoring/_reporting-and-governance.qmd +++ b/site/about/glossary/monitoring/_reporting-and-governance.qmd @@ -2,5 +2,5 @@ Refer to the LICENSE file in the root of this repository for details. SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> -reporting and governance +reporting and governance : The documentation of monitoring artifacts and communication to stakeholders to support decision-making and maintain transparency. diff --git a/site/about/glossary/validation_reports/_artifacts.qmd b/site/about/glossary/validation_reports/_artifacts.qmd index fc3c7cfe8b..5f27365aef 100644 --- a/site/about/glossary/validation_reports/_artifacts.qmd +++ b/site/about/glossary/validation_reports/_artifacts.qmd @@ -3,4 +3,4 @@ Refer to the LICENSE file in the root of this repository for details. SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> artifacts (previously findings) -: Observations or issues identified during validation, including any deviations from expected performance or standards. Artifacts are organized by type — default types include Validation Issue, Policy Exception, and Limitation. Custom artifact types such as Change Management Record can be created to track other categories relevant to your organization. \ No newline at end of file +: Observations or issues identified during validation, including any deviations from expected performance or standards. Artifacts are organized by type — default types provided by {{< var vm.product >}} include Validation Issue, Policy Exception, and Limitation. Custom artifact types can be created to track other categories relevant to your organization. \ No newline at end of file diff --git a/site/about/glossary/validation_reports/_report-intro.qmd b/site/about/glossary/validation_reports/_report-intro.qmd index ed19217e3a..3525a01e55 100644 --- a/site/about/glossary/validation_reports/_report-intro.qmd +++ b/site/about/glossary/validation_reports/_report-intro.qmd @@ -2,4 +2,4 @@ Refer to the LICENSE file in the root of this repository for details. SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> -A validation report is a comprehensive review that evaluates a record's accuracy, performance, and suitability for its intended purpose. It encompasses the process of risk assessment, identifying areas of potential error or risk within the record's components, such as data inputs and algorithms. The report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards. \ No newline at end of file +A validation report is a comprehensive and structured review evaluating a record's accuracy, performance, and suitability for its intended purpose. A report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards — encompassing the process of risk assessment, identifying areas of potential error or risk within the record's components, supporting transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions. \ No newline at end of file diff --git a/site/about/glossary/validation_reports/_risk-areas.qmd b/site/about/glossary/validation_reports/_risk-areas.qmd index 34169742e5..323a642ca1 100644 --- a/site/about/glossary/validation_reports/_risk-areas.qmd +++ b/site/about/glossary/validation_reports/_risk-areas.qmd @@ -3,4 +3,4 @@ Refer to the LICENSE file in the root of this repository for details. SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> model risk areas, risk areas -: Specific components or aspects of a record such as a model where risk might be present, such as data inputs, algorithms, or implementation. \ No newline at end of file +: Specific components or aspects of a record (such as a model) where risk might be present, such as data inputs, algorithms, or implementation. \ No newline at end of file diff --git a/site/about/glossary/validation_reports/_risk-assessment.qmd b/site/about/glossary/validation_reports/_risk-assessment.qmd index b862511921..39ec545370 100644 --- a/site/about/glossary/validation_reports/_risk-assessment.qmd +++ b/site/about/glossary/validation_reports/_risk-assessment.qmd @@ -3,4 +3,4 @@ Refer to the LICENSE file in the root of this repository for details. SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> model risk assessment, risk assessment -: The process of identifying and evaluating risks associated with the use and potential errors in a record such as a model. \ No newline at end of file +: The process of identifying and evaluating risks associated with the use and potential errors in a record (such as a model). \ No newline at end of file diff --git a/site/about/glossary/validation_reports/_validation-guidelines.qmd b/site/about/glossary/validation_reports/_validation-guidelines.qmd index 77930799af..2ac7b8baf0 100644 --- a/site/about/glossary/validation_reports/_validation-guidelines.qmd +++ b/site/about/glossary/validation_reports/_validation-guidelines.qmd @@ -3,4 +3,4 @@ Refer to the LICENSE file in the root of this repository for details. SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> validation guidelines -: Established standards or procedures for conducting thorough and consistent validations, usually aligned with principles within specific tools such as models or AI risk frameworks. \ No newline at end of file +: Established standards or procedures for conducting thorough and consistent validations, usually aligned with principles within specific records (such as models) or AI risk frameworks. \ No newline at end of file diff --git a/site/about/library-and-platform.qmd b/site/about/library-and-platform.qmd index 7f776107c8..ed0bcf2cf9 100644 --- a/site/about/library-and-platform.qmd +++ b/site/about/library-and-platform.qmd @@ -12,7 +12,7 @@ listing: sort: false fields: [title, description] contents: - - overview-model-documentation.qmd + - overview-documentation.qmd - overview-llm-features.qmd - deployment-options.qmd - system-access-requirements.qmd @@ -24,21 +24,21 @@ listing: #### 1. {{< var validmind.developer >}} -The *{{< var validmind.developer >}}* is a Python library of tools and methods designed to automate generating model documentation and running validation tests. The {{< var vm.developer >}} is designed to be platform agnostic and integrates with your existing development environment. +The *{{< var validmind.developer >}}* is a Python library of tools and methods designed to automate generating documentation and running validation tests. The {{< var vm.developer >}} is designed to be platform agnostic and integrates with your existing development environment. For Python developers, a single installation command provides access to all the functions: - + ```python %pip install validmind ``` #### 2. {{< var validmind.platform >}} -The *{{< var validmind.platform >}}* is an easy-to-use web-based interface that enables you to track the model lifecycle: +The *{{< var validmind.platform >}}* is an easy-to-use web-based interface that enables you to track your risk management lifecycles: - Customize workflows to adhere to and oversee your governance processes. - Review and edit the documentation and test metrics generated by the {{< var vm.developer >}}. -- Collaborate with and capture feedback from model developers and model validators. +- Collaborate with and capture feedback from developers and validators. - Generate validation reports and approvals. ::: diff --git a/site/about/overview-model-documentation.qmd b/site/about/overview-documentation.qmd similarity index 65% rename from site/about/overview-model-documentation.qmd rename to site/about/overview-documentation.qmd index 2edacee030..7954b36a9e 100644 --- a/site/about/overview-model-documentation.qmd +++ b/site/about/overview-documentation.qmd @@ -2,10 +2,11 @@ # Copyright © 2023-2026 ValidMind Inc. All rights reserved. # Refer to the LICENSE file in the root of this repository for details. # SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial -title: "Automated model testing & documentation" +title: "Automated testing & documentation" date: last-modified aliases: - /guide/overview-model-documentation.html + - /about/overview-model-documentation.html listing: id: quickstart type: grid @@ -17,40 +18,40 @@ listing: - path: ../developer/validmind-library.qmd # INVISIBLE SPACE REQUIRED TO ENSURE THAT THE DESCRIPTION DOESN'T HAVE EXTRA PADDING DUE TO THE VARIABLE title: "{{< var validmind.developer >}}​" - description: "The {{< var validmind.developer >}} streamlines model development and validation by automating testing." + description: "The {{< var validmind.developer >}} streamlines development and validation by automating testing." fields: [title, description] --- -The {{< var validmind.developer >}} streamlines the process of documenting various types of models. {{< var vm.product >}} automates the documentation process, ensuring that your model documentation and testing aligns with regulatory and compliance standards. +The {{< var validmind.developer >}} streamlines the process of documenting various types of records, such as models. {{< var vm.product >}} automates the documentation process, ensuring that your documentation and testing aligns with regulatory and compliance standards. ::: {.attn} ## {{< fa code >}} The {{< var validmind.developer >}} -The {{< var validmind.developer >}} is a Python library and documentation engine designed to streamline the process of documenting various types of models, including traditional statistical models, legacy systems, artificial intelligence/machine learning models, and large language models (LLMs). +The {{< var validmind.developer >}} is a Python library and documentation engine designed to streamline the process of documenting various types of records, including traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and more. -It offers model developers a systematic approach to documenting and testing risk models with repeatability and consistency, ensuring alignment with regulatory and compliance standards. +It offers developers a systematic approach to documenting and testing with repeatability and consistency, ensuring alignment with regulatory and compliance standards. ![The two main components of {{< var vm.product >}}: the {{< var validmind.developer >}} that integrates with your existing developer environment, and the {{< var validmind.platform >}}](/about/deployment/validmind-architecture-overview.png){fig-alt="An image showing the two main components of ValidMind: the ValidMind Library that integrates with your existing developer environment, and the ValidMind Platform"} -The {{< var validmind.developer >}} consists of a client-side library, a {{< var vm.api >}} integration for models and testing, and validation tests that streamline the model development process. Implemented as a series of independent libraries in Python and R, our {{< var vm.developer >}} ensures compatibility and flexibility with diverse sets of developer environments and requirements. +The {{< var validmind.developer >}} consists of a client-side library, a {{< var vm.api >}} integration for records (models) and testing, and validation tests that streamline the development process. Implemented as a series of independent libraries in Python and R, our {{< var vm.developer >}} ensures compatibility and flexibility with diverse sets of developer environments and requirements. With the {{< var validmind.developer >}}, you can: -- **Automate documentation** — Add comprehensive documentation as metadata while you build models to be shared with model validators, streamlining and speeding up the process. +- **Automate documentation** — Add comprehensive documentation as metadata while you build records to be shared with validators, streamlining and speeding up the process. - **Run test suites** — Identify potential risks for a diverse range of statistical and AI/LLM/ML models by assessing data quality, model outcomes, robustness, and explainability. -- **Integrate with your development environment** — Seamlessly incorporate the {{< var validmind.developer >}} into your existing model development environment, connecting to your existing model code and data sets. -- **Upload documentation data** — Send qualitative and quantitative test data to the {{< var validmind.platform >}}[^1] to generate the model documentation for review and approval, fostering effective collaboration with model reviewers and validators. +- **Integrate with your development environment** — Seamlessly incorporate the {{< var validmind.developer >}} into your existing development environment, connecting to your existing code and data sets. +- **Upload documentation data** — Send qualitative and quantitative test data to the {{< var validmind.platform >}}[^1] to generate the documentation for review and approval, fostering effective collaboration with reviewers and validators. ::: ## Simple installation -Install the {{< var vm.developer >}} with: +Install the {{< var vm.developer >}} with: ```python %pip install validmind @@ -64,8 +65,8 @@ Install the {{< var vm.developer >}} with: What the {{< var validmind.developer >}} offers: -- Generates documentation artifacts utilizing the context of the model and dataset, the model's metadata, and the chosen documentation template. -- Can be easily imported into your local model development environment. The supported platforms include Python and R. +- Generates documentation artifacts utilizing the context of the record (such as a model) and dataset, the record's metadata, and the chosen documentation template. +- Can be easily imported into your local development environment. The supported platforms include Python and R. - Dual-licensed — The {{< var vm.developer >}} is available as open-source under AGPL v3 license and also with a commercial software license. ::: @@ -79,18 +80,18 @@ vm.init(model="MODEL_IDENTIFIER") ``` ```python -vm_dataset = vm. log_dataset( +vm_dataset = vm.log_dataset( df, "training", targets=targets, ) -vm. run_dataset_tests(df, vm_dataset=vm_dataset) +vm.run_dataset_tests(df, vm_dataset=vm_dataset) ``` ```python -vm. Log_model (model) -vm. log_training_metrics (model, x_train, y_train) -vm. run_model_tests (model, x_test, y_test) +vm.log_model (model) +vm.log_training_metrics (model, x_train, y_train) +vm.run_model_tests (model, x_test, y_test) ``` ::: @@ -100,12 +101,12 @@ vm. run_model_tests (model, x_test, y_test) How the {{< var validmind.developer >}} works: -- The tests and functions are executed automatically, following pre-configured templates tailored for specific model use cases. This ensures that minimum documentation requirements are consistently fulfilled. +- The tests and functions are executed automatically, following pre-configured templates tailored for specific use cases. This ensures that minimum documentation requirements are consistently fulfilled. - The {{< var vm.developer >}} integrates with ETL/data processing pipelines using connector interfaces. This enables the extraction of relationships between raw data sources and their corresponding post-processed datasets, such as those preloaded session instances received from platforms like Spark and Snowflake. ## Extensible by design -{{< var vm.product >}} supports various model types, including:[^2] +{{< var vm.product >}} supports various record (model) types, including:[^2] - Traditional machine learning models (ML) such as tree-based models and neural network models. - Natural language processing models (NLP) for text analysis and understanding. @@ -114,25 +115,25 @@ How the {{< var validmind.developer >}} works: {{< var vm.product >}} is designed to be highly extensible to cater to our customers' specific requirements. You can expand its functionality in the following ways: -- You can easily add support for new models and data types by defining new classes within the {{< var validmind.developer >}}. We provide templates to guide you through this process.[^3] -- To include custom tests in the library, you can define new functions. We offer templates to help you create these custom tests.[^4] -- You have the flexibility to integrate third-party test libraries seamlessly. These libraries can be hosted either locally within your infrastructure or remotely, for example, on GitHub. Leverage additional testing capabilities and resources as needed.[^5] +- You can easily add support for new records and data types by defining new classes within the {{< var validmind.developer >}}. We provide templates to guide you through this process.[^3] +- To include custom tests in the library, you can define new functions. We offer templates to help you create these custom tests.[^4] +- You have the flexibility to integrate third-party test libraries seamlessly. These libraries can be hosted either locally within your infrastructure or remotely, for example, on GitHub. Leverage additional testing capabilities and resources as needed.[^5] ## {{< var validmind.api >}} integration {{< var vm.product >}} imports the following artifacts into the documentation via our {{< var validmind.api >}} integration: -- Metadata about datasets and models, used to lookup programmatic documentation content, such as the stored definition for _common logistic regression limitations_ when a logistic regression model has been passed to the {{< var vm.product >}} test plan to be run. -- Quality and performance metrics collected from datasets and models. -- Output from test and test suites that have been run. +- Metadata about datasets and records (models), used to look up programmatic documentation content, such as the stored definition for _common logistic regression limitations_ when a logistic regression model has been passed to the {{< var vm.product >}} test plan to be run. +- Quality and performance metrics collected from datasets and records. +- Output from tests and test suites that have been run. - Images, plots, visuals that were generated as part of extracting metrics and running tests. ![Artifacts imported into the documentation via our {{< var vm.api >}}](fine-print/overview-api-integration.jpg){width=90% fig-alt="A representation of artifacts imported into the documentation via our Python API"} ::: {.callout-important} -## {{< var vm.product >}} does NOT: -- Send any personal identifiable information (PII) when generating documentation reports. -- Store any customer datasets or models. +## {{< var vm.product >}} does NOT: +- Send any personally identifiable information (PII) when generating documentation reports. +- Store any customer datasets or records. ::: ## Ready to try out {{< var vm.product >}}? diff --git a/site/about/overview-llm-features.qmd b/site/about/overview-llm-features.qmd index 2dc6c1d19d..202c7e1217 100644 --- a/site/about/overview-llm-features.qmd +++ b/site/about/overview-llm-features.qmd @@ -6,7 +6,7 @@ title: "Large language model features" date: last-modified --- -{{< var vm.product >}} offers several specialized features that use large language models (LLMs) to streamline model risk management and ensure regulatory compliance. Here's how we approach these features and what you need to know. +{{< var vm.product >}} offers several specialized features that use large language models (LLMs) to streamline risk management and ensure regulatory compliance. Here's how we approach these features and what you need to know. ::: {.attn} ## {{< fa list-check >}} Our philosophy @@ -30,7 +30,7 @@ Our testing methodologies and philosophy around testing are readily available, a ## Our features -{{< var vm.product >}} enhances model documentation, testing, and compliance workflows, providing your team with tools for effective model governance. +{{< var vm.product >}} enhances documentation, testing, and compliance workflows, providing your team with tools for effective risk governance. ::: {.column-margin .pl3 .pt6} @@ -54,13 +54,13 @@ Why it matters ::: {.w-50-ns .pl2 .pr2} ### Qualitative checks -Leverages metadata from the model inventory, test outcomes, and additional data provided to create qualitative sections within model documentation. +Leverages metadata from the inventory, test outcomes, and additional data provided to create qualitative sections within documentation.

::: {.feature} Why it matters -: Qualitative checks ensure that essential contextual information is accurately documented and aligned with the model's purpose and scope. +: Qualitative checks ensure that essential contextual information is accurately documented and aligned with the record's purpose and scope. ::: ::: @@ -72,7 +72,7 @@ Why it matters ::: {.w-50-ns .pr2} ### Risk assessment -Using data from test results, generates a tailored risk assessment for each section of model documentation. This feature aids in identifying potential risks based on the model’s performance and results. +Using data from test results, generates a tailored risk assessment for each section of documentation. This feature aids in identifying potential risks based on the record's performance and results. ::: {.feature} Why it matters @@ -85,7 +85,7 @@ Why it matters ::: {.w-50-ns .pl2 .pr2} ### {{< var validmind.checker >}} -Reviews documents such as model documentation or validation reports to ensure documents aligns with relevant regulatory requirements. +Reviews documents such as documentation or validation reports to ensure documents align with relevant regulatory requirements.

@@ -97,7 +97,7 @@ Why it matters +Assesses each part of the documentation for adherence to internal guidelines and policies. This tool supports consistent documentation standards across the organization, promoting uniformity in compliance practices. --> ::: :::: @@ -120,7 +120,7 @@ These documents detail our [AI usage policy](https://validmind.com/about/legal/a ::: {.w-50-ns .pr3} ### Try it yourself -Discover how {{< var vm.product >}}’s LLM-powered platform, purpose-built for model risk management teams, enables streamlined and confident testing, documentation, validation, and governance of generative AI models and processes. +Discover how {{< var vm.product >}}’s LLM-powered platform, purpose-built for risk management teams, enables streamlined and confident testing, documentation, validation, and governance of generative AI systems and processes. [Request a Demo](https://validmind.com/request-demo/){.button .button-green} diff --git a/site/about/overview.qmd b/site/about/overview.qmd index 8b1ef72853..af7f860745 100644 --- a/site/about/overview.qmd +++ b/site/about/overview.qmd @@ -32,7 +32,7 @@ aliases: - /about.html --- -{{< var vm.product >}} is the system of record for AI governance. You use {{< var vm.product >}} to model the full lifecycle of AI systems, models, use cases, and tools, along with their dependencies, and automates the governance and documentation you build on top. +{{< var vm.product >}} is the system of record for AI governance. You use {{< var vm.product >}} to model the full lifecycle of AI systems, records (such as models), use cases, and tools, along with their dependencies, and automates the governance and documentation you build on top. Flexible by design, the {{< var vm.platform >}} lets you define your own inventory hierarchy, dependencies, and governance rules, powered by {{< var vm.product >}}’s documentation automation, workflows, and analytics. @@ -40,7 +40,7 @@ Flexible by design, the {{< var vm.platform >}} lets you define your own invento ::: {.column-margin} ::: {.image-container} - + ![](/assets/img/admin-diagram.png) ![](/assets/img/developer-diagram.png) ![](/assets/img/validator-diagram.png) @@ -56,7 +56,8 @@ Flexible by design, the {{< var vm.platform >}} lets you define your own invento ## {{< fa hand-point-right >}} Ready to try out {{< var vm.product >}}? +:::{#validmind-next-steps} ::: -:::{#validmind-next-steps} ::: + diff --git a/site/about/using-the-documentation.qmd b/site/about/using-the-documentation.qmd index ca5dee30ad..39f59870e6 100644 --- a/site/about/using-the-documentation.qmd +++ b/site/about/using-the-documentation.qmd @@ -8,7 +8,7 @@ aliases: - /about/contributing/using-the-documentation.html --- -This documentation site helps you learn {{< var vm.product >}}, implement it in your organization, govern your AI/ML models, and operate the platform day to day. +This documentation site helps you learn {{< var vm.product >}}, implement it in your organization, govern your AI/ML records (models), and operate the platform day to day. ## How to use this site @@ -37,7 +37,7 @@ Introduces the platform, its use cases, and deployment options. Role-based quickstarts to help you begin using {{< var vm.product >}} quickly. -- [Developer quickstart](/get-started/developer/quickstart-developer.qmd) — Set up your environment and document your first model +- [Developer quickstart](/get-started/developer/quickstart-developer.qmd) — Set up your environment and document your first record (model) - [Validator quickstart](/get-started/validator/quickstart-validator.qmd) — Review documentation and prepare validation reports - [Administrator quickstart](/get-started/administrator/quickstart-administrator.qmd) — Configure users, roles, and organization settings @@ -51,14 +51,14 @@ Step-by-step instructions for platform tasks, organized by feature area. |---------|--------|---------------| | [Access](/guide/guides.qmd#access) | Signing up for and logging into {{< var vm.product >}} | Register, sign in via SSO, recover access | | [Configuration](/guide/guides.qmd#configuration) | Setting up your organization and users | Add users, create groups, assign roles and permissions | -| [Integrations](/guide/integrations/managing-integrations.qmd) | Connecting {{< var vm.product >}} to external systems | Manage secrets, configure connections, link external models | -| [Workflows](/guide/guides.qmd#workflows) | Automating model lifecycle processes | Configure workflow steps, manage transitions, set up approvals | -| [Inventory](/guide/guides.qmd#inventory) | Managing your model and record inventory | Register records, edit fields, configure interdependencies | +| [Integrations](/guide/integrations/managing-integrations.qmd) | Connecting {{< var vm.product >}} to external systems | Manage secrets, configure connections, link external records (models) | +| [Workflows](/guide/guides.qmd#workflows) | Automating lifecycle processes | Configure workflow steps, manage transitions, set up approvals | +| [Inventory](/guide/guides.qmd#inventory) | Managing your records (models) and record inventory | Register records, edit fields, configure interdependencies | | [Documents & templates](/guide/templates/working-with-documents.qmd) | Creating and customizing documentation | Manage document types, customize templates, use the text block library | -| [Model documentation](/guide/guides.qmd#model-documentation) | Authoring and collaborating on model docs | Edit content blocks, add test results, manage versions, submit for approval | -| [Model validation](/guide/guides.qmd#model-validation) | Reviewing and validating models | Review documentation, assess compliance, manage findings and artifacts | +| [Documentation](/guide/guides.qmd#documentation) | Authoring and collaborating on documents | Edit content blocks, add test results, manage versions, submit for approval | +| [Validation](/guide/guides.qmd#validation) | Reviewing and validating records (models) | Review documentation, assess compliance, manage findings and artifacts | | [Reporting](/guide/guides.qmd#reporting) | Analyzing and exporting data | View reports, create custom analytics, export inventory and documents | -| [Monitoring](/guide/guides.qmd#monitoring) | Tracking model performance over time | Enable monitoring, review results, set thresholds and alerts | +| [Monitoring](/guide/guides.qmd#monitoring) | Tracking record (model) performance over time | Enable monitoring, review results, set thresholds and alerts | | [Attestation](/guide/guides.qmd#attestation) | Managing formal attestations | Create, submit, review, and approve attestations | ### [{{< var validmind.developer >}}](/developer/validmind-library.qmd) diff --git a/site/developer/how-to/testing-overview.qmd b/site/developer/how-to/testing-overview.qmd index d21a1c3f5d..129c428de9 100644 --- a/site/developer/how-to/testing-overview.qmd +++ b/site/developer/how-to/testing-overview.qmd @@ -138,7 +138,7 @@ listing: ## Explore tests -Start by exploring the {{< var validmind.developer >}}'s available tests and tests suites: +Start by exploring the {{< var validmind.developer >}}'s available tests and test suites: ::: {.panel-tabset} diff --git a/site/faq/_faq-activity.qmd b/site/faq/_faq-activity.qmd index b950ebeccd..42dee4c34c 100644 --- a/site/faq/_faq-activity.qmd +++ b/site/faq/_faq-activity.qmd @@ -2,7 +2,7 @@ Refer to the LICENSE file in the root of this repository for details. SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> -## Is activity on models, documents, etc. logged? +## Is activity on records, documents, etc. logged? -- Yes, the {{< var validmind.platform >}}^[[Accessing {{< var vm.product >}}](/guide/access/accessing-validmind.qmd)] provides an audit trail functionality, enabling you to track or audit all the events associated with a specific model. -- You can review a full record of comments, workflow status changes, and any other updates made to the model, including modifications to documents or test results. \ No newline at end of file +- Yes, the {{< var validmind.platform >}}^[[Accessing {{< var vm.product >}}](/guide/access/accessing-validmind.qmd)] provides an audit trail functionality, enabling you to track or audit all the events associated with a specific record (such as a model). +- You can review a full record of comments, workflow status changes, and any other updates made to the record, including modifications to documents or test results. \ No newline at end of file diff --git a/site/faq/_faq-attachments.qmd b/site/faq/_faq-attachments.qmd index a37ad2d71b..625f325a92 100644 --- a/site/faq/_faq-attachments.qmd +++ b/site/faq/_faq-attachments.qmd @@ -2,11 +2,11 @@ Refer to the LICENSE file in the root of this repository for details. SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> -## Can we attach files to models, artifacts, or documents? +## Can we attach files to records, artifacts, or documents? -Yes, attachment type inventory fields are available for custom use.^[[Manage inventory fields](/guide/inventory/manage-inventory-fields.qmd)] Once created, attachment type fields allow you to upload supporting files to your model. +Yes, attachment type inventory fields are available for custom use.^[[Manage inventory fields](/guide/inventory/manage-inventory-fields.qmd)] Once created, attachment type fields allow you to upload supporting files to your record (model). -- Out-of-the-box functionality is included for attaching files to model artifacts. +- Out-of-the-box functionality is included for attaching files to artifacts. - You can also attach images to document content blocks and comments. -By default, the [{{< fa hand >}} Customer Admin]{.bubble} role has sufficient permissions to manage model inventory fields. \ No newline at end of file +By default, the [{{< fa hand >}} Customer Admin]{.bubble} role has sufficient permissions to manage inventory fields. \ No newline at end of file diff --git a/site/faq/_faq-explainability.qmd b/site/faq/_faq-explainability.qmd index 7381f1fd9e..f4fc4bcb68 100644 --- a/site/faq/_faq-explainability.qmd +++ b/site/faq/_faq-explainability.qmd @@ -2,11 +2,11 @@ Refer to the LICENSE file in the root of this repository for details. SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> -## Do you include explainability-related testing and documentation? - -Yes, {{< var vm.product >}} includes explainability-related testing and documentation as part of our offerings. Our approach incorporates a comprehensive suite of tests designed to evaluate model interpretability and identify potential risks, ensuring transparency and reliability in model outcomes. +## Do you include explainability-related testing and documentation? + +Yes, {{< var vm.product >}} includes explainability-related testing and documentation as part of our offerings. Our approach incorporates a comprehensive suite of tests designed to evaluate interpretability and identify potential risks, ensuring transparency and reliability in outcomes. -Below is an overview of our key explainability-related tests (browse names and descriptions in the [{{< var vm.product >}} test sandbox](/developer/how-to/test-sandbox.qmd)): +Below is an overview of our key explainability-related tests^[[{{< var vm.product >}} test sandbox](/developer/how-to/test-sandbox.qmd)] with models as an example: - **Features AUC** — Assesses the discriminatory power of individual features in binary classification models, providing insights into how well each feature differentiates between classes. This test supports explainability by isolating the contribution of each feature to the classification task. - **Feature Importance** — Generates feature importance scores to identify and compare impactful features across different models and datasets. By highlighting the relative significance of features, this test clarifies how inputs influence model predictions. diff --git a/site/faq/_faq-images.qmd b/site/faq/_faq-images.qmd index 2d36be4758..c84d874f14 100644 --- a/site/faq/_faq-images.qmd +++ b/site/faq/_faq-images.qmd @@ -2,7 +2,7 @@ Refer to the LICENSE file in the root of this repository for details. SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> -## Do you support including images in model documents? +## Do you support including images in documents? Yes, as long as you can produce the image with Python or open the image from a file, you can include it in your documents with {{< var vm.product >}}:^[[Implement custom tests](/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb#custom-test-images)] diff --git a/site/faq/_faq-model-updates.qmd b/site/faq/_faq-model-updates.qmd deleted file mode 100644 index a9a23c7325..0000000000 --- a/site/faq/_faq-model-updates.qmd +++ /dev/null @@ -1,9 +0,0 @@ - - -## How does {{< var vm.product >}} manage updates to models? - -1. {{< var vm.product >}} allows model developers to re-run documentation functions with the {{< var validmind.developer >}}^[[{{< var validmind.developer >}}](/developer/validmind-library.qmd)] to capture changes in the model, such as changes in the number of features or hyperparameters. -2. After a model developer has made a change in their development environment, such as to a Jupyter Notebook,^[[Code samples](/developer/samples-jupyter-notebooks.qmd)] they can execute the relevant {{< var vm.product >}} documentation function to update the corresponding documentation section. -3. {{< var vm.product >}} will then automatically recreate the relevant figures and tables and update them in the online documentation. \ No newline at end of file diff --git a/site/faq/_faq-monitoring.qmd b/site/faq/_faq-monitoring.qmd index 76d6b43827..372ac8f510 100644 --- a/site/faq/_faq-monitoring.qmd +++ b/site/faq/_faq-monitoring.qmd @@ -2,11 +2,11 @@ Refer to the LICENSE file in the root of this repository for details. SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> -## Does {{< var vm.product >}} support monitoring models after deployment? +## Does {{< var vm.product >}} support monitoring records after deployment? -Yes, {{< var vm.product >}} offers ongoing monitoring support to help you regularly assess a model’s accuracy, stability, and robustness to ensure it remains reliable after deployment: +Yes, {{< var vm.product >}} offers ongoing monitoring support to help you regularly assess a record’s accuracy, stability, and robustness to ensure it remains reliable after deployment: -- You can enable monitoring for both new and existing models.^[[Enable monitoring](/guide/monitoring/enable-monitoring.qmd)] -- You use the {{< var validmind.developer >}} to automatically populate the monitoring template for your model with data, providing a comprehensive view of your model’s performance over time. +- You can enable monitoring for both new and existing records.^[[Enable monitoring](/guide/monitoring/enable-monitoring.qmd)] +- You use the {{< var validmind.developer >}} to automatically populate the monitoring template for your record with data, providing a comprehensive view of your record’s performance over time. - You then access and examine these results within the {{< var validmind.platform >}}, allowing you to identify any deviations from expected performance and take corrective actions as needed.^[[Review monitoring results](/guide/monitoring/review-monitoring-results.qmd)] - Once generated via the {{< var validmind.developer >}}, view and add metrics over time to your ongoing monitoring reports in the {{< var validmind.platform >}}.^[[Work with metrics over time](/guide/monitoring/work-with-metrics-over-time.qmd)] \ No newline at end of file diff --git a/site/faq/_faq-progress-model.qmd b/site/faq/_faq-progress-workflow.qmd similarity index 53% rename from site/faq/_faq-progress-model.qmd rename to site/faq/_faq-progress-workflow.qmd index aebd3a98eb..dd9e0f62b3 100644 --- a/site/faq/_faq-progress-model.qmd +++ b/site/faq/_faq-progress-workflow.qmd @@ -2,8 +2,8 @@ Refer to the LICENSE file in the root of this repository for details. SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> -## How do I progress a model along in its lifecycle within {{< var vm.product >}}? +## How do I progress a record along in its lifecycle within {{< var vm.product >}}? -Transition a model along in the workflow, for example for review with the next set of stakeholders, by changing a model's status. +Transition a record (such as a model) along in the workflow, for example for review with the next set of stakeholders, by changing a record's status. {{< include /guide/workflows/_transition-workflow-examples.qmd >}} \ No newline at end of file diff --git a/site/faq/_faq-record-updates.qmd b/site/faq/_faq-record-updates.qmd new file mode 100644 index 0000000000..739040675e --- /dev/null +++ b/site/faq/_faq-record-updates.qmd @@ -0,0 +1,9 @@ + + +## How does {{< var vm.product >}} manage updates to records? + +1. {{< var vm.product >}} allows developers to re-run documentation functions with the {{< var validmind.developer >}}^[[{{< var validmind.developer >}}](/developer/validmind-library.qmd)] to capture changes in the record (such as a model), such as changes in the number of features or hyperparameters. +2. After a developer has made a change in their development environment, such as to a Jupyter Notebook,^[[Code samples](/developer/samples-jupyter-notebooks.qmd)] they can execute the relevant {{< var vm.product >}} documentation function to update the corresponding documentation section. +3. {{< var vm.product >}} will then automatically recreate the relevant figures and tables and update them in the online documentation. \ No newline at end of file diff --git a/site/faq/_faq-tracking.qmd b/site/faq/_faq-tracking.qmd index 82ae4cc508..2a8c460248 100644 --- a/site/faq/_faq-tracking.qmd +++ b/site/faq/_faq-tracking.qmd @@ -4,5 +4,5 @@ SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> ## Can I use {{< var vm.product >}} to track milestone dates? -- Yes, the {{< var validmind.platform >}} includes support for custom inventory fields, including those for dates and date time — allowing you to track important dates throughout the model risk management lifecycle unique to your workflow. +- Yes, the {{< var validmind.platform >}} includes support for custom inventory fields, including those for dates and date time — allowing you to track important dates throughout the risk management lifecycle unique to your workflow. - In addition, calculation type custom inventory fields can draw upon date and date time values, allowing you to automatically calculate next review, revalidation, ongoing monitoring deadlines, or any other desired date. diff --git a/site/faq/faq-collaboration.qmd b/site/faq/faq-collaboration.qmd index c1901b8180..76e1dd16ac 100644 --- a/site/faq/faq-collaboration.qmd +++ b/site/faq/faq-collaboration.qmd @@ -4,8 +4,6 @@ # SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial title: "Collaboration" date: last-modified -aliases: - - /guide/faq-workflows.html listing: - id: faq-collaboration type: grid @@ -17,15 +15,15 @@ listing: - ../guide/inventory/view-record-activity.qmd - ../guide/documentation/collaborate-with-others.qmd - ../guide/workflows/working-with-workflows.qmd -categories: ["real-time collaboration", "documents","documentation", "record activity", "auditing", "workflows", "model lifecycle", "validmind platform"] +categories: ["real-time collaboration", "documents","documentation", "record activity", "auditing", "workflows", "lifecycle", "validmind platform"] --- {{< include _faq-activity.qmd >}} ## What real-time collaboration features does {{< var vm.product >}} offer? -- You can simultaneously edit model documents, leave and respond to comments or suggestions all within the {{< var validmind.platform >}}. -- You can also saved named versions of edits to retain specific revisions, and any changes to model documents are automatically logged on your model's activity feed. +- You can simultaneously edit documents, leave and respond to comments or suggestions all within the {{< var validmind.platform >}}. +- You can also save named versions of edits to retain specific revisions, and any changes to documents are automatically logged on your record's activity feed. ::: {.callout} ## Multiple users are able to simultaneously edit documents in the {{< var validmind.platform >}}. @@ -33,7 +31,7 @@ categories: ["real-time collaboration", "documents","documentation", "record act If two users are editing the same cell within {{< var vm.platform >}}, the most recently saved version of the content will prevail. ::: -{{< include _faq-progress-model.qmd >}} +{{< include _faq-progress-workflow.qmd >}} @@ -45,6 +43,6 @@ If two users are editing the same cell within {{< var vm.platform >}}, the most ## Learn more -:::{#faq-validation} +:::{#faq-collaboration} ::: diff --git a/site/faq/faq-documentation.qmd b/site/faq/faq-documentation.qmd index 244940cf56..bd78fd0859 100644 --- a/site/faq/faq-documentation.qmd +++ b/site/faq/faq-documentation.qmd @@ -2,7 +2,7 @@ # Copyright © 2023-2026 ValidMind Inc. All rights reserved. # Refer to the LICENSE file in the root of this repository for details. # SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial -title: "Model documents and templates" +title: "Documents and templates" date: last-modified aliases: - /guide/faq-documentation.html @@ -23,7 +23,7 @@ categories: ["templates", "documents", "documentation","customization", "images" {{< var vm.product >}} provides the following default template types:[^1] -- Development (model documentation) +- Development (documentation) - Validation (validation reports) - Monitoring (ongoing monitoring reports) @@ -31,7 +31,7 @@ You can also create custom document types and associated templates to suit your ## Can templates be customized for our use cases? -Yes, the {{< var validmind.platform >}}[^2] allows you to configure versioned templates based on document requirements for each model or lifecycle use case. +Yes, the {{< var validmind.platform >}}[^2] allows you to configure versioned templates based on document requirements for each record (such as a model) or lifecycle use case. - {{< var vm.product >}}'s templates are fully customizable,[^3] and are complemented by the ability to manage validation guidelines. - You can swap between different versions of templates or apply another version of the current template.[^4] @@ -41,17 +41,17 @@ By default, the [{{< fa hand >}} Customer Admin]{.bubble} role[^5] has sufficien ## Can documents be created right in the {{< var validmind.platform >}}? -Yes, you can work with model documentation, validation reports, ongoing monitoring reports, or any other document type directly in the {{< var validmind.platform >}}, without having to first generate anything using the {{< var validmind.developer >}}.[^6] +Yes, you can work with documentation, validation reports, ongoing monitoring reports, or any other document type directly in the {{< var validmind.platform >}}, without having to first generate anything using the {{< var validmind.developer >}}.[^6] 1. Add and edit text on any document within the {{< var vm.platform >}} using our content editing toolbar.[^7] 2. Using the {{< var vm.developer >}}, execute test suites and generate the corresponding supporting results. These results can then be added to your documents within the {{< var vm.platform >}}.[^8] -## Can I run tests and log documentation without a model? - -Yes! If you do not have a model ready, or your model can't be loaded directly, or you only have access to model predictions, you can still run tests and log documentation using the {{< var validmind.developer >}} as long as you're able to load the model predictions. +## Can I run tests and log documentation without a record? + +Yes! If you do not have a record (such as a model) ready, or your record can't be loaded directly, or you only have access to predictions, you can still run tests and log documentation using the {{< var validmind.developer >}} as long as you're able to load the predictions. - Use `assign_predictions()`[^9] to load predictions from a separate file or a dataset with predictions. -- Call `init_model()`[^10] but instead of a trained model instance, pass an `input_id` and model metadata. `ModelMetadata()`[^11] will use the provided metadata instead of trying to calculate it from the model's library. +- Call `init_model()`[^10] to create a model object, but instead of a trained instance, pass an `input_id` and model metadata. `ModelMetadata()`[^11] will use the provided metadata instead of trying to calculate it from the model's library. ::: {.column-margin} @@ -65,8 +65,10 @@ Yes! If you do not have a model ready, or your model can't be loaded directly, o ``` ::: -::: {.callout title="If neither a trained model instance nor metadata is provided, `init_model()` will return an error. "} -However, tests that need a trained model will not work with "empty" models. +::: {.callout title="If neither a trained instance nor metadata is provided, `init_model()` will return an error."} +- However, tests that need a trained model will not work with "empty" models. +- Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with {{< var vm.product >}}. + ::: {{< include _faq-attachments.qmd >}} diff --git a/site/faq/faq-integrations.qmd b/site/faq/faq-integrations.qmd index dd1d65314c..e1be635668 100644 --- a/site/faq/faq-integrations.qmd +++ b/site/faq/faq-integrations.qmd @@ -22,7 +22,7 @@ categories: ["supported libraries", "supported languages", "integrations", "imag ## Which languages, libraries, and environments do you support? -- The {{< var validmind.developer >}}[^1] is designed to be platform-agnostic and compatible with most popular open-source programming languages and model development environments in Python and R,[^2] from XGBoost to more sophisticated libraries such as Pytorch and TensorFlow — and many more. +- The {{< var validmind.developer >}}[^1] is designed to be platform-agnostic and compatible with most popular open-source programming languages and development environments in Python and R,[^2] from XGBoost to more sophisticated libraries such as Pytorch and TensorFlow — and many more. - We directly support Matplotlib[^3] and Plotly[^4] plotting libraries for visual representations, and you're able to return images from other libraries as bytes-like objects.[^5] ::: {.callout} @@ -35,18 +35,18 @@ Support for commercial and closed-source programming languages such as SAS and M ## What test ingestion or modeling techniques are supported? - {{< var vm.product >}} supports ingesting test results from your training and evaluation pipeline, such as using batch prediction or online prediction mechanisms.[^6] -- We are also offer standard documentation via the {{< var vm.developer >}} for additional modeling techniques.[^7] +- We also offer standard documentation via the {{< var vm.developer >}} for additional modeling techniques.[^7] {{< include _faq-images.qmd >}} ## What large language model (LLM) features are offered? -{{< var vm.product >}} offers several specialized features that use large language models (LLMs) to streamline model risk management and ensure regulatory compliance: +{{< var vm.product >}} offers several specialized features that use large language models (LLMs) to streamline risk management and ensure regulatory compliance: - **Test interpretation** — Interprets results from tests run within {{< var vm.product >}}. -- **Qualitative checks** — Leverages metadata from the model inventory, test outcomes, and additional data provided to create qualitative sections within model documentation. -- **Risk assessment** — Using data from test results, generates a tailored risk assessment for each section of model documentation. -- **{{< var validmind.checker >}}**[^8] — Reviews documents such as model documentation or validation reports to ensure documents aligns with relevant regulatory requirements. +- **Qualitative checks** — Leverages metadata from the inventory, test outcomes, and additional data provided to create qualitative sections within documentation. +- **Risk assessment** — Using data from test results, generates a tailored risk assessment for each section of documentation. +- **{{< var validmind.checker >}}**[^8] — Reviews documents such as documentation or validation reports to ensure they align with relevant regulatory requirements. {{< include _faq-explainability.qmd >}} @@ -54,24 +54,6 @@ Support for commercial and closed-source programming languages such as SAS and M {{< include /about/deployment/_deployment-available-options.qmd >}} - - - - - - - - - - ## Learn more :::{#faq-integrations} @@ -88,10 +70,10 @@ We will be implementing connector interfaces allowing extraction of relationship [^4]: [Plotly](https://plotly.com/) -[^5]: [Do you support including images in model documents?](#images) +[^5]: [Do you support including images in documents?](#images) [^6]: [Load dataset predictions](/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb) -[^7]: [Do you include explainability-related testing and documentation?](#explanability) +[^7]: [Do you include explainability-related testing and documentation?](#explainability) [^8]: [Customize {{< var validmind.checker >}}](/guide/templates/customize-document-checker.qmd) \ No newline at end of file diff --git a/site/faq/faq-inventory.qmd b/site/faq/faq-inventory.qmd index 1aa5bf9ccc..3cb987a699 100644 --- a/site/faq/faq-inventory.qmd +++ b/site/faq/faq-inventory.qmd @@ -2,7 +2,7 @@ # Copyright © 2023-2026 ValidMind Inc. All rights reserved. # Refer to the LICENSE file in the root of this repository for details. # SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial -title: "Model inventory and activity" +title: "Inventory and activity" date: last-modified aliases: - /guide/faq-inventory.html @@ -17,47 +17,47 @@ listing: - ../guide/inventory/view-record-activity.qmd - ../guide/inventory/working-with-the-inventory.qmd - ../guide/inventory/managing-the-inventory.qmd -categories: ["record activity", "model registration", "model inventory", "customization", "model stages", "model interdependencies", "auditing", "exports", "validmind platform"] +categories: ["inventory", "inventory registration", "inventory customization", "inventory interdependencies", "activity", "auditing", "exports", "validmind platform"] --- {{< include _faq-activity.qmd >}} -## How do I register models with {{< var vm.product >}}? +## How do I register records with {{< var vm.product >}}? -- Register models within the {{< var validmind.platform >}} via the model inventory as the first step towards streamlining your model documentation and validation workflow. -- To add a model to the inventory, you fill out a customizable questionnaire capturing the required registration metadata.[^1] +- Register records (such as models) within the {{< var validmind.platform >}} via the inventory as the first step towards streamlining your documentation and validation workflow. +- To add a record to the inventory, you fill out a customizable questionnaire capturing the required registration metadata.[^1] -By default, the [{{< fa code >}} Developer]{.bubble} role[^2] has sufficient permissions to register models. +By default, the [{{< fa code >}} Developer]{.bubble} role[^2] has sufficient permissions to register records. -## Are model registration questionnaires customizable? +## Are registration questionnaires customizable? -- Yes, along with default fields provided by {{< var vm.product >}} as part of your basic model information, you can add additional model information and make these custom fields required when creating models.[^3] +- Yes, along with default fields provided by {{< var vm.product >}} as part of your basic record (model) information, you can add additional record information and make these custom fields required when creating records.[^3] - You can modify these custom fields as needed and on an ongoing basis. -By default, the [{{< fa hand >}} Customer Admin]{.bubble} role has sufficient permissions to manage model inventory fields. +By default, the [{{< fa hand >}} Customer Admin]{.bubble} role has sufficient permissions to manage inventory fields. -## Can the {{< var vm.product >}} model inventory be customized? +## Can the {{< var vm.product >}} inventory be customized? -- Yes, information that is displayed on the model inventory is configurable on a per user basis. -- You can also search, filter, and sort models to narrow down results. -- Fields that appear on all models for all users can also be customized. +- Yes, information that is displayed on the inventory is configurable on a per user basis. +- You can also search, filter, and sort records (such as models) to narrow down results. +- Fields that appear on all records for all users can also be customized. - By default, the [{{< fa hand >}} Customer Admin]{.bubble} role has sufficient permissions to manage model inventory fields. + By default, the [{{< fa hand >}} Customer Admin]{.bubble} role has sufficient permissions to manage inventory fields. -## Can I archive or delete models within {{< var vm.product >}}? +## Can I archive or delete records within {{< var vm.product >}}? -Yes, models can be archived within the {{< var validmind.platform >}} model inventory to keep your inventory accurate and up to date with your organization’s current resources.[^4] +Yes, records (such as models) can be archived within the {{< var validmind.platform >}} inventory to keep your inventory accurate and up to date with your organization’s current resources.[^4] -By default, the [{{< fa hand >}} Customer Admin]{.bubble} role has sufficient permissions to archive and delete models. +By default, the [{{< fa hand >}} Customer Admin]{.bubble} role has sufficient permissions to archive and delete records. -## Can I track model interdependencies within the model inventory? +## Can I track interdependencies within the inventory? -- Yes, the {{< var validmind.platform >}} allows you to connect two or more models together in your model inventory.[^5] -- You can note both upstream and downstream models. +- Yes, the {{< var validmind.platform >}} allows you to connect two or more records (models) together in your inventory.[^5] +- You can note both upstream and downstream records. -By default, the [{{< fa code >}} Developer]{.bubble} role has sufficient permissions to edit model interdependencies. +By default, the [{{< fa code >}} Developer]{.bubble} role has sufficient permissions to edit record interdependencies. -{{< include _faq-model-updates.qmd >}} +{{< include _faq-record-updates.qmd >}} {{< include _faq-tracking.qmd >}} @@ -69,12 +69,12 @@ By default, the [{{< fa code >}} Developer]{.bubble} role has sufficient permiss -[^1]: [Register records in the inventory](/guide/inventory/working-with-the-inventory.qmd) +[^1]: [Register records in the inventory](/guide/inventory/register-records-in-inventory.qmd) [^2]: [Manage permissions](/guide/configuration/manage-permissions.qmd) [^3]: [Manage inventory fields](/guide/inventory/manage-inventory-fields.qmd) -[^4]: [Archive and delete models](/guide/inventory/archive-delete-records.qmd) +[^4]: [Archive and delete records](/guide/inventory/archive-delete-records.qmd) -[^5]: [Configure model interdependencies](/guide/inventory/configure-record-interdependencies.qmd) \ No newline at end of file +[^5]: [Configure record interdependencies](/guide/inventory/configure-record-interdependencies.qmd) \ No newline at end of file diff --git a/site/faq/faq-organizations.qmd b/site/faq/faq-organizations.qmd index 4a6818d07f..4709d309a5 100644 --- a/site/faq/faq-organizations.qmd +++ b/site/faq/faq-organizations.qmd @@ -19,19 +19,19 @@ categories: ["access", "permissions", "organizations", "user registration", "val ## How do I get access to {{< var vm.product >}}? -#### 1. Register with ValidMind +#### 1. Register with {{< var vm.product >}} -- First register with our cloud-hosted {{< var validmind.platform >}},[^1] which enables you to work with model documentation or configure ValidMind for your organization. -- With an email address or a Google, GitHub, or Microsoft account, you can gain access to ValidMind either by signing up independently, or by accepting an invite from another member of your organization. +- First register with our cloud-hosted {{< var validmind.platform >}},[^1] which enables you to work with documentation or configure {{< var vm.product >}} for your organization. +- With an email address or a Google, GitHub, or Microsoft account, you can gain access to {{< var vm.product >}} either by signing up independently, or by accepting an invite from another member of your organization. -#### 2. Log in to ValidMind +#### 2. Log in to {{< var vm.product >}} - Once you've signed up or accepted an invite, log in to the {{< var validmind.platform >}}.[^2] -- {{< var vm.product >}} supports logging in via both the public interent and private network endpoints. +- {{< var vm.product >}} supports logging in via both the public internet and private network endpoints. ## What are organizations within {{< var vm.product >}}? -Access to the {{< var validmind.platform >}} where your model inventory is hosted is associated with an organization,[^3] which encompasses all your users, groups, and business units. +Access to the {{< var validmind.platform >}} where your inventory is hosted is associated with an organization,[^3] which encompasses all your users, groups, and business units. - As a user,[^4] you can belong to multiple organizations. - You will see the option to switch between organizations only if you belong to more than one organization. @@ -46,8 +46,8 @@ By default, the [{{< fa hand >}} Customer Admin]{.bubble} role[^5] has sufficien ## How do user roles, user groups, and access permissions work? - - Users belong to groups which determine which models they can see, and have roles with attached permissions which define the level of access they have to features. - - Groups are segments of users with the ability to view models associated with that group. Access to granular features in the {{< var vm.platform >}} within a group’s set of models is further defined by roles and permissions. + - Users belong to groups which determine which records (such as models) they can see, and have roles with attached permissions which define the level of access they have to features. + - Groups are segments of users with the ability to view records associated with that group. Access to granular features in the {{< var vm.platform >}} within a group’s set of records is further defined by roles and permissions. - Roles are a named set of permissions that determine your users’ access to features within the {{< var vm.platform >}} based on your organization’s structure. - Permissions dictate user access controls within the {{< var vm.platform >}}, and are associated with specific roles. diff --git a/site/faq/faq-privacy.qmd b/site/faq/faq-privacy.qmd index 79644eb4e9..b2468c25e0 100644 --- a/site/faq/faq-privacy.qmd +++ b/site/faq/faq-privacy.qmd @@ -28,12 +28,12 @@ categories: ["data handling", "privacy", "confidentiality", "record activity", " Access to the {{< var validmind.platform >}} is facilitated through AWS PrivateLink, which provides private connectivity between {{< var vm.product >}} and your on-premises networks without exposing your traffic to the public internet.[^2] -## What model assets are automatically imported into {{< var vm.product >}}? +## What assets are automatically imported into {{< var vm.product >}}? {{< var vm.product >}} stores the following assets in documents via our {{< var validmind.api >}}: -- Dataset and model metadata which allow generating documentation snippets programmatically (example: stored definition for "common logistic regression limitations" when a logistic regression model has been passed to the {{< var vm.product >}} test suite execution) -- Quality and performance metrics collected from the dataset and model +- Dataset and model (any type of record) object metadata which allow generating documentation snippets programmatically (example: stored definition for "common logistic regression limitations" when a logistic regression model has been passed to the {{< var vm.product >}} test suite execution) +- Quality and performance metrics collected from the dataset and record - Outputs from executed test suites - Images, plots, and visuals generated as part of extracting metrics and running tests @@ -55,32 +55,7 @@ Furthermore, {{< var vm.product >}}'s data retention policy complies with the SO {{< include _faq-activity.qmd >}} -{{< include _faq-model-updates.qmd >}} - - - - - - - - - - - - +{{< include _faq-record-updates.qmd >}} ## Learn more diff --git a/site/faq/faq-reporting.qmd b/site/faq/faq-reporting.qmd index 0b9abd046a..c672652c5f 100644 --- a/site/faq/faq-reporting.qmd +++ b/site/faq/faq-reporting.qmd @@ -22,7 +22,7 @@ categories: ["exports", "analytics", "reports", "ongoing monitoring", "validmind ## What analytic features are offered by {{< var vm.product >}}? -- Out-of-the-box reports within the {{< var validmind.platform >}}[^1] are broken down by data on models and data on artifacts. +- Out-of-the-box reports within the {{< var validmind.platform >}}[^1] are broken down by data on records (such as models) and data on artifacts. - For each of the bar charts, you can hover for numerical breakdowns or click on individual bars to get a more detailed view.[^2] - You're also able to add custom report pages and analytic widgets to supplement the out-of-the-box reports provided.[^3] diff --git a/site/faq/faq-testing.qmd b/site/faq/faq-testing.qmd index 32ee6aa308..6f82ab1694 100644 --- a/site/faq/faq-testing.qmd +++ b/site/faq/faq-testing.qmd @@ -19,28 +19,28 @@ listing: description: "Tests that are available as part of the {{< var validmind.developer >}}, grouped by type of validation or monitoring test." path: ../developer/how-to/test-sandbox.qmd - ../guide/monitoring/ongoing-monitoring.qmd -categories: ["testing", "model documentation", "customization", "custom data", "explainability", "ongoing monitoring", "validmind library"] +categories: ["testing", "documentation", "customization", "custom data", "explainability", "ongoing monitoring", "validmind library"] --- ## How do the out-of-the-box tests developed by {{< var vm.product >}} work? All the default tests are developed using open-source Python and R libraries. -The {{< var validmind.developer >}}[^1] test interface is a light wrapper that defines utility functions to agnostically interact with different dataset and model backends, and contains functions to collect and post results to the {{< var validmind.platform >}}[^2] using a generic results schema. +The {{< var validmind.developer >}}[^1] test interface is a light wrapper that defines utility functions to agnostically interact with different dataset and record (model) backends, and contains functions to collect and post results to the {{< var validmind.platform >}}[^2] using a generic results schema. -## When do I use tests and tests suites? +## When do I use tests and test suites? While you have the flexibility to decide when to use which {{< var vm.product >}} tests, here are a few typical scenarios:[^3] - **Dataset testing** — To document and validate your dataset. -- **Model testing** — To document and validate your model. +- **Model testing** — To document and validate your record, such as a model. - **End-to-end testing** — To document a binary classification model and the relevant dataset end-to-end. ## Can we configure, customize, or add our own tests? Yes, {{< var vm.product >}} allows tests to be manipulated at several levels: -- You can configure which tests are required to run programmatically depending on the model use case.[^4] +- You can configure which tests are required to run programmatically depending on the record's use case.[^4] - You can change the thresholds and parameters for default tests already available in the {{< var vm.developer >}} — for instance, changing the threshold parameter for the class imbalance flag.[^5] - You can also connect your own custom tests with the {{< var validmind.developer >}}. These custom tests are configurable and are able to run programmatically, just like the rest of the {{< var vm.developer >}}.[^6] - Personalize tests further for your use case by using {{< var vm.product >}}'s `RawData` feature[^7] to customize the output of tests. @@ -51,12 +51,12 @@ In addition to custom tests, you can also add use case and test-specific context ## How do I log tests as a developer? -You use the {{< var validmind.developer >}} to run and log tests during model development, the results of which are then inserted your model documentation within the {{< var validmind.platform >}}.[^9] The {{< var vm.developer >}} also automatically generates draft test descriptions for your test results — generations that can be modified for your custom use cases.[^10] +You use the {{< var validmind.developer >}} to run and log tests during development, the results of which are then inserted into your documentation within the {{< var validmind.platform >}}.[^9] The {{< var vm.developer >}} also automatically generates draft test descriptions for your test results — generations that can be modified for your custom use cases.[^10] To log tests as a developer with the {{< var validmind.developer >}}: -- You must have the [{{< fa code >}} Developer]{.bubble} role[^11] or another role with sufficient permissions to create and own models, and to work with model documentation. -- You must be the model owner or model developer, but not the model validator,[^12] for the model you want to log tests and update documentation for. +- You must have the [{{< fa code >}} Developer]{.bubble} role[^11] or another role with sufficient permissions to create and own records (models), and to work with documentation. +- You must be the record owner or record developer, but not the record validator,[^12] for the record you want to log tests and update documentation for. ::: {.callout} ## Want to learn how to use {{< var vm.product >}} as a developer? @@ -66,12 +66,12 @@ Check out our introductory series — [**{{< var vm.product >}} for development* ## How do I log tests as a validator? -You use the {{< var validmind.developer >}} to run and log tests during model validation, the results of which are then inserted your validation report within the {{< var validmind.platform >}}.[^13] The {{< var vm.developer >}} also automatically generates draft test descriptions for your test results — generations that can be modified for your custom use cases.[^14] +You use the {{< var validmind.developer >}} to run and log tests during validation, the results of which are then inserted into your validation report within the {{< var validmind.platform >}}.[^13] The {{< var vm.developer >}} also automatically generates draft test descriptions for your test results — generations that can be modified for your custom use cases.[^14] To log tests as a validator with the {{< var validmind.developer >}}: -- You must have the [{{< fa circle-check >}} Validator]{.bubble} role[^15] or another role with sufficient permissions to access models for validation, to review model documentation, and to work with validation reports and model artifacts. -- You must be the model validator, but not the model owner or model developer,[^16] for the model you want to log tests and update documentation for. +- You must have the [{{< fa circle-check >}} Validator]{.bubble} role[^15] or another role with sufficient permissions to access records (models) for validation, to review documentation, and to work with validation reports and artifacts. +- You must be the record validator, but not the record owner or record developer,[^16] for the record you want to log tests and update documentation for. ::: {.callout} ## Want to learn how to use {{< var vm.product >}} as a validator? diff --git a/site/faq/faq-validation.qmd b/site/faq/faq-validation.qmd index ee56299e2a..741abe5beb 100644 --- a/site/faq/faq-validation.qmd +++ b/site/faq/faq-validation.qmd @@ -2,10 +2,8 @@ # Copyright © 2023-2026 ValidMind Inc. All rights reserved. # Refer to the LICENSE file in the root of this repository for details. # SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial -title: "Model validation and artifacts" +title: "Validation and artifacts" date: last-modified -aliases: - - /guide/faq-documentation.html listing: - id: faq-validation type: grid @@ -17,7 +15,7 @@ listing: - ../guide/validation/manage-validation-guidelines.qmd - ../guide/validation/preparing-validation-reports.qmd - ../guide/validation/working-with-artifacts.qmd -categories: ["model validation", "validation guidelines", "model artifacts", "findings", "model documentation", "templates", "compliance", "validmind platform"] +categories: ["validation", "validation guidelines", "artifacts", "findings", "documentation", "templates", "compliance", "validmind platform"] --- ## Can I set up custom validation guidelines for use in templates? @@ -29,9 +27,9 @@ Yes, {{< var vm.product >}} supports the implementation of custom validation gui By default, the [{{< fa hand >}} Customer Admin]{.bubble} role[^2] has sufficient permissions to manage validation guidelines. -## How does {{< var vm.product >}} help with assessing model compliance? +## How does {{< var vm.product >}} help with assessing compliance? -Use {{< var vm.product >}} to assess compliance of your models with guidelines based on analyzing evidence and artifacts (findings),[^3] providing an unbiased starting point that enables more efficient discussions between validators and developers. +Use {{< var vm.product >}} to assess compliance of your records (such as models) with guidelines based on analyzing evidence and artifacts (findings),[^3] providing an unbiased starting point that enables more efficient discussions between validators and developers. Without leaving the {{< var validmind.platform >}}, you're able to: @@ -39,19 +37,19 @@ Without leaving the {{< var validmind.platform >}}, you're able to: 2. Link tracked artifacts to validation reports 3. Provide granular compliance assessments for each section of your validation report -By default, the [{{< fa circle-check >}} Validator]{.bubble} role has sufficient permissions to assess model compliance on validation reports. +By default, the [{{< fa circle-check >}} Validator]{.bubble} role has sufficient permissions to assess compliance on validation reports. -## What support does {{< var vm.product >}} offer for model artifacts? +## What support does {{< var vm.product >}} offer for artifacts? -- Within the {{< var validmind.platform >}}, you're able to log artifacts at the model or the documentation section level.[^4] +- Within the {{< var validmind.platform >}}, you're able to log artifacts at the record (model) or the documentation section level.[^4] - On each artifact, you're able to outline proposed remediation plans, attach supporting documentation, track the artifact's status, attach the artifact to a risk area and/or documentation section, designate a due date, and assign a resolution owner. -- You can also access a complete list of filterable artifacts logged across all your models, or look at only artifacts linked to a specific model.[^5] +- You can also access a complete list of filterable artifacts logged across all your records, or look at only artifacts linked to a specific record.[^5] -By default, the [{{< fa circle-check >}} Validator]{.bubble} role has sufficient permissions to manage model artifacts. +By default, the [{{< fa circle-check >}} Validator]{.bubble} role has sufficient permissions to manage artifacts. ## Can I create custom artifact types? -Yes, you can create custom artifact types to track categories of observations beyond the default types (Validation Issue, Policy Exception, Model Limitation).[^6] +Yes, you can create custom artifact types to track categories of observations beyond the default types (Validation Issue, Policy Exception, Limitation).[^6] Common examples include: diff --git a/site/faq/faq-workflows.qmd b/site/faq/faq-workflows.qmd index a11bf82fb3..5b9a1424c6 100644 --- a/site/faq/faq-workflows.qmd +++ b/site/faq/faq-workflows.qmd @@ -9,7 +9,7 @@ aliases: listing: - id: faq-workflows type: grid - grid-columns: 3 + grid-columns: 2 max-description-length: 250 sort: false fields: [title, description] @@ -18,12 +18,12 @@ listing: - ../guide/workflows/manage-record-stages.qmd - ../guide/inventory/manage-inventory-fields.qmd - ../guide/attestation/working-with-attestations.qmd -categories: ["workflows", "model lifecycle", "lifecycle statuses", "attestations", "validmind platform", "validmind library"] +categories: ["workflows", "lifecycle", "lifecycle statuses", "attestations", "validmind platform", "validmind library"] --- ## Can I customize workflows within {{< var vm.product >}}? -- Yes, you can create custom workflows for the review and approval of models throughout their lifecycles with {{< var validmind.platform >}},[^1] enabling you to more easily oversee your organization's unique model risk management process. +- Yes, you can create custom workflows for the review and approval of records (such as models) throughout their lifecycles with {{< var validmind.platform >}},[^1] enabling you to more easily oversee your organization's unique risk management process. - For example, workflows can be configured to include any number of review stages involving different sets of stakeholders — at any point in the process. By default, the [{{< fa hand >}} Customer Admin]{.bubble} role[^2] has sufficient permissions to manage workflows. @@ -35,19 +35,19 @@ By default, the [{{< fa hand >}} Customer Admin]{.bubble} role[^2] has sufficien By default, the [{{< fa hand >}} Customer Admin]{.bubble} role has sufficient permissions to manage lifecycle statuses. -{{< include _faq-progress-model.qmd >}} +{{< include _faq-progress-workflow.qmd >}} ## Can we work with disconnected workflows? Yes, {{< var vm.product >}} supports disconnected workflows natively at the data-collection level since the {{< var validmind.developer >}}[^3] creates individual test runs every time a new test iteration is executed. - This allows for running parallel/disconnected tests that individually send results to the {{< var validmind.platform >}}. -- Visualizing the disconnected workflow in terms of model testing and documentation will depend on requirements at the use-case level. +- Visualizing the disconnected workflow in terms of testing and documentation will depend on requirements at the use-case level. ::: {.callout} -## You can also leverage the {{< var validmind.developer >}} once you are ready to document a specific model for review and validation. +## You can also leverage the {{< var validmind.developer >}} once you are ready to document a specific record (model) for review and validation. -You do not need to use the {{< var validmind.platform >}} while you are in the exploration or R&D phase of model development. +You do not need to use the {{< var validmind.platform >}} while you are in the exploration or R&D phase of development. ::: {{< include _faq-attestations.qmd >}} diff --git a/site/faq/faq.qmd b/site/faq/faq.qmd index 74bf9a5488..50a718ba8e 100644 --- a/site/faq/faq.qmd +++ b/site/faq/faq.qmd @@ -18,11 +18,11 @@ listing: - path: faq-workflows.qmd title: "Workflows" - path: faq-inventory.qmd - title: "Model inventory and activity" + title: "Inventory and activity" - path: faq-documentation.qmd - title: "Model documents and templates" + title: "Documents and templates" - path: faq-validation.qmd - title: "Model validation and artifacts" + title: "Validation and artifacts" - path: faq-collaboration.qmd title: "Collaboration" - path: faq-reporting.qmd diff --git a/site/get-started/common-steps/_register-your-first-model.qmd b/site/get-started/common-steps/_register-your-first-model.qmd index 4e7562b6b4..332c9ca023 100644 --- a/site/get-started/common-steps/_register-your-first-model.qmd +++ b/site/get-started/common-steps/_register-your-first-model.qmd @@ -9,7 +9,7 @@ a. Under the [record type]{.smallcaps} drop-down, select `Model`.^[[Manage inven a. Click **{{< fa plus >}} Register Model**. -a. Enter in the **[model name]{.smallcaps}** and select any option for the following required fields: +a. Enter the **[model name]{.smallcaps}** and select any option for the following required fields: - **[business unit]{.smallcaps}** — For example: `Finance` - **[prelimnary risk tier]{.smallcaps}** — For example: `1` @@ -47,7 +47,7 @@ a. Under the [record type]{.smallcaps} drop-down, select `Model`.^[[Manage inven a. Click **{{< fa plus >}} Register Model**. -a. Enter in the **[model name]{.smallcaps}** and select any option for the following required fields: +a. Enter the **[model name]{.smallcaps}** and select any option for the following required fields: - **[business unit]{.smallcaps}** — For example: `Finance` - **[prelimnary risk tier]{.smallcaps}** — For example: `1` diff --git a/site/guide/configuration/_add-business-units.qmd b/site/guide/configuration/_add-business-units.qmd index d9fe0f3a9e..ddc3be0ff2 100644 --- a/site/guide/configuration/_add-business-units.qmd +++ b/site/guide/configuration/_add-business-units.qmd @@ -7,7 +7,7 @@ SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> 1. Click **{{< fa plus >}} Add Business Unit** under Business Units. -1. Enter in your **[business unit name]{.smallcaps}**. +1. Enter your **[business unit name]{.smallcaps}**. 1. Click **Add Business Unit** to save your changes. @@ -18,7 +18,7 @@ SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> :::: {.content-visible unless-format="revealjs" unless-meta="includes.quickstart"} a. Click **{{< fa plus >}} Add Business Unit** under Business Units. -a. Enter in your **[business unit name]{.smallcaps}**. +a. Enter your **[business unit name]{.smallcaps}**. a. Click **Add Business Unit** to save your changes. @@ -35,7 +35,7 @@ a. Under {{< fa building >}} Organization, select **Organization**. a. Click **{{< fa plus >}} Add Business Unit** under Business Units. -a. Enter in your **[business unit name]{.smallcaps}**. +a. Enter your **[business unit name]{.smallcaps}**. a. Click **Add Business Unit** to save your changes. diff --git a/site/guide/documentation/_locate-document-overview.qmd b/site/guide/documentation/_locate-document-overview.qmd index 59a5a4513d..af25cd9fec 100644 --- a/site/guide/documentation/_locate-document-overview.qmd +++ b/site/guide/documentation/_locate-document-overview.qmd @@ -18,7 +18,7 @@ To locate your document overview for a record: :::: {.content-hidden unless-format="revealjs"} 1. In the left sidebar, click **{{< fa cubes >}} Inventory**. -1. Select a record or [find your record by applying a filter or searching for it](/guide/inventory/working-with-the-inventory.qmd#search-filter-and-sort-records){target="blank"}. +1. Select a record or [find your record by applying a filter or searching for it](/guide/inventory/working-with-the-inventory.qmd#search-filter-and-sort-records){target="_blank"}. 1. In the left sidebar that appears for your record, click **{{< fa file >}} Documents** and select **Development**. diff --git a/site/guide/documentation/content-editing-toolbar.png b/site/guide/documentation/content-editing-toolbar.png index 68cf696412..9421913ad7 100644 Binary files a/site/guide/documentation/content-editing-toolbar.png and b/site/guide/documentation/content-editing-toolbar.png differ diff --git a/site/guide/documentation/content_blocks/_generate-with-ai.qmd b/site/guide/documentation/content_blocks/_generate-with-ai.qmd index 1852a374f2..f3ed00600d 100644 --- a/site/guide/documentation/content_blocks/_generate-with-ai.qmd +++ b/site/guide/documentation/content_blocks/_generate-with-ai.qmd @@ -5,7 +5,7 @@ SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> :::: {.content-visible unless-format="revealjs"} 1. Click **{{< fa ellipsis-vertical >}}** in the content editing toolbar and select **[{{< fa diamond >}} (Generate Text with AI)]{.pink}**. -1. Enter in a custom prompt and click **Send**, or click **Generate Content** to compose a draft for review. +1. Enter a custom prompt and click **Send**, or click **Generate Content** to compose a draft for review. @@ -33,7 +33,7 @@ SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> :::: {.content-hidden unless-format="revealjs"} 1. Click **{{< fa ellipsis-vertical >}}** in the content editing toolbar and select **[{{< fa diamond >}} (Generate Text with AI)]{.pink}**. -1. Enter in a custom prompt and click **Send**, or click **Generate Content** to compose a draft for review. +1. Enter a custom prompt and click **Send**, or click **Generate Content** to compose a draft for review. :::: {.content-visible when-format="revealjs" when-meta="includes.pdf-context"} You can also include [record attachments in `.pdf` format](/guide/inventory/edit-inventory-fields.qmd#manage-attachments){target="_blank"} as context documents: diff --git a/site/guide/documentation/work-with-content-blocks.qmd b/site/guide/documentation/work-with-content-blocks.qmd index 5d5e23d7b4..6ec599dbae 100644 --- a/site/guide/documentation/work-with-content-blocks.qmd +++ b/site/guide/documentation/work-with-content-blocks.qmd @@ -14,17 +14,17 @@ includes: Make edits to your documents by adding or removing content blocks directly in the online editor. -## What are content blocks? - -Content blocks provide you with sections that are part of a template, and are used in documents. - -- You can think of these sections as an empty canvas that you fill in with text and test results. -- Multiple sections are joined to create a longer document with a table of contents that has different heading and subheading levels, such as "1.," "1.1.," and so on. - ::: {.callout title="Static PDF documents cannot be edited."} You can only work with content blocks in PDFs converted to editable documents,[^1] documents generated by the {{< var validmind.developer >}}, or documents created in the {{< var validmind.platform >}}. ::: +## What are content blocks? + +Content blocks are modular document template components, used to populate your documents: + +- Content blocks are empty canvases that you fill in with text and test results. +- Content blocks are inserted into document templates that define the structure of your documents.[^2] + #### Content block types Content blocks can be new, blank blocks, or prepopulated via your library of logged test or metric results and text block templates: @@ -43,12 +43,12 @@ The content editing toolbar is a rich text editor that enables you to: - Undo or redo changes - Format your text, including adding hyperlinks and code blocks -- Reference related record and artifact field values with variables[^2] -- Insert LaTex formulas[^3] -- Reference other sections of the document[^4] +- Reference related record and artifact field values with variables[^3] +- Insert LaTeX formulas[^4] +- Reference other sections of the document[^5] - Attach images via upload or URL -You can also use the toolbar to suggest changes, save named versions of content, and leave comments.[^5] +You can also use the toolbar to suggest changes, save named versions of content, and leave comments.[^6] ::: {.callout title="Stuck on where to start?"} Use {{< var vm.product >}} to assist you with generating content via AI!^[[Generate content drafts with AI](#generate-content)] @@ -59,9 +59,9 @@ Use {{< var vm.product >}} to assist you with generating content via AI!^[[Gener ## Prerequisites - [x] {{< var link.login >}} -- [x] There are records registered in the inventory.[^6] -- [x] Documents exist and are completed or are in progress for your record.[^7] -- [x] You are a [{{< fa code >}} Developer]{.bubble} or [{{< fa circle-check >}} Validator]{.bubble}, or assigned another role with sufficient permissions to perform the tasks in this guide.[^8] +- [x] There are records registered in the inventory.[^7] +- [x] Documents exist and are completed or are in progress for your record.[^8] +- [x] You are a [{{< fa code >}} Developer]{.bubble} or [{{< fa circle-check >}} Validator]{.bubble}, or assigned another role with sufficient permissions to perform the tasks in this guide.[^9] ::: @@ -80,26 +80,26 @@ Use {{< var vm.product >}} to assist you with generating content via AI!^[[Gener #### [from library]{.smallcaps} Text Block - : Inserts a text block from a template in your block library:[^9] + : Inserts a text block from a template in your block library:[^10] a. Select the reusable blocks you want to add. b. Click **Insert # Text Block(s) to Document**. - Test-Driven[^10] + Test-Driven[^11] : Adds a new section with logged test results. - Metric Over Time[^11] + Metric Over Time[^12] : Adds a new section with logged metric over time results. ::: -7. After adding the block to your document, click on the text to make changes or add comments.[^12] +7. After adding the block to your document, click on the text to make changes or add comments.[^13] ### Reference field values While editing a simple text block within documents, you can reference values in the form of variables from: ::: {.callout} -- To reference field values in the form of variables, your organization must not have tracked changes enabled by default.[^13] +- To reference field values in the form of variables, your organization must not have tracked changes enabled by default.[^14] - Note that while you are able to select fields with empty values within available record or artifact type fields, no value will be displayed in the content block until the field is populated. ::: @@ -119,8 +119,8 @@ While editing a simple text block within documents, you can insert math equation While editing a simple text block within documents, you can have {{< var vm.product >}} assist you with generating content drafts. ::: {.callout title="How can generate content drafts with AI?"} -- To use the generate content drafts with AI feature, your organization must not have tracked changes enabled by default.[^14] -- Generating content drafts works best after you've logged tests with the {{< var validmind.developer >}},[^15] as existing test descriptions and results provide more context for the {{< var vm.product >}} AI Content Builder to draw upon. +- To use the generate content drafts with AI feature, your organization must not have tracked changes enabled by default.[^15] +- Generating content drafts works best after you've logged tests with the {{< var validmind.developer >}},[^16] as existing test descriptions and results provide more context for the {{< var vm.product >}} AI Content Builder to draw upon. ::: @@ -129,7 +129,7 @@ To generate content drafts: {{< include content_blocks/_generate-with-ai.qmd >}} -When generating content drafts with AI, accepted versions and edits are retained in your {{< fa wifi >}} Activity[^16] just like other updates to your documents. +When generating content drafts with AI, accepted versions and edits are retained in your {{< fa wifi >}} Activity[^17] just like other updates to your documents.
@@ -177,7 +177,7 @@ While editing a simple text block within documents, you can directly reference o ::: -Hyperlinks will also take you to the referenced section in PDF document exports.[^17] +Hyperlinks will also take you to the referenced section in PDF document exports.[^18] ## Remove content blocks @@ -187,11 +187,11 @@ Test-driven or metric over time blocks can be re-added later on but **text block 1. In the left sidebar, click **{{< fa cubes >}} Inventory**. -2. Select a record or find your record by applying a filter or searching for it.[^18] +2. Select a record or find your record by applying a filter or searching for it.[^19] -3. In the left sidebar that appears for your record, click **{{< fa file >}} Documents** and select the **Latest** tab.[^19] +3. In the left sidebar that appears for your record, click **{{< fa file >}} Documents** and select the **Latest** tab.[^20] -4. Click on the document file you want to remove a block from.[^20] +4. Click on the document file you want to remove a block from.[^21] 5. Click on a section header to expand that section and remove content. @@ -206,40 +206,42 @@ Test-driven or metric over time blocks can be re-added later on but **text block [^1]: [Manage documents](/guide/templates/manage-documents.qmd#add-record-documents) -[^2]: [Reference field values](#reference-field-values) +[^2]: [Working with templates](/guide/templates/working-with-document-templates.qmd) -[^3]: [Insert mathematical formulas](#insert-mathematical-formulas) +[^3]: [Reference field values](#reference-field-values) -[^4]: [Reference document sections](#reference-document-sections) +[^4]: [Insert mathematical formulas](#insert-mathematical-formulas) -[^5]: [Collaborate with others](collaborate-with-others.qmd) +[^5]: [Reference document sections](#reference-document-sections) -[^6]: [Register records in the inventory](/guide/inventory/register-records-in-inventory.qmd) +[^6]: [Collaborate with others](collaborate-with-others.qmd) -[^7]: [Working with documents](/guide/templates/working-with-documents.qmd) +[^7]: [Register records in the inventory](/guide/inventory/register-records-in-inventory.qmd) -[^8]: [Manage permissions](/guide/configuration/manage-permissions.qmd) +[^8]: [Working with documents](/guide/templates/working-with-documents.qmd) -[^9]: [Manage text block library](/guide/templates/manage-text-block-library.qmd) +[^9]: [Manage permissions](/guide/configuration/manage-permissions.qmd) -[^10]: [Work with test results](/guide/documentation/work-with-test-results.qmd) +[^10]: [Manage text block library](/guide/templates/manage-text-block-library.qmd) -[^11]: [Work with metrics over time](/guide/monitoring/work-with-metrics-over-time.qmd) +[^11]: [Work with test results](/guide/documentation/work-with-test-results.qmd) -[^12]: [Collaborate with others](/guide/documentation/collaborate-with-others.qmd) +[^12]: [Work with metrics over time](/guide/monitoring/work-with-metrics-over-time.qmd) -[^13]: [Managing your organization](/guide/configuration/managing-your-organization.qmd#manage-document-defaults) +[^13]: [Collaborate with others](/guide/documentation/collaborate-with-others.qmd) [^14]: [Managing your organization](/guide/configuration/managing-your-organization.qmd#manage-document-defaults) -[^15]: [Run tests and test suites](/developer/how-to/testing-overview.qmd) +[^15]: [Managing your organization](/guide/configuration/managing-your-organization.qmd#manage-document-defaults) + +[^16]: [Run tests and test suites](/developer/how-to/testing-overview.qmd) -[^16]: [View model activity](/guide/inventory/view-record-activity.qmd) +[^17]: [View record activity](/guide/inventory/view-record-activity.qmd) -[^17]: [Export documents](/guide/reporting/export-documents.qmd) +[^18]: [Export documents](/guide/reporting/export-documents.qmd) -[^18]: [Working with the model inventory](/guide/inventory/working-with-the-inventory.qmd#search-filter-and-sort-models) +[^19]: [Working with the inventory](/guide/inventory/working-with-the-inventory.qmd#search-filter-and-sort-records) -[^19]: [Work with document versions](/guide/documentation/work-with-document-versions.qmd) +[^20]: [Work with document versions](/guide/documentation/work-with-document-versions.qmd) -[^20]: [Working with documents](/guide/templates/working-with-documents.qmd) \ No newline at end of file +[^21]: [Working with documents](/guide/templates/working-with-documents.qmd) \ No newline at end of file diff --git a/site/guide/documentation/work-with-document-versions.qmd b/site/guide/documentation/work-with-document-versions.qmd index 8c1ad1cdae..15ee59055d 100644 --- a/site/guide/documentation/work-with-document-versions.qmd +++ b/site/guide/documentation/work-with-document-versions.qmd @@ -33,7 +33,7 @@ Save read-only versions of your documents in the {{< var validmind.platform >}} 5. On the document overview page, click **{{< fa bookmark >}} Save Version**. -6. Enter in your **[notes]{.smallcaps}** for the version. +6. Enter your **[notes]{.smallcaps}** for the version. 7. Click **Save Version** to create a read-only version of that document that captures the contents in the current state. diff --git a/site/guide/documentation/working-with-documentation.qmd b/site/guide/documentation/working-with-documentation.qmd index 566f2e7f94..d5ada74553 100644 --- a/site/guide/documentation/working-with-documentation.qmd +++ b/site/guide/documentation/working-with-documentation.qmd @@ -55,26 +55,26 @@ This section describes how to work with Development type documents[^1] (for exam ## Key concepts - + -{{< include /about/glossary/model_documentation/_doc-intro.qmd >}} +{{< include /about/glossary/documentation/_doc-intro.qmd >}} :::: {.flex .flex-wrap .justify-around} ::: {.w-50-ns} -{{< include /about/glossary/model_documentation/_conceptual-soundness.qmd >}} +{{< include /about/glossary/documentation/_conceptual-soundness.qmd >}} -{{< include /about/glossary/model_documentation/_data-preparation.qmd >}} +{{< include /about/glossary/documentation/_data-preparation.qmd >}} ::: ::: {.w-40-ns} -{{< include /about/glossary/model_documentation/_model-development.qmd >}} +{{< include /about/glossary/documentation/_model-development.qmd >}} -{{< include /about/glossary/model_documentation/_monitoring-governance.qmd >}} +{{< include /about/glossary/documentation/_monitoring-governance.qmd >}} ::: diff --git a/site/guide/integrations/configure-connections.qmd b/site/guide/integrations/configure-connections.qmd index 831f96e5aa..2bca5f107e 100644 --- a/site/guide/integrations/configure-connections.qmd +++ b/site/guide/integrations/configure-connections.qmd @@ -165,7 +165,7 @@ Required configuration details: 4. In the modal that opens, select one of the supported connections, such as MLflow or Jira. -5. Enter in the: +5. Enter the following: - **[integration name]{.smallcaps}** — How other admins can identify the connection. - **[description]{.smallcaps}** (optional) — The intended usage or additional details. diff --git a/site/guide/inventory/_add-edit-record-types.qmd b/site/guide/inventory/_add-edit-record-types.qmd index ff0ca938fa..21416b69e7 100644 --- a/site/guide/inventory/_add-edit-record-types.qmd +++ b/site/guide/inventory/_add-edit-record-types.qmd @@ -46,7 +46,7 @@ b. Make your desired changes to your record type, then click **Update**. :::: {.content-hidden unless-format="revealjs"} -- To add: Click **{{< fa plus >}} Add Inventory Record Type**, enter in the record type details, then click **Create**. +- To add: Click **{{< fa plus >}} Add Inventory Record Type**, enter the record type details, then click **Create**. - To edit: Click on an existing record type, update the record type details, then click **Update**. #### Record type details diff --git a/site/guide/inventory/_field-types.qmd b/site/guide/inventory/_field-types.qmd index f0779eaf23..77fabc701e 100644 --- a/site/guide/inventory/_field-types.qmd +++ b/site/guide/inventory/_field-types.qmd @@ -175,21 +175,47 @@ def formula(params): :::: + + ::: {.panel-tabset} #### Params dictionary In addition to custom field keys you add from the **available fields** drop-down, formulas can read built-in keys on the `params` dictionary: + + +:::: {.content-visible when-format="html" when-meta="includes.inventory"} | Param | Applies to | Availability | Description | |---|---|---|---| -| `params[""]` | Inventory records and artifacts | Add the field from the **available fields** drop-down | Current value of another custom field on the same record or artifact. | +| `params[""]` | Inventory records & artifacts | Add the field from the **available fields** drop-down | Current value of another custom field on the same record or artifact. | | `params["model_stage"]` | Inventory records | Add **Model Stage** from the **available fields** drop-down | Current model stage name as a string (for example, `"Production"`). Returns an empty string when no stage is assigned. Compare directly: `params["model_stage"] == "Production"`. | | `params["stakeholders"]` | Inventory records | Add a stakeholder role from the **available fields** drop-down (for example, **Stakeholders — Owners**) | Assigned users by stakeholder role. Default role keys are `owners`, `developers`, and `validators`. Custom stakeholder types use `custom_stakeholder_`. Each entry includes `name`, `email`, and `title`. | -| `params["integrations"]` | Inventory records and artifacts | Reference `params["integrations"]` in the formula source | Linked external integration data keyed by service (for example, `params["integrations"]["mlflow"]`). Empty when integrations are not configured or not referenced. | +| `params["integrations"]` | Inventory records & artifacts | Reference `params["integrations"]` in the formula source | Linked external integration data keyed by service (for example, `params["integrations"]["mlflow"]`). Empty when integrations are not configured or not referenced. | + +:::: + + + +:::: {.content-visible when-format="html" unless-meta="includes.inventory"} +| Param | Applies to | Availability | Description | +|---|---|---|---| +| `params[""]` | Inventory records & artifacts | Add the field from the **available fields** drop-down | Current value of another custom field on the same record or artifact. | +| `params["integrations"]` | Inventory records & artifacts | Reference `params["integrations"]` in the formula source | Linked external integration data keyed by service (for example, `params["integrations"]["mlflow"]`). Empty when integrations are not configured or not referenced. | | `params["finding_type"]` | Artifacts | Always available on artifact formulas | Artifact type metadata with `["tag"]` (technical identifier) and `["name"]` (display name). | | `params["model"]` | Artifacts | Always available when the artifact is linked to an inventory record | Parent inventory record custom field values (for example, `params["model"]["criticality_level"]`). | +:::: + + #### Available helpers Reference these helpers in your formulas — they cover the date, number, and list operations the engine does not expose directly: @@ -212,6 +238,7 @@ Reference these helpers in your formulas — they cover the date, number, and li ::: + Checkbox : A `true`/`false` value set by a toggle. @@ -283,7 +310,7 @@ Calculation 1. Select from the drop-down of **[available record fields]{.smallcaps}**, or **[available artifact fields]{.smallcaps}** and **[record fields available via]{.smallcaps} `params["model"]`** (artifact fields) to allow your formula access to the field's values. 2. Replace the demonstration formula with your own in the code box provided. 4. Click **Test Calculation {{< fa angle-down >}}** to open the testing area. - 5. Enter in sample values in the testing area then click **{{< fa play >}} Test Calculation** to validate your formula. + 5. Enter sample values in the testing area then click **{{< fa play >}} Test Calculation** to validate your formula. Checkbox : A `true`/`false` value set by a toggle. diff --git a/site/guide/inventory/_rename-field-keys.qmd b/site/guide/inventory/_rename-field-keys.qmd index 487373b087..c72fd1ca25 100644 --- a/site/guide/inventory/_rename-field-keys.qmd +++ b/site/guide/inventory/_rename-field-keys.qmd @@ -6,7 +6,7 @@ SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> 1. When the **{{< fa ellipsis-vertical >}}** appears, click on it and select **{{< fa pen-to-square >}} Rename Key**. -1. Enter in the [new key]{.smallcaps} and click **Check Availability**. +1. Enter the [new key]{.smallcaps} and click **Check Availability**. 1. If the key is not already in use, you'll be presented with a list of dependencies to review. diff --git a/site/guide/inventory/manage-inventory-fields.qmd b/site/guide/inventory/manage-inventory-fields.qmd index e9541ad0da..b810431c5c 100644 --- a/site/guide/inventory/manage-inventory-fields.qmd +++ b/site/guide/inventory/manage-inventory-fields.qmd @@ -101,7 +101,7 @@ To group inventory fields, first create an inventory field group: 2. Under {{< fa cube >}} Inventory, select **Inventory Record Fields**. -3. Click **{{< fa folder-plus >}} Add Group** and enter in a **[name]{.smallcaps}** and a optional **[description]{.smallcaps}** for the group. +3. Click **{{< fa folder-plus >}} Add Group** and enter a **[name]{.smallcaps}** and an optional **[description]{.smallcaps}** for the group. 4. Click **Create Group** to add the new group. diff --git a/site/guide/reporting/_create-a-visualization.qmd b/site/guide/reporting/_create-a-visualization.qmd index 2735588020..a485657a72 100644 --- a/site/guide/reporting/_create-a-visualization.qmd +++ b/site/guide/reporting/_create-a-visualization.qmd @@ -13,7 +13,7 @@ a. Click **{{< fa pencil >}} Edit Dashboard**. a. Select **{{< fa plus >}} Add Widget** then **{{< fa plus >}} Add Visualization**. -a. On the Add Visualization panel, enter in your **[title]{.smallcaps}**. +a. On the Add Visualization panel, enter your **[title]{.smallcaps}**. a. Select a **[visualization type]{.smallcaps}**: @@ -37,7 +37,7 @@ a. When you are done configuring your dataset, click **Add Visualization** to in :::: {.content-hidden unless-format="revealjs"} a. Select **{{< fa plus >}} Add Widget** then **{{< fa plus >}} Add Visualization**. -a. On the Add Visualization panel, enter in your **[title]{.smallcaps}**. +a. On the Add Visualization panel, enter your **[title]{.smallcaps}**. a. Select a **[visualization type]{.smallcaps}**: diff --git a/site/guide/reporting/_create-an-analytics-page.qmd b/site/guide/reporting/_create-an-analytics-page.qmd index 039b26e78d..5a5ee949ac 100644 --- a/site/guide/reporting/_create-an-analytics-page.qmd +++ b/site/guide/reporting/_create-an-analytics-page.qmd @@ -6,7 +6,7 @@ a. In the left sidebar, click **{{< fa square-poll-vertical >}} Analytics**. a. Click **{{< fa plus >}} Add Page**. -a. On the Add New Page module, enter in the: +a. On the Add New Page module, enter the following: - **[page name]{.smallcaps}** - **[description]{.smallcaps}** (optional) diff --git a/site/guide/shared/manage-views/_personal-views.qmd b/site/guide/shared/manage-views/_personal-views.qmd index b4dc370bfb..3d81b810e4 100644 --- a/site/guide/shared/manage-views/_personal-views.qmd +++ b/site/guide/shared/manage-views/_personal-views.qmd @@ -29,7 +29,7 @@ SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> 3. Select **My Views (#)** and click on **{{< fa bookmark >}} Save New View** to create your saved view. -4. Enter in the **[view name]{.smallcaps}** and **[description]{.smallcaps}** for your saved view. +4. Enter the **[view name]{.smallcaps}** and **[description]{.smallcaps}** for your saved view. 5. Click **Add New View** to save your saved view. diff --git a/site/guide/templates/_add-assessment-questions.qmd b/site/guide/templates/_add-assessment-questions.qmd index 8838206a42..87d2b971c6 100644 --- a/site/guide/templates/_add-assessment-questions.qmd +++ b/site/guide/templates/_add-assessment-questions.qmd @@ -9,7 +9,7 @@ Manually add assessment questions, or generate questions from a PDF upload for D a. Click **{{< fa plus >}} Add Question** to create a new question. -a. Enter in the **[questions]{.smallcaps}**: +a. Enter the **[questions]{.smallcaps}**: - Each line without a break (enter) is considered one complete question.^[Empty lines will be ignored.] - To add a separate question, press enter to start a new line. @@ -44,7 +44,7 @@ Select the assessment you want to add questions to by clicking on it: 1. Click **{{< fa plus >}} Add Question** to create a new question. -1. Enter in the **[questions]{.smallcaps}**: +1. Enter the **[questions]{.smallcaps}**: - Each line without a break (enter) is considered one complete question. Empty lines will be ignored. - To add a separate question, press enter to start a new line. diff --git a/site/guide/templates/_customize-document-templates.qmd b/site/guide/templates/_customize-document-templates.qmd index e94cd2314e..1f37129c2e 100644 --- a/site/guide/templates/_customize-document-templates.qmd +++ b/site/guide/templates/_customize-document-templates.qmd @@ -96,7 +96,7 @@ Customize {{< var vm.product >}}'s templates for documents to fit your specific 1. Under {{< fa file >}} Documents, select **Templates**. -1. Select one of the tabs for the [type of template you want to edit](/guide/templates/manage-document-types.qmd){target="blank"}. +1. Select one of the tabs for the [type of template you want to edit](/guide/templates/manage-document-types.qmd){target="_blank"}. 1. Click the template to edit and on the template details page, select **{{< fa pencil >}} Edit Outline**. diff --git a/site/guide/templates/_duplicate-template.qmd b/site/guide/templates/_duplicate-template.qmd index 41aa0abacc..f2f3c0420f 100644 --- a/site/guide/templates/_duplicate-template.qmd +++ b/site/guide/templates/_duplicate-template.qmd @@ -34,7 +34,7 @@ To duplicate an existing template and start with version one of that new templat 1. Under {{< fa file >}} Documents, select **Templates**. -1. Select one of the tabs for the [type of template you want to duplicate](/guide/templates/working-with-documents.qmd){target="blank"}. +1. Select one of the tabs for the [type of template you want to duplicate](/guide/templates/working-with-documents.qmd){target="_blank"}. 1. Click on the template to duplicate and on the template details page, select **{{< fa copy >}} Duplicate Template**. diff --git a/site/guide/templates/_template-schema-generated.qmd b/site/guide/templates/_template-schema-generated.qmd index 6408b92014..3256ea27cc 100644 --- a/site/guide/templates/_template-schema-generated.qmd +++ b/site/guide/templates/_template-schema-generated.qmd @@ -1,6 +1,12 @@ +SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial + +This file is auto-generated by scripts/generate_template_schema_docs.py +Do not edit directly. Re-run the script to update. + +Source: backend/src/backend/templates/documentation/model_documentation/mdd_template_schema_v5_ui.json +--> ```{=html} @@ -1131,7 +1137,7 @@ SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> ``` diff --git a/site/guide/templates/_view-document-templates.qmd b/site/guide/templates/_view-document-templates.qmd index f374270c08..a2bc6d93ea 100644 --- a/site/guide/templates/_view-document-templates.qmd +++ b/site/guide/templates/_view-document-templates.qmd @@ -65,7 +65,7 @@ To review the existing templates available to your organization: 1. Under {{< fa file >}} Documents, select **Templates**. -1. Select one of the tabs for [the document type with the templates you want to view](/guide/templates/working-with-documents.qmd){target="blank"}. +1. Select one of the tabs for [the document type with the templates you want to view](/guide/templates/working-with-documents.qmd){target="_blank"}. 1. Select one of the available templates to view detailed information about the template. diff --git a/site/guide/templates/customize-document-checker.qmd b/site/guide/templates/customize-document-checker.qmd index 2ff9e2bccc..535c206cf0 100644 --- a/site/guide/templates/customize-document-checker.qmd +++ b/site/guide/templates/customize-document-checker.qmd @@ -139,7 +139,7 @@ d. Once cloned, add or edit the assessment questions.[^3] ### Edit assessment questions ::: {.callout title="How do I locate a specific question?"} -Click **{{}} Search** to enter in your keywords, then press **Search** to narrow down results. +Click **{{}} Search** to enter your keywords, then press **Search** to narrow down results. ::: a. Hover over the question you want to edit. diff --git a/site/guide/templates/manage-document-types.qmd b/site/guide/templates/manage-document-types.qmd index d8def7e441..dfc5752195 100644 --- a/site/guide/templates/manage-document-types.qmd +++ b/site/guide/templates/manage-document-types.qmd @@ -34,7 +34,7 @@ These stock document types cannot be deleted, only edited: 3. Click **{{< fa plus >}} Add Document Type**. -4. Enter in the document type details: +4. Enter the document type details: - Provide a **[name]{.smallcaps}** and an optional **[description]{.smallcaps}**. - Toggle whether this document type should be automatically created **[when record is registered]{.smallcaps}**. diff --git a/site/guide/templates/manage-documents.qmd b/site/guide/templates/manage-documents.qmd index 001220bb4a..d06e02b407 100644 --- a/site/guide/templates/manage-documents.qmd +++ b/site/guide/templates/manage-documents.qmd @@ -31,7 +31,7 @@ Add or delete documents available on individual records in your inventory. 4. Click **{{< fa plus >}} Create Document**. -5. Enter in the document details: +5. Enter the document details: - **[document title]{.smallcaps}** — Title of your document.[^7] - **[document type]{.smallcaps}**[^8] — The type of your document. diff --git a/site/guide/validation/_add-edit-artifact-statuses.qmd b/site/guide/validation/_add-edit-artifact-statuses.qmd index 79fb4e0dd4..25797b00e9 100644 --- a/site/guide/validation/_add-edit-artifact-statuses.qmd +++ b/site/guide/validation/_add-edit-artifact-statuses.qmd @@ -8,7 +8,7 @@ SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> a. Click **{{< fa plus >}} Add Status**. -b. Enter in a **[status name]{.smallcaps}** and assign a **[color]{.smallcaps}** to your artifact status. +b. Enter a **[status name]{.smallcaps}** and assign a **[color]{.smallcaps}** to your artifact status. c. When you are done, click **Add Status** to create your new status. @@ -36,7 +36,7 @@ Artifact statuses cannot be deleted if in use on an artifact. Ensure that the st a. Click **{{< fa plus >}} Add Status**. -b. Enter in a **[status name]{.smallcaps}** and assign a **[color]{.smallcaps}** to your artifact status. +b. Enter a **[status name]{.smallcaps}** and assign a **[color]{.smallcaps}** to your artifact status. c. When you are done, click **Add Status** to create your new status. diff --git a/site/guide/validation/_add-edit-artifact-types.qmd b/site/guide/validation/_add-edit-artifact-types.qmd index 7f52899cd1..de814dd5b1 100644 --- a/site/guide/validation/_add-edit-artifact-types.qmd +++ b/site/guide/validation/_add-edit-artifact-types.qmd @@ -11,13 +11,13 @@ SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> 3. Click **{{< fa plus >}} Add Artifact Type**. -4. Enter in a [name]{.smallcaps} and an optional [description]{.smallcaps} for your artifact type. +4. Enter a [name]{.smallcaps} and an optional [description]{.smallcaps} for your artifact type. 5. Click **Create** to create your new artifact type. 6. Click on your newly created artifact type to edit its details and permissions. -7. Enter in the artifact type's details: +7. Enter the artifact type's details: - **[fields]{.smallcaps}** — Select the default artifact fields that should appear on this type of artifact and click **Save Fields** to apply your changes.[^add-fields-callout] - **[record fields display]{.smallcaps}**[^add-record-fields-config] — Select which upstream record fields^[[Edit inventory fields](/guide/inventory/edit-inventory-fields.qmd)] to display as read-only under the Inventory Record Information section on artifacts of this type and click **Save Record Fields Configuration** to apply your changes. @@ -95,7 +95,7 @@ SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> a. Click **{{< fa plus >}} Add Artifact Type**. -b. Enter in a [name]{.smallcaps} and an optional [description]{.smallcaps} for your artifact type. +b. Enter a [name]{.smallcaps} and an optional [description]{.smallcaps} for your artifact type. c. Click **Create** to create your new artifact type. diff --git a/site/guide/validation/_link-artifacts-to-reports.qmd b/site/guide/validation/_link-artifacts-to-reports.qmd index bdbc706c92..5020d2b7b7 100644 --- a/site/guide/validation/_link-artifacts-to-reports.qmd +++ b/site/guide/validation/_link-artifacts-to-reports.qmd @@ -27,7 +27,7 @@ d. Click **Update Linked Artifacts**. [^create-artifact]: 1. Click **{{< fa plus >}} Add {Artifact Type}** where `{Artifact Type}` is the artifact type you want to add to create a new artifact. - 2. Enter in the details for your artifact. + 2. Enter the details for your artifact. 3. Click **Add {Artifact Type}** to submit the artifact. :::: {.content-hidden unless-format="revealjs"} @@ -41,7 +41,7 @@ d. Click **Update Linked Artifacts**. 1. Click **{{< fa link >}} Link Artifact** and select **Validation Issue** as the [type of artifact](/guide/validation/manage-artifact-types.qmd){target="_blank"}. -1. Click **{{< fa plus >}} Add Validation Issue** and enter in the details for your validation issue, for example: +1. Click **{{< fa plus >}} Add Validation Issue** and enter the details for your validation issue, for example: - **[title]{.smallcaps}** — Champion Logistic Regression Model Fails Minimum Accuracy Threshold - **[risk area]{.smallcaps}** — Model Performance @@ -54,7 +54,7 @@ d. Click **Update Linked Artifacts**. 1. Click **Update Linked Artifacts** to insert your validation issue. -1. Confirm that validation issue you inserted has been correctly inserted into section **2.2.2. Model Performance** of the report. +1. Confirm that the validation issue you inserted has been correctly inserted into section **2.2.2. Model Performance** of the report. 1. Click on the validation issue to expand the issue, where you can adjust details such as severity, owner, due date, status, etc. as well as include proposed remediation plans or supporting documentation as attachments. diff --git a/site/guide/validation/assess-compliance.qmd b/site/guide/validation/assess-compliance.qmd index b06fa534f0..b882ad7b74 100644 --- a/site/guide/validation/assess-compliance.qmd +++ b/site/guide/validation/assess-compliance.qmd @@ -113,7 +113,7 @@ Once you have linked evidence to a section of your report, assess the linked evi 1. In any section of the report where evidence has been linked, click **Evidence Assessment** to expand the evidence assessment panel. -2. Click **Add your assessment** to use the content editing toolbar[^13] to enter in your assessment notes. +2. Click **Add your assessment** to use the content editing toolbar[^13] to enter your assessment notes. 3. (Optional) When you are finished editing your assessment notes, hover over the content block and click **{{< fa unlock >}} Lock Assessment** to prevent additional changes. diff --git a/site/guide/validation/manage-artifact-fields.qmd b/site/guide/validation/manage-artifact-fields.qmd index dd9f290c76..8d20312dd7 100644 --- a/site/guide/validation/manage-artifact-fields.qmd +++ b/site/guide/validation/manage-artifact-fields.qmd @@ -99,7 +99,7 @@ To group artifact fields, first create an artifact field group: 2. Under {{< fa expand >}} Artifacts, select **Artifact Fields**. -3. Click **{{< fa plus >}} Add Group** and enter in a **[name]{.smallcaps}** and a **[description]{.smallcaps}** for the group. +3. Click **{{< fa plus >}} Add Group** and enter a **[name]{.smallcaps}** and a **[description]{.smallcaps}** for the group. 4. Click **Create Group** to add the new group. diff --git a/site/guide/workflows/_add-new-workflows.qmd b/site/guide/workflows/_add-new-workflows.qmd index 5f51e45f9a..50f82770a1 100644 --- a/site/guide/workflows/_add-new-workflows.qmd +++ b/site/guide/workflows/_add-new-workflows.qmd @@ -20,7 +20,7 @@ d. Select the [workflow target]{.smallcaps} type to add: #### Add record workflows -i. Enter in a **[title]{.smallcaps}** and a **[description]{.smallcaps}** the workflow. +i. Enter a **[title]{.smallcaps}** and a **[description]{.smallcaps}** for the workflow. ii. Select the **[record type]{.smallcaps}**^[[Manage record types](/guide/inventory/manage-inventory-record-types.qmd)] this workflow applies to. @@ -37,14 +37,14 @@ v. Click **Save Draft** to save your blank workflow, and then [configure your wo #### Add artifact workflows -i. Enter in a **[title]{.smallcaps}** and a **[description]{.smallcaps}** the workflow. +i. Enter a **[title]{.smallcaps}** and a **[description]{.smallcaps}** for the workflow. ii. Select the **[artifact type]{.smallcaps}**^[[Manage artifact types](/guide/validation/manage-artifact-types.qmd)] this workflow applies to. iii. Under **[workflow start]{.smallcaps}**, select when the workflow should be initiated: - **Manually** — Start this workflow manually.^[[Initiate workflows](/guide/workflows/manage-workflows.qmd#initiate-workflows)] -- **On Artifact Registration** — Start this workflow when a artifact is logged on a record.[^on-artifact-registration] +- **On Artifact Registration** — Start this workflow when an artifact is logged on a record.[^on-artifact-registration] - **Via Webhook** — Start this workflow when a webhook event is received. iv. Select the **[artifact type]{.smallcaps}**^[[Manage artifact types](/guide/validation/manage-artifact-types.qmd)] this workflow applies to. @@ -97,7 +97,7 @@ d. Select the [workflow target]{.smallcaps} type to add: #### Add record workflows -i. Enter in a **[title]{.smallcaps}** and a **[description]{.smallcaps}** the workflow. +i. Enter a **[title]{.smallcaps}** and a **[description]{.smallcaps}** for the workflow. ii. Select the [**[record type]{.smallcaps}**](/guide/inventory/manage-inventory-record-types.qmd){target="_blank"} this workflow applies to. @@ -114,14 +114,14 @@ v. Click **Save Draft** to save your blank workflow, and then configure your wor #### Add artifact workflows -i. Enter in a **[title]{.smallcaps}** and a **[description]{.smallcaps}** the workflow. +i. Enter a **[title]{.smallcaps}** and a **[description]{.smallcaps}** for the workflow. ii. Select the [**[artifact type]{.smallcaps}**](/guide/validation/manage-artifact-types.qmd){target="_blank"} this workflow applies to. iii. Under **[workflow start]{.smallcaps}**, select when the workflow should be initiated: - **Manually** — Start this workflow manually. -- **On Artifact Registration** — Start this workflow when a artifact is logged on a record. +- **On Artifact Registration** — Start this workflow when an artifact is logged on a record. iv. Select the **[artifact type]{.smallcaps}** diff --git a/site/guide/workflows/_conditional-requirements.qmd b/site/guide/workflows/_conditional-requirements.qmd index 179079df7d..73e8597962 100644 --- a/site/guide/workflows/_conditional-requirements.qmd +++ b/site/guide/workflows/_conditional-requirements.qmd @@ -46,7 +46,7 @@ Conditional requirements are required or optional for the following step types:^ - (Optional) Present these requested fields to users in steps: i. Click **{{< fa plus >}} Add Step** to add a step. - ii. On the Add Step modal that appears, enter in the step **[title]{.smallcaps}** and provide an optional **[description]{.smallcaps}**. + ii. On the Add Step modal that appears, enter the step **[title]{.smallcaps}** and provide an optional **[description]{.smallcaps}**. iii. Click **Save** to insert the step. iv. Drag request fields into a step to add them to that step. @@ -122,7 +122,7 @@ a. Set the **[record field]{.smallcaps}**^[[Manage inventory fields](/guide/inve b. Select a **[time delta direction]{.smallcaps}** relative to your selected field. -c. Enter in a **[wait duration]{.smallcaps}** in minutes, hours, days, or months for the delta. +c. Enter a **[wait duration]{.smallcaps}** in minutes, hours, days, or months for the delta. ::: @@ -134,7 +134,7 @@ c. Enter in a **[wait duration]{.smallcaps}** in minutes, hours, days, or months #### Request timeout -- Enter in a request timeout in seconds under **[timeout (seconds)]{.smallcaps}**. +- Enter a request timeout in seconds under **[timeout (seconds)]{.smallcaps}**. - Max 300 seconds, or enter `0` to disable timeout. @@ -152,7 +152,7 @@ Toggle the following request options on or off: #### Request headers -Enter in optional headers to include with your request: +Enter optional headers to include with your request: 1. Click **{{< fa plus >}} Add Header** under [headers]{.smallcaps} to enter a header. diff --git a/site/guide/workflows/_workflow-states.qmd b/site/guide/workflows/_workflow-states.qmd index aba2c05568..1604c17eaf 100644 --- a/site/guide/workflows/_workflow-states.qmd +++ b/site/guide/workflows/_workflow-states.qmd @@ -5,7 +5,7 @@ SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> :::: {.content-visible unless-format="revealjs"} ### Add workflow states -Add workflow states by either while configuring a **{{< fa wifi >}} Workflow State Change** step,^[[Configure workflow steps](/guide/workflows/configure-workflows.qmd#configure-workflow-steps)] or via the **{{< fa gear >}} Settings** page: +Add workflow states either while configuring a **{{< fa wifi >}} Workflow State Change** step,^[[Configure workflow steps](/guide/workflows/configure-workflows.qmd#configure-workflow-steps)] or via the **{{< fa gear >}} Settings** page: 1. In the left sidebar, click **{{< fa gear >}} Settings**. @@ -20,7 +20,7 @@ Add workflow states by either while configuring a **{{< fa wifi >}} Workflow Sta 1. Click **{{< fa plus >}} Add Workflow State**. -1. Enter in a **[name]{.smallcaps}** and a **[description]{.smallcaps}**, then select a **[color]{.smallcaps}** for your workflow state. +1. Enter a **[name]{.smallcaps}** and a **[description]{.smallcaps}**, then select a **[color]{.smallcaps}** for your workflow state. 1. Click **Save** to create the workflow state. @@ -74,6 +74,6 @@ If a state is or was previously in use on a workflow within a {{< fa wifi >}} W :::: {.content-hidden unless-format="revealjs"} Workflow states are statuses unique to a specific workflow, discrete from record stages. -Add workflow states by either while configuring a **{{< fa wifi >}} Workflow State Change** step, or [via the **{{< fa gear >}} Settings** page](/guide/workflows/workflow-states.qmd){target="blank"}. +Add workflow states either while configuring a **{{< fa wifi >}} Workflow State Change** step, or [via the **{{< fa gear >}} Settings** page](/guide/workflows/workflow-states.qmd){target="_blank"}. :::: \ No newline at end of file diff --git a/site/guide/workflows/_workflow-step-types.qmd b/site/guide/workflows/_workflow-step-types.qmd index 5672da6414..dd5cffe0f8 100644 --- a/site/guide/workflows/_workflow-step-types.qmd +++ b/site/guide/workflows/_workflow-step-types.qmd @@ -65,7 +65,7 @@ Display a preconfigured message to users in the selected **[approval group]{.sma |---:|---| | [approval group]{.smallcaps} | Select the roles, stakeholders, or user fields responsible for approval. | | [fields to review]{.smallcaps} (optional) | Select the record or artifact fields shown to approvers for read-only review.[^approval-2] | -| Approval Message | When your workflow reaches this {{< fa users >}} Approval step, the selected [approval group]{.smallcaps} is shown this message.^[Enter in a **[title]{.smallcaps}** and a **[message]{.smallcaps}** to display.] | +| Approval Message | When your workflow reaches this {{< fa users >}} Approval step, the selected [approval group]{.smallcaps} is shown this message.^[Enter a **[title]{.smallcaps}** and a **[message]{.smallcaps}** to display.] | : **{{< fa users >}} Approval** step configuration {.hover tbl-colwidths="[35,65]"} ### {{< fa bullhorn >}} Broadcast @@ -94,7 +94,7 @@ Workflows cannot be saved until condition branches are connected to other steps. To configure a condition branch: 1. On the Configure Condition Branch modal, click **{{< fa plus >}} Add Branch**. -2. Enter in the **[path name]{.smallcaps}** and designate the **[conditions]{.smallcaps}**^[[Conditional step requirements](/guide/workflows/conditional-step-requirements.qmd#condition-branch-required)] that apply to this path. +2. Enter the **[path name]{.smallcaps}** and designate the **[conditions]{.smallcaps}**^[[Conditional step requirements](/guide/workflows/conditional-step-requirements.qmd#condition-branch-required)] that apply to this path. 3. Continue with steps 1 and 2 until your conditional branch logic is complete. To remove a path, click **{{< fa ellipsis-vertical >}}** and select **{{< fa trash-can >}} Remove Path**. @@ -166,9 +166,9 @@ Sends a HTTP request with optional additional conditions.^[[Conditional step req | Field | Description | |---:|---| -| [url]{.smallcaps} | Enter in the URL to send the HTTP request to. | +| [url]{.smallcaps} | Enter the URL to send the HTTP request to. | | [method]{.smallcaps} | Select the HTTP request method: `GET`, `POST`, `PUT`, `DELETE`[^request-types] | -| [timeout (seconds)]{.smallcaps} | Enter in a request timeout in seconds.[^request-timeout] | +| [timeout (seconds)]{.smallcaps} | Enter a request timeout in seconds.[^request-timeout] | | [fail on non-]{.smallcaps}[2]{.smallercaps}[xx]{.smallcaps} (optional) | Toggle whether or not the request will be considered failed if the response status code is not in the `2xx` range. | | [allow invalid certificates]{.smallcaps} (optional) | Toggle whether or not the request will be allowed to use invalid certificates. | | [follow redirects]{.smallcaps} (optional) | Toggle whether or not the request will follow redirects. | @@ -218,7 +218,7 @@ Sends a HTTP request with optional additional conditions.^[[Conditional step req [^request-types]: `PUT` and `POST` requests have additional configuration fields: - **[body type]{.smallcaps}** — Select whether the body is `JSON` or `Text`. - - **[body]{.smallcaps}** — Enter in your payload. + - **[body]{.smallcaps}** — Enter your payload. [^request-timeout]: Max 300 seconds, or enter `0` to disable timeout. diff --git a/site/guide/workflows/configure-workflows.qmd b/site/guide/workflows/configure-workflows.qmd index 30f1aef081..6ef4eb217b 100644 --- a/site/guide/workflows/configure-workflows.qmd +++ b/site/guide/workflows/configure-workflows.qmd @@ -120,7 +120,7 @@ a. Make your desired changes to step configuration[^9] and step relationships[^1 b. When you are finished, click **Save New Version** to apply your changes. -c. Enter in your **[version notes]{.smallcaps}** to describe your changes. +c. Enter your **[version notes]{.smallcaps}** to describe your changes. ### Delete workflow steps diff --git a/site/guide/workflows/manage-workflows.qmd b/site/guide/workflows/manage-workflows.qmd index 5c40547464..026d027a1c 100644 --- a/site/guide/workflows/manage-workflows.qmd +++ b/site/guide/workflows/manage-workflows.qmd @@ -97,7 +97,7 @@ To adjust the expected end date for a workflow: 5. On the workflow's detail modal, click on the **{{< fa ellipsis-vertical >}}** in the top-right hand corner and select **{{< fa calendar >}} Edit Expected End Date**. -6. Enter in the new [expected end date]{.smallcaps} for the workflow. +6. Enter the new [expected end date]{.smallcaps} for the workflow. 7. Click **Save Expected End Date** to apply the new date. @@ -109,7 +109,7 @@ To adjust the expected end date for a workflow: 3. On the workflow's detail modal, click on the **{{< fa ellipsis-vertical >}}** in the top-right hand corner and select **{{< fa calendar >}} Edit Expected End Date**. -4. Enter in the new [expected end date]{.smallcaps} for the workflow. +4. Enter the new [expected end date]{.smallcaps} for the workflow. 5. Click **Save Expected End Date** to apply the new date. diff --git a/site/guide/workflows/workflow-configuration-examples.qmd b/site/guide/workflows/workflow-configuration-examples.qmd index 8eb0a26993..770d0c5128 100644 --- a/site/guide/workflows/workflow-configuration-examples.qmd +++ b/site/guide/workflows/workflow-configuration-examples.qmd @@ -44,7 +44,7 @@ This workflow is initiated manually[^4] — in this case via {{< fa arrow-right- This workflow is initiated when a field is populated — in this case, when the model is slated for deployment by entering value into the [deployment scheduled]{.smallcaps} date time field.[^8] -- The workflow will wait until the timestamp indicated in the scheduled deployment date before revealing the next available action in the workflow — in this case, the option to deploy the model and enter in a concrete date the model was initially pushed to production. +- The workflow will wait until the timestamp indicated in the scheduled deployment date before revealing the next available action in the workflow — in this case, the option to deploy the model and enter a concrete date the model was initially pushed to production. - After a model is deployed via this workflow, an email notification is sent to users notifying them of the completed implementation. - Actions on this workflow are linked both to a transition in model stage,[^9] as well as workflow state.[^10] diff --git a/site/llm/chatbot-product-map.md b/site/llm/chatbot-product-map.md index d1361fcfac..c2abf3e499 100644 --- a/site/llm/chatbot-product-map.md +++ b/site/llm/chatbot-product-map.md @@ -452,7 +452,7 @@ **Docs (related):** - `/faq/faq-workflows.html` - - Sections: Can I customize workflows within }?; What statuses are available for use in workflows?; Can we work with disconnected workflows?; You can also leverage the } once you are ready to document a specific model for review and validation.; Learn more + - Sections: Can I customize workflows within }?; What statuses are available for use in workflows?; Can we work with disconnected workflows?; You can also leverage the } once you are ready to document a specific record (model) for review and validation.; Learn more - `/guide/integrations/integrations-examples/use-webhooks-with-workflows.html` - Sections: Prerequisites; Start a workflow via webhook; a. Configure workflow in }; b. Start workflow from external system; Trigger a paused workflow to continue; a. Configure workflow in }; b. Trigger workflow to continue from external system - `/guide/workflows/conditional-step-requirements.html` @@ -474,7 +474,7 @@ **Docs (related):** - `/faq/faq-workflows.html` - - Sections: Can I customize workflows within }?; What statuses are available for use in workflows?; Can we work with disconnected workflows?; You can also leverage the } once you are ready to document a specific model for review and validation.; Learn more + - Sections: Can I customize workflows within }?; What statuses are available for use in workflows?; Can we work with disconnected workflows?; You can also leverage the } once you are ready to document a specific record (model) for review and validation.; Learn more - `/guide/integrations/integrations-examples/use-webhooks-with-workflows.html` - Sections: Prerequisites; Start a workflow via webhook; a. Configure workflow in }; b. Start workflow from external system; Trigger a paused workflow to continue; a. Configure workflow in }; b. Trigger workflow to continue from external system - `/guide/workflows/conditional-step-requirements.html` @@ -561,7 +561,7 @@ **Docs (related):** - `/faq/faq-workflows.html` - - Sections: Can I customize workflows within }?; What statuses are available for use in workflows?; Can we work with disconnected workflows?; You can also leverage the } once you are ready to document a specific model for review and validation.; Learn more + - Sections: Can I customize workflows within }?; What statuses are available for use in workflows?; Can we work with disconnected workflows?; You can also leverage the } once you are ready to document a specific record (model) for review and validation.; Learn more - `/guide/integrations/integrations-examples/use-webhooks-with-workflows.html` - Sections: Prerequisites; Start a workflow via webhook; a. Configure workflow in }; b. Start workflow from external system; Trigger a paused workflow to continue; a. Configure workflow in }; b. Trigger workflow to continue from external system - `/guide/workflows/conditional-step-requirements.html` diff --git a/site/notebooks.zip b/site/notebooks.zip index 3706a851af..f505f25d90 100644 Binary files a/site/notebooks.zip and b/site/notebooks.zip differ diff --git a/site/notebooks/EXECUTED/development/1-set_up_validmind.ipynb b/site/notebooks/EXECUTED/development/1-set_up_validmind.ipynb index 0c8316c27d..9ba5431049 100644 --- a/site/notebooks/EXECUTED/development/1-set_up_validmind.ipynb +++ b/site/notebooks/EXECUTED/development/1-set_up_validmind.ipynb @@ -1,477 +1,481 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "3bd9bc41", - "metadata": {}, - "source": [ - "# ValidMind for development 1 — Set up the ValidMind Library\n", - "\n", - "Learn how to use ValidMind for your end-to-end documentation process based on common development scenarios with our series of four introductory notebooks. This first notebook walks you through the initial setup of the ValidMind Library.\n", - "\n", - "These notebooks use a binary classification model as an example, but the same principles shown here apply to other record (model) types.\n", - "\n", - "
Learn by doing\n", - "

\n", - "Our course tailor-made for developers new to ValidMind combines this series of notebooks with more a more in-depth introduction to the ValidMind Platform — Developer Fundamentals
" - ] + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# ValidMind for development 1 — Set up the ValidMind Library\n", + "\n", + "Learn how to use ValidMind for your end-to-end documentation process based on common development scenarios with our series of four introductory notebooks. This first notebook walks you through the initial setup of the ValidMind Library.\n", + "\n", + "These notebooks use a binary classification model as an example, but the same principles shown here apply to other record (model) types.\n", + "\n", + "
Learn by doing\n", + "

\n", + "Our course tailor-made for developers new to ValidMind combines this series of notebooks with more a more in-depth introduction to the ValidMind Platform — Developer Fundamentals
" + ], + "id": "3bd9bc41" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [Introduction](#toc1__) \n", + "- [About ValidMind](#toc2__) \n", + " - [Before you begin](#toc2_1__) \n", + " - [New to ValidMind?](#toc2_2__) \n", + " - [Key concepts](#toc2_3__) \n", + "- [Setting up](#toc3__) \n", + " - [Install the ValidMind Library](#toc3_1__) \n", + " - [Initialize the ValidMind Library](#toc3_2__) \n", + " - [Register sample model](#toc3_2_1__) \n", + " - [Apply documentation template](#toc3_2_2__) \n", + " - [Get your code snippet](#toc3_2_3__) \n", + "- [Getting to know ValidMind](#toc4__) \n", + " - [Preview the documentation template](#toc4_1__) \n", + " - [View documentation in the ValidMind Platform](#toc4_1_1__) \n", + " - [Explore available tests](#toc4_2__) \n", + "- [Upgrade ValidMind](#toc5__) \n", + "- [In summary](#toc6__) \n", + "- [Next steps](#toc7__) \n", + " - [Start the model development process](#toc7_1__) \n", + "\n", + ":::\n", + "\n", + "" + ], + "id": "b4b7c002" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Introduction\n", + "\n", + "Development aims to produce a fit-for-purpose *champion* by conducting thorough testing and analysis, supporting the capabilities of the champion with evidence in the form of documentation and test results. Documentation should be clear and comprehensive, ideally following a structure or template covering all aspects of compliance with risk regulation.\n", + "\n", + "A *binary classification model* is a type of predictive model used in churn analysis to identify customers who are likely to leave a service or subscription by analyzing various behavioral, transactional, and demographic factors.\n", + "\n", + "- This model helps businesses take proactive measures to retain at-risk customers by offering personalized incentives, improving customer service, or adjusting pricing strategies.\n", + "- Effective validation of a churn prediction model ensures that businesses can accurately identify potential churners, optimize retention efforts, and enhance overall customer satisfaction while minimizing revenue loss." + ], + "id": "7b7de259" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." + ], + "id": "b68b9958" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." + ], + "id": "3b520a7e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "
For access to all features available in this notebook, you'll need access to a ValidMind account.\n", + "

\n", + "Register with ValidMind
" + ], + "id": "9b3108db" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ], + "id": "f97d4266" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Setting up" + ], + "id": "bf5cd6c2" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "
Recommended Python versions\n", + "

\n", + "Python 3.8 <= x <= 3.14
\n", + "\n", + "To install the library:" + ], + "id": "95bf9e4b" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [], + "id": "827eb6bd" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Initialize the ValidMind Library\n", + "\n", + "The ValidMind Library provides a rich collection of documentation tools and test suites, from documenting descriptions of datasets to validation and testing using a variety of open-source testing frameworks." + ], + "id": "ad74254d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ], + "id": "a48cd34d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ], + "id": "8ad7e39a" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ], + "id": "3339f683" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "a58d951f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Getting to know ValidMind" + ], + "id": "61a021f3" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ], + "id": "852db20d" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [], + "id": "819a40bc" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### View documentation in the ValidMind Platform\n", + "\n", + "Next, let's head to the ValidMind Platform to see the template in action:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, navigate to **Inventory** and select the model you registered for this \"ValidMind for development\" series of notebooks.\n", + "\n", + "3. Click **Development** under Documents for your model and note how the structure of the documentation matches our preview above." + ], + "id": "65ed2873" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Explore available tests\n", + "\n", + "Next, let's explore the list of all available tests in the ValidMind Library with [the `vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) — we'll learn how to run tests shortly. \n", + "\n", + "You can see that the documentation template for this model has references to some of the **test `ID`s used to run tests listed below:**" + ], + "id": "cdbb94d2" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tests()" + ], + "execution_count": null, + "outputs": [], + "id": "7ccc7776" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "
After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.
\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ], + "id": "786f0d9c" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [], + "id": "f5d3216d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ], + "id": "d2010ad4" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ], + "id": "b637c5c6" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## In summary\n", + "\n", + "In this first notebook, you learned how to:\n", + "\n", + "- [x] Register a record (model) within the ValidMind Platform\n", + "- [x] Install and initialize the ValidMind Library\n", + "- [x] Preview the documentation template for your model\n", + "- [x] Explore the available tests offered by the ValidMind Library" + ], + "id": "dfef8925" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Next steps" + ], + "id": "186bee4f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Start the development process\n", + "\n", + "Now that the ValidMind Library is connected to your model in the ValidMind Library with the correct template applied, we can go ahead and start the development process: **[2 — Start the development process](2-start_development_process.ipynb)**" + ], + "id": "7dbb07a1" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.
\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.
\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
" + ], + "id": "copyright-63fcb66be39b42d38ad874a72a66581b" + } + ], + "metadata": { + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "name": "python", + "version": "3.10.13" + } }, - { - "cell_type": "markdown", - "id": "b4b7c002", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [Introduction](#toc1__) \n", - "- [About ValidMind](#toc2__) \n", - " - [Before you begin](#toc2_1__) \n", - " - [New to ValidMind?](#toc2_2__) \n", - " - [Key concepts](#toc2_3__) \n", - "- [Setting up](#toc3__) \n", - " - [Install the ValidMind Library](#toc3_1__) \n", - " - [Initialize the ValidMind Library](#toc3_2__) \n", - " - [Register sample model](#toc3_2_1__) \n", - " - [Apply documentation template](#toc3_2_2__) \n", - " - [Get your code snippet](#toc3_2_3__) \n", - "- [Getting to know ValidMind](#toc4__) \n", - " - [Preview the documentation template](#toc4_1__) \n", - " - [View documentation in the ValidMind Platform](#toc4_1_1__) \n", - " - [Explore available tests](#toc4_2__) \n", - "- [Upgrade ValidMind](#toc5__) \n", - "- [In summary](#toc6__) \n", - "- [Next steps](#toc7__) \n", - " - [Start the model development process](#toc7_1__) \n", - "\n", - ":::\n", - "\n", - "" - ] - }, - { - "cell_type": "markdown", - "id": "7b7de259", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Introduction\n", - "\n", - "Development aims to produce a fit-for-purpose *champion* by conducting thorough testing and analysis, supporting the capabilities of the champion with evidence in the form of documentation and test results. Documentation should be clear and comprehensive, ideally following a structure or template covering all aspects of compliance with risk regulation.\n", - "\n", - "A *binary classification model* is a type of predictive model used in churn analysis to identify customers who are likely to leave a service or subscription by analyzing various behavioral, transactional, and demographic factors.\n", - "\n", - "- This model helps businesses take proactive measures to retain at-risk customers by offering personalized incentives, improving customer service, or adjusting pricing strategies.\n", - "- Effective validation of a churn prediction model ensures that businesses can accurately identify potential churners, optimize retention efforts, and enhance overall customer satisfaction while minimizing revenue loss." - ] - }, - { - "cell_type": "markdown", - "id": "b68b9958", - "metadata": {}, - "source": [ - "\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." - ] - }, - { - "cell_type": "markdown", - "id": "3b520a7e", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." - ] - }, - { - "cell_type": "markdown", - "id": "9b3108db", - "metadata": {}, - "source": [ - "\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "
For access to all features available in this notebook, you'll need access to a ValidMind account.\n", - "

\n", - "Register with ValidMind
" - ] - }, - { - "cell_type": "markdown", - "id": "f97d4266", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "id": "bf5cd6c2", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "id": "95bf9e4b", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "
Recommended Python versions\n", - "

\n", - "Python 3.8 <= x <= 3.14
\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "827eb6bd", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "id": "ad74254d", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Initialize the ValidMind Library\n", - "\n", - "The ValidMind Library provides a rich collection of documentation tools and test suites, from documenting descriptions of datasets to validation and testing using a variety of open-source testing frameworks." - ] - }, - { - "cell_type": "markdown", - "id": "a48cd34d", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "id": "8ad7e39a", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "id": "3339f683", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "a58d951f", - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "61a021f3", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Getting to know ValidMind" - ] - }, - { - "cell_type": "markdown", - "id": "852db20d", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "819a40bc", - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "id": "65ed2873", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### View documentation in the ValidMind Platform\n", - "\n", - "Next, let's head to the ValidMind Platform to see the template in action:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, navigate to **Inventory** and select the model you registered for this \"ValidMind for development\" series of notebooks.\n", - "\n", - "3. Click **Development** under Documents for your model and note how the structure of the documentation matches our preview above." - ] - }, - { - "cell_type": "markdown", - "id": "cdbb94d2", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Explore available tests\n", - "\n", - "Next, let's explore the list of all available tests in the ValidMind Library with [the `vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) — we'll learn how to run tests shortly. \n", - "\n", - "You can see that the documentation template for this model has references to some of the **test `ID`s used to run tests listed below:**" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "7ccc7776", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tests()" - ] - }, - { - "cell_type": "markdown", - "id": "786f0d9c", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "
After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.
\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "f5d3216d", - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "id": "d2010ad4", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "id": "b637c5c6", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "dfef8925", - "metadata": {}, - "source": [ - "\n", - "\n", - "## In summary\n", - "\n", - "In this first notebook, you learned how to:\n", - "\n", - "- [x] Register a record (model) within the ValidMind Platform\n", - "- [x] Install and initialize the ValidMind Library\n", - "- [x] Preview the documentation template for your model\n", - "- [x] Explore the available tests offered by the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "id": "186bee4f", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Next steps" - ] - }, - { - "cell_type": "markdown", - "id": "7dbb07a1", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Start the development process\n", - "\n", - "Now that the ValidMind Library is connected to your model in the ValidMind Library with the correct template applied, we can go ahead and start the development process: **[2 — Start the development process](2-start_development_process.ipynb)**" - ] - }, - { - "cell_type": "markdown", - "id": "copyright-63fcb66be39b42d38ad874a72a66581b", - "metadata": {}, - "source": [ - "\n", - "\n", - "\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.
\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.
\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" - }, - "language_info": { - "name": "python", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 5 + "nbformat": 4, + "nbformat_minor": 5 } diff --git a/site/notebooks/EXECUTED/development/2-start_development_process.ipynb b/site/notebooks/EXECUTED/development/2-start_development_process.ipynb index 51fd1724ab..4016e2a97a 100644 --- a/site/notebooks/EXECUTED/development/2-start_development_process.ipynb +++ b/site/notebooks/EXECUTED/development/2-start_development_process.ipynb @@ -10,7 +10,7 @@ "\n", "You'll become familiar with the individual tests available in ValidMind, as well as how to run them and change parameters as necessary. Using ValidMind's repository of individual tests as building blocks helps you ensure that a record (model) is being built appropriately.\n", "\n", - "**For a full list of out-of-the-box tests and descriptions,** use the interactive [Test sandbox](https://docs.validmind.ai/developer/how-to/test-sandbox.html).\n", + "**For a full list of out-of-the-box tests and descriptions,** use the interactive [ValidMind test sandbox](https://docs.validmind.ai/developer/how-to/test-sandbox.html).\n", "\n", "
Learn by doing\n", "

\n", @@ -329,7 +329,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The output above shows that the class imbalance test did not pass according to the value we set for `min_percent_threshold`.\n", + "The output above shows that the `validmind.data_validation.ClassImbalance` test did not pass according to the value we set for `min_percent_threshold`.\n", "\n", "To address this issue, we'll re-run the test on some processed data. In this case let's apply a very simple rebalancing technique to the dataset:\n" ] diff --git a/site/notebooks/EXECUTED/validation/1-set_up_validmind_for_validation.ipynb b/site/notebooks/EXECUTED/validation/1-set_up_validmind_for_validation.ipynb index 6f8d378ceb..feda59a354 100644 --- a/site/notebooks/EXECUTED/validation/1-set_up_validmind_for_validation.ipynb +++ b/site/notebooks/EXECUTED/validation/1-set_up_validmind_for_validation.ipynb @@ -1,523 +1,533 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "821a881e", - "metadata": {}, - "source": [ - "# ValidMind for validation 1 — Set up the ValidMind Library for validation\n", - "\n", - "Learn how to use ValidMind for your end-to-end validation process based on common scenarios with our series of four introductory notebooks. In this first notebook, set up the ValidMind Library in preparation for validating a champion.\n", - "\n", - "These notebooks use a binary classification model as an example, but the same principles shown here apply to other record (model) types.\n", - "\n", - "
Learn by doing\n", - "

\n", - "Our course tailor-made for validators new to ValidMind combines this series of notebooks with more a more in-depth introduction to the ValidMind Platform — Validator Fundamentals
" - ] + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# ValidMind for validation 1 — Set up the ValidMind Library for validation\n", + "\n", + "Learn how to use ValidMind for your end-to-end validation process based on common scenarios with our series of four introductory notebooks. In this first notebook, set up the ValidMind Library in preparation for validating a champion.\n", + "\n", + "These notebooks use a binary classification model as an example, but the same principles shown here apply to other record (model) types.\n", + "\n", + "
Learn by doing\n", + "

\n", + "Our course tailor-made for validators new to ValidMind combines this series of notebooks with more a more in-depth introduction to the ValidMind Platform — Validator Fundamentals
" + ], + "id": "821a881e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [Introduction](#toc1__) \n", + "- [About ValidMind](#toc2__) \n", + " - [Before you begin](#toc2_1__) \n", + " - [New to ValidMind?](#toc2_2__) \n", + " - [Key concepts](#toc2_3__) \n", + "- [Setting up](#toc3__) \n", + " - [Register a sample model](#toc3_1__) \n", + " - [Assign validator credentials](#toc3_1_1__) \n", + " - [Apply documentation template](#toc3_1_2__) \n", + " - [Apply validation report template](#toc3_1_3__) \n", + " - [Install the ValidMind Library](#toc3_2__) \n", + " - [Initialize the ValidMind Library](#toc3_3__) \n", + " - [Get your code snippet](#toc3_3_1__) \n", + "- [Getting to know ValidMind](#toc4__) \n", + " - [Preview the validation report template](#toc4_1__) \n", + " - [View validation report in the ValidMind Platform](#toc4_1_1__) \n", + " - [Explore available tests](#toc4_2__) \n", + "- [Upgrade ValidMind](#toc5__) \n", + "- [In summary](#toc6__) \n", + "- [Next steps](#toc7__) \n", + " - [Start the validation process](#toc7_1__) \n", + "\n", + ":::\n", + "\n", + "" + ], + "id": "19ea797c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Introduction\n", + "\n", + "Validation aims to independently assess the compliance of *champions* created by developers with regulatory guidance by conducting thorough testing and analysis, potentially including the use of challengers to benchmark performance. Assessments, presented in the form of a validation report, typically include *artifacts (findings)* and recommendations to address those issues.\n", + "\n", + "A *binary classification model* is a type of predictive model used in churn analysis to identify customers who are likely to leave a service or subscription by analyzing various behavioral, transactional, and demographic factors.\n", + "\n", + "- This model helps businesses take proactive measures to retain at-risk customers by offering personalized incentives, improving customer service, or adjusting pricing strategies.\n", + "- Effective validation of a churn prediction model ensures that businesses can accurately identify potential churners, optimize retention efforts, and enhance overall customer satisfaction while minimizing revenue loss." + ], + "id": "d624f88d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate comparison and other validation tests, and then use the ValidMind Platform to submit compliance assessments of champions via comprehensive validation reports. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and developers." + ], + "id": "4fb1ef5a" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." + ], + "id": "594f9fd4" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "
For access to all features available in this notebook, you'll need access to a ValidMind account.\n", + "

\n", + "Register with ValidMind
" + ], + "id": "262ed111" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**validation report:** A validation report is a comprehensive and structured review evaluating a record's accuracy, performance, and suitability for its intended purpose. A report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards — encompassing the process of risk assessment, identifying areas of potential error or risk within the record's components, supporting transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**validation report template**: A default ValidMind document template that serves as a standardized framework for conducting and documenting validation, including sections designated for attaching test results, evidence, or artifacts (findings). By outlining required documentation, recommended analyses, and expected validation tests, validation report templates ensure consistency and completeness across validation reports and help guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", + "\n", + "**artifacts (findings)**: Observations or issues identified during validation, including any deviations from expected performance or standards. Artifacts are organized by type — default types provided by ValidMind include Validation Issue, Policy Exception, and Limitation. Custom artifact types can be created to track other categories relevant to your organization.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ], + "id": "0eb67fe9" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Setting up" + ], + "id": "e0e1cf3d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Register a sample model\n", + "\n", + "In a usual lifecycle, a champion will have been independently registered in your inventory and submitted to you for validation by your development team as part of the effective challenge process. (**Learn more:** [Submit documents](https://docs.validmind.ai/guide/documentation/submit-documents.html))\n", + "\n", + "For this notebook, we'll have you register a dummy record (model) in the ValidMind Platform inventory and assign yourself as the validator to familiarize you with the ValidMind interface and circumvent the need for an existing model:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down — don’t worry, we’ll adjust these permissions next for validation.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ], + "id": "609fe59b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Assign validator credentials\n", + "\n", + "In order to log tests as a validator instead of as a developer, on the details page that appears after you've successfully registered your sample model:\n", + "\n", + "1. Remove yourself as an owner:\n", + "\n", + " - Click on the **OWNERS** tile.\n", + " - Click the **x** next to your name to remove yourself from that model's role.\n", + " - Click **Save** to apply your changes to that role.\n", + "\n", + "2. Remove yourself as a developer:\n", + "\n", + " - Click on the **DEVELOPERS** tile.\n", + " - Click the **x** next to your name to remove yourself from that model's role.\n", + " - Click **Save** to apply your changes to that role.\n", + "\n", + "3. Add yourself as a validator:\n", + "\n", + " - Click on the **VALIDATORS** tile.\n", + " - Select your name from the drop-down menu.\n", + " - Click **Save** to apply your changes to that role." + ], + "id": "58e552bb" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier for developers.\n", + "\n", + "We'll need this documentation template later for reference as we draft our validation report:\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Documentation**.\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ], + "id": "84251589" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Apply validation report template\n", + "\n", + "Next, let's select a validation report template. A template predefines sections for your report and provides a general outline to follow, making the validation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Validation**.\n", + "\n", + " If you cannot locate your Validation document, make sure Validation type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Generic Validation Report`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ], + "id": "fdfb5dc5" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "
Recommended Python versions\n", + "

\n", + "Python 3.8 <= x <= 3.14
\n", + "\n", + "To install the library:" + ], + "id": "f656d0d6" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [], + "id": "931d8f7f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Initialize the ValidMind Library" + ], + "id": "1435fd5b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Validation` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ], + "id": "b375b341" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"validation-report\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "d5d87e2d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Getting to know ValidMind" + ], + "id": "331e1c07" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Preview the validation report template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will attach evidence to this template in the form of risk assessment notes, artifacts, and test results later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library:" + ], + "id": "f6331a98" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [], + "id": "13d34bbb" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### View validation report in the ValidMind Platform\n", + "\n", + "Next, let's head to the ValidMind Platform to see the template in action:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, navigate to **Inventory** and select the model you registered for this \"ValidMind for validation\" series of notebooks.\n", + "\n", + "3. Click **Validation** under Documents for your model and note:\n", + "\n", + " - [x] The risk assessment compliance summary at the top of the report (screenshot below)\n", + " - [x] How the structure of the validation report reflects the previewed template\n", + "\n", + " \"Screenshot\n", + "

" + ], + "id": "20717133" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Explore available tests\n", + "\n", + "Next, let's explore the list of all available tests in the ValidMind Library with [the `vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) — we'll later narrow down the tests we want to run from this list when we learn to run tests." + ], + "id": "f5d0aaab" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tests()" + ], + "execution_count": null, + "outputs": [], + "id": "de6abc2a" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "
After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.
\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ], + "id": "dce47e40" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [], + "id": "10272aa9" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ], + "id": "7a0c3cc2" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ], + "id": "2dac11d5" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## In summary\n", + "\n", + "In this first notebook, you learned how to:\n", + "\n", + "- [x] Register a record (model) within the ValidMind Platform and assign yourself as the validator\n", + "- [x] Install and initialize the ValidMind Library\n", + "- [x] Preview the validation report template for your model\n", + "- [x] Explore the available tests offered by the ValidMind Library" + ], + "id": "174d2c8d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Next steps\n", + "\n", + "\n", + "\n", + "### Start the validation process\n", + "\n", + "Now that the ValidMind Library is connected to your model in the ValidMind Library with the correct template applied, we can go ahead and start the validation process: **[2 — Start the validation process](2-start_validation_process.ipynb)**" + ], + "id": "d8ffdcf7" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.
\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.
\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
" + ], + "id": "copyright-5d7a1c159e4840fca79011d1c0380725" + } + ], + "metadata": { + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "name": "python", + "version": "3.10.13" + } }, - { - "cell_type": "markdown", - "id": "19ea797c", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [Introduction](#toc1__) \n", - "- [About ValidMind](#toc2__) \n", - " - [Before you begin](#toc2_1__) \n", - " - [New to ValidMind?](#toc2_2__) \n", - " - [Key concepts](#toc2_3__) \n", - "- [Setting up](#toc3__) \n", - " - [Register a sample model](#toc3_1__) \n", - " - [Assign validator credentials](#toc3_1_1__) \n", - " - [Apply documentation template](#toc3_1_2__) \n", - " - [Apply validation report template](#toc3_1_3__) \n", - " - [Install the ValidMind Library](#toc3_2__) \n", - " - [Initialize the ValidMind Library](#toc3_3__) \n", - " - [Get your code snippet](#toc3_3_1__) \n", - "- [Getting to know ValidMind](#toc4__) \n", - " - [Preview the validation report template](#toc4_1__) \n", - " - [View validation report in the ValidMind Platform](#toc4_1_1__) \n", - " - [Explore available tests](#toc4_2__) \n", - "- [Upgrade ValidMind](#toc5__) \n", - "- [In summary](#toc6__) \n", - "- [Next steps](#toc7__) \n", - " - [Start the validation process](#toc7_1__) \n", - "\n", - ":::\n", - "\n", - "" - ] - }, - { - "cell_type": "markdown", - "id": "d624f88d", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Introduction\n", - "\n", - "Validation aims to independently assess the compliance of *champions* created by developers with regulatory guidance by conducting thorough testing and analysis, potentially including the use of challengers to benchmark performance. Assessments, presented in the form of a validation report, typically include *artifacts (findings)* and recommendations to address those issues.\n", - "\n", - "A *binary classification model* is a type of predictive model used in churn analysis to identify customers who are likely to leave a service or subscription by analyzing various behavioral, transactional, and demographic factors.\n", - "\n", - "- This model helps businesses take proactive measures to retain at-risk customers by offering personalized incentives, improving customer service, or adjusting pricing strategies.\n", - "- Effective validation of a churn prediction model ensures that businesses can accurately identify potential churners, optimize retention efforts, and enhance overall customer satisfaction while minimizing revenue loss." - ] - }, - { - "cell_type": "markdown", - "id": "4fb1ef5a", - "metadata": {}, - "source": [ - "\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate comparison and other validation tests, and then use the ValidMind Platform to submit compliance assessments of champions via comprehensive validation reports. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and developers." - ] - }, - { - "cell_type": "markdown", - "id": "594f9fd4", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." - ] - }, - { - "cell_type": "markdown", - "id": "262ed111", - "metadata": {}, - "source": [ - "\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "
For access to all features available in this notebook, you'll need access to a ValidMind account.\n", - "

\n", - "Register with ValidMind
" - ] - }, - { - "cell_type": "markdown", - "id": "0eb67fe9", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Key concepts\n", - "\n", - "**Validation report**: A comprehensive and structured assessment of a model’s development and performance, focusing on verifying its integrity, appropriateness, and alignment with its intended use. It includes analyses of model assumptions, data quality, performance metrics, outcomes of testing procedures, and risk considerations. The validation report supports transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", - "\n", - "**Validation report template**: Serves as a standardized framework for conducting and documenting model validation activities. It outlines the required sections, recommended analyses, and expected validation tests, ensuring consistency and completeness across validation reports. The template helps guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." - ] - }, - { - "cell_type": "markdown", - "id": "e0e1cf3d", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "id": "609fe59b", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Register a sample model\n", - "\n", - "In a usual lifecycle, a champion will have been independently registered in your inventory and submitted to you for validation by your development team as part of the effective challenge process. (**Learn more:** [Submit documents](https://docs.validmind.ai/guide/documentation/submit-documents.html))\n", - "\n", - "For this notebook, we'll have you register a dummy record (model) in the ValidMind Platform inventory and assign yourself as the validator to familiarize you with the ValidMind interface and circumvent the need for an existing model:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down — don’t worry, we’ll adjust these permissions next for validation.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "id": "58e552bb", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Assign validator credentials\n", - "\n", - "In order to log tests as a validator instead of as a developer, on the details page that appears after you've successfully registered your sample model:\n", - "\n", - "1. Remove yourself as an owner:\n", - "\n", - " - Click on the **OWNERS** tile.\n", - " - Click the **x** next to your name to remove yourself from that model's role.\n", - " - Click **Save** to apply your changes to that role.\n", - "\n", - "2. Remove yourself as a developer:\n", - "\n", - " - Click on the **DEVELOPERS** tile.\n", - " - Click the **x** next to your name to remove yourself from that model's role.\n", - " - Click **Save** to apply your changes to that role.\n", - "\n", - "3. Add yourself as a validator:\n", - "\n", - " - Click on the **VALIDATORS** tile.\n", - " - Select your name from the drop-down menu.\n", - " - Click **Save** to apply your changes to that role." - ] - }, - { - "cell_type": "markdown", - "id": "84251589", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier for developers.\n", - "\n", - "We'll need this documentation template later for reference as we draft our validation report:\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Documentation**.\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "id": "fdfb5dc5", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Apply validation report template\n", - "\n", - "Next, let's select a validation report template. A template predefines sections for your report and provides a general outline to follow, making the validation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Validation**.\n", - "\n", - " If you cannot locate your Validation document, make sure Validation type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Generic Validation Report`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "id": "f656d0d6", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "
Recommended Python versions\n", - "

\n", - "Python 3.8 <= x <= 3.14
\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "931d8f7f", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "id": "1435fd5b", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "id": "b375b341", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Validation` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d5d87e2d", - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"validation-report\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "331e1c07", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Getting to know ValidMind" - ] - }, - { - "cell_type": "markdown", - "id": "f6331a98", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Preview the validation report template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will attach evidence to this template in the form of risk assessment notes, artifacts, and test results later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "13d34bbb", - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "id": "20717133", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### View validation report in the ValidMind Platform\n", - "\n", - "Next, let's head to the ValidMind Platform to see the template in action:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, navigate to **Inventory** and select the model you registered for this \"ValidMind for validation\" series of notebooks.\n", - "\n", - "3. Click **Validation** under Documents for your model and note:\n", - "\n", - " - [x] The risk assessment compliance summary at the top of the report (screenshot below)\n", - " - [x] How the structure of the validation report reflects the previewed template\n", - "\n", - " \"Screenshot\n", - "

" - ] - }, - { - "cell_type": "markdown", - "id": "f5d0aaab", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Explore available tests\n", - "\n", - "Next, let's explore the list of all available tests in the ValidMind Library with [the `vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) — we'll later narrow down the tests we want to run from this list when we learn to run tests." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "de6abc2a", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tests()" - ] - }, - { - "cell_type": "markdown", - "id": "dce47e40", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "
After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.
\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "10272aa9", - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "id": "7a0c3cc2", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "id": "2dac11d5", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "174d2c8d", - "metadata": {}, - "source": [ - "\n", - "\n", - "## In summary\n", - "\n", - "In this first notebook, you learned how to:\n", - "\n", - "- [x] Register a record (model) within the ValidMind Platform and assign yourself as the validator\n", - "- [x] Install and initialize the ValidMind Library\n", - "- [x] Preview the validation report template for your model\n", - "- [x] Explore the available tests offered by the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "id": "d8ffdcf7", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Next steps\n", - "\n", - "\n", - "\n", - "### Start the validation process\n", - "\n", - "Now that the ValidMind Library is connected to your model in the ValidMind Library with the correct template applied, we can go ahead and start the validation process: **[2 — Start the validation process](2-start_validation_process.ipynb)**" - ] - }, - { - "cell_type": "markdown", - "id": "copyright-5d7a1c159e4840fca79011d1c0380725", - "metadata": {}, - "source": [ - "\n", - "\n", - "\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.
\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.
\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" - }, - "language_info": { - "name": "python", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 5 + "nbformat": 4, + "nbformat_minor": 5 } diff --git a/site/notebooks/EXECUTED/validation/2-start_validation_process.ipynb b/site/notebooks/EXECUTED/validation/2-start_validation_process.ipynb index 9547a1367e..a0d4440e6c 100644 --- a/site/notebooks/EXECUTED/validation/2-start_validation_process.ipynb +++ b/site/notebooks/EXECUTED/validation/2-start_validation_process.ipynb @@ -15,7 +15,7 @@ "- Ensuring that data used for training and testing the champion is of appropriate data quality\n", "- Ensuring that the raw data has been preprocessed appropriately and that the resulting final datasets reflects this\n", "\n", - "**For a full list of out-of-the-box tests and descriptions,** use the interactive [Test sandbox](https://docs.validmind.ai/developer/how-to/test-sandbox.html).\n", + "**For a full list of out-of-the-box tests and descriptions,** use the interactive [ValidMind test sandbox](https://docs.validmind.ai/developer/how-to/test-sandbox.html).\n", "\n", "
Learn by doing\n", "

\n", @@ -295,7 +295,7 @@ "\n", "#### Run tabular data tests\n", "\n", - "The inputs expected by a test can also be found in the test definition — let's take [`validmind.data_validation.DescriptiveStatistics`](https://docs.validmind.ai/tests/data_validation/DescriptiveStatistics.html) as an example.\n", + "The inputs expected by a test can also be found in the test definition — let's take `validmind.data_validation.DescriptiveStatistics` as an example.\n", "\n", "Note that the output of the [`describe_test()` function](https://docs.validmind.ai/validmind/validmind/tests.html#describe_test) below shows that this test expects a `dataset` as input:" ] @@ -333,7 +333,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The output above shows that [the class imbalance test](https://docs.validmind.ai/tests/data_validation/ClassImbalance.html) did not pass according to the value we set for `min_percent_threshold` — great, this matches what was reported by the development team.\n", + "The output above shows that the `validmind.data_validation.ClassImbalance` test did not pass according to the value we set for `min_percent_threshold` — great, this matches what was reported by the development team.\n", "\n", "To address this issue, we'll re-run the test on some processed data. In this case let's apply a very simple rebalancing technique to the dataset:" ] @@ -405,7 +405,7 @@ "\n", "You can utilize the output from a ValidMind test for further use — in this below example, to retrieve the list of features with the highest correlation coefficients and use them to reduce the final list of features for modeling.\n", "\n", - "First, we'll run [`validmind.data_validation.HighPearsonCorrelation`](https://docs.validmind.ai/tests/data_validation/HighPearsonCorrelation.html) with the `balanced_raw_dataset` we initialized previously as input as is for comparison with later runs:" + "First, we'll run `validmind.data_validation.HighPearsonCorrelation` with the `balanced_raw_dataset` we initialized previously as input as is for comparison with later runs:\n" ] }, { diff --git a/site/notebooks/EXECUTED/validation/3-developing_potential_challenger.ipynb b/site/notebooks/EXECUTED/validation/3-developing_potential_challenger.ipynb index 3c5db2507d..2ed29a195f 100644 --- a/site/notebooks/EXECUTED/validation/3-developing_potential_challenger.ipynb +++ b/site/notebooks/EXECUTED/validation/3-developing_potential_challenger.ipynb @@ -544,11 +544,11 @@ "source": [ "We'll isolate the specific tests we want to run in `mpt`:\n", "\n", - "- [`ClassifierPerformance`](https://docs.validmind.ai/tests/model_validation/sklearn/ClassifierPerformance.html)\n", - "- [`ConfusionMatrix`](https://docs.validmind.ai/tests/model_validation/sklearn/ConfusionMatrix.html)\n", - "- [`MinimumAccuracy`](https://docs.validmind.ai/tests/model_validation/sklearn/MinimumAccuracy.html)\n", - "- [`MinimumF1Score`](https://docs.validmind.ai/tests/model_validation/sklearn/MinimumF1Score.html)\n", - "- [`ROCCurve`](https://docs.validmind.ai/tests/model_validation/sklearn/ROCCurve.html)\n", + "- `model_validation.sklearn.ClassifierPerformance`\n", + "- `model_validation.sklearn.ConfusionMatrix`\n", + "- `model_validation.sklearn.MinimumAccuracy`\n", + "- `model_validation.sklearn.MinimumF1Score`\n", + "- `model_validation.sklearn.ROCCurve`\n", "\n", "As we learned in the previous notebook [2 — Start the model validation process](2-start_validation_process.ipynb), you can use a custom `result_id` to tag the individual result with a unique identifier by appending this `result_id` to the `test_id` with a `:` separator. We'll append an identifier for our champion model here:" ] @@ -639,7 +639,7 @@ "\n", "9. Click **Update Linked Artifacts** to insert your validation issue.\n", "\n", - "10. Confirm that validation issue you inserted has been correctly inserted into section 2.2.2. Model Performance of the report.\n", + "10. Confirm that the validation issue you inserted has been correctly inserted into section 2.2.2. Model Performance of the report.\n", "\n", "11. Click on the validation issue to expand the issue, where you can adjust details such as severity, owner, due date, status, etc. as well as include proposed remediation plans or supporting documentation as attachments." ] @@ -729,7 +729,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Let’s now assess the models for potential signs of *overfitting* and identify any sub-segments where performance may inconsistent with the [`OverfitDiagnosis` test](https://docs.validmind.ai/tests/model_validation/sklearn/OverfitDiagnosis.html).\n", + "Let’s now assess the models for potential signs of *overfitting* and identify any sub-segments where performance may inconsistent with the `model_validation.sklearn.OverfitDiagnosis` test.\n", "\n", "Overfitting occurs when a model learns the training data too well, capturing not only the true pattern but noise and random fluctuations resulting in excellent performance on the training dataset but poor generalization to new, unseen data:\n", "\n", @@ -756,7 +756,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Let's also conduct *robustness* and *stability* testing of the two models with the [`RobustnessDiagnosis` test](https://docs.validmind.ai/tests/model_validation/sklearn/RobustnessDiagnosis.html). Robustness refers to a model's ability to maintain consistent performance, and stability refers to a model's ability to produce consistent outputs over time across different data subsets.\n", + "Let's also conduct *robustness* and *stability* testing of the two models with the `model_validation.sklearn.RobustnessDiagnosis` test.\n", + "\n", + "Robustness refers to a model's ability to maintain consistent performance, and stability refers to a model's ability to produce consistent outputs over time across different data subsets.\n", "\n", "Again, we'll use both the training and testing datasets to establish baseline performance and to simulate real-world generalization:" ] diff --git a/site/notebooks/EXECUTED/validation/4-finalize_validation_reporting.ipynb b/site/notebooks/EXECUTED/validation/4-finalize_validation_reporting.ipynb index 768c569b26..32d46c6e2d 100644 --- a/site/notebooks/EXECUTED/validation/4-finalize_validation_reporting.ipynb +++ b/site/notebooks/EXECUTED/validation/4-finalize_validation_reporting.ipynb @@ -121,7 +121,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Make sure the ValidMind Library is installed\n", "\n", @@ -143,9 +145,7 @@ " # model=\"...\",\n", " document=\"validation-report\",\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -160,7 +160,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Load the sample dataset\n", "from validmind.datasets.classification import customer_churn as demo_dataset\n", @@ -170,13 +172,13 @@ ")\n", "\n", "raw_df = demo_dataset.load_data()" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Initialize the raw dataset for use in ValidMind tests\n", "vm_raw_dataset = vm.init_dataset(\n", @@ -184,13 +186,13 @@ " input_id=\"raw_dataset\",\n", " target_column=\"Exited\",\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "import pandas as pd\n", "\n", @@ -202,9 +204,7 @@ "\n", "balanced_raw_df = pd.concat([exited_df, not_exited_df])\n", "balanced_raw_df = balanced_raw_df.sample(frac=1, random_state=42)" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -215,7 +215,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Register new data and now 'balanced_raw_dataset' is the new dataset object of interest\n", "vm_balanced_raw_dataset = vm.init_dataset(\n", @@ -223,13 +225,13 @@ " input_id=\"balanced_raw_dataset\",\n", " target_column=\"Exited\",\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Run HighPearsonCorrelation test with our balanced dataset as input and return a result object\n", "corr_result = vm.tests.run_test(\n", @@ -237,46 +239,46 @@ " params={\"max_threshold\": 0.3},\n", " inputs={\"dataset\": vm_balanced_raw_dataset},\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# From result object, extract table from `corr_result.tables`\n", "features_df = corr_result.tables[0].data\n", "features_df" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Extract list of features that failed the test\n", "high_correlation_features = features_df[features_df[\"Pass/Fail\"] == \"Fail\"][\"Columns\"].tolist()\n", "high_correlation_features" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Extract feature names from the list of strings\n", "high_correlation_features = [feature.split(\",\")[0].strip(\"()\") for feature in high_correlation_features]\n", "high_correlation_features" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Remove the highly correlated features from the dataset\n", "balanced_raw_no_age_df = balanced_raw_df.drop(columns=high_correlation_features)\n", @@ -287,13 +289,13 @@ " input_id=\"raw_dataset_preprocessed\",\n", " target_column=\"Exited\",\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Re-run the test with the reduced feature set\n", "corr_result = vm.tests.run_test(\n", @@ -301,9 +303,7 @@ " params={\"max_threshold\": 0.3},\n", " inputs={\"dataset\": vm_raw_dataset_preprocessed},\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -318,20 +318,22 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Encode categorical features in the dataset\n", "balanced_raw_no_age_df = pd.get_dummies(\n", " balanced_raw_no_age_df, columns=[\"Geography\", \"Gender\"], drop_first=True\n", ")\n", "balanced_raw_no_age_df.head()" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "from sklearn.model_selection import train_test_split\n", "\n", @@ -342,13 +344,13 @@ "y_train = train_df[\"Exited\"]\n", "X_test = test_df.drop(\"Exited\", axis=1)\n", "y_test = test_df[\"Exited\"]" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Initialize the split datasets\n", "vm_train_ds = vm.init_dataset(\n", @@ -362,9 +364,7 @@ " dataset=test_df,\n", " target_column=\"Exited\",\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -379,16 +379,16 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Import the champion model\n", "import pickle as pkl\n", "\n", "with open(\"lr_model_champion.pkl\", \"rb\") as f:\n", " log_reg = pkl.load(f)" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -403,7 +403,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Import the Random Forest Classification model\n", "from sklearn.ensemble import RandomForestClassifier\n", @@ -416,9 +418,7 @@ "\n", "# Train the model\n", "rf_model.fit(X_train, y_train)" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -433,7 +433,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Initialize the champion logistic regression model\n", "vm_log_model = vm.init_model(\n", @@ -446,13 +448,13 @@ " rf_model,\n", " input_id=\"rf_model\",\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Assign predictions to Champion — Logistic regression model\n", "vm_train_ds.assign_predictions(model=vm_log_model)\n", @@ -461,9 +463,7 @@ "# Assign predictions to Challenger — Random forest classification model\n", "vm_train_ds.assign_predictions(model=vm_rf_model)\n", "vm_test_ds.assign_predictions(model=vm_rf_model)" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -509,7 +509,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "from sklearn import metrics\n", @@ -523,9 +525,7 @@ " confusion_matrix=confusion_matrix, display_labels=[False, True]\n", ")\n", "cm_display.plot()" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -544,7 +544,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "@vm.test(\"my_custom_tests.ConfusionMatrix\")\n", "def confusion_matrix(dataset, model):\n", @@ -572,9 +574,7 @@ " plt.close() # close the plot to avoid displaying it\n", "\n", " return cm_display.figure_ # return the figure object itself" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -585,7 +585,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Champion train and test\n", "vm.tests.run_test(\n", @@ -595,13 +597,13 @@ " \"model\" : [vm_log_model]\n", " }\n", ").log()" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Challenger train and test\n", "vm.tests.run_test(\n", @@ -611,9 +613,7 @@ " \"model\" : [vm_rf_model]\n", " }\n", ").log()" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -637,7 +637,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "@vm.test(\"my_custom_tests.ConfusionMatrix\")\n", "def confusion_matrix(dataset, model, normalize=False):\n", @@ -668,9 +670,7 @@ " plt.close() # close the plot to avoid displaying it\n", "\n", " return cm_display.figure_ # return the figure object itself" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -690,7 +690,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Champion with test dataset and normalize=True\n", "vm.tests.run_test(\n", @@ -701,13 +703,13 @@ " },\n", " params={\"normalize\": True}\n", ").log()" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Challenger with test dataset and normalize=True\n", "vm.tests.run_test(\n", @@ -718,9 +720,7 @@ " },\n", " params={\"normalize\": True}\n", ").log()" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -756,7 +756,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "tests_folder = \"my_tests\"\n", "\n", @@ -770,9 +772,7 @@ " # remove files and pycache\n", " if f.endswith(\".py\") or f == \"__pycache__\":\n", " os.system(f\"rm -rf {tests_folder}/{f}\")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -809,16 +809,16 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "confusion_matrix.save(\n", " # Save it to the custom tests folder we created\n", " tests_folder,\n", " imports=[\"import matplotlib.pyplot as plt\", \"from sklearn import metrics\"],\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -873,7 +873,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "from validmind.tests import LocalTestProvider\n", "\n", @@ -886,9 +888,7 @@ ")\n", "# `my_test_provider.load_test()` will be called for any test ID that starts with `my_test_provider`\n", "# e.g. `my_test_provider.ConfusionMatrix` will look for a function named `ConfusionMatrix` in `my_tests/ConfusionMatrix.py` file" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -906,7 +906,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Champion with test dataset and test provider custom test\n", "vm.tests.run_test(\n", @@ -916,13 +918,13 @@ " \"model\" : [vm_log_model]\n", " }\n", ").log()" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Challenger with test dataset and test provider custom test\n", "vm.tests.run_test(\n", @@ -932,9 +934,7 @@ " \"model\" : [vm_rf_model]\n", " }\n", ").log()" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -951,7 +951,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "test_config = {\n", " # Run with the raw dataset\n", @@ -1061,9 +1063,7 @@ " 'params': {'min_threshold': 0.5}\n", " }\n", "}" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -1074,7 +1074,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "for t in test_config:\n", " print(t)\n", @@ -1094,9 +1096,7 @@ " vm.tests.run_test(t, inputs=test_config[t]['inputs']).log()\n", " except Exception as e:\n", " print(f\"Error running test {t}: {str(e)}\")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -1141,7 +1141,7 @@ "\n", "Now that you've logged all your test results and verified the work done by the development team, head to the ValidMind Platform to wrap up your validation report. Continue to work on your validation report by:\n", "\n", - "- **Inserting additional test results:** Click **Link Evidence to Report** under any section of 2. Validation in your validation report. (Learn more: [Link evidence to reports](https://docs.validmind.ai/guide/validation/assess-compliance.html#link-evidence-to-reports))\n", + "- **Inserting additional test results:** Click **Link Evidence** under any Evidence panel of 2. Validation in your validation report. (Learn more: [Link evidence to reports](https://docs.validmind.ai/guide/validation/assess-compliance.html#link-evidence-to-reports))\n", "\n", "- **Making qualitative edits to your test descriptions:** Expand any linked evidence under Validator Evidence and click **See evidence details** to review and edit the ValidMind-generated test descriptions for quality and accuracy. (Learn more: [Preparing validation reports](https://docs.validmind.ai/guide/validation/preparing-validation-reports.html#validation-overview))\n", "\n", @@ -1149,7 +1149,7 @@ "\n", "- **Adding risk assessment notes:** Click under **Risk Assessment Notes** in any validation report section to access the text editor and content editing toolbar, including an option to generate a draft with AI. Once generated, edit your ValidMind-generated test descriptions to adhere to your organization's requirements. (Learn more: [Work with content blocks](https://docs.validmind.ai/guide/documentation/work-with-content-blocks.html#content-editing-toolbar))\n", "\n", - "- **Assessing compliance:** Under the Guideline for any validation report section, click **ASSESSMENT** and select the compliance status from the drop-down menu. (Learn more: [Provide compliance assessments](https://docs.validmind.ai/guide/validation/assess-compliance.html#provide-compliance-assessments))\n", + "- **Assessing compliance:** Under the Guideline for any validation report section, click **ASSESSMENT** and select the compliance status from the drop-down menu. (Learn more: [Assign compliance assessments](https://docs.validmind.ai/guide/validation/assess-compliance.html#assign-compliance-assessments))\n", "\n", "- **Collaborate with other stakeholders:** Use the ValidMind Platform's real-time collaborative features to work seamlessly together with the rest of your organization, including developers. Propose suggested changes in the documentation, work with versioned history, and use comments to discuss specific portions of the documentation. (Learn more: [Collaborate with others](https://docs.validmind.ai/guide/documentation/collaborate-with-others.html))\n", "\n", diff --git a/site/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb b/site/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb index 7abcc885d4..1b9ab41dab 100644 --- a/site/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb +++ b/site/notebooks/how_to/data_and_datasets/dataset_inputs/configure_dataset_features.ipynb @@ -1,478 +1,484 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Configure dataset features\n", - "\n", - "When initializing a ValidMind dataset object, you can pass in a list of features to use instead of utilizing all dataset columns when running tests.\n", - "\n", - "This notebook shows how to use custom feature columns with `init_dataset`. The default behavior of `init_dataset` is to utilize all dataset columns when running tests. It is also possible to pass in a list of features to use and thus restrict computations to only those features." - ] + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Configure dataset features\n", + "\n", + "When initializing a ValidMind dataset object, you can pass in a list of features to use instead of utilizing all dataset columns when running tests.\n", + "\n", + "This notebook shows how to use custom feature columns with `init_dataset`. The default behavior of `init_dataset` is to utilize all dataset columns when running tests. It is also possible to pass in a list of features to use and thus restrict computations to only those features." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + "- [Load the sample dataset](#toc3__) \n", + " - [Initialize the training and test datasets](#toc3_1__) \n", + " - [Defining custom features](#toc3_2__) \n", + "- [Next steps](#toc4__) \n", + " - [Work with your model documentation](#toc4_1__) \n", + " - [Discover more learning resources](#toc4_2__) \n", + "- [Upgrade ValidMind](#toc5__) \n", + "\n", + ":::\n", + "\n", + "" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "
For access to all features available in this notebook, you'll need access to a ValidMind account.\n", + "

\n", + "Register with ValidMind
\n", + "\n", + "\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%pip install -q validmind" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Load the sample dataset" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%matplotlib inline\n", + "\n", + "# Import the sample dataset from the library\n", + "\n", + "from validmind.datasets.classification import customer_churn as demo_dataset\n", + "\n", + "# You can also try a different dataset with:\n", + "# from validmind.datasets.classification import taiwan_credit as demo_dataset\n", + "\n", + "df = demo_dataset.load_data()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Initialize the training and test datasets\n", + "\n", + "Before you can run a test suite, which are just a collection of tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", + "\n", + "This function takes a number of arguments:\n", + "\n", + "- `dataset` — the raw dataset that you want to analyze\n", + "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", + "- `target_column` — the name of the target column in the dataset\n", + "- `feature_columns` - the names of the feature columns in the dataset" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "feature_columns = [\n", + " \"CreditScore\",\n", + " \"Age\",\n", + " \"Tenure\",\n", + " \"Balance\",\n", + " \"NumOfProducts\",\n", + " \"HasCrCard\",\n", + " \"IsActiveMember\",\n", + " \"EstimatedSalary\",\n", + "]\n", + "\n", + "vm_dataset = vm.init_dataset(\n", + " dataset=df,\n", + " input_id=\"raw_dataset\",\n", + " target_column=demo_dataset.target_column,\n", + " feature_columns=feature_columns,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Defining custom features\n", + "\n", + "This section shows how we can define a subset of features to use when running dataset tests. Any feature that is not included in the `feature_columns` argument is omitted from the computation of the `DescriptiveStatistics` test in the examples below." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In the following example we use the `DescriptiveStatistics` test to show how the output changes when customizing features." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "1. Running a test with all the features." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "vm_dataset = vm.init_dataset(\n", + " dataset=df,\n", + " input_id=\"raw_dataset_all_features\",\n", + " target_column=demo_dataset.target_column,\n", + ")\n", + "\n", + "test = vm.tests.run_test(\n", + " test_id=\"validmind.data_validation.DescriptiveStatistics\",\n", + " inputs={\"dataset\": vm_dataset},\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "2. Running a test with a subset of features." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "vm_dataset = vm.init_dataset(\n", + " dataset=df,\n", + " input_id=\"raw_dataset_subset\",\n", + " target_column=demo_dataset.target_column,\n", + " feature_columns=[\"CreditScore\", \"Age\", \"Balance\", \"Geography\"],\n", + ")\n", + "\n", + "test = vm.tests.run_test(\n", + " test_id=\"validmind.data_validation.DescriptiveStatistics\",\n", + " inputs={\"dataset\": vm_dataset},\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "
After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.
\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%pip show validmind" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "id": "copyright-32870f8bce7f4ed0903136a69d02b421", + "metadata": {}, + "source": [ + "\n", + "\n", + "\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.
\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.
\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": ".venv", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.13" + } }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - "- [Load the sample dataset](#toc3__) \n", - " - [Initialize the training and test datasets](#toc3_1__) \n", - " - [Defining custom features](#toc3_2__) \n", - "- [Next steps](#toc4__) \n", - " - [Work with your model documentation](#toc4_1__) \n", - " - [Discover more learning resources](#toc4_2__) \n", - "- [Upgrade ValidMind](#toc5__) \n", - "\n", - ":::\n", - "\n", - "" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "
For access to all features available in this notebook, you'll need access to a ValidMind account.\n", - "

\n", - "Register with ValidMind
\n", - "\n", - "\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Load the sample dataset" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%matplotlib inline\n", - "\n", - "# Import the sample dataset from the library\n", - "\n", - "from validmind.datasets.classification import customer_churn as demo_dataset\n", - "\n", - "# You can also try a different dataset with:\n", - "# from validmind.datasets.classification import taiwan_credit as demo_dataset\n", - "\n", - "df = demo_dataset.load_data()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Initialize the training and test datasets\n", - "\n", - "Before you can run a test suite, which are just a collection of tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", - "\n", - "This function takes a number of arguments:\n", - "\n", - "- `dataset` — the raw dataset that you want to analyze\n", - "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", - "- `target_column` — the name of the target column in the dataset\n", - "- `feature_columns` - the names of the feature columns in the dataset" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "feature_columns = [\n", - " \"CreditScore\",\n", - " \"Age\",\n", - " \"Tenure\",\n", - " \"Balance\",\n", - " \"NumOfProducts\",\n", - " \"HasCrCard\",\n", - " \"IsActiveMember\",\n", - " \"EstimatedSalary\",\n", - "]\n", - "\n", - "vm_dataset = vm.init_dataset(\n", - " dataset=df,\n", - " input_id=\"raw_dataset\",\n", - " target_column=demo_dataset.target_column,\n", - " feature_columns=feature_columns,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Defining custom features\n", - "\n", - "This section shows how we can define a subset of features to use when running dataset tests. Any feature that is not included in the `feature_columns` argument is omitted from the computation of the `DescriptiveStatistics` test in the examples below." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "In the following example we use the `DescriptiveStatistics` test to show how the output changes when customizing features." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "1. Running a test with all the features." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_dataset = vm.init_dataset(\n", - " dataset=df,\n", - " input_id=\"raw_dataset_all_features\",\n", - " target_column=demo_dataset.target_column,\n", - ")\n", - "\n", - "test = vm.tests.run_test(\n", - " test_id=\"validmind.data_validation.DescriptiveStatistics\",\n", - " inputs={\"dataset\": vm_dataset},\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "2. Running a test with a subset of features." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_dataset = vm.init_dataset(\n", - " dataset=df,\n", - " input_id=\"raw_dataset_subset\",\n", - " target_column=demo_dataset.target_column,\n", - " feature_columns=[\"CreditScore\", \"Age\", \"Balance\", \"Geography\"],\n", - ")\n", - "\n", - "test = vm.tests.run_test(\n", - " test_id=\"validmind.data_validation.DescriptiveStatistics\",\n", - " inputs={\"dataset\": vm_dataset},\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "
After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.
\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-32870f8bce7f4ed0903136a69d02b421", - "metadata": {}, - "source": [ - "\n", - "\n", - "\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.
\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.
\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": ".venv", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.13" - } - }, - "nbformat": 4, - "nbformat_minor": 2 + "nbformat": 4, + "nbformat_minor": 2 } diff --git a/site/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb b/site/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb index 222c984313..a98ff348bc 100644 --- a/site/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb +++ b/site/notebooks/how_to/data_and_datasets/dataset_inputs/load_datasets_predictions.ipynb @@ -1,1067 +1,1073 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Load dataset predictions\n", - "\n", - "To enable tests to make use of predictions, you can load predictions in ValidMind dataset objects in multiple different ways.\n", - "\n", - "This interactive notebook includes the code required to load the demo dataset, preprocess the raw dataset and train a model for testing, and initialize ValidMind objects. Additionally, it offers options for loading predictions using the `assign_predictions()` function, such as loading predictions from a file, linking an existing prediction column in the dataset with a model, or allowing the ValidMind Library to run and link predictions to a model." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Preview the documentation template](#toc2_3__) \n", - "- [Load the sample dataset](#toc3__) \n", - "- [Prepocess the raw dataset](#toc4__) \n", - "- [Train models for testing](#toc5__) \n", - "- [Initialize ValidMind objects](#toc6__) \n", - " - [Initialize the ValidMind models](#toc6_1__) \n", - " - [Initialize the ValidMind datasets](#toc6_2__) \n", - "- [Options to load predictions using the ValidMind Library](#toc7__) \n", - " - [Load predictions from a file](#toc7_1__) \n", - " - [Predictions calculated outside of VM](#toc7_2__) \n", - " - [Assign predictions to the training dataset](#toc7_3__) \n", - " - [Run an example test](#toc7_4__) \n", - " - [Link an existing prediction column in the dataset with a model](#toc7_5__) \n", - " - [Link prediction column to a specific model](#toc7_5_1__) \n", - " - [Link an existing prediction column in the dataset with a model](#toc7_6__) \n", - " - [Pass `` in dataset interface](#toc7_6_1__) \n", - " - [Through `assign_predictions` interface](#toc7_6_2__) \n", - " - [Run an example test](#toc7_7__) \n", - " - [Using `predict_fn` to store multiple columns](#toc7_8__) \n", - " - [Create enhanced predict function](#toc7_8_1__) \n", - " - [Initialize model with predict function](#toc7_8_2__) \n", - " - [Assign predictions with multiple columns](#toc7_8_3__) \n", - " - [Verify multiple columns in dataset](#toc7_8_4__) \n", - "- [Next steps](#toc8__) \n", - " - [Work with your model documentation](#toc8_1__) \n", - " - [Discover more learning resources](#toc8_2__) \n", - "- [Upgrade ValidMind](#toc9__) \n", - "\n", - ":::\n", - "\n", - "" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "
For access to all features available in this notebook, you'll need access to a ValidMind account.\n", - "

\n", - "Register with ValidMind
\n", - "\n", - "\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Load the sample dataset\n", - "\n", - "The sample dataset used here is provided by the ValidMind library. To be able to use it, you need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Import the sample dataset from the library\n", - "\n", - "from validmind.datasets.classification import customer_churn as demo_dataset\n", - "\n", - "print(\n", - " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{demo_dataset.target_column}' \\n\\t• Class labels: {demo_dataset.class_labels}\"\n", - ")\n", - "\n", - "raw_df = demo_dataset.load_data()\n", - "raw_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Prepocess the raw dataset\n", - "\n", - "Preprocessing performs a number of operations to get ready for the subsequent steps:\n", - "\n", - "- Preprocess the data: Splits the DataFrame (`df`) into multiple datasets (`train_df`, `validation_df`, and `test_df`) using `demo_dataset.preprocess` to simplify preprocessing.\n", - "- Separate features and targets: Drops the target column to create feature sets (`x_train`, `x_val`) and target sets (`y_train`, `y_val`)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_df, validation_df, test_df = demo_dataset.preprocess(raw_df)\n", - "x_train = train_df.drop(demo_dataset.target_column, axis=1)\n", - "y_train = train_df[demo_dataset.target_column]\n", - "x_val = validation_df.drop(demo_dataset.target_column, axis=1)\n", - "y_val = validation_df[demo_dataset.target_column]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Train models for testing\n", - "\n", - "- Initialize XGBoost and Logistic Regression Classifiers" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from sklearn.linear_model import LogisticRegression\n", - "import xgboost\n", - "\n", - "%matplotlib inline\n", - "\n", - "xgb = xgboost.XGBClassifier(early_stopping_rounds=10)\n", - "xgb.set_params(\n", - " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", - ")\n", - "xgb.fit(\n", - " x_train,\n", - " y_train,\n", - " eval_set=[(x_val, y_val)],\n", - " verbose=False,\n", - ")\n", - "\n", - "lr = LogisticRegression(random_state=0)\n", - "lr.fit(\n", - " x_train,\n", - " y_train,\n", - ")\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Initialize ValidMind objects\n", - "\n", - "\n", - "\n", - "### Initialize the ValidMind models" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_model_xgb = vm.init_model(\n", - " xgb,\n", - " input_id=\"xgb\",\n", - ")\n", - "vm_model_lr = vm.init_model(\n", - " lr,\n", - " input_id=\"lr\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", - "\n", - "This function takes a number of arguments:\n", - "\n", - "- `dataset` — the raw dataset that you want to provide as input to tests\n", - "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", - "- `target_column` — a required argument if tests require access to true values. This is the name of the target column in the dataset\n", - "- `class_labels` — an optional value to map predicted classes to class labels\n", - "\n", - "With all datasets ready, you can now initialize the raw, training and test datasets (`raw_df`, `train_df` and `test_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_ds = vm.init_dataset(\n", - " input_id=\"raw_dataset\",\n", - " dataset=raw_df,\n", - " target_column=demo_dataset.target_column,\n", - ")\n", - "\n", - "vm_train_ds = vm.init_dataset(\n", - " input_id=\"train_dataset\",\n", - " dataset=train_df,\n", - " target_column=demo_dataset.target_column,\n", - ")\n", - "vm_test_ds = vm.init_dataset(\n", - " input_id=\"test_dataset\", dataset=test_df, target_column=demo_dataset.target_column\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Options to load predictions using the ValidMind Library\n", - "\n", - "\n", - "\n", - "### Load predictions from a file\n", - "\n", - "This creates a new column called `_prediction` in the dataset and assigns metadata to track that the `_prediction` column is linked to the model ``" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Predictions calculated outside of VM" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import pandas as pd\n", - "\n", - "train_xgb_prediction = pd.DataFrame(xgb.predict(x_train), columns=[\"xgb_prediction\"])\n", - "test__xgb_prediction = pd.DataFrame(xgb.predict(x_val), columns=[\"xgb_prediction\"])\n", - "\n", - "train_lr_prediction = pd.DataFrame(lr.predict(x_train), columns=[\"lr_prediction\"])\n", - "test_lr_prediction = pd.DataFrame(lr.predict(x_val), columns=[\"lr_prediction\"])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Assign predictions to the training dataset\n", - "\n", - "We can now use the `assign_predictions()` method from the `Dataset` object to link existing predictions to any model:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(\n", - " model=vm_model_xgb, prediction_values=train_xgb_prediction.xgb_prediction.values\n", - ")\n", - "vm_train_ds.assign_predictions(\n", - " model=vm_model_lr, prediction_values=train_lr_prediction.lr_prediction.values\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Run an example test\n", - "\n", - "Now, let's run an example test such as `MinimumAccuracy` twice to show how we're able to load the correct model predictions by using the `model` input parameter, even though we're passing the same `train_ds` dataset instance to the test:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "full_suite = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", - " inputs={\"dataset\": vm_train_ds, \"model\": vm_model_xgb},\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "full_suite = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", - " inputs={\n", - " \"dataset\": vm_train_ds,\n", - " \"model\": vm_model_lr,\n", - " },\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Link an existing prediction column in the dataset with a model\n", - "\n", - "This approach allows loading datasets that already have prediction columns in addition to feature and target columns. The ValidMind Library assigns metadata to track the predictions column that are linked to a given `` model." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_df2 = train_df.copy()\n", - "train_df2[\"xgb_prediction\"] = train_xgb_prediction.xgb_prediction.values\n", - "train_df2[\"lr_prediction\"] = train_lr_prediction.lr_prediction.values\n", - "train_df2.head(5)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "feature_columns = [\n", - " \"CreditScore\",\n", - " \"Gender\",\n", - " \"Age\",\n", - " \"Tenure\",\n", - " \"Balance\",\n", - " \"NumOfProducts\",\n", - " \"HasCrCard\",\n", - " \"IsActiveMember\",\n", - " \"EstimatedSalary\",\n", - " \"Geography_France\",\n", - " \"Geography_Germany\",\n", - " \"Geography_Spain\",\n", - "]\n", - "\n", - "vm_train_ds = vm.init_dataset(\n", - " dataset=train_df2,\n", - " input_id=\"train_dataset\",\n", - " target_column=demo_dataset.target_column,\n", - " feature_columns=feature_columns,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Link prediction column to a specific model\n", - "\n", - "The `prediction_column` parameter informs the `Dataset` object about the model that should be linked to that column." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(model=vm_model_xgb, prediction_column=\"xgb_prediction\")\n", - "vm_train_ds.assign_predictions(model=vm_model_lr, prediction_column=\"lr_prediction\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "wE0OckXjSPc7" - }, - "outputs": [], - "source": [ - "full_suite = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", - " inputs={\"dataset\": vm_train_ds, \"model\": vm_model_xgb},\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "full_suite = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", - " inputs={\"dataset\": vm_train_ds, \"model\": vm_model_lr},\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Link an existing prediction column in the dataset with a model\n", - "\n", - "This lets the ValidMind Library run model predictions, creates a new column called `_prediction`, and assign metadata to track that the `_prediction` column is linked to the `` model.\n", - "\n", - "There are two ways run and assign model predictions with the ValidMind Library:\n", - "\n", - "- When initializing a `Dataset` with `init_dataset()`. This is the most straightforward method to assign predictions for a single model.\n", - "- Using `dataset.assign_predictions()`. This allows assigning predictions to a dataset for one or more models." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Pass `` in dataset interface" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "feature_columns = [\n", - " \"CreditScore\",\n", - " \"Gender\",\n", - " \"Age\",\n", - " \"Tenure\",\n", - " \"Balance\",\n", - " \"NumOfProducts\",\n", - " \"HasCrCard\",\n", - " \"IsActiveMember\",\n", - " \"EstimatedSalary\",\n", - " \"Geography_France\",\n", - " \"Geography_Germany\",\n", - " \"Geography_Spain\",\n", - "]\n", - "\n", - "vm_train_ds = vm.init_dataset(\n", - " model=vm_model_xgb,\n", - " dataset=train_df,\n", - " input_id=\"train_dataset\",\n", - " target_column=demo_dataset.target_column,\n", - " feature_columns=feature_columns,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Through `assign_predictions` interface" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds = vm.init_dataset(\n", - " dataset=train_df,\n", - " input_id=\"train_dataset\",\n", - " target_column=demo_dataset.target_column,\n", - " feature_columns=feature_columns,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "##### Perform predictions using the same `assign_predictions` interface" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(model=vm_model_xgb)\n", - "vm_train_ds.assign_predictions(model=vm_model_lr)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Run an example test\n", - "\n", - "Now, let's run an example test such as `MinimumAccuracy` twice to show how we're able to load the correct model predictions by using the `model` input parameter, even though we're passing the same `train_ds` dataset instance to the test:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "full_suite = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", - " inputs={\"dataset\": vm_train_ds, \"model\": vm_model_xgb},\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "full_suite = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", - " inputs={\n", - " \"dataset\": vm_train_ds,\n", - " \"model\": vm_model_lr,\n", - " },\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Using `predict_fn` to store multiple columns\n", - "\n", - "The `predict_fn` parameter in `vm.init_model()` allows you to create models that return multiple pieces of information when making predictions. This is particularly useful when you want to capture additional metadata, confidence scores, feature importance, or any other model-related information alongside the main prediction.\n", - "\n", - "By returning a dictionary from your predict function, ValidMind automatically creates separate columns for each key when you run `assign_predictions()`." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Create enhanced predict function\n", - "\n", - "Let's create a predict function that wraps our XGBoost model and returns multiple pieces of information:\n", - "- **prediction**: The main class prediction\n", - "- **prediction_proba**: The prediction probabilities for both classes\n", - "- **confidence**: The maximum probability as a confidence score\n", - "- **model_info**: Metadata about the model used" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import numpy as np\n", - "import pandas as pd\n", - "\n", - "def enhanced_xgb_predict_fn(input_data):\n", - " \"\"\"\n", - " Enhanced predict function that returns multiple pieces of information.\n", - " \n", - " Args:\n", - " input_data: Input features for prediction (single row as dictionary when called by ValidMind)\n", - " \n", - " Returns:\n", - " dict: Dictionary containing prediction, probabilities, confidence, and model info\n", - " \"\"\"\n", - " # Define the feature columns that the model was trained on\n", - " # These are the same columns from x_train (excluding the target column 'Exited')\n", - " training_features = [\n", - " 'CreditScore', 'Gender', 'Age', 'Tenure', 'Balance', 'NumOfProducts',\n", - " 'HasCrCard', 'IsActiveMember', 'EstimatedSalary', 'Geography_France',\n", - " 'Geography_Germany', 'Geography_Spain'\n", - " ]\n", - " \n", - " # Convert dictionary input to DataFrame for model prediction\n", - " # When called by ValidMind, input_data is a single row dictionary\n", - " if isinstance(input_data, dict):\n", - " # Filter to only include training features and convert to DataFrame\n", - " filtered_data = {key: value for key, value in input_data.items() if key in training_features}\n", - " input_df = pd.DataFrame([filtered_data])\n", - " \n", - " # Ensure all training features are present (in case some are missing)\n", - " for feature in training_features:\n", - " if feature not in input_df.columns:\n", - " input_df[feature] = 0 # Default value for missing features\n", - " \n", - " # Reorder columns to match training order\n", - " input_df = input_df[training_features]\n", - " else:\n", - " # Handle other input types (DataFrame, array, etc.)\n", - " input_df = pd.DataFrame(input_data) if not isinstance(input_data, pd.DataFrame) else input_data\n", - " # Filter to training features if it's a DataFrame\n", - " if isinstance(input_df, pd.DataFrame):\n", - " input_df = input_df[training_features]\n", - " \n", - " # Make predictions\n", - " prediction = xgb.predict(input_df)\n", - " prediction_proba = xgb.predict_proba(input_df)\n", - " \n", - " # Since we're processing one row at a time, extract the single values\n", - " single_prediction = prediction[0] if len(prediction) > 0 else None\n", - " single_proba = prediction_proba[0] if len(prediction_proba) > 0 else None\n", - " \n", - " # Calculate confidence as the maximum probability for this prediction\n", - " confidence = np.max(single_proba) if single_proba is not None else None\n", - " \n", - " # Create model metadata\n", - " model_info = {\n", - " \"model_type\": \"XGBClassifier\",\n", - " \"n_estimators\": xgb.n_estimators,\n", - " \"max_depth\": xgb.max_depth,\n", - " \"feature_count\": len(training_features),\n", - " \"features_used\": training_features\n", - " }\n", - " \n", - " return {\n", - " \"prediction\": single_prediction,\n", - " \"prediction_proba\": single_proba.tolist() if single_proba is not None else None,\n", - " \"confidence\": confidence,\n", - " \"model_info\": model_info\n", - " }\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Initialize model with predict function\n", - "\n", - "Now we'll create a ValidMind model using the `predict_fn` parameter. This tells ValidMind to use our enhanced function instead of the model's default `predict()` method:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Initialize ValidMind model with the enhanced predict function\n", - "vm_model_enhanced_xgb = vm.init_model(\n", - " model=xgb,\n", - " input_id=\"enhanced_xgb\",\n", - " predict_fn=enhanced_xgb_predict_fn \n", - ")\n", - "\n", - "print(f\"Enhanced XGBoost model initialized with input_id: {vm_model_enhanced_xgb.input_id}\")\n", - "print(\"This model now uses the predict function that handles dictionary inputs correctly\")\n", - "print(\"It will return multiple columns when predictions are assigned to datasets\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Assign predictions with multiple columns\n", - "\n", - "When we use `assign_predictions()` with our enhanced model, ValidMind will automatically create separate columns for each key returned by our predict function. Let's assign predictions to our test dataset:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Create a fresh dataset for this demonstration\n", - "vm_test_ds_enhanced = vm.init_dataset(\n", - " input_id=\"test_dataset_enhanced\",\n", - " dataset=test_df,\n", - " target_column=demo_dataset.target_column\n", - ")\n", - "\n", - "# This will create multiple columns based on the keys returned by our predict function\n", - "vm_test_ds_enhanced.assign_predictions(model=vm_model_enhanced_xgb)\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Verify multiple columns in dataset\n", - "\n", - "Let's examine the dataset to see all the columns that were created by our enhanced predict function. Each key from the returned dictionary becomes a separate column with the model's `input_id` as a prefix:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_test_ds_enhanced._df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "
After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.
\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-76fcd2c215674068b812492b7c639056", - "metadata": {}, - "source": [ - "\n", - "\n", - "\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.
\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.
\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
" - ] - } - ], - "metadata": { - "colab": { - "provenance": [] - }, - "gpuClass": "standard", - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.11.9" - } - }, - "nbformat": 4, - "nbformat_minor": 0 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Load dataset predictions\n", + "\n", + "To enable tests to make use of predictions, you can load predictions in ValidMind dataset objects in multiple different ways.\n", + "\n", + "This interactive notebook includes the code required to load the demo dataset, preprocess the raw dataset and train a model for testing, and initialize ValidMind objects. Additionally, it offers options for loading predictions using the `assign_predictions()` function, such as loading predictions from a file, linking an existing prediction column in the dataset with a model, or allowing the ValidMind Library to run and link predictions to a model." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Preview the documentation template](#toc2_3__) \n", + "- [Load the sample dataset](#toc3__) \n", + "- [Prepocess the raw dataset](#toc4__) \n", + "- [Train models for testing](#toc5__) \n", + "- [Initialize ValidMind objects](#toc6__) \n", + " - [Initialize the ValidMind models](#toc6_1__) \n", + " - [Initialize the ValidMind datasets](#toc6_2__) \n", + "- [Options to load predictions using the ValidMind Library](#toc7__) \n", + " - [Load predictions from a file](#toc7_1__) \n", + " - [Predictions calculated outside of VM](#toc7_2__) \n", + " - [Assign predictions to the training dataset](#toc7_3__) \n", + " - [Run an example test](#toc7_4__) \n", + " - [Link an existing prediction column in the dataset with a model](#toc7_5__) \n", + " - [Link prediction column to a specific model](#toc7_5_1__) \n", + " - [Link an existing prediction column in the dataset with a model](#toc7_6__) \n", + " - [Pass `` in dataset interface](#toc7_6_1__) \n", + " - [Through `assign_predictions` interface](#toc7_6_2__) \n", + " - [Run an example test](#toc7_7__) \n", + " - [Using `predict_fn` to store multiple columns](#toc7_8__) \n", + " - [Create enhanced predict function](#toc7_8_1__) \n", + " - [Initialize model with predict function](#toc7_8_2__) \n", + " - [Assign predictions with multiple columns](#toc7_8_3__) \n", + " - [Verify multiple columns in dataset](#toc7_8_4__) \n", + "- [Next steps](#toc8__) \n", + " - [Work with your model documentation](#toc8_1__) \n", + " - [Discover more learning resources](#toc8_2__) \n", + "- [Upgrade ValidMind](#toc9__) \n", + "\n", + ":::\n", + "\n", + "" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "
For access to all features available in this notebook, you'll need access to a ValidMind account.\n", + "

\n", + "Register with ValidMind
\n", + "\n", + "\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Load the sample dataset\n", + "\n", + "The sample dataset used here is provided by the ValidMind library. To be able to use it, you need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Import the sample dataset from the library\n", + "\n", + "from validmind.datasets.classification import customer_churn as demo_dataset\n", + "\n", + "print(\n", + " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{demo_dataset.target_column}' \\n\\t• Class labels: {demo_dataset.class_labels}\"\n", + ")\n", + "\n", + "raw_df = demo_dataset.load_data()\n", + "raw_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Prepocess the raw dataset\n", + "\n", + "Preprocessing performs a number of operations to get ready for the subsequent steps:\n", + "\n", + "- Preprocess the data: Splits the DataFrame (`df`) into multiple datasets (`train_df`, `validation_df`, and `test_df`) using `demo_dataset.preprocess` to simplify preprocessing.\n", + "- Separate features and targets: Drops the target column to create feature sets (`x_train`, `x_val`) and target sets (`y_train`, `y_val`)." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_df, validation_df, test_df = demo_dataset.preprocess(raw_df)\n", + "x_train = train_df.drop(demo_dataset.target_column, axis=1)\n", + "y_train = train_df[demo_dataset.target_column]\n", + "x_val = validation_df.drop(demo_dataset.target_column, axis=1)\n", + "y_val = validation_df[demo_dataset.target_column]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Train models for testing\n", + "\n", + "- Initialize XGBoost and Logistic Regression Classifiers" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from sklearn.linear_model import LogisticRegression\n", + "import xgboost\n", + "\n", + "%matplotlib inline\n", + "\n", + "xgb = xgboost.XGBClassifier(early_stopping_rounds=10)\n", + "xgb.set_params(\n", + " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", + ")\n", + "xgb.fit(\n", + " x_train,\n", + " y_train,\n", + " eval_set=[(x_val, y_val)],\n", + " verbose=False,\n", + ")\n", + "\n", + "lr = LogisticRegression(random_state=0)\n", + "lr.fit(\n", + " x_train,\n", + " y_train,\n", + ")\n" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Initialize ValidMind objects\n", + "\n", + "\n", + "\n", + "### Initialize the ValidMind models" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_model_xgb = vm.init_model(\n", + " xgb,\n", + " input_id=\"xgb\",\n", + ")\n", + "vm_model_lr = vm.init_model(\n", + " lr,\n", + " input_id=\"lr\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", + "\n", + "This function takes a number of arguments:\n", + "\n", + "- `dataset` — the raw dataset that you want to provide as input to tests\n", + "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", + "- `target_column` — a required argument if tests require access to true values. This is the name of the target column in the dataset\n", + "- `class_labels` — an optional value to map predicted classes to class labels\n", + "\n", + "With all datasets ready, you can now initialize the raw, training and test datasets (`raw_df`, `train_df` and `test_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_ds = vm.init_dataset(\n", + " input_id=\"raw_dataset\",\n", + " dataset=raw_df,\n", + " target_column=demo_dataset.target_column,\n", + ")\n", + "\n", + "vm_train_ds = vm.init_dataset(\n", + " input_id=\"train_dataset\",\n", + " dataset=train_df,\n", + " target_column=demo_dataset.target_column,\n", + ")\n", + "vm_test_ds = vm.init_dataset(\n", + " input_id=\"test_dataset\", dataset=test_df, target_column=demo_dataset.target_column\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Options to load predictions using the ValidMind Library\n", + "\n", + "\n", + "\n", + "### Load predictions from a file\n", + "\n", + "This creates a new column called `_prediction` in the dataset and assigns metadata to track that the `_prediction` column is linked to the model ``" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Predictions calculated outside of VM" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import pandas as pd\n", + "\n", + "train_xgb_prediction = pd.DataFrame(xgb.predict(x_train), columns=[\"xgb_prediction\"])\n", + "test__xgb_prediction = pd.DataFrame(xgb.predict(x_val), columns=[\"xgb_prediction\"])\n", + "\n", + "train_lr_prediction = pd.DataFrame(lr.predict(x_train), columns=[\"lr_prediction\"])\n", + "test_lr_prediction = pd.DataFrame(lr.predict(x_val), columns=[\"lr_prediction\"])" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Assign predictions to the training dataset\n", + "\n", + "We can now use the `assign_predictions()` method from the `Dataset` object to link existing predictions to any model:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(\n", + " model=vm_model_xgb, prediction_values=train_xgb_prediction.xgb_prediction.values\n", + ")\n", + "vm_train_ds.assign_predictions(\n", + " model=vm_model_lr, prediction_values=train_lr_prediction.lr_prediction.values\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Run an example test\n", + "\n", + "Now, let's run an example test such as `MinimumAccuracy` twice to show how we're able to load the correct model predictions by using the `model` input parameter, even though we're passing the same `train_ds` dataset instance to the test:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "full_suite = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", + " inputs={\"dataset\": vm_train_ds, \"model\": vm_model_xgb},\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "full_suite = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", + " inputs={\n", + " \"dataset\": vm_train_ds,\n", + " \"model\": vm_model_lr,\n", + " },\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Link an existing prediction column in the dataset with a model\n", + "\n", + "This approach allows loading datasets that already have prediction columns in addition to feature and target columns. The ValidMind Library assigns metadata to track the predictions column that are linked to a given `` model." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_df2 = train_df.copy()\n", + "train_df2[\"xgb_prediction\"] = train_xgb_prediction.xgb_prediction.values\n", + "train_df2[\"lr_prediction\"] = train_lr_prediction.lr_prediction.values\n", + "train_df2.head(5)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "feature_columns = [\n", + " \"CreditScore\",\n", + " \"Gender\",\n", + " \"Age\",\n", + " \"Tenure\",\n", + " \"Balance\",\n", + " \"NumOfProducts\",\n", + " \"HasCrCard\",\n", + " \"IsActiveMember\",\n", + " \"EstimatedSalary\",\n", + " \"Geography_France\",\n", + " \"Geography_Germany\",\n", + " \"Geography_Spain\",\n", + "]\n", + "\n", + "vm_train_ds = vm.init_dataset(\n", + " dataset=train_df2,\n", + " input_id=\"train_dataset\",\n", + " target_column=demo_dataset.target_column,\n", + " feature_columns=feature_columns,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Link prediction column to a specific model\n", + "\n", + "The `prediction_column` parameter informs the `Dataset` object about the model that should be linked to that column." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(model=vm_model_xgb, prediction_column=\"xgb_prediction\")\n", + "vm_train_ds.assign_predictions(model=vm_model_lr, prediction_column=\"lr_prediction\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": { + "id": "wE0OckXjSPc7" + }, + "source": [ + "full_suite = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", + " inputs={\"dataset\": vm_train_ds, \"model\": vm_model_xgb},\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "full_suite = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", + " inputs={\"dataset\": vm_train_ds, \"model\": vm_model_lr},\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Link an existing prediction column in the dataset with a model\n", + "\n", + "This lets the ValidMind Library run model predictions, creates a new column called `_prediction`, and assign metadata to track that the `_prediction` column is linked to the `` model.\n", + "\n", + "There are two ways run and assign model predictions with the ValidMind Library:\n", + "\n", + "- When initializing a `Dataset` with `init_dataset()`. This is the most straightforward method to assign predictions for a single model.\n", + "- Using `dataset.assign_predictions()`. This allows assigning predictions to a dataset for one or more models." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Pass `` in dataset interface" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "feature_columns = [\n", + " \"CreditScore\",\n", + " \"Gender\",\n", + " \"Age\",\n", + " \"Tenure\",\n", + " \"Balance\",\n", + " \"NumOfProducts\",\n", + " \"HasCrCard\",\n", + " \"IsActiveMember\",\n", + " \"EstimatedSalary\",\n", + " \"Geography_France\",\n", + " \"Geography_Germany\",\n", + " \"Geography_Spain\",\n", + "]\n", + "\n", + "vm_train_ds = vm.init_dataset(\n", + " model=vm_model_xgb,\n", + " dataset=train_df,\n", + " input_id=\"train_dataset\",\n", + " target_column=demo_dataset.target_column,\n", + " feature_columns=feature_columns,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Through `assign_predictions` interface" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds = vm.init_dataset(\n", + " dataset=train_df,\n", + " input_id=\"train_dataset\",\n", + " target_column=demo_dataset.target_column,\n", + " feature_columns=feature_columns,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Perform predictions using the same `assign_predictions` interface" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(model=vm_model_xgb)\n", + "vm_train_ds.assign_predictions(model=vm_model_lr)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Run an example test\n", + "\n", + "Now, let's run an example test such as `MinimumAccuracy` twice to show how we're able to load the correct model predictions by using the `model` input parameter, even though we're passing the same `train_ds` dataset instance to the test:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "full_suite = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", + " inputs={\"dataset\": vm_train_ds, \"model\": vm_model_xgb},\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "full_suite = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", + " inputs={\n", + " \"dataset\": vm_train_ds,\n", + " \"model\": vm_model_lr,\n", + " },\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Using `predict_fn` to store multiple columns\n", + "\n", + "The `predict_fn` parameter in `vm.init_model()` allows you to create models that return multiple pieces of information when making predictions. This is particularly useful when you want to capture additional metadata, confidence scores, feature importance, or any other model-related information alongside the main prediction.\n", + "\n", + "By returning a dictionary from your predict function, ValidMind automatically creates separate columns for each key when you run `assign_predictions()`." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Create enhanced predict function\n", + "\n", + "Let's create a predict function that wraps our XGBoost model and returns multiple pieces of information:\n", + "- **prediction**: The main class prediction\n", + "- **prediction_proba**: The prediction probabilities for both classes\n", + "- **confidence**: The maximum probability as a confidence score\n", + "- **model_info**: Metadata about the model used" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import numpy as np\n", + "import pandas as pd\n", + "\n", + "def enhanced_xgb_predict_fn(input_data):\n", + " \"\"\"\n", + " Enhanced predict function that returns multiple pieces of information.\n", + " \n", + " Args:\n", + " input_data: Input features for prediction (single row as dictionary when called by ValidMind)\n", + " \n", + " Returns:\n", + " dict: Dictionary containing prediction, probabilities, confidence, and model info\n", + " \"\"\"\n", + " # Define the feature columns that the model was trained on\n", + " # These are the same columns from x_train (excluding the target column 'Exited')\n", + " training_features = [\n", + " 'CreditScore', 'Gender', 'Age', 'Tenure', 'Balance', 'NumOfProducts',\n", + " 'HasCrCard', 'IsActiveMember', 'EstimatedSalary', 'Geography_France',\n", + " 'Geography_Germany', 'Geography_Spain'\n", + " ]\n", + " \n", + " # Convert dictionary input to DataFrame for model prediction\n", + " # When called by ValidMind, input_data is a single row dictionary\n", + " if isinstance(input_data, dict):\n", + " # Filter to only include training features and convert to DataFrame\n", + " filtered_data = {key: value for key, value in input_data.items() if key in training_features}\n", + " input_df = pd.DataFrame([filtered_data])\n", + " \n", + " # Ensure all training features are present (in case some are missing)\n", + " for feature in training_features:\n", + " if feature not in input_df.columns:\n", + " input_df[feature] = 0 # Default value for missing features\n", + " \n", + " # Reorder columns to match training order\n", + " input_df = input_df[training_features]\n", + " else:\n", + " # Handle other input types (DataFrame, array, etc.)\n", + " input_df = pd.DataFrame(input_data) if not isinstance(input_data, pd.DataFrame) else input_data\n", + " # Filter to training features if it's a DataFrame\n", + " if isinstance(input_df, pd.DataFrame):\n", + " input_df = input_df[training_features]\n", + " \n", + " # Make predictions\n", + " prediction = xgb.predict(input_df)\n", + " prediction_proba = xgb.predict_proba(input_df)\n", + " \n", + " # Since we're processing one row at a time, extract the single values\n", + " single_prediction = prediction[0] if len(prediction) > 0 else None\n", + " single_proba = prediction_proba[0] if len(prediction_proba) > 0 else None\n", + " \n", + " # Calculate confidence as the maximum probability for this prediction\n", + " confidence = np.max(single_proba) if single_proba is not None else None\n", + " \n", + " # Create model metadata\n", + " model_info = {\n", + " \"model_type\": \"XGBClassifier\",\n", + " \"n_estimators\": xgb.n_estimators,\n", + " \"max_depth\": xgb.max_depth,\n", + " \"feature_count\": len(training_features),\n", + " \"features_used\": training_features\n", + " }\n", + " \n", + " return {\n", + " \"prediction\": single_prediction,\n", + " \"prediction_proba\": single_proba.tolist() if single_proba is not None else None,\n", + " \"confidence\": confidence,\n", + " \"model_info\": model_info\n", + " }\n" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Initialize model with predict function\n", + "\n", + "Now we'll create a ValidMind model using the `predict_fn` parameter. This tells ValidMind to use our enhanced function instead of the model's default `predict()` method:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Initialize ValidMind model with the enhanced predict function\n", + "vm_model_enhanced_xgb = vm.init_model(\n", + " model=xgb,\n", + " input_id=\"enhanced_xgb\",\n", + " predict_fn=enhanced_xgb_predict_fn \n", + ")\n", + "\n", + "print(f\"Enhanced XGBoost model initialized with input_id: {vm_model_enhanced_xgb.input_id}\")\n", + "print(\"This model now uses the predict function that handles dictionary inputs correctly\")\n", + "print(\"It will return multiple columns when predictions are assigned to datasets\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Assign predictions with multiple columns\n", + "\n", + "When we use `assign_predictions()` with our enhanced model, ValidMind will automatically create separate columns for each key returned by our predict function. Let's assign predictions to our test dataset:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Create a fresh dataset for this demonstration\n", + "vm_test_ds_enhanced = vm.init_dataset(\n", + " input_id=\"test_dataset_enhanced\",\n", + " dataset=test_df,\n", + " target_column=demo_dataset.target_column\n", + ")\n", + "\n", + "# This will create multiple columns based on the keys returned by our predict function\n", + "vm_test_ds_enhanced.assign_predictions(model=vm_model_enhanced_xgb)\n" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Verify multiple columns in dataset\n", + "\n", + "Let's examine the dataset to see all the columns that were created by our enhanced predict function. Each key from the returned dictionary becomes a separate column with the model's `input_id` as a prefix:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_test_ds_enhanced._df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "
After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.
\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.
\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.
\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
" + ], + "id": "copyright-76fcd2c215674068b812492b7c639056" + } + ], + "metadata": { + "colab": { + "provenance": [] + }, + "gpuClass": "standard", + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.9" + } + }, + "nbformat": 4, + "nbformat_minor": 0 } diff --git a/site/notebooks/how_to/data_and_datasets/use_dataset_model_objects.ipynb b/site/notebooks/how_to/data_and_datasets/use_dataset_model_objects.ipynb index f727d405d5..7102ad6de1 100644 --- a/site/notebooks/how_to/data_and_datasets/use_dataset_model_objects.ipynb +++ b/site/notebooks/how_to/data_and_datasets/use_dataset_model_objects.ipynb @@ -1,997 +1,999 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Introduction to ValidMind Dataset and Model Objects\n", - "\n", - "When writing custom tests, it is essential to be aware of the interfaces of the ValidMind Dataset and ValidMind Model, which are used as input arguments.\n", - "\n", - "As a model developer, writing custom tests is beneficial when the ValidMind library lacks a built-in test for your specific needs. For example, a model might require new tests to evaluate specific aspects of the model or dataset based on a particular use case.\n", - "\n", - "This interactive notebook offers a detailed understanding of ValidMind objects and their use in writing custom tests. It introduces various interfaces provided by these objects and demonstrates how they can be leveraged to implement tests effortlessly." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - "- [Load the demo dataset](#toc3__) \n", - " - [Prepocess the raw dataset](#toc3_1__) \n", - "- [Train a model for testing](#toc4__) \n", - "- [Explore basic components of the ValidMind library](#toc5__) \n", - " - [VMDataset Object](#toc5_1__) \n", - " - [Initialize the ValidMind datasets](#toc5_1_1__) \n", - " - [ Interfaces of the dataset object](#toc5_1_2__) \n", - " - [Using VM Dataset object as arguments in custom tests](#toc5_2__) \n", - " - [Run the test](#toc5_2_1__) \n", - " - [Using VM Dataset object and parameters as arguments in custom tests](#toc5_3__) \n", - " - [VMModel Object](#toc5_4__) \n", - " - [Initialize ValidMind model object](#toc5_5__) \n", - " - [Assign predictions to the datasets](#toc5_6__) \n", - " - [Using VM Model and Dataset objects as arguments in Custom tests](#toc5_7__) \n", - " - [Log the test results](#toc5_8__) \n", - "- [In summary](#toc6__) \n", - "- [Discover more learning resources](#toc7__) \n", - "- [Upgrade ValidMind](#toc8__) \n", - "\n", - ":::\n", - "\n", - "" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.\n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "
For access to all features available in this notebook, you'll need access to a ValidMind account.\n", - "

\n", - "Register with ValidMind
\n", - "\n", - "\n", - "\n", - "### Key concepts\n", - "\n", - "Here, we will focus on ValidMind dataset, ValidMind model and tests to use these objects to generate artefacts for the documentation.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - "- **model**: A single ValidMind model object that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - "- **dataset**: Single ValidMind dataset object that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - "- **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - "- **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Dataset based Test**\n", - "\n", - "![Dataset based test architecture](./dataset_image.png)\n", - "The dataset based tests take VM dataset object(s) as inputs, test configuration as test parameters to produce `Outputs` as mentioned above.\n", - "\n", - "**Model based Test**\n", - "\n", - "![Model based test architecture](./model_image.png)\n", - "Similar to datasest based tests, the model based tests as an additional input that is VM model object. It allows to identify prediction values of a specific model in the dataset object. " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "Please note the following recommended Python versions to use:\n", - "\n", - "- Python 3.7 > x <= 3.11\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "metadata": {} - }, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%matplotlib inline\n", - "\n", - "import xgboost as xgb" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Load the demo dataset" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.datasets.classification import customer_churn as demo_dataset\n", - "\n", - "raw_df = demo_dataset.load_data()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Prepocess the raw dataset" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_df, validation_df, test_df = demo_dataset.preprocess(raw_df)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Train a model for testing\n", - "\n", - "We train a simple customer churn model for our test." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "x_train = train_df.drop(demo_dataset.target_column, axis=1)\n", - "y_train = train_df[demo_dataset.target_column]\n", - "x_val = validation_df.drop(demo_dataset.target_column, axis=1)\n", - "y_val = validation_df[demo_dataset.target_column]\n", - "\n", - "model = xgb.XGBClassifier(early_stopping_rounds=10)\n", - "model.set_params(\n", - " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", - ")\n", - "model.fit(\n", - " x_train,\n", - " y_train,\n", - " eval_set=[(x_val, y_val)],\n", - " verbose=False,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Explore basic components of the ValidMind library\n", - "\n", - "In this section, you will learn about the basic objects of the ValidMind library that are necessary to implement both custom and built-in tests. As explained above, these objects are:\n", - "* VMDataset: [The high level APIs can be found here](https://docs.validmind.ai/validmind/validmind/vm_models.html#VMDataset)\n", - "* VMModel: [The high level APIs can be found here](https://docs.validmind.ai/validmind/validmind/vm_models.html#VMModel)\n", - "\n", - "Let's understand these objects and their interfaces step by step: " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### VMDataset Object" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Initialize the ValidMind datasets\n", - "\n", - "You can initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", - "\n", - "The function wraps the dataset to create a ValidMind `Dataset` object so that you can write tests effectively using the common interface provided by the VM objects. This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind. You only need to do it one time per dataset.\n", - "\n", - "This function takes a number of arguments. Some of the arguments are:\n", - "\n", - "- `dataset` — the raw dataset that you want to provide as input to tests\n", - "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", - "- `target_column` — a required argument if tests require access to true values. This is the name of the target column in the dataset\n", - "\n", - "The detailed list of the arguments can be found [here](https://docs.validmind.ai/validmind/validmind.html#init_dataset) " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# vm_raw_dataset is now a VMDataset object that you can pass to any ValidMind test\n", - "vm_raw_dataset = vm.init_dataset(\n", - " dataset=raw_df,\n", - " input_id=\"raw_dataset\",\n", - " target_column=\"Exited\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Once you have a ValidMind dataset object (VMDataset), you can inspect its attributes and methods using the inspect_obj utility module. This module provides a list of available attributes and interfaces for use in tests. Understanding how to use VMDatasets is crucial for comprehending how a custom test functions." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.utils import inspect_obj\n", - "inspect_obj(vm_raw_dataset)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Interfaces of the dataset object\n", - "\n", - "**DataFrame**" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_dataset.df" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "**Feature columns**" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_dataset.feature_columns" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "**Target column**" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_dataset.target_column" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "**Features values**" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_dataset.x_df()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "**Target value**" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_dataset.y_df()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "**Numeric feature columns** " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_dataset.feature_columns_numeric" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "**Categorical feature columns** " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_dataset.feature_columns_categorical" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Similarly, you can use all other interfaces of the [VMDataset objects](https://docs.validmind.ai/validmind/validmind/vm_models.html#VMDataset) " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Using VM Dataset object as arguments in custom tests\n", - "\n", - "A custom test is simply a Python function that takes two types of arguments: `inputs` and `params`. The `inputs` are ValidMind objects (`VMDataset`, `VMModel`), and the `params` are additional parameters required for the underlying computation of the test. We will discuss both types of arguments in the following sections.\n", - "\n", - "Let's start with a custom test that requires only a ValidMind dataset object. In this example, we will check the balance of classes in the target column of the dataset:\n", - "\n", - "- The custom test below requires a single argument of type `VMDataset` (dataset).\n", - "- The `my_custom_tests.ClassImbalance` is a unique test identifier that can be assigned using the `vm.test` decorator functionality. This unique test ID will be used in the platform to load test results in the documentation.\n", - "- The `dataset.target_column` and `dataset.df` attributes of the `VMDataset` object are used in the test.\n", - "\n", - "Other high-level APIs (attributes and methods) of the dataset object are listed [here](https://docs.validmind.ai/validmind/validmind/vm_models.html#VMDataset).\n", - "\n", - "If you've gone through the [Implement custom tests notebook](../tests/custom_tests/implement_custom_tests.ipynb), you should have a good understanding of how custom tests are implemented in details. If you haven't, we recommend going through that notebook first." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.vm_models.dataset.dataset import VMDataset\n", - "import pandas as pd\n", - "\n", - "@vm.test(\"my_custom_tests.ClassImbalance\")\n", - "def class_imbalance(dataset):\n", - " # Can only run this test if we have a Dataset object\n", - " if not isinstance(dataset, VMDataset):\n", - " raise ValueError(\"ClassImbalance requires a validmind Dataset object\")\n", - "\n", - " if dataset.target_column is None:\n", - " print(\"Skipping class_imbalance test because no target column is defined\")\n", - " return\n", - "\n", - " # VMDataset object provides target_column attribute\n", - " target_column = dataset.target_column\n", - " # we can access pandas DataFrame using df attribute\n", - " imbalance_percentages = dataset.df[target_column].value_counts(\n", - " normalize=True\n", - " )\n", - " classes = list(imbalance_percentages.index) \n", - " percentages = list(imbalance_percentages.values * 100)\n", - "\n", - " return pd.DataFrame({\"Classes\":classes, \"Percentage\": percentages})" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Run the test\n", - "\n", - "Let's run the test using the `run_test` method, which is part of the `validmind.tests` module. Here, we pass the `dataset` through the `inputs`. Similarly, you can pass `datasets`, `model`, or `models` as inputs if your custom test requires them. In this example below, we run the custom test `my_custom_tests.ClassImbalance` by passing the `dataset` through the `inputs`. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.tests import run_test\n", - "result = run_test(\n", - " test_id=\"my_custom_tests.ClassImbalance\",\n", - " inputs={\n", - " \"dataset\": vm_raw_dataset\n", - " }\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You can move custom tests into separate modules in a folder. It allows you to take one-off tests and move them into an organized structure that makes it easier to manage, maintain and share them. We have provided a seperate notebook with detailed explaination [here](../tests/custom_tests/integrate_external_test_providers.ipynb) " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Using VM Dataset object and parameters as arguments in custom tests\n", - "\n", - "Simlilar to `inputs`, you can pass `params` to a custom test by providing a dictionary of parameters to the `run_test()` function. The parameters will override any default parameters set in the custom test definition. Note that the `dataset` is still passed as `inputs`. \n", - "Let's modify the class imbalance test so that it provides flexibility to `normalize` the results." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.vm_models.dataset.dataset import VMDataset\n", - "import pandas as pd\n", - "\n", - "@vm.test(\"my_custom_tests.ClassImbalance\")\n", - "def class_imbalance(dataset, normalize=True):\n", - " # Can only run this test if we have a Dataset object\n", - " if not isinstance(dataset, VMDataset):\n", - " raise ValueError(\"ClassImbalance requires a validmind Dataset object\")\n", - "\n", - " if dataset.target_column is None:\n", - " print(\"Skipping class_imbalance test because no target column is defined\")\n", - " return\n", - "\n", - " # VMDataset object provides target_column attribute\n", - " target_column = dataset.target_column\n", - " # we can access pandas DataFrame using df attribute\n", - " imbalance_percentages = dataset.df[target_column].value_counts(\n", - " normalize=normalize\n", - " )\n", - " classes = list(imbalance_percentages.index) \n", - " if normalize: \n", - " result = pd.DataFrame({\"Classes\":classes, \"Percentage\": list(imbalance_percentages.values*100)})\n", - " else:\n", - " result = pd.DataFrame({\"Classes\":classes, \"Count\": list(imbalance_percentages.values)})\n", - " return result" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "In this example, the `normalize` parameter is set to `False`, so the class counts will not be normalized. You can change the value to `True` if you want the counts to be normalized. The results of the test will reflect this flexibility, allowing for different outputs based on the parameter passed.\n", - "\n", - "Here, we have passed the `dataset` through the `inputs` and the `normalize` parameter using the `params`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.tests import run_test\n", - "result = run_test(\n", - " test_id = \"my_custom_tests.ClassImbalance\",\n", - " inputs={\"dataset\": vm_raw_dataset},\n", - " params={\"normalize\": True},\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### VMModel Object" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Initialize ValidMind model object\n", - "\n", - "Similar to ValidMind `Dataset` object, you can initialize a ValidMind Model object using the [`init_model`](https://docs.validmind.ai/validmind/validmind.html#init_model) function from the ValidMind (`vm`) module.\n", - "\n", - "This function takes a number of arguments. Some of the arguments are:\n", - "\n", - "- `model` — the raw model that you want evaluate\n", - "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", - "\n", - "The detailed list of the arguments can be found [here](https://docs.validmind.ai/validmind/validmind.html#init_model) " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "\n", - "vm_model = vm.init_model(\n", - " model=model,\n", - " input_id=\"xgb_model\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's inspect the methods and attributes of the model now:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "inspect_obj(vm_model)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Assign predictions to the datasets\n", - "\n", - "We can now use the `assign_predictions()` method from the `Dataset` object to link existing predictions to any model. If no prediction values are passed, the method will compute predictions automatically:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds = vm.init_dataset(\n", - " input_id=\"train_dataset\",\n", - " dataset=train_df,\n", - " type=\"generic\",\n", - " target_column=demo_dataset.target_column,\n", - ")\n", - "\n", - "vm_train_ds.assign_predictions(model=vm_model)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You can see below, the extra prediction column (`xgb_model_prediction`) for the model (`xgb_model`) has been added in the dataset." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "print(vm_train_ds)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Using VM Model and Dataset objects as arguments in Custom tests\n", - "\n", - "We will now create a `@vm.test` wrapper that will allow you to create a reusable test. Note the following changes in the code below:\n", - "\n", - "- The function `confusion_matrix` takes two arguments `dataset` and `model`. This is a `VMDataset` and `VMModel` object respectively.\n", - " - `VMDataset` objects allow you to access the dataset's true (target) values by accessing the `.y` attribute.\n", - " - `VMDataset` objects allow you to access the predictions for a given record (model) by accessing the `.y_pred()` method.\n", - "- The function docstring provides a description of what the test does. This will be displayed along with the result in this notebook as well as in the ValidMind Platform.\n", - "- The function body calculates the confusion matrix using the `sklearn.tests.confusion_matrix` function as we just did above.\n", - "- The function then returns the `ConfusionMatrixDisplay.figure_` object - this is important as the ValidMind Library expects the output of the custom test to be a plot or a table.\n", - "- The `@vm.test` decorator is doing the work of creating a wrapper around the function that will allow it to be run by the ValidMind Library. It also registers the test so it can be found by the ID `my_custom_tests.ConfusionMatrix` (see the section below on how test IDs work in ValidMind and why this format is important)\n", - "\n", - "Similarly, you can use the functinality provided by `VMDataset` and `VMModel` objects. You can refer our documentation page for all the avalialble APIs [here](https://docs.validmind.ai/validmind/validmind.html#init_dataset)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from sklearn import metrics\n", - "import matplotlib.pyplot as plt\n", - "@vm.test(\"my_custom_tests.ConfusionMatrix\")\n", - "def confusion_matrix(dataset, model):\n", - " \"\"\"The confusion matrix is a table that is often used to describe the performance of a classification model on a set of data for which the true values are known.\n", - "\n", - " The confusion matrix is a 2x2 table that contains 4 values:\n", - "\n", - " - True Positive (TP): the number of correct positive predictions\n", - " - True Negative (TN): the number of correct negative predictions\n", - " - False Positive (FP): the number of incorrect positive predictions\n", - " - False Negative (FN): the number of incorrect negative predictions\n", - "\n", - " The confusion matrix can be used to assess the holistic performance of a classification model by showing the accuracy, precision, recall, and F1 score of the model on a single figure.\n", - " \"\"\"\n", - " # we can retrieve traget value from dataset which is y attribute\n", - " y_true = dataset.y\n", - " # The prediction value of a specific model using y_pred method \n", - " y_pred = dataset.y_pred(model=model)\n", - "\n", - " confusion_matrix = metrics.confusion_matrix(y_true, y_pred)\n", - "\n", - " cm_display = metrics.ConfusionMatrixDisplay(\n", - " confusion_matrix=confusion_matrix, display_labels=[False, True]\n", - " )\n", - " cm_display.plot()\n", - " plt.close()\n", - "\n", - " return cm_display.figure_ # return the figure object itself" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Here, we run test using two inputs; `dataset` and `model`. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.tests import run_test\n", - "result = run_test(\n", - " test_id = \"my_custom_tests.ConfusionMatrix\",\n", - " inputs={\n", - " \"dataset\": vm_train_ds,\n", - " \"model\": vm_model,\n", - " }\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Log the test results\n", - "\n", - "You can log any test result to the ValidMind Platform with the `.log()` method of the result object. This will allow you to add the result to the documentation.\n", - "\n", - "You can now do the same for the confusion matrix results." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## In summary\n", - "\n", - "In this notebook you have learned the end-to-end process to document a model with the ValidMind Library, running through some very common scenarios in a typical model development setting:\n", - "\n", - "- Running out-of-the-box tests\n", - "- Documenting your model by adding evidence to model documentation\n", - "- Extending the capabilities of the ValidMind Library by implementing custom tests\n", - "- Ensuring that the documentation is complete by running all tests in the documentation template" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "
After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.
\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-9be1890525a54c10be782f80fe33833f", - "metadata": {}, - "source": [ - "\n", - "\n", - "\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.
\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.
\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.14" - } - }, - "nbformat": 4, - "nbformat_minor": 2 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Introduction to ValidMind Dataset and Model Objects\n", + "\n", + "When writing custom tests, it is essential to be aware of the interfaces of the ValidMind Dataset and ValidMind Model, which are used as input arguments.\n", + "\n", + "As a model developer, writing custom tests is beneficial when the ValidMind library lacks a built-in test for your specific needs. For example, a model might require new tests to evaluate specific aspects of the model or dataset based on a particular use case.\n", + "\n", + "This interactive notebook offers a detailed understanding of ValidMind objects and their use in writing custom tests. It introduces various interfaces provided by these objects and demonstrates how they can be leveraged to implement tests effortlessly." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + "- [Load the demo dataset](#toc3__) \n", + " - [Prepocess the raw dataset](#toc3_1__) \n", + "- [Train a model for testing](#toc4__) \n", + "- [Explore basic components of the ValidMind library](#toc5__) \n", + " - [VMDataset Object](#toc5_1__) \n", + " - [Initialize the ValidMind datasets](#toc5_1_1__) \n", + " - [ Interfaces of the dataset object](#toc5_1_2__) \n", + " - [Using VM Dataset object as arguments in custom tests](#toc5_2__) \n", + " - [Run the test](#toc5_2_1__) \n", + " - [Using VM Dataset object and parameters as arguments in custom tests](#toc5_3__) \n", + " - [VMModel Object](#toc5_4__) \n", + " - [Initialize ValidMind model object](#toc5_5__) \n", + " - [Assign predictions to the datasets](#toc5_6__) \n", + " - [Using VM Model and Dataset objects as arguments in Custom tests](#toc5_7__) \n", + " - [Log the test results](#toc5_8__) \n", + "- [In summary](#toc6__) \n", + "- [Discover more learning resources](#toc7__) \n", + "- [Upgrade ValidMind](#toc8__) \n", + "\n", + ":::\n", + "\n", + "" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.\n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "
For access to all features available in this notebook, you'll need access to a ValidMind account.\n", + "

\n", + "Register with ValidMind
\n", + "\n", + "\n", + "\n", + "### Key concepts\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + " - **dataset-based test**\n", + "\n", + " ![Dataset based test architecture](./dataset_image.png)\n", + " Dataset-based tests take VM dataset objects as inputs, can be configured with values passed in as parameters, and return outputs such as tables, plots, or images.\n", + "\n", + " - **model-based test**:\n", + "\n", + " ![Model based test architecture](./model_image.png)\n", + " Similar to dataset-based tests, model-based tests take additional VM model objects as inputs alongside VM dataset objects. The VM model object can wrap any type of record and is used to obtain prediction values for entries in the dataset.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [test_suites](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "Please note the following recommended Python versions to use:\n", + "\n", + "- Python 3.7 > x <= 3.11\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": { + "metadata": {} + }, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%matplotlib inline\n", + "\n", + "import xgboost as xgb" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Load the demo dataset" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.datasets.classification import customer_churn as demo_dataset\n", + "\n", + "raw_df = demo_dataset.load_data()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Prepocess the raw dataset" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_df, validation_df, test_df = demo_dataset.preprocess(raw_df)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Train a model for testing\n", + "\n", + "We train a simple customer churn model for our test." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "x_train = train_df.drop(demo_dataset.target_column, axis=1)\n", + "y_train = train_df[demo_dataset.target_column]\n", + "x_val = validation_df.drop(demo_dataset.target_column, axis=1)\n", + "y_val = validation_df[demo_dataset.target_column]\n", + "\n", + "model = xgb.XGBClassifier(early_stopping_rounds=10)\n", + "model.set_params(\n", + " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", + ")\n", + "model.fit(\n", + " x_train,\n", + " y_train,\n", + " eval_set=[(x_val, y_val)],\n", + " verbose=False,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Explore basic components of the ValidMind library\n", + "\n", + "In this section, you will learn about the basic objects of the ValidMind library that are necessary to implement both custom and built-in tests. As explained above, these objects are:\n", + "* VMDataset: [The high level APIs can be found here](https://docs.validmind.ai/validmind/validmind/vm_models.html#VMDataset)\n", + "* VMModel: [The high level APIs can be found here](https://docs.validmind.ai/validmind/validmind/vm_models.html#VMModel)\n", + "\n", + "Let's understand these objects and their interfaces step by step: " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### VMDataset Object" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Initialize the ValidMind datasets\n", + "\n", + "You can initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", + "\n", + "The function wraps the dataset to create a ValidMind `Dataset` object so that you can write tests effectively using the common interface provided by the VM objects. This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind. You only need to do it one time per dataset.\n", + "\n", + "This function takes a number of arguments. Some of the arguments are:\n", + "\n", + "- `dataset` — the raw dataset that you want to provide as input to tests\n", + "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", + "- `target_column` — a required argument if tests require access to true values. This is the name of the target column in the dataset\n", + "\n", + "The detailed list of the arguments can be found [here](https://docs.validmind.ai/validmind/validmind.html#init_dataset) " + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# vm_raw_dataset is now a VMDataset object that you can pass to any ValidMind test\n", + "vm_raw_dataset = vm.init_dataset(\n", + " dataset=raw_df,\n", + " input_id=\"raw_dataset\",\n", + " target_column=\"Exited\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Once you have a ValidMind dataset object (VMDataset), you can inspect its attributes and methods using the inspect_obj utility module. This module provides a list of available attributes and interfaces for use in tests. Understanding how to use VMDatasets is crucial for comprehending how a custom test functions." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.utils import inspect_obj\n", + "inspect_obj(vm_raw_dataset)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Interfaces of the dataset object\n", + "\n", + "**DataFrame**" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_dataset.df" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Feature columns**" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_dataset.feature_columns" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Target column**" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_dataset.target_column" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Features values**" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_dataset.x_df()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Target value**" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_dataset.y_df()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Numeric feature columns** " + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_dataset.feature_columns_numeric" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Categorical feature columns** " + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_dataset.feature_columns_categorical" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Similarly, you can use all other interfaces of the [VMDataset objects](https://docs.validmind.ai/validmind/validmind/vm_models.html#VMDataset) " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Using VM Dataset object as arguments in custom tests\n", + "\n", + "A custom test is simply a Python function that takes two types of arguments: `inputs` and `params`. The `inputs` are ValidMind objects (`VMDataset`, `VMModel`), and the `params` are additional parameters required for the underlying computation of the test. We will discuss both types of arguments in the following sections.\n", + "\n", + "Let's start with a custom test that requires only a ValidMind dataset object. In this example, we will check the balance of classes in the target column of the dataset:\n", + "\n", + "- The custom test below requires a single argument of type `VMDataset` (dataset).\n", + "- The `my_custom_tests.ClassImbalance` is a unique test identifier that can be assigned using the `vm.test` decorator functionality. This unique test ID will be used in the platform to load test results in the documentation.\n", + "- The `dataset.target_column` and `dataset.df` attributes of the `VMDataset` object are used in the test.\n", + "\n", + "Other high-level APIs (attributes and methods) of the dataset object are listed [here](https://docs.validmind.ai/validmind/validmind/vm_models.html#VMDataset).\n", + "\n", + "If you've gone through the [Implement custom tests notebook](../tests/custom_tests/implement_custom_tests.ipynb), you should have a good understanding of how custom tests are implemented in details. If you haven't, we recommend going through that notebook first." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.vm_models.dataset.dataset import VMDataset\n", + "import pandas as pd\n", + "\n", + "@vm.test(\"my_custom_tests.ClassImbalance\")\n", + "def class_imbalance(dataset):\n", + " # Can only run this test if we have a Dataset object\n", + " if not isinstance(dataset, VMDataset):\n", + " raise ValueError(\"ClassImbalance requires a validmind Dataset object\")\n", + "\n", + " if dataset.target_column is None:\n", + " print(\"Skipping class_imbalance test because no target column is defined\")\n", + " return\n", + "\n", + " # VMDataset object provides target_column attribute\n", + " target_column = dataset.target_column\n", + " # we can access pandas DataFrame using df attribute\n", + " imbalance_percentages = dataset.df[target_column].value_counts(\n", + " normalize=True\n", + " )\n", + " classes = list(imbalance_percentages.index) \n", + " percentages = list(imbalance_percentages.values * 100)\n", + "\n", + " return pd.DataFrame({\"Classes\":classes, \"Percentage\": percentages})" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Run the test\n", + "\n", + "Let's run the test using the `run_test` method, which is part of the `validmind.tests` module. Here, we pass the `dataset` through the `inputs`. Similarly, you can pass `datasets`, `model`, or `models` as inputs if your custom test requires them. In this example below, we run the custom test `my_custom_tests.ClassImbalance` by passing the `dataset` through the `inputs`. " + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.tests import run_test\n", + "result = run_test(\n", + " test_id=\"my_custom_tests.ClassImbalance\",\n", + " inputs={\n", + " \"dataset\": vm_raw_dataset\n", + " }\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You can move custom tests into separate modules in a folder. It allows you to take one-off tests and move them into an organized structure that makes it easier to manage, maintain and share them. We have provided a seperate notebook with detailed explaination [here](../tests/custom_tests/integrate_external_test_providers.ipynb) " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Using VM Dataset object and parameters as arguments in custom tests\n", + "\n", + "Simlilar to `inputs`, you can pass `params` to a custom test by providing a dictionary of parameters to the `run_test()` function. The parameters will override any default parameters set in the custom test definition. Note that the `dataset` is still passed as `inputs`. \n", + "Let's modify the class imbalance test so that it provides flexibility to `normalize` the results." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.vm_models.dataset.dataset import VMDataset\n", + "import pandas as pd\n", + "\n", + "@vm.test(\"my_custom_tests.ClassImbalance\")\n", + "def class_imbalance(dataset, normalize=True):\n", + " # Can only run this test if we have a Dataset object\n", + " if not isinstance(dataset, VMDataset):\n", + " raise ValueError(\"ClassImbalance requires a validmind Dataset object\")\n", + "\n", + " if dataset.target_column is None:\n", + " print(\"Skipping class_imbalance test because no target column is defined\")\n", + " return\n", + "\n", + " # VMDataset object provides target_column attribute\n", + " target_column = dataset.target_column\n", + " # we can access pandas DataFrame using df attribute\n", + " imbalance_percentages = dataset.df[target_column].value_counts(\n", + " normalize=normalize\n", + " )\n", + " classes = list(imbalance_percentages.index) \n", + " if normalize: \n", + " result = pd.DataFrame({\"Classes\":classes, \"Percentage\": list(imbalance_percentages.values*100)})\n", + " else:\n", + " result = pd.DataFrame({\"Classes\":classes, \"Count\": list(imbalance_percentages.values)})\n", + " return result" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In this example, the `normalize` parameter is set to `False`, so the class counts will not be normalized. You can change the value to `True` if you want the counts to be normalized. The results of the test will reflect this flexibility, allowing for different outputs based on the parameter passed.\n", + "\n", + "Here, we have passed the `dataset` through the `inputs` and the `normalize` parameter using the `params`." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.tests import run_test\n", + "result = run_test(\n", + " test_id = \"my_custom_tests.ClassImbalance\",\n", + " inputs={\"dataset\": vm_raw_dataset},\n", + " params={\"normalize\": True},\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### VMModel Object" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Initialize ValidMind model object\n", + "\n", + "Similar to ValidMind `Dataset` object, you can initialize a ValidMind Model object using the [`init_model`](https://docs.validmind.ai/validmind/validmind.html#init_model) function from the ValidMind (`vm`) module.\n", + "\n", + "This function takes a number of arguments. Some of the arguments are:\n", + "\n", + "- `model` — the raw model that you want evaluate\n", + "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", + "\n", + "The detailed list of the arguments can be found [here](https://docs.validmind.ai/validmind/validmind.html#init_model) " + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "\n", + "vm_model = vm.init_model(\n", + " model=model,\n", + " input_id=\"xgb_model\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's inspect the methods and attributes of the model now:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "inspect_obj(vm_model)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Assign predictions to the datasets\n", + "\n", + "We can now use the `assign_predictions()` method from the `Dataset` object to link existing predictions to any model. If no prediction values are passed, the method will compute predictions automatically:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds = vm.init_dataset(\n", + " input_id=\"train_dataset\",\n", + " dataset=train_df,\n", + " type=\"generic\",\n", + " target_column=demo_dataset.target_column,\n", + ")\n", + "\n", + "vm_train_ds.assign_predictions(model=vm_model)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You can see below, the extra prediction column (`xgb_model_prediction`) for the model (`xgb_model`) has been added in the dataset." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "print(vm_train_ds)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Using VM Model and Dataset objects as arguments in Custom tests\n", + "\n", + "We will now create a `@vm.test` wrapper that will allow you to create a reusable test. Note the following changes in the code below:\n", + "\n", + "- The function `confusion_matrix` takes two arguments `dataset` and `model`. This is a `VMDataset` and `VMModel` object respectively.\n", + " - `VMDataset` objects allow you to access the dataset's true (target) values by accessing the `.y` attribute.\n", + " - `VMDataset` objects allow you to access the predictions for a given record (model) by accessing the `.y_pred()` method.\n", + "- The function docstring provides a description of what the test does. This will be displayed along with the result in this notebook as well as in the ValidMind Platform.\n", + "- The function body calculates the confusion matrix using the `sklearn.tests.confusion_matrix` function as we just did above.\n", + "- The function then returns the `ConfusionMatrixDisplay.figure_` object - this is important as the ValidMind Library expects the output of the custom test to be a plot or a table.\n", + "- The `@vm.test` decorator is doing the work of creating a wrapper around the function that will allow it to be run by the ValidMind Library. It also registers the test so it can be found by the ID `my_custom_tests.ConfusionMatrix` (see the section below on how test IDs work in ValidMind and why this format is important)\n", + "\n", + "Similarly, you can use the functinality provided by `VMDataset` and `VMModel` objects. You can refer our documentation page for all the avalialble APIs [here](https://docs.validmind.ai/validmind/validmind.html#init_dataset)" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from sklearn import metrics\n", + "import matplotlib.pyplot as plt\n", + "@vm.test(\"my_custom_tests.ConfusionMatrix\")\n", + "def confusion_matrix(dataset, model):\n", + " \"\"\"The confusion matrix is a table that is often used to describe the performance of a classification model on a set of data for which the true values are known.\n", + "\n", + " The confusion matrix is a 2x2 table that contains 4 values:\n", + "\n", + " - True Positive (TP): the number of correct positive predictions\n", + " - True Negative (TN): the number of correct negative predictions\n", + " - False Positive (FP): the number of incorrect positive predictions\n", + " - False Negative (FN): the number of incorrect negative predictions\n", + "\n", + " The confusion matrix can be used to assess the holistic performance of a classification model by showing the accuracy, precision, recall, and F1 score of the model on a single figure.\n", + " \"\"\"\n", + " # we can retrieve traget value from dataset which is y attribute\n", + " y_true = dataset.y\n", + " # The prediction value of a specific model using y_pred method \n", + " y_pred = dataset.y_pred(model=model)\n", + "\n", + " confusion_matrix = metrics.confusion_matrix(y_true, y_pred)\n", + "\n", + " cm_display = metrics.ConfusionMatrixDisplay(\n", + " confusion_matrix=confusion_matrix, display_labels=[False, True]\n", + " )\n", + " cm_display.plot()\n", + " plt.close()\n", + "\n", + " return cm_display.figure_ # return the figure object itself" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Here, we run test using two inputs; `dataset` and `model`. " + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.tests import run_test\n", + "result = run_test(\n", + " test_id = \"my_custom_tests.ConfusionMatrix\",\n", + " inputs={\n", + " \"dataset\": vm_train_ds,\n", + " \"model\": vm_model,\n", + " }\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Log the test results\n", + "\n", + "You can log any test result to the ValidMind Platform with the `.log()` method of the result object. This will allow you to add the result to the documentation.\n", + "\n", + "You can now do the same for the confusion matrix results." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## In summary\n", + "\n", + "In this notebook you have learned the end-to-end process to document a model with the ValidMind Library, running through some very common scenarios in a typical model development setting:\n", + "\n", + "- Running out-of-the-box tests\n", + "- Documenting your model by adding evidence to model documentation\n", + "- Extending the capabilities of the ValidMind Library by implementing custom tests\n", + "- Ensuring that the documentation is complete by running all tests in the documentation template" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "
After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.
\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.
\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.
\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
" + ], + "id": "copyright-9be1890525a54c10be782f80fe33833f" + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.14" + } + }, + "nbformat": 4, + "nbformat_minor": 2 } diff --git a/site/notebooks/how_to/metrics/log_metrics_over_time.ipynb b/site/notebooks/how_to/metrics/log_metrics_over_time.ipynb index 271b987276..7e8d1faef8 100644 --- a/site/notebooks/how_to/metrics/log_metrics_over_time.ipynb +++ b/site/notebooks/how_to/metrics/log_metrics_over_time.ipynb @@ -1,969 +1,975 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Log metrics over time\n", - "\n", - "Learn how to track and visualize the temporal evolution of key record (model) performance metrics with ValidMind.\n", - "\n", - "While this notebook uses a traditional binary classification model to demonstrate, the same principles apply to logging performance metrics over time for any record (model) type registered with ValidMind — including agentic AI systems, generative LLM applications, and beyond. For example:\n", - "\n", - "- Key model performance metrics such as AUC, F1 score, precision, recall, and accuracy, are useful for analyzing the stability and trends in model performance indicators, helping to identify potential degradation or unexpected fluctuations in model behavior over time.\n", - "- By monitoring these metrics systematically, teams can detect early warning signs of model drift and take proactive measures to maintain model reliability.\n", - "- Unit metrics in ValidMind provide a standardized way to compute and track individual performance measures, making it easy to monitor specific aspects of model behavior.\n", - "\n", - "Log metrics over time with the ValidMind Library's [`log_metric()`](https://docs.validmind.ai/validmind/validmind.html#log_metric) function and visualize them in your documentation using the *Metric Over Time* block within the ValidMind Platform. This integration enables seamless tracking of record performance, supporting custom thresholds and facilitating the automation of alerts based on logged metrics.\n", - "\n", - "
Metrics over time are most commonly associated with the continued monitoring of a records's performance once it is deployed.\n", - "

\n", - "While you are able to add Metric Over Time blocks to documentation, we recommend first enabling ongoing monitoring for your record to maximize the potential of your performance data.
" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Initialize the Python environment](#toc2_3__) \n", - "- [Load demo model](#toc3__) \n", - "- [Logging metrics](#toc4__) \n", - " - [Run unit metrics](#toc4_1__) \n", - " - [Log unit metrics over time](#toc4_2__) \n", - " - [Pass thresholds](#toc4_3__) \n", - " - [Log multiple metrics with custom thresholds](#toc4_4__) \n", - " - [Add acceptable performance flag](#toc4_5__) \n", - "- [Next steps](#toc5__) \n", - " - [Work with your model documentation](#toc5_1__) \n", - " - [Discover more learning resources](#toc5_2__) \n", - "- [Upgrade ValidMind](#toc6__) \n", - "\n", - ":::\n", - "\n", - "" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## About ValidMind\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "\n", - "\n", - "### Before you begin\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.\n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "\n", - "\n", - "### New to ValidMind?\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "
For access to all features available in this notebook, you'll need access to a ValidMind account.\n", - "

\n", - "Register with ValidMind
" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - "- **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - "- **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - "- **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - "- **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: The [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Credit Risk Scorecard`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Next, let's import the necessary libraries and set up your Python environment for data analysis:" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [], - "source": [ - "import xgboost as xgb\n", - "import numpy as np\n", - "\n", - "from datetime import datetime, timedelta\n", - "\n", - "from validmind.unit_metrics import list_metrics, describe_metric, run_metric\n", - "from validmind.api_client import log_metric\n", - "\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Load demo model\n", - "\n", - "We'll use a classification model trained on customer churn data to demonstrate ValidMind's metric logging capabilities.\n", - "\n", - "- We'll employ a built-in classification dataset, process it through train-validation-test splits, and train an XGBoost classifier.\n", - "- The trained model and datasets are then initialized in ValidMind's framework, enabling us to track and monitor various performance metrics in the following sections." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Import the sample dataset from the library\n", - "\n", - "from validmind.datasets.classification import customer_churn\n", - "\n", - "print(\n", - " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{customer_churn.target_column}' \\n\\t• Class labels: {customer_churn.class_labels}\"\n", - ")\n", - "\n", - "raw_df = customer_churn.load_data()\n", - "raw_df.head()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_df, validation_df, test_df = customer_churn.preprocess(raw_df)\n", - "\n", - "x_train = train_df.drop(customer_churn.target_column, axis=1)\n", - "y_train = train_df[customer_churn.target_column]\n", - "x_val = validation_df.drop(customer_churn.target_column, axis=1)\n", - "y_val = validation_df[customer_churn.target_column]\n", - "\n", - "model = xgb.XGBClassifier(early_stopping_rounds=10)\n", - "model.set_params(\n", - " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", - ")\n", - "model.fit(\n", - " x_train,\n", - " y_train,\n", - " eval_set=[(x_val, y_val)],\n", - " verbose=False,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Once the datasets and model are prepared for validation, let's initialize the ValidMind `dataset` and `model`, specifying features and targets columns.\n", - "\n", - "- The property `input_id` allows users to uniquely identify each dataset and model.\n", - "- This allows for the creation of multiple versions of datasets and models, enabling us to compute metrics by specifying which versions we want to use as inputs." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_dataset = vm.init_dataset(\n", - " dataset=raw_df,\n", - " input_id=\"raw_dataset\",\n", - " target_column=customer_churn.target_column,\n", - " class_labels=customer_churn.class_labels,\n", - ")\n", - "\n", - "vm_train_ds = vm.init_dataset(\n", - " dataset=train_df,\n", - " input_id=\"train_dataset\",\n", - " target_column=customer_churn.target_column,\n", - ")\n", - "\n", - "vm_test_ds = vm.init_dataset(\n", - " dataset=test_df, input_id=\"test_dataset\", target_column=customer_churn.target_column\n", - ")\n", - "\n", - "# Initialize the ValidMind model object wrapper so that it can be passed as input to tests or test suites\n", - "# ValidMind model objects can be any type of record you want to test, document, validate, or monitor\n", - "vm_model = vm.init_model(\n", - " model,\n", - " input_id=\"model\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "We can now use the `assign_predictions()` method from the Dataset object to link existing predictions to any model. \n", - "\n", - "If no prediction values are passed, the method will compute predictions automatically:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(\n", - " model=vm_model,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_model,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Logging metrics\n", - "\n", - "Next, we'll use ValidMind to track the temporal evolution of key model performance metrics.\n", - "\n", - "We'll set appropriate thresholds for each metric, enable automated alerting when performance drifts beyond acceptable boundaries, and demonstrate how these thresholds can be customized based on business requirements and risk tolerance levels." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "metrics = [metric for metric in list_metrics() if \"classification\" in metric]\n", - "\n", - "for metric_id in metrics:\n", - " describe_metric(metric_id)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Run unit metrics\n", - "\n", - "Compute individual metrics using ValidMind's *unit metrics* — single-value metrics that can be computed on a dataset and model. Use the `run_metric()` function from the `validmind.unit_metrics` module to calculate these metrics.\n", - "\n", - "The `run_metric()` function has a signature similar to `run_test()` from the `validmind.tests` module, but is specifically designed for unit metrics and takes the following arguments:\n", - "\n", - "- **`metric_id`:** The unique identifier for the metric (for example, `validmind.unit_metrics.classification.ROC_AUC`)\n", - "- **`inputs`:** A dictionary containing the input dataset and model or their respective input IDs\n", - "- **`params`:** A dictionary containing keyword arguments for the unit metric (optional, accepts any `kwargs` from the underlying sklearn implementation)\n", - "\n", - "`run_metric()` returns and displays a result object similar to a regular ValidMind test, but only shows the unit metric value. While this result object has a `.log()` method for logging to the ValidMind Platform, in this use case we'll use unit metrics to compute performance metrics and then log them over time using the `log_metric()` function from the `validmind.api_client` module." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = run_metric(\n", - " \"validmind.unit_metrics.classification.ROC_AUC\",\n", - " inputs={\n", - " \"model\": vm_model,\n", - " \"dataset\": vm_test_ds,\n", - " },\n", - ")\n", - "auc = result.metric" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = run_metric(\n", - " \"validmind.unit_metrics.classification.Accuracy\",\n", - " inputs={\n", - " \"model\": vm_model,\n", - " \"dataset\": vm_test_ds,\n", - " },\n", - ")\n", - "accuracy = result.metric" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = run_metric(\n", - " \"validmind.unit_metrics.classification.Recall\",\n", - " inputs={\n", - " \"model\": vm_model,\n", - " \"dataset\": vm_test_ds,\n", - " },\n", - ")\n", - "recall = result.metric" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "f1 = run_metric(\n", - " \"validmind.unit_metrics.classification.F1\",\n", - " inputs={\n", - " \"model\": vm_model,\n", - " \"dataset\": vm_test_ds,\n", - " },\n", - ")\n", - "f1 = result.metric" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "precision = run_metric(\n", - " \"validmind.unit_metrics.classification.Precision\",\n", - " inputs={\n", - " \"model\": vm_model,\n", - " \"dataset\": vm_test_ds,\n", - " },\n", - ")\n", - "precision = result.metric" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Log unit metrics over time\n", - "\n", - "Using the `log_metric()` function from the `validmind.api_client` module, let's log the unit metrics over time. This function takes the following arguments:\n", - "\n", - "- **`key`:** The name of the metric to log\n", - "- **`value`:** The value of the metric to log\n", - "- **`recorded_at`:** The timestamp of the metric to log — useful for logging historic predictions\n", - "- **`thresholds`:** A dictionary containing the thresholds for the metric to log\n", - "- **`params`:** A dictionary containing the keyword arguments for the unit metric (in this case, none are required, but we can pass any `kwargs` that the underlying sklearn implementation accepts)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "log_metric(\n", - " key=\"AUC Score\",\n", - " value=auc,\n", - " # If `recorded_at` is not included, the time at function run is logged\n", - " recorded_at=datetime(2024, 1, 1), \n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "To visualize the logged metric, we'll use the **[Metrics Over Time block](https://docs.validmind.ai/guide/monitoring/work-with-metrics-over-time.html)** in the ValidMind Platform:\n", - "\n", - "- After adding this visualization block to your documentation or ongoing monitoring report (as shown in the image below), you'll be able to review your logged metrics plotted over time.\n", - "- In this example, since we've only logged a single data point, the visualization shows just one measurement.\n", - "- As you continue logging metrics, the graph will populate with more points, enabling you to track trends and patterns.\n", - "\n", - "![Metric Over Time block](./add_metric_over_time_block.png)\n", - "![AUC Score](./log_metric_auc_1.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Pass thresholds\n", - "\n", - "We can pass *thresholds* to the `log_metric()` function to enhance the metric over time: \n", - "\n", - "- This is useful for visualizing the metric over time and identifying potential issues. \n", - "- The metric visualization component provides a dynamic way to monitor and contextualize metric values through customizable thresholds. \n", - "- These thresholds appear as horizontal reference lines on the chart. \n", - "- The system always displays the most recent threshold configuration, meaning that if you update threshold values in your client application, the visualization will reflect these changes immediately. \n", - "\n", - "When a metric is logged without thresholds or with an empty threshold dictionary, the reference lines gracefully disappear from the chart, though the metric line itself remains visible. \n", - "\n", - "Thresholds are highly flexible in their implementation. You can define them with any meaningful key names (such as `low_risk`, `maximum`, `target`, or `acceptable_range`) in your metric data, and the visualization will adapt accordingly. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "log_metric(\n", - " key=\"AUC Score\",\n", - " value=auc,\n", - " recorded_at=datetime(2024, 1, 1),\n", - " thresholds={\n", - " \"min_auc\": 0.7,\n", - " }\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "![AUC Score](./log_metric_auc_2.png)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "log_metric(\n", - " key=\"AUC Score\",\n", - " value=auc,\n", - " recorded_at=datetime(2024, 1, 1),\n", - " thresholds={\n", - " \"high_risk\": 0.6,\n", - " \"medium_risk\": 0.7,\n", - " \"low_risk\": 0.8,\n", - " }\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "![AUC Score](./log_metric_auc_3.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Log multiple metrics with custom thresholds\n", - "\n", - "The following code snippet shows an example of how to set up and log multiple performance metrics with custom thresholds for each metric:\n", - "\n", - "- Using AUC, F1, Precision, Recall, and Accuracy scores as examples, it demonstrates how to define different risk levels (high, medium, low) appropriate for each metric's expected range.\n", - "- The code simulates 10 days of metric history by applying a gradual decay and random noise to help visualize how metrics might drift over time in a production environment." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "NUM_DAYS = 10\n", - "REFERENCE_DATE = datetime(2024, 1, 1) # Fixed date: January 1st, 2024\n", - "base_date = REFERENCE_DATE - timedelta(days=NUM_DAYS)\n", - "\n", - "# Initial values with their specific thresholds\n", - "performance_metrics = {\n", - " \"AUC Score\": {\n", - " \"value\": auc,\n", - " \"thresholds\": {\n", - " \"high_risk\": 0.7,\n", - " \"medium_risk\": 0.8,\n", - " \"low_risk\": 0.9,\n", - " }\n", - " },\n", - " \"F1 Score\": {\n", - " \"value\": f1,\n", - " \"thresholds\": {\n", - " \"high_risk\": 0.5,\n", - " \"medium_risk\": 0.6,\n", - " \"low_risk\": 0.7,\n", - " }\n", - " },\n", - " \"Precision Score\": {\n", - " \"value\": precision,\n", - " \"thresholds\": {\n", - " \"high_risk\": 0.6,\n", - " \"medium_risk\": 0.7,\n", - " \"low_risk\": 0.8,\n", - " }\n", - " },\n", - " \"Recall Score\": {\n", - " \"value\": recall,\n", - " \"thresholds\": {\n", - " \"high_risk\": 0.4,\n", - " \"medium_risk\": 0.5,\n", - " \"low_risk\": 0.6,\n", - " }\n", - " },\n", - " \"Accuracy Score\": {\n", - " \"value\": accuracy,\n", - " \"thresholds\": {\n", - " \"high_risk\": 0.75,\n", - " \"medium_risk\": 0.8,\n", - " \"low_risk\": 0.85,\n", - " }\n", - " }\n", - "}\n", - "\n", - "# Trend parameters\n", - "trend_factor = 0.98 # Slight downward trend\n", - "noise_scale = 0.02 # Random fluctuation of ±2%\n", - "\n", - "for i in range(NUM_DAYS):\n", - " recorded_at = base_date + timedelta(days=i)\n", - " print(f\"\\nrecorded_at: {recorded_at}\")\n", - "\n", - " # Log each metric with trend and noise\n", - " for metric_name, metric_info in performance_metrics.items():\n", - " base_value = metric_info[\"value\"]\n", - " thresholds = metric_info[\"thresholds\"]\n", - " \n", - " # Apply trend and add random noise\n", - " trend = base_value * (trend_factor ** i)\n", - " noise = np.random.normal(0, noise_scale * base_value)\n", - " value = max(0, min(1, trend + noise)) # Ensure value stays between 0 and 1\n", - " \n", - " log_metric(\n", - " key=metric_name,\n", - " value=value,\n", - " recorded_at=recorded_at.isoformat(),\n", - " thresholds=thresholds\n", - " )\n", - " \n", - " print(f\"{metric_name:<15}: {value:.4f} (Thresholds: {thresholds})\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "![AUC Score](./log_metric_auc_4.png)\n", - "![Accuracy Score](./log_metric_accuracy.png)\n", - "![Precision Score](./log_metric_precision.png)\n", - "![Recall Score](./log_metric_recall.png)\n", - "![F1 Score](./log_metric_f1.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Add acceptable performance flag\n", - "\n", - "The `passed` parameter in the `log_metric()` function allows you to explicitly mark whether a specific metric value should be considered \"Satisfactory\" or \"Requires Attention\":\n", - " - When `passed=True`: A green \"Satisfactory\" badge appears on the chart, indicating the metric value meets your acceptance criteria.\n", - " - When `passed=False`: A yellow \"Requires Attention\" badge appears, highlighting potential concerns that may require investigation." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "In the example below, the `passed=True` parameter adds a green \"Satisfactory\" badge to the GINI Score metric visualization, instantly indicating that the 0.75 value meets acceptable performance standards by being above the `medium_risk` threshold of 0.6:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "log_metric(\n", - " key=\"GINI Score\",\n", - " value=0.75,\n", - " recorded_at=datetime(2025, 6, 7),\n", - " thresholds = {\n", - " \"high_risk\": 0.5,\n", - " \"medium_risk\": 0.6,\n", - " \"low_risk\": 0.8,\n", - " },\n", - " passed=True\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "![GINI Score](./log_metric_satisfactory.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "In this example, the `passed=False` parameter adds a yellow \"Requires Attention\" badge to the GINI Score metric visualization, immediately highlighting that the value of 0.5 fails to meet acceptable performance standards by not exceeding the `medium_risk` threshold of 0.6:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "log_metric(\n", - " key=\"GINI Score\",\n", - " value=0.5,\n", - " recorded_at=datetime(2025, 6, 9),\n", - " thresholds = {\n", - " \"high_risk\": 0.5,\n", - " \"medium_risk\": 0.6,\n", - " \"low_risk\": 0.8,\n", - " },\n", - " passed=False\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "![GINI Score](./log_metric_attention.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Here, a custom function `passed_fn` determines the badge status automatically, displaying a green \"Satisfactory\" badge for the 0.65 GINI Score because it exceeds the `medium_risk` threshold of 0.6, enabling programmatic evaluation of metric performance based on predefined business rules:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "gini = 0.65\n", - "\n", - "thresholds = {\n", - " \"high_risk\": 0.5,\n", - " \"medium_risk\": 0.6,\n", - " \"low_risk\": 0.8,\n", - "}\n", - "\n", - "def passed_fn(value):\n", - " return value > thresholds[\"medium_risk\"]\n", - "\n", - "log_metric(\n", - " key=\"GINI Score\",\n", - " value=gini, \n", - " recorded_at=datetime(2025, 6, 10),\n", - " thresholds=thresholds,\n", - " passed=passed_fn(gini)\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "![GINI Score](./log_metric_satisfactory_2.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your documentation.\n", - "\n", - "\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "
After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.
\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-584966fafc334aec9585d8f880ddba0c", - "metadata": {}, - "source": [ - "\n", - "\n", - "\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.
\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.
\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 2 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Log metrics over time\n", + "\n", + "Learn how to track and visualize the temporal evolution of key record (model) performance metrics with ValidMind.\n", + "\n", + "While this notebook uses a traditional binary classification model to demonstrate, the same principles apply to logging performance metrics over time for any record (model) type registered with ValidMind — including agentic AI systems, generative LLM applications, and beyond. For example:\n", + "\n", + "- Key model performance metrics such as AUC, F1 score, precision, recall, and accuracy, are useful for analyzing the stability and trends in model performance indicators, helping to identify potential degradation or unexpected fluctuations in model behavior over time.\n", + "- By monitoring these metrics systematically, teams can detect early warning signs of model drift and take proactive measures to maintain model reliability.\n", + "- Unit metrics in ValidMind provide a standardized way to compute and track individual performance measures, making it easy to monitor specific aspects of model behavior.\n", + "\n", + "Log metrics over time with the ValidMind Library's [`log_metric()`](https://docs.validmind.ai/validmind/validmind.html#log_metric) function and visualize them in your documentation using the *Metric Over Time* block within the ValidMind Platform. This integration enables seamless tracking of record performance, supporting custom thresholds and facilitating the automation of alerts based on logged metrics.\n", + "\n", + "
Metrics over time are most commonly associated with the continued monitoring of a records's performance once it is deployed.\n", + "

\n", + "While you are able to add Metric Over Time blocks to documentation, we recommend first enabling ongoing monitoring for your record to maximize the potential of your performance data.
" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Initialize the Python environment](#toc2_3__) \n", + "- [Load demo model](#toc3__) \n", + "- [Logging metrics](#toc4__) \n", + " - [Run unit metrics](#toc4_1__) \n", + " - [Log unit metrics over time](#toc4_2__) \n", + " - [Pass thresholds](#toc4_3__) \n", + " - [Log multiple metrics with custom thresholds](#toc4_4__) \n", + " - [Add acceptable performance flag](#toc4_5__) \n", + "- [Next steps](#toc5__) \n", + " - [Work with your model documentation](#toc5_1__) \n", + " - [Discover more learning resources](#toc5_2__) \n", + "- [Upgrade ValidMind](#toc6__) \n", + "\n", + ":::\n", + "\n", + "" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## About ValidMind\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "\n", + "\n", + "### Before you begin\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.\n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "\n", + "\n", + "### New to ValidMind?\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "
For access to all features available in this notebook, you'll need access to a ValidMind account.\n", + "

\n", + "Register with ValidMind
" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Credit Risk Scorecard`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Next, let's import the necessary libraries and set up your Python environment for data analysis:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import xgboost as xgb\n", + "import numpy as np\n", + "\n", + "from datetime import datetime, timedelta\n", + "\n", + "from validmind.unit_metrics import list_metrics, describe_metric, run_metric\n", + "from validmind.api_client import log_metric\n", + "\n", + "%matplotlib inline" + ], + "execution_count": 3, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Load demo model\n", + "\n", + "We'll use a classification model trained on customer churn data to demonstrate ValidMind's metric logging capabilities.\n", + "\n", + "- We'll employ a built-in classification dataset, process it through train-validation-test splits, and train an XGBoost classifier.\n", + "- The trained model and datasets are then initialized in ValidMind's framework, enabling us to track and monitor various performance metrics in the following sections." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Import the sample dataset from the library\n", + "\n", + "from validmind.datasets.classification import customer_churn\n", + "\n", + "print(\n", + " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{customer_churn.target_column}' \\n\\t• Class labels: {customer_churn.class_labels}\"\n", + ")\n", + "\n", + "raw_df = customer_churn.load_data()\n", + "raw_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_df, validation_df, test_df = customer_churn.preprocess(raw_df)\n", + "\n", + "x_train = train_df.drop(customer_churn.target_column, axis=1)\n", + "y_train = train_df[customer_churn.target_column]\n", + "x_val = validation_df.drop(customer_churn.target_column, axis=1)\n", + "y_val = validation_df[customer_churn.target_column]\n", + "\n", + "model = xgb.XGBClassifier(early_stopping_rounds=10)\n", + "model.set_params(\n", + " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", + ")\n", + "model.fit(\n", + " x_train,\n", + " y_train,\n", + " eval_set=[(x_val, y_val)],\n", + " verbose=False,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Once the datasets and model are prepared for validation, let's initialize the ValidMind `dataset` and `model`, specifying features and targets columns.\n", + "\n", + "- The property `input_id` allows users to uniquely identify each dataset and model.\n", + "- This allows for the creation of multiple versions of datasets and models, enabling us to compute metrics by specifying which versions we want to use as inputs." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_dataset = vm.init_dataset(\n", + " dataset=raw_df,\n", + " input_id=\"raw_dataset\",\n", + " target_column=customer_churn.target_column,\n", + " class_labels=customer_churn.class_labels,\n", + ")\n", + "\n", + "vm_train_ds = vm.init_dataset(\n", + " dataset=train_df,\n", + " input_id=\"train_dataset\",\n", + " target_column=customer_churn.target_column,\n", + ")\n", + "\n", + "vm_test_ds = vm.init_dataset(\n", + " dataset=test_df, input_id=\"test_dataset\", target_column=customer_churn.target_column\n", + ")\n", + "\n", + "# Initialize the ValidMind model object wrapper so that it can be passed as input to tests or test suites\n", + "# ValidMind model objects can be any type of record you want to test, document, validate, or monitor\n", + "vm_model = vm.init_model(\n", + " model,\n", + " input_id=\"model\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can now use the `assign_predictions()` method from the Dataset object to link existing predictions to any model. \n", + "\n", + "If no prediction values are passed, the method will compute predictions automatically:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(\n", + " model=vm_model,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_model,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Logging metrics\n", + "\n", + "Next, we'll use ValidMind to track the temporal evolution of key model performance metrics.\n", + "\n", + "We'll set appropriate thresholds for each metric, enable automated alerting when performance drifts beyond acceptable boundaries, and demonstrate how these thresholds can be customized based on business requirements and risk tolerance levels." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "metrics = [metric for metric in list_metrics() if \"classification\" in metric]\n", + "\n", + "for metric_id in metrics:\n", + " describe_metric(metric_id)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Run unit metrics\n", + "\n", + "Compute individual metrics using ValidMind's *unit metrics* — single-value metrics that can be computed on a dataset and model. Use the `run_metric()` function from the `validmind.unit_metrics` module to calculate these metrics.\n", + "\n", + "The `run_metric()` function has a signature similar to `run_test()` from the `validmind.tests` module, but is specifically designed for unit metrics and takes the following arguments:\n", + "\n", + "- **`metric_id`:** The unique identifier for the metric (for example, `validmind.unit_metrics.classification.ROC_AUC`)\n", + "- **`inputs`:** A dictionary containing the input dataset and model or their respective input IDs\n", + "- **`params`:** A dictionary containing keyword arguments for the unit metric (optional, accepts any `kwargs` from the underlying sklearn implementation)\n", + "\n", + "`run_metric()` returns and displays a result object similar to a regular ValidMind test, but only shows the unit metric value. While this result object has a `.log()` method for logging to the ValidMind Platform, in this use case we'll use unit metrics to compute performance metrics and then log them over time using the `log_metric()` function from the `validmind.api_client` module." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_metric(\n", + " \"validmind.unit_metrics.classification.ROC_AUC\",\n", + " inputs={\n", + " \"model\": vm_model,\n", + " \"dataset\": vm_test_ds,\n", + " },\n", + ")\n", + "auc = result.metric" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_metric(\n", + " \"validmind.unit_metrics.classification.Accuracy\",\n", + " inputs={\n", + " \"model\": vm_model,\n", + " \"dataset\": vm_test_ds,\n", + " },\n", + ")\n", + "accuracy = result.metric" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_metric(\n", + " \"validmind.unit_metrics.classification.Recall\",\n", + " inputs={\n", + " \"model\": vm_model,\n", + " \"dataset\": vm_test_ds,\n", + " },\n", + ")\n", + "recall = result.metric" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "f1 = run_metric(\n", + " \"validmind.unit_metrics.classification.F1\",\n", + " inputs={\n", + " \"model\": vm_model,\n", + " \"dataset\": vm_test_ds,\n", + " },\n", + ")\n", + "f1 = result.metric" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "precision = run_metric(\n", + " \"validmind.unit_metrics.classification.Precision\",\n", + " inputs={\n", + " \"model\": vm_model,\n", + " \"dataset\": vm_test_ds,\n", + " },\n", + ")\n", + "precision = result.metric" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Log unit metrics over time\n", + "\n", + "Using the `log_metric()` function from the `validmind.api_client` module, let's log the unit metrics over time. This function takes the following arguments:\n", + "\n", + "- **`key`:** The name of the metric to log\n", + "- **`value`:** The value of the metric to log\n", + "- **`recorded_at`:** The timestamp of the metric to log — useful for logging historic predictions\n", + "- **`thresholds`:** A dictionary containing the thresholds for the metric to log\n", + "- **`params`:** A dictionary containing the keyword arguments for the unit metric (in this case, none are required, but we can pass any `kwargs` that the underlying sklearn implementation accepts)" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "log_metric(\n", + " key=\"AUC Score\",\n", + " value=auc,\n", + " # If `recorded_at` is not included, the time at function run is logged\n", + " recorded_at=datetime(2024, 1, 1), \n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To visualize the logged metric, we'll use the **[Metrics Over Time block](https://docs.validmind.ai/guide/monitoring/work-with-metrics-over-time.html)** in the ValidMind Platform:\n", + "\n", + "- After adding this visualization block to your documentation or ongoing monitoring report (as shown in the image below), you'll be able to review your logged metrics plotted over time.\n", + "- In this example, since we've only logged a single data point, the visualization shows just one measurement.\n", + "- As you continue logging metrics, the graph will populate with more points, enabling you to track trends and patterns.\n", + "\n", + "![Metric Over Time block](./add_metric_over_time_block.png)\n", + "![AUC Score](./log_metric_auc_1.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Pass thresholds\n", + "\n", + "We can pass *thresholds* to the `log_metric()` function to enhance the metric over time: \n", + "\n", + "- This is useful for visualizing the metric over time and identifying potential issues. \n", + "- The metric visualization component provides a dynamic way to monitor and contextualize metric values through customizable thresholds. \n", + "- These thresholds appear as horizontal reference lines on the chart. \n", + "- The system always displays the most recent threshold configuration, meaning that if you update threshold values in your client application, the visualization will reflect these changes immediately. \n", + "\n", + "When a metric is logged without thresholds or with an empty threshold dictionary, the reference lines gracefully disappear from the chart, though the metric line itself remains visible. \n", + "\n", + "Thresholds are highly flexible in their implementation. You can define them with any meaningful key names (such as `low_risk`, `maximum`, `target`, or `acceptable_range`) in your metric data, and the visualization will adapt accordingly. " + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "log_metric(\n", + " key=\"AUC Score\",\n", + " value=auc,\n", + " recorded_at=datetime(2024, 1, 1),\n", + " thresholds={\n", + " \"min_auc\": 0.7,\n", + " }\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "![AUC Score](./log_metric_auc_2.png)" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "log_metric(\n", + " key=\"AUC Score\",\n", + " value=auc,\n", + " recorded_at=datetime(2024, 1, 1),\n", + " thresholds={\n", + " \"high_risk\": 0.6,\n", + " \"medium_risk\": 0.7,\n", + " \"low_risk\": 0.8,\n", + " }\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "![AUC Score](./log_metric_auc_3.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Log multiple metrics with custom thresholds\n", + "\n", + "The following code snippet shows an example of how to set up and log multiple performance metrics with custom thresholds for each metric:\n", + "\n", + "- Using AUC, F1, Precision, Recall, and Accuracy scores as examples, it demonstrates how to define different risk levels (high, medium, low) appropriate for each metric's expected range.\n", + "- The code simulates 10 days of metric history by applying a gradual decay and random noise to help visualize how metrics might drift over time in a production environment." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "NUM_DAYS = 10\n", + "REFERENCE_DATE = datetime(2024, 1, 1) # Fixed date: January 1st, 2024\n", + "base_date = REFERENCE_DATE - timedelta(days=NUM_DAYS)\n", + "\n", + "# Initial values with their specific thresholds\n", + "performance_metrics = {\n", + " \"AUC Score\": {\n", + " \"value\": auc,\n", + " \"thresholds\": {\n", + " \"high_risk\": 0.7,\n", + " \"medium_risk\": 0.8,\n", + " \"low_risk\": 0.9,\n", + " }\n", + " },\n", + " \"F1 Score\": {\n", + " \"value\": f1,\n", + " \"thresholds\": {\n", + " \"high_risk\": 0.5,\n", + " \"medium_risk\": 0.6,\n", + " \"low_risk\": 0.7,\n", + " }\n", + " },\n", + " \"Precision Score\": {\n", + " \"value\": precision,\n", + " \"thresholds\": {\n", + " \"high_risk\": 0.6,\n", + " \"medium_risk\": 0.7,\n", + " \"low_risk\": 0.8,\n", + " }\n", + " },\n", + " \"Recall Score\": {\n", + " \"value\": recall,\n", + " \"thresholds\": {\n", + " \"high_risk\": 0.4,\n", + " \"medium_risk\": 0.5,\n", + " \"low_risk\": 0.6,\n", + " }\n", + " },\n", + " \"Accuracy Score\": {\n", + " \"value\": accuracy,\n", + " \"thresholds\": {\n", + " \"high_risk\": 0.75,\n", + " \"medium_risk\": 0.8,\n", + " \"low_risk\": 0.85,\n", + " }\n", + " }\n", + "}\n", + "\n", + "# Trend parameters\n", + "trend_factor = 0.98 # Slight downward trend\n", + "noise_scale = 0.02 # Random fluctuation of ±2%\n", + "\n", + "for i in range(NUM_DAYS):\n", + " recorded_at = base_date + timedelta(days=i)\n", + " print(f\"\\nrecorded_at: {recorded_at}\")\n", + "\n", + " # Log each metric with trend and noise\n", + " for metric_name, metric_info in performance_metrics.items():\n", + " base_value = metric_info[\"value\"]\n", + " thresholds = metric_info[\"thresholds\"]\n", + " \n", + " # Apply trend and add random noise\n", + " trend = base_value * (trend_factor ** i)\n", + " noise = np.random.normal(0, noise_scale * base_value)\n", + " value = max(0, min(1, trend + noise)) # Ensure value stays between 0 and 1\n", + " \n", + " log_metric(\n", + " key=metric_name,\n", + " value=value,\n", + " recorded_at=recorded_at.isoformat(),\n", + " thresholds=thresholds\n", + " )\n", + " \n", + " print(f\"{metric_name:<15}: {value:.4f} (Thresholds: {thresholds})\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "![AUC Score](./log_metric_auc_4.png)\n", + "![Accuracy Score](./log_metric_accuracy.png)\n", + "![Precision Score](./log_metric_precision.png)\n", + "![Recall Score](./log_metric_recall.png)\n", + "![F1 Score](./log_metric_f1.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Add acceptable performance flag\n", + "\n", + "The `passed` parameter in the `log_metric()` function allows you to explicitly mark whether a specific metric value should be considered \"Satisfactory\" or \"Requires Attention\":\n", + " - When `passed=True`: A green \"Satisfactory\" badge appears on the chart, indicating the metric value meets your acceptance criteria.\n", + " - When `passed=False`: A yellow \"Requires Attention\" badge appears, highlighting potential concerns that may require investigation." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In the example below, the `passed=True` parameter adds a green \"Satisfactory\" badge to the GINI Score metric visualization, instantly indicating that the 0.75 value meets acceptable performance standards by being above the `medium_risk` threshold of 0.6:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "log_metric(\n", + " key=\"GINI Score\",\n", + " value=0.75,\n", + " recorded_at=datetime(2025, 6, 7),\n", + " thresholds = {\n", + " \"high_risk\": 0.5,\n", + " \"medium_risk\": 0.6,\n", + " \"low_risk\": 0.8,\n", + " },\n", + " passed=True\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "![GINI Score](./log_metric_satisfactory.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In this example, the `passed=False` parameter adds a yellow \"Requires Attention\" badge to the GINI Score metric visualization, immediately highlighting that the value of 0.5 fails to meet acceptable performance standards by not exceeding the `medium_risk` threshold of 0.6:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "log_metric(\n", + " key=\"GINI Score\",\n", + " value=0.5,\n", + " recorded_at=datetime(2025, 6, 9),\n", + " thresholds = {\n", + " \"high_risk\": 0.5,\n", + " \"medium_risk\": 0.6,\n", + " \"low_risk\": 0.8,\n", + " },\n", + " passed=False\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "![GINI Score](./log_metric_attention.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Here, a custom function `passed_fn` determines the badge status automatically, displaying a green \"Satisfactory\" badge for the 0.65 GINI Score because it exceeds the `medium_risk` threshold of 0.6, enabling programmatic evaluation of metric performance based on predefined business rules:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "gini = 0.65\n", + "\n", + "thresholds = {\n", + " \"high_risk\": 0.5,\n", + " \"medium_risk\": 0.6,\n", + " \"low_risk\": 0.8,\n", + "}\n", + "\n", + "def passed_fn(value):\n", + " return value > thresholds[\"medium_risk\"]\n", + "\n", + "log_metric(\n", + " key=\"GINI Score\",\n", + " value=gini, \n", + " recorded_at=datetime(2025, 6, 10),\n", + " thresholds=thresholds,\n", + " passed=passed_fn(gini)\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "![GINI Score](./log_metric_satisfactory_2.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your documentation.\n", + "\n", + "\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "
After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.
\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.
\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.
\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
" + ], + "id": "copyright-584966fafc334aec9585d8f880ddba0c" + } + ], + "metadata": { + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.13" + } + }, + "nbformat": 4, + "nbformat_minor": 2 } diff --git a/site/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb b/site/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb index a827605971..f2c72ce7b3 100644 --- a/site/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb +++ b/site/notebooks/how_to/qualitative_text/qualitative_text_generation.ipynb @@ -1,962 +1,970 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "9a900020", - "metadata": {}, - "source": [ - "# Generate qualitative text with the ValidMind library\n", - "\n", - "This notebook shows how to generate qualitative documentation content directly from the ValidMind library using both `vm.run_text_generation()` and `vm.generate_documentation_text()`. Instead of switching to the UI to write text manually or trigger generation one section at a time, you can generate content for documentation text blocks programmatically from within a notebook and log it back to the corresponding sections of the model document.\n", - "\n", - "After building an example model and documenting its quantitative results, we’ll show how to generate text for individual content blocks, customize the output with prompts, control the context used for generation, and use a configuration-driven workflow to populate multiple qualitative sections across the document. By the end, you’ll have an end-to-end example of how quantitative test results and AI-generated qualitative content can work together to populate a full model document from Python, giving you a more automated documentation workflow directly in the library." - ] - }, - { - "cell_type": "markdown", - "id": "cd48db57", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Initialize the Python environment](#toc2_3__) \n", - "- [Getting to know ValidMind](#toc3__) \n", - " - [Preview the documentation template](#toc3_1__) \n", - " - [View model documentation in the ValidMind Platform](#toc3_2__) \n", - "- [Build the example model](#toc4__) \n", - " - [Import the sample dataset](#toc4_1__) \n", - " - [Preprocessing the raw dataset](#toc4_2__) \n", - " - [Training an XGBoost classifier model](#toc4_3__) \n", - "- [Initialize the ValidMind inputs](#toc5__) \n", - "- [Document test results](#toc6__) \n", - "- [Document qualitative sections](#toc7__) \n", - " - [Generate text for a single content block](#toc7_1__) \n", - " - [Customize the prompt](#toc7_2__) \n", - " - [Pass section-specific context](#toc7_3__) \n", - " - [Append a new text block to a section](#toc7_4__) \n", - " - [Generate text across the document](#toc7_5__) \n", - "- [In summary](#toc8__) \n", - "- [Next steps](#toc9__) \n", - " - [Work with your model documentation](#toc9_1__) \n", - " - [Discover more learning resources](#toc9_2__) \n", - "- [Upgrade ValidMind](#toc10__) \n", - "\n", - ":::\n", - "\n", - "" - ] - }, - { - "cell_type": "markdown", - "id": "a67217b3", - "metadata": {}, - "source": [ - "\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." - ] - }, - { - "cell_type": "markdown", - "id": "281cfb86", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." - ] - }, - { - "cell_type": "markdown", - "id": "51c11b52", - "metadata": {}, - "source": [ - "\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "
For access to all features available in this notebook, you'll need access to a ValidMind account.\n", - "

\n", - "Register with ValidMind
" - ] - }, - { - "cell_type": "markdown", - "id": "9103cd45", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Key concepts\n", - "\n", - "**Validation report**: A comprehensive and structured assessment of a model’s development and performance, focusing on verifying its integrity, appropriateness, and alignment with its intended use. It includes analyses of model assumptions, data quality, performance metrics, outcomes of testing procedures, and risk considerations. The validation report supports transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", - "\n", - "**Validation report template**: Serves as a standardized framework for conducting and documenting model validation activities. It outlines the required sections, recommended analyses, and expected validation tests, ensuring consistency and completeness across validation reports. The template helps guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." - ] - }, - { - "cell_type": "markdown", - "id": "23020a1b", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "id": "6202d6dc", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "
Recommended Python versions\n", - "

\n", - "Python 3.8 <= x <= 3.14
\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "045b05a6", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "id": "b3231d8e", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "id": "56592217", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "id": "43ed3d0c", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "id": "9b9203be", - "metadata": {}, - "source": [ - "\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "690dc368", - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " api_host=\"http://localhost:5000/api/v1/tracking\",\n", - " api_key=\"..\",\n", - " api_secret=\"..\",\n", - " document=\"documentation\", # requires library >=2.12.0\n", - " model=\"..\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "a68f6031", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Then, let's import the necessary libraries and set up your Python environment for data analysis:\n", - "\n", - "- Import **Extreme Gradient Boosting** (XGBoost) with an alias so that we can reference its functions in later calls. XGBoost is a powerful machine learning library designed for speed and performance, especially in handling structured or tabular data.\n", - "- Enable **`matplotlib`**, a plotting library used for visualizing data. Ensures that any plots you generate will render inline in our notebook output rather than opening in a separate window." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "3fa2d9de", - "metadata": {}, - "outputs": [], - "source": [ - "%matplotlib inline\n", - "\n", - "import xgboost as xgb" - ] - }, - { - "cell_type": "markdown", - "id": "69a37995", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Getting to know ValidMind" - ] - }, - { - "cell_type": "markdown", - "id": "40c9eb24", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "62842e84", - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "id": "6fab1c1c", - "metadata": {}, - "source": [ - "\n", - "\n", - "### View model documentation in the ValidMind Platform\n", - "\n", - "Next, let's head to the ValidMind Platform to see the template in action:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, navigate to **Inventory** and select the model you registered for this notebook.\n", - "\n", - "3. Click **Development** under Documents for your model and note how the structure of the documentation matches our preview above." - ] - }, - { - "cell_type": "markdown", - "id": "606d932b", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Build the example model" - ] - }, - { - "cell_type": "markdown", - "id": "3d7ad25a", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Import the sample dataset\n", - "\n", - "First, let's import the public [Bank Customer Churn Prediction](https://www.kaggle.com/datasets/shantanudhakadd/bank-customer-churn-prediction) dataset from Kaggle so that we have something to work with.\n", - "\n", - "In our below example, note that: \n", - "\n", - "- The target column, `Exited` has a value of `1` when a customer has churned and `0` otherwise.\n", - "- The ValidMind Library provides a wrapper to automatically load the dataset as a [Pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) object. A Pandas Dataframe is a two-dimensional tabular data structure that makes use of rows and columns." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "8ea8188e", - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.datasets.classification import customer_churn\n", - "\n", - "print(\n", - " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{customer_churn.target_column}' \\n\\t• Class labels: {customer_churn.class_labels}\"\n", - ")\n", - "\n", - "raw_df = customer_churn.load_data()\n", - "raw_df.head()" - ] - }, - { - "cell_type": "markdown", - "id": "a5ceef72", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Preprocessing the raw dataset\n", - "\n", - "In this section, we preprocess the raw dataset so it is ready for model training and validation. This includes splitting the data into training, validation, and test subsets to support both model fitting and evaluation on unseen data, and then separating each subset into input features and target labels so the model can learn from customer attributes and predict whether a customer churned." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "9d2bec58", - "metadata": {}, - "outputs": [], - "source": [ - "train_df, validation_df, test_df = customer_churn.preprocess(raw_df)\n", - "\n", - "x_train = train_df.drop(customer_churn.target_column, axis=1)\n", - "y_train = train_df[customer_churn.target_column]\n", - "x_val = validation_df.drop(customer_churn.target_column, axis=1)\n", - "y_val = validation_df[customer_churn.target_column]" - ] - }, - { - "cell_type": "markdown", - "id": "3b9edacf", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Training an XGBoost classifier model\n", - "\n", - "In this section, we train an XGBoost classifier to predict customer churn, using early stopping to halt training if performance does not improve after 10 rounds and reduce unnecessary fitting. We configure the model to evaluate performance with three complementary metrics: error for incorrect predictions, logloss for prediction confidence, and auc for class separation. The model is trained on the training split and evaluated against the validation split during fitting, while verbose=False keeps the training output concise." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "658447fc", - "metadata": {}, - "outputs": [], - "source": [ - "model = xgb.XGBClassifier(early_stopping_rounds=10)\n", - "\n", - "model.set_params(\n", - " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", - ")\n", - "\n", - "model.fit(\n", - " x_train,\n", - " y_train,\n", - " eval_set=[(x_val, y_val)],\n", - " verbose=False,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "c2a6b492", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Initialize the ValidMind inputs\n", - "\n", - "We begin by registering the datasets and trained model as ValidMind inputs so they can be referenced consistently throughout the documentation workflow. For the datasets, this means creating ValidMind Dataset objects for the raw, training, and testing data, each with a unique `input_id` for traceability. Where needed, we also provide supporting metadata such as the target column and class labels so tests can interpret the data correctly." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "081548ae", - "metadata": {}, - "outputs": [], - "source": [ - "# Initialize the raw dataset\n", - "vm_raw_dataset = vm.init_dataset(\n", - " dataset=raw_df,\n", - " input_id=\"raw_dataset\",\n", - " target_column=customer_churn.target_column,\n", - " class_labels=customer_churn.class_labels,\n", - ")\n", - "\n", - "# Initialize the training dataset\n", - "vm_train_ds = vm.init_dataset(\n", - " dataset=train_df,\n", - " input_id=\"train_dataset\",\n", - " target_column=customer_churn.target_column,\n", - ")\n", - "\n", - "# Initialize the testing dataset\n", - "vm_test_ds = vm.init_dataset(\n", - " dataset=test_df,\n", - " input_id=\"test_dataset\",\n", - " target_column=customer_churn.target_column\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "1ebfda19", - "metadata": {}, - "source": [ - "You'll also need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data for our model.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "6cc5aff8", - "metadata": {}, - "outputs": [], - "source": [ - "# Initialize the model\n", - "vm_model = vm.init_model(\n", - " model,\n", - " input_id=\"model\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "48d23cf8", - "metadata": {}, - "source": [ - "Finally, we assign predictions from the trained model to the training and testing datasets. The `assign_predictions()` method links predicted classes and probabilities to each dataset, and can also compute predictions automatically if they are not passed explicitly. This step is what allows ValidMind to run performance and diagnostic tests using the model outputs." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "922baa9d", - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(\n", - " model=vm_model,\n", - ")\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_model,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "7c9a174d", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Document test results\n", - "\n", - "In this section, we run the documentation tests defined by the applied template to populate the quantitative parts of the model documentation. The `vm.run_documentation_tests()` function discovers each test-driven block in the template, executes the corresponding tests, and uploads the resulting artifacts to the ValidMind Platform.\n", - "\n", - "To run the full suite successfully, ValidMind needs to know which model and dataset inputs should be used for each test. This can be done with a shared `inputs` argument when all tests use the same objects, or with a `config` dictionary when individual tests require specific inputs or parameters. In this example, we use the default test parameters and provide the input configuration needed for the demo model." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "47f7e709", - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.utils import preview_test_config\n", - "\n", - "test_config = customer_churn.get_demo_test_config()\n", - "preview_test_config(test_config)" - ] - }, - { - "cell_type": "markdown", - "id": "3f22d37b", - "metadata": {}, - "source": [ - "Once the configuration is prepared, we pass it to `vm.run_documentation_tests()` and execute the full suite. The returned `full_suite` object contains the test results and represents the quantitative documentation that has been generated for the model." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "999be7fe", - "metadata": {}, - "outputs": [], - "source": [ - "full_suite = vm.run_documentation_tests(config=test_config)" - ] - }, - { - "cell_type": "markdown", - "id": "5d531744", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Document qualitative sections\n", - "\n", - "In addition to documenting quantitative results through tests, ValidMind now supports programmatic generation of qualitative content for the text blocks in a model documentation template through `vm.run_text_generation()`. This function allows you to generate AI-assisted text for a specific content block directly from a notebook and then log it back to the corresponding section of the document. As a result, you can populate qualitative sections without switching to the UI to write text manually or trigger generation one section at a time.\n", - "\n", - "In the next sections, we’ll walk through the main ways to use this functionality. We’ll start by generating text for a single content block with the default behavior, then show how to customize the output with a prompt, how to control the context used for generation by selecting specific sections, and finally how to scale the same pattern across all text blocks in the document." - ] - }, - { - "cell_type": "markdown", - "id": "899c8553", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Generate text for a single content block\n", - "\n", - "First, we’ll use `vm.run_text_generation()` to generate qualitative text for a single documentation block. By providing a `content_id`, you can target the exact text placeholder you want to populate and let ValidMind generate content using the current document context. The helper `vm.get_content_ids()` is useful for inspecting which content blocks are available in the active template, making it easier to identify the IDs you can use when generating and logging text programmatically." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "85cc552f", - "metadata": {}, - "outputs": [], - "source": [ - "vm.get_content_ids()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "26fcddf9", - "metadata": {}, - "outputs": [], - "source": [ - "vm.run_text_generation(\n", - " content_id=\"dataset_summary_text\",\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "caff6490", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Customize the prompt\n", - "\n", - "Next, we’ll customize the generated output by passing a `prompt` to `vm.run_text_generation()`. This makes it possible to guide not just the subject of the generated text, but also its structure, tone, level of detail, and presentation format. In practice, this allows you to tailor the output for different documentation needs, such as producing a short narrative summary, a more structured section, or content written for a specific audience, while still relying on the same underlying document context for generation." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "52165b98", - "metadata": {}, - "outputs": [], - "source": [ - "prompt = \"\"\"\n", - "Use exactly this structure:\n", - "\n", - "

Dataset Overview

\n", - "

Explain in 1-2 sentences what the dataset contains and what it is used for.

\n", - "\n", - "

Dataset Summary

\n", - "

Summarize the dataset structure, target outcome, and the main types of input features in 2-3 sentences.

\n", - "\n", - "

Key Characteristics

\n", - "
    \n", - "
  • Include 2-3 concise points about the most important characteristics of the dataset.
  • \n", - "
\n", - "\n", - "

Data Quality and Considerations

\n", - "
    \n", - "
  • Include 2-3 concise points about important quality observations, limitations, or considerations relevant to the dataset.
  • \n", - "
\n", - "\n", - "

Overall Assessment

\n", - "

End with a short balanced conclusion on the dataset's suitability for model development and evaluation.

\n", - "\"\"\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "fbf10ad9", - "metadata": {}, - "outputs": [], - "source": [ - "vm.run_text_generation(\n", - " content_id=\"dataset_summary_text\",\n", - " prompt=prompt,\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "99a0740e", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Pass section-specific context\n", - "\n", - "Then, we’ll control the `context` used for generation by passing a selected set of content IDs to `vm.run_text_generation()`. Rather than relying on the full document, this lets you focus the model on the most relevant parts of the documentation for a given text block. In practice, that means you can generate more targeted qualitative content by choosing which existing test and text blocks should inform the output." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "43cf0e7d", - "metadata": {}, - "outputs": [], - "source": [ - "vm.get_content_ids(\"data_description\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "1e1a919e", - "metadata": {}, - "outputs": [], - "source": [ - "vm.run_text_generation(\n", - " content_id=\"dataset_summary_text\",\n", - " context={\"content_ids\": vm.get_content_ids(\"data_description\")},\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "701a0323", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Append a new text block to a section\n", - "\n", - "Sometimes you may want to generate text for a `content_id` that is not already defined in the template. In that case, you can still generate the text with `vm.run_text_generation()` and then use `.log(section_id=...)` to tell ValidMind where that new text block should be placed in the document. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "6a9ba924", - "metadata": {}, - "outputs": [], - "source": [ - "vm.run_text_generation(\n", - " content_id=\"intended_use\",\n", - " section_id=\"intended_use\",\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "6e032b79", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Generate text across the document\n", - "\n", - "At this stage, instead of generating one block at a time, we can populate multiple qualitative sections in a single pass.\n", - "\n", - "The [`vm.generate_documentation_text`](https://docs.validmind.ai/validmind/validmind.html#generate_documentation_text) function reads a configuration dictionary, generates content for each target block, logs the generated text to the ValidMind Platform, and returns a notebook summary grouped by section.\n", - "\n", - "- The function uses a `config` argument to describe which text blocks to generate and how each one should be handled.\n", - "- The `config` parameter is a dictionary with the following structure:\n", - "\n", - " ```python\n", - " config = {\n", - " \"\": {\n", - " \"section_id\": \"\",\n", - " \"prompt\": \"Optional custom prompt\",\n", - " \"context\": {\n", - " \"content_ids\": [\"\", \"\"]\n", - " }\n", - " },\n", - " ...\n", - " }\n", - " ```\n", - "\n", - " Each `` represents a documentation text block to populate. Use `section_id` when the block should be inserted into a specific section, `prompt` when you want to shape the output more explicitly, and `context.content_ids` when you want the generation step to focus on selected parts of the document. In this notebook, `text_config` comes from `customer_churn.get_demo_text_config()`, which provides the demo setup for the customer churn example." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "a97bb129", - "metadata": {}, - "outputs": [], - "source": [ - "text_config = customer_churn.get_demo_text_config()\n", - "preview_test_config(text_config)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "aff42702", - "metadata": {}, - "outputs": [], - "source": [ - "results = vm.generate_documentation_text(config=text_config)" - ] - }, - { - "cell_type": "markdown", - "id": "03b6b875", - "metadata": {}, - "source": [ - "\n", - "\n", - "## In summary\n", - "\n", - "In this notebook, you learned how to:\n", - "\n", - "- [x] Build and document an example customer churn model with ValidMind\n", - "- [x] Run documentation tests to populate the quantitative sections of a model document\n", - "- [x] Generate qualitative text for a single documentation content block with `vm.run_text_generation()`\n", - "- [x] Customize generated output by passing a prompt\n", - "- [x] Control generation context by selecting specific sections of the document\n", - "- [x] Use a configuration-driven workflow to generate qualitative content across the document with `vm.generate_documentation_text()`" - ] - }, - { - "cell_type": "markdown", - "id": "3db3c328", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the output produced by the ValidMind Library right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your documentation." - ] - }, - { - "cell_type": "markdown", - "id": "d7bd8df8", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))" - ] - }, - { - "cell_type": "markdown", - "id": "c0951457", - "metadata": {}, - "source": [ - "\n", - "\n", - "### Discover more learning resources\n", - "\n", - "For a more in-depth introduction to using the ValidMind Library for development, check out our introductory development series and the accompanying interactive training:\n", - "\n", - "- **[ValidMind for development](https://docs.validmind.ai/developer/validmind-library.html#development)**\n", - "- **[Developer Fundamentals](https://docs.validmind.ai/training/developer-fundamentals/developer-fundamentals-register.html)**\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "id": "24532182", - "metadata": {}, - "source": [ - "\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "
After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.
\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "2e796c43", - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "id": "713a6722", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "id": "84a65def", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-18d82030e09942c4953248e9bf432249", - "metadata": {}, - "source": [ - "\n", - "\n", - "\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.
\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.
\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.11.11" - } - }, - "nbformat": 4, - "nbformat_minor": 5 + "cells": [ + { + "cell_type": "markdown", + "id": "9a900020", + "metadata": {}, + "source": [ + "# Generate qualitative text with the ValidMind library\n", + "\n", + "This notebook shows how to generate qualitative documentation content directly from the ValidMind library using both `vm.run_text_generation()` and `vm.generate_documentation_text()`. Instead of switching to the UI to write text manually or trigger generation one section at a time, you can generate content for documentation text blocks programmatically from within a notebook and log it back to the corresponding sections of the model document.\n", + "\n", + "After building an example model and documenting its quantitative results, we’ll show how to generate text for individual content blocks, customize the output with prompts, control the context used for generation, and use a configuration-driven workflow to populate multiple qualitative sections across the document. By the end, you’ll have an end-to-end example of how quantitative test results and AI-generated qualitative content can work together to populate a full model document from Python, giving you a more automated documentation workflow directly in the library." + ] + }, + { + "cell_type": "markdown", + "id": "cd48db57", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Initialize the Python environment](#toc2_3__) \n", + "- [Getting to know ValidMind](#toc3__) \n", + " - [Preview the documentation template](#toc3_1__) \n", + " - [View model documentation in the ValidMind Platform](#toc3_2__) \n", + "- [Build the example model](#toc4__) \n", + " - [Import the sample dataset](#toc4_1__) \n", + " - [Preprocessing the raw dataset](#toc4_2__) \n", + " - [Training an XGBoost classifier model](#toc4_3__) \n", + "- [Initialize the ValidMind inputs](#toc5__) \n", + "- [Document test results](#toc6__) \n", + "- [Document qualitative sections](#toc7__) \n", + " - [Generate text for a single content block](#toc7_1__) \n", + " - [Customize the prompt](#toc7_2__) \n", + " - [Pass section-specific context](#toc7_3__) \n", + " - [Append a new text block to a section](#toc7_4__) \n", + " - [Generate text across the document](#toc7_5__) \n", + "- [In summary](#toc8__) \n", + "- [Next steps](#toc9__) \n", + " - [Work with your model documentation](#toc9_1__) \n", + " - [Discover more learning resources](#toc9_2__) \n", + "- [Upgrade ValidMind](#toc10__) \n", + "\n", + ":::\n", + "\n", + "" + ] + }, + { + "cell_type": "markdown", + "id": "a67217b3", + "metadata": {}, + "source": [ + "\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." + ] + }, + { + "cell_type": "markdown", + "id": "281cfb86", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." + ] + }, + { + "cell_type": "markdown", + "id": "51c11b52", + "metadata": {}, + "source": [ + "\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "
For access to all features available in this notebook, you'll need access to a ValidMind account.\n", + "

\n", + "Register with ValidMind
" + ] + }, + { + "cell_type": "markdown", + "id": "9103cd45", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "id": "23020a1b", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "id": "6202d6dc", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "
Recommended Python versions\n", + "

\n", + "Python 3.8 <= x <= 3.14
\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "045b05a6", + "metadata": {}, + "outputs": [], + "source": [ + "%pip install -q validmind" + ] + }, + { + "cell_type": "markdown", + "id": "b3231d8e", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "id": "56592217", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "id": "43ed3d0c", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "id": "9b9203be", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "690dc368", + "metadata": {}, + "outputs": [], + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " api_host=\"http://localhost:5000/api/v1/tracking\",\n", + " api_key=\"..\",\n", + " api_secret=\"..\",\n", + " document=\"documentation\", # requires library >=2.12.0\n", + " model=\"..\",\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "a68f6031", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Then, let's import the necessary libraries and set up your Python environment for data analysis:\n", + "\n", + "- Import **Extreme Gradient Boosting** (XGBoost) with an alias so that we can reference its functions in later calls. XGBoost is a powerful machine learning library designed for speed and performance, especially in handling structured or tabular data.\n", + "- Enable **`matplotlib`**, a plotting library used for visualizing data. Ensures that any plots you generate will render inline in our notebook output rather than opening in a separate window." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3fa2d9de", + "metadata": {}, + "outputs": [], + "source": [ + "%matplotlib inline\n", + "\n", + "import xgboost as xgb" + ] + }, + { + "cell_type": "markdown", + "id": "69a37995", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Getting to know ValidMind" + ] + }, + { + "cell_type": "markdown", + "id": "40c9eb24", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "62842e84", + "metadata": {}, + "outputs": [], + "source": [ + "vm.preview_template()" + ] + }, + { + "cell_type": "markdown", + "id": "6fab1c1c", + "metadata": {}, + "source": [ + "\n", + "\n", + "### View model documentation in the ValidMind Platform\n", + "\n", + "Next, let's head to the ValidMind Platform to see the template in action:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, navigate to **Inventory** and select the model you registered for this notebook.\n", + "\n", + "3. Click **Development** under Documents for your model and note how the structure of the documentation matches our preview above." + ] + }, + { + "cell_type": "markdown", + "id": "606d932b", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Build the example model" + ] + }, + { + "cell_type": "markdown", + "id": "3d7ad25a", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Import the sample dataset\n", + "\n", + "First, let's import the public [Bank Customer Churn Prediction](https://www.kaggle.com/datasets/shantanudhakadd/bank-customer-churn-prediction) dataset from Kaggle so that we have something to work with.\n", + "\n", + "In our below example, note that: \n", + "\n", + "- The target column, `Exited` has a value of `1` when a customer has churned and `0` otherwise.\n", + "- The ValidMind Library provides a wrapper to automatically load the dataset as a [Pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) object. A Pandas Dataframe is a two-dimensional tabular data structure that makes use of rows and columns." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8ea8188e", + "metadata": {}, + "outputs": [], + "source": [ + "from validmind.datasets.classification import customer_churn\n", + "\n", + "print(\n", + " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{customer_churn.target_column}' \\n\\t• Class labels: {customer_churn.class_labels}\"\n", + ")\n", + "\n", + "raw_df = customer_churn.load_data()\n", + "raw_df.head()" + ] + }, + { + "cell_type": "markdown", + "id": "a5ceef72", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Preprocessing the raw dataset\n", + "\n", + "In this section, we preprocess the raw dataset so it is ready for model training and validation. This includes splitting the data into training, validation, and test subsets to support both model fitting and evaluation on unseen data, and then separating each subset into input features and target labels so the model can learn from customer attributes and predict whether a customer churned." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9d2bec58", + "metadata": {}, + "outputs": [], + "source": [ + "train_df, validation_df, test_df = customer_churn.preprocess(raw_df)\n", + "\n", + "x_train = train_df.drop(customer_churn.target_column, axis=1)\n", + "y_train = train_df[customer_churn.target_column]\n", + "x_val = validation_df.drop(customer_churn.target_column, axis=1)\n", + "y_val = validation_df[customer_churn.target_column]" + ] + }, + { + "cell_type": "markdown", + "id": "3b9edacf", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Training an XGBoost classifier model\n", + "\n", + "In this section, we train an XGBoost classifier to predict customer churn, using early stopping to halt training if performance does not improve after 10 rounds and reduce unnecessary fitting. We configure the model to evaluate performance with three complementary metrics: error for incorrect predictions, logloss for prediction confidence, and auc for class separation. The model is trained on the training split and evaluated against the validation split during fitting, while verbose=False keeps the training output concise." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "658447fc", + "metadata": {}, + "outputs": [], + "source": [ + "model = xgb.XGBClassifier(early_stopping_rounds=10)\n", + "\n", + "model.set_params(\n", + " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", + ")\n", + "\n", + "model.fit(\n", + " x_train,\n", + " y_train,\n", + " eval_set=[(x_val, y_val)],\n", + " verbose=False,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "c2a6b492", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Initialize the ValidMind inputs\n", + "\n", + "We begin by registering the datasets and trained model as ValidMind inputs so they can be referenced consistently throughout the documentation workflow. For the datasets, this means creating ValidMind Dataset objects for the raw, training, and testing data, each with a unique `input_id` for traceability. Where needed, we also provide supporting metadata such as the target column and class labels so tests can interpret the data correctly." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "081548ae", + "metadata": {}, + "outputs": [], + "source": [ + "# Initialize the raw dataset\n", + "vm_raw_dataset = vm.init_dataset(\n", + " dataset=raw_df,\n", + " input_id=\"raw_dataset\",\n", + " target_column=customer_churn.target_column,\n", + " class_labels=customer_churn.class_labels,\n", + ")\n", + "\n", + "# Initialize the training dataset\n", + "vm_train_ds = vm.init_dataset(\n", + " dataset=train_df,\n", + " input_id=\"train_dataset\",\n", + " target_column=customer_churn.target_column,\n", + ")\n", + "\n", + "# Initialize the testing dataset\n", + "vm_test_ds = vm.init_dataset(\n", + " dataset=test_df,\n", + " input_id=\"test_dataset\",\n", + " target_column=customer_churn.target_column\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "1ebfda19", + "metadata": {}, + "source": [ + "You'll also need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data for our model.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6cc5aff8", + "metadata": {}, + "outputs": [], + "source": [ + "# Initialize the model\n", + "vm_model = vm.init_model(\n", + " model,\n", + " input_id=\"model\",\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "48d23cf8", + "metadata": {}, + "source": [ + "Finally, we assign predictions from the trained model to the training and testing datasets. The `assign_predictions()` method links predicted classes and probabilities to each dataset, and can also compute predictions automatically if they are not passed explicitly. This step is what allows ValidMind to run performance and diagnostic tests using the model outputs." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "922baa9d", + "metadata": {}, + "outputs": [], + "source": [ + "vm_train_ds.assign_predictions(\n", + " model=vm_model,\n", + ")\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_model,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "7c9a174d", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Document test results\n", + "\n", + "In this section, we run the documentation tests defined by the applied template to populate the quantitative parts of the model documentation. The `vm.run_documentation_tests()` function discovers each test-driven block in the template, executes the corresponding tests, and uploads the resulting artifacts to the ValidMind Platform.\n", + "\n", + "To run the full suite successfully, ValidMind needs to know which model and dataset inputs should be used for each test. This can be done with a shared `inputs` argument when all tests use the same objects, or with a `config` dictionary when individual tests require specific inputs or parameters. In this example, we use the default test parameters and provide the input configuration needed for the demo model." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "47f7e709", + "metadata": {}, + "outputs": [], + "source": [ + "from validmind.utils import preview_test_config\n", + "\n", + "test_config = customer_churn.get_demo_test_config()\n", + "preview_test_config(test_config)" + ] + }, + { + "cell_type": "markdown", + "id": "3f22d37b", + "metadata": {}, + "source": [ + "Once the configuration is prepared, we pass it to `vm.run_documentation_tests()` and execute the full suite. The returned `full_suite` object contains the test results and represents the quantitative documentation that has been generated for the model." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "999be7fe", + "metadata": {}, + "outputs": [], + "source": [ + "full_suite = vm.run_documentation_tests(config=test_config)" + ] + }, + { + "cell_type": "markdown", + "id": "5d531744", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Document qualitative sections\n", + "\n", + "In addition to documenting quantitative results through tests, ValidMind now supports programmatic generation of qualitative content for the text blocks in a model documentation template through `vm.run_text_generation()`. This function allows you to generate AI-assisted text for a specific content block directly from a notebook and then log it back to the corresponding section of the document. As a result, you can populate qualitative sections without switching to the UI to write text manually or trigger generation one section at a time.\n", + "\n", + "In the next sections, we’ll walk through the main ways to use this functionality. We’ll start by generating text for a single content block with the default behavior, then show how to customize the output with a prompt, how to control the context used for generation by selecting specific sections, and finally how to scale the same pattern across all text blocks in the document." + ] + }, + { + "cell_type": "markdown", + "id": "899c8553", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Generate text for a single content block\n", + "\n", + "First, we’ll use `vm.run_text_generation()` to generate qualitative text for a single documentation block. By providing a `content_id`, you can target the exact text placeholder you want to populate and let ValidMind generate content using the current document context. The helper `vm.get_content_ids()` is useful for inspecting which content blocks are available in the active template, making it easier to identify the IDs you can use when generating and logging text programmatically." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "85cc552f", + "metadata": {}, + "outputs": [], + "source": [ + "vm.get_content_ids()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "26fcddf9", + "metadata": {}, + "outputs": [], + "source": [ + "vm.run_text_generation(\n", + " content_id=\"dataset_summary_text\",\n", + ").log()" + ] + }, + { + "cell_type": "markdown", + "id": "caff6490", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Customize the prompt\n", + "\n", + "Next, we’ll customize the generated output by passing a `prompt` to `vm.run_text_generation()`. This makes it possible to guide not just the subject of the generated text, but also its structure, tone, level of detail, and presentation format. In practice, this allows you to tailor the output for different documentation needs, such as producing a short narrative summary, a more structured section, or content written for a specific audience, while still relying on the same underlying document context for generation." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "52165b98", + "metadata": {}, + "outputs": [], + "source": [ + "prompt = \"\"\"\n", + "Use exactly this structure:\n", + "\n", + "

Dataset Overview

\n", + "

Explain in 1-2 sentences what the dataset contains and what it is used for.

\n", + "\n", + "

Dataset Summary

\n", + "

Summarize the dataset structure, target outcome, and the main types of input features in 2-3 sentences.

\n", + "\n", + "

Key Characteristics

\n", + "
    \n", + "
  • Include 2-3 concise points about the most important characteristics of the dataset.
  • \n", + "
\n", + "\n", + "

Data Quality and Considerations

\n", + "
    \n", + "
  • Include 2-3 concise points about important quality observations, limitations, or considerations relevant to the dataset.
  • \n", + "
\n", + "\n", + "

Overall Assessment

\n", + "

End with a short balanced conclusion on the dataset's suitability for model development and evaluation.

\n", + "\"\"\"" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "fbf10ad9", + "metadata": {}, + "outputs": [], + "source": [ + "vm.run_text_generation(\n", + " content_id=\"dataset_summary_text\",\n", + " prompt=prompt,\n", + ").log()" + ] + }, + { + "cell_type": "markdown", + "id": "99a0740e", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Pass section-specific context\n", + "\n", + "Then, we’ll control the `context` used for generation by passing a selected set of content IDs to `vm.run_text_generation()`. Rather than relying on the full document, this lets you focus the model on the most relevant parts of the documentation for a given text block. In practice, that means you can generate more targeted qualitative content by choosing which existing test and text blocks should inform the output." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "43cf0e7d", + "metadata": {}, + "outputs": [], + "source": [ + "vm.get_content_ids(\"data_description\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1e1a919e", + "metadata": {}, + "outputs": [], + "source": [ + "vm.run_text_generation(\n", + " content_id=\"dataset_summary_text\",\n", + " context={\"content_ids\": vm.get_content_ids(\"data_description\")},\n", + ").log()" + ] + }, + { + "cell_type": "markdown", + "id": "701a0323", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Append a new text block to a section\n", + "\n", + "Sometimes you may want to generate text for a `content_id` that is not already defined in the template. In that case, you can still generate the text with `vm.run_text_generation()` and then use `.log(section_id=...)` to tell ValidMind where that new text block should be placed in the document. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6a9ba924", + "metadata": {}, + "outputs": [], + "source": [ + "vm.run_text_generation(\n", + " content_id=\"intended_use\",\n", + " section_id=\"intended_use\",\n", + ").log()" + ] + }, + { + "cell_type": "markdown", + "id": "6e032b79", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Generate text across the document\n", + "\n", + "At this stage, instead of generating one block at a time, we can populate multiple qualitative sections in a single pass.\n", + "\n", + "The [`vm.generate_documentation_text`](https://docs.validmind.ai/validmind/validmind.html#generate_documentation_text) function reads a configuration dictionary, generates content for each target block, logs the generated text to the ValidMind Platform, and returns a notebook summary grouped by section.\n", + "\n", + "- The function uses a `config` argument to describe which text blocks to generate and how each one should be handled.\n", + "- The `config` parameter is a dictionary with the following structure:\n", + "\n", + " ```python\n", + " config = {\n", + " \"\": {\n", + " \"section_id\": \"\",\n", + " \"prompt\": \"Optional custom prompt\",\n", + " \"context\": {\n", + " \"content_ids\": [\"\", \"\"]\n", + " }\n", + " },\n", + " ...\n", + " }\n", + " ```\n", + "\n", + " Each `` represents a documentation text block to populate. Use `section_id` when the block should be inserted into a specific section, `prompt` when you want to shape the output more explicitly, and `context.content_ids` when you want the generation step to focus on selected parts of the document. In this notebook, `text_config` comes from `customer_churn.get_demo_text_config()`, which provides the demo setup for the customer churn example." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a97bb129", + "metadata": {}, + "outputs": [], + "source": [ + "text_config = customer_churn.get_demo_text_config()\n", + "preview_test_config(text_config)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "aff42702", + "metadata": {}, + "outputs": [], + "source": [ + "results = vm.generate_documentation_text(config=text_config)" + ] + }, + { + "cell_type": "markdown", + "id": "03b6b875", + "metadata": {}, + "source": [ + "\n", + "\n", + "## In summary\n", + "\n", + "In this notebook, you learned how to:\n", + "\n", + "- [x] Build and document an example customer churn model with ValidMind\n", + "- [x] Run documentation tests to populate the quantitative sections of a model document\n", + "- [x] Generate qualitative text for a single documentation content block with `vm.run_text_generation()`\n", + "- [x] Customize generated output by passing a prompt\n", + "- [x] Control generation context by selecting specific sections of the document\n", + "- [x] Use a configuration-driven workflow to generate qualitative content across the document with `vm.generate_documentation_text()`" + ] + }, + { + "cell_type": "markdown", + "id": "3db3c328", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the output produced by the ValidMind Library right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your documentation." + ] + }, + { + "cell_type": "markdown", + "id": "d7bd8df8", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))" + ] + }, + { + "cell_type": "markdown", + "id": "c0951457", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Discover more learning resources\n", + "\n", + "For a more in-depth introduction to using the ValidMind Library for development, check out our introductory development series and the accompanying interactive training:\n", + "\n", + "- **[ValidMind for development](https://docs.validmind.ai/developer/validmind-library.html#development)**\n", + "- **[Developer Fundamentals](https://docs.validmind.ai/training/developer-fundamentals/developer-fundamentals-register.html)**\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "id": "24532182", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "
After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.
\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2e796c43", + "metadata": {}, + "outputs": [], + "source": [ + "%pip show validmind" + ] + }, + { + "cell_type": "markdown", + "id": "713a6722", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "84a65def", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "id": "copyright-18d82030e09942c4953248e9bf432249", + "metadata": {}, + "source": [ + "\n", + "\n", + "\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.
\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.
\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial
" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.11" + } + }, + "nbformat": 4, + "nbformat_minor": 5 } diff --git a/site/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb b/site/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb index cd8af2d278..38ad4a3086 100644 --- a/site/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb +++ b/site/notebooks/how_to/tests/custom_tests/implement_custom_tests.ipynb @@ -1,1107 +1,1113 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Implement custom tests\n", - "\n", - "Custom tests extend the functionality of ValidMind, allowing you to document any model or use case with added flexibility.\n", - "\n", - "ValidMind provides a comprehensive set of tests out-of-the-box to evaluate and document your models and datasets. We recognize there will be cases where the default tests do not support a model or dataset, or specific documentation is needed. In these cases, you can create and use your own custom code to accomplish what you need. To streamline custom code integration, we support the creation of custom test functions.\n", - "\n", - "This interactive notebook provides a step-by-step guide for implementing and registering custom tests with ValidMind, running them individually, viewing the results on the ValidMind Platform, and incorporating them into your model documentation template." - ] + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Implement custom tests\n", + "\n", + "Custom tests extend the functionality of ValidMind, allowing you to document any model or use case with added flexibility.\n", + "\n", + "ValidMind provides a comprehensive set of tests out-of-the-box to evaluate and document your models and datasets. We recognize there will be cases where the default tests do not support a model or dataset, or specific documentation is needed. In these cases, you can create and use your own custom code to accomplish what you need. To streamline custom code integration, we support the creation of custom test functions.\n", + "\n", + "This interactive notebook provides a step-by-step guide for implementing and registering custom tests with ValidMind, running them individually, viewing the results on the ValidMind Platform, and incorporating them into your model documentation template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + "- [Implement a Custom Test](#toc3__) \n", + "- [Run the Custom Test](#toc4__) \n", + " - [Setup the Model and Dataset](#toc4_1__) \n", + " - [Run the Custom Test](#toc4_2__) \n", + "- [Adding Custom Test to Model Documentation](#toc5__) \n", + "- [Some More Custom Tests](#toc6__) \n", + " - [Custom Test: Table of Model Hyperparameters](#toc6_1__) \n", + " - [Custom Test: External API Call](#toc6_2__) \n", + " - [Custom Test: Passing Parameters](#toc6_3__) \n", + " - [Custom Test: Multiple Tables and Plots in a Single Test](#toc6_4__) \n", + " - [Custom Test: Images](#toc6_5__) \n", + " - [Custom Test: Description](#toc6_6__) \n", + "- [Conclusion](#toc7__) \n", + "- [Next steps](#toc8__) \n", + " - [Work with your model documentation](#toc8_1__) \n", + " - [Discover more learning resources](#toc8_2__) \n", + "- [Upgrade ValidMind](#toc9__) \n", + "\n", + ":::\n", + "\n", + "" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "
For access to all features available in this notebook, you'll need access to a ValidMind account.\n", + "

\n", + "Register with ValidMind
\n", + "\n", + "\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Implement a Custom Test\n", + "\n", + "Let's start off by creating a simple custom test that creates a Confusion Matrix for a binary classification model. We will use the `sklearn.metrics.confusion_matrix` function to calculate the confusion matrix and then display it as a heatmap using `plotly`. (This is already a built-in test in ValidMind, but we will use it as an example to demonstrate how to create custom tests.)" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import matplotlib.pyplot as plt\n", + "from sklearn import metrics\n", + "\n", + "\n", + "@vm.test(\"my_custom_tests.ConfusionMatrix\")\n", + "def confusion_matrix(dataset, model):\n", + " \"\"\"The confusion matrix is a table that is often used to describe the performance of a classification model on a set of data for which the true values are known.\n", + "\n", + " The confusion matrix is a 2x2 table that contains 4 values:\n", + "\n", + " - True Positive (TP): the number of correct positive predictions\n", + " - True Negative (TN): the number of correct negative predictions\n", + " - False Positive (FP): the number of incorrect positive predictions\n", + " - False Negative (FN): the number of incorrect negative predictions\n", + "\n", + " The confusion matrix can be used to assess the holistic performance of a classification model by showing the accuracy, precision, recall, and F1 score of the model on a single figure.\n", + " \"\"\"\n", + " y_true = dataset.y\n", + " y_pred = dataset.y_pred(model)\n", + "\n", + " confusion_matrix = metrics.confusion_matrix(y_true, y_pred)\n", + "\n", + " cm_display = metrics.ConfusionMatrixDisplay(\n", + " confusion_matrix=confusion_matrix, display_labels=[False, True]\n", + " )\n", + " cm_display.plot()\n", + "\n", + " plt.close() # close the plot to avoid displaying it\n", + "\n", + " return cm_display.figure_ # return the figure object itself" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Thats our custom test defined and ready to go... Let's take a look at whats going on here:\n", + "\n", + "- The function `confusion_matrix` takes two arguments `dataset` and `model`. This is a VMDataset and VMModel object respectively.\n", + "- The function docstring provides a description of what the test does. This will be displayed along with the result in this notebook as well as in the ValidMind Platform.\n", + "- The function body calculates the confusion matrix using the `sklearn.metrics.confusion_matrix` function and then plots it using `sklearn.metric.ConfusionMatrixDisplay`.\n", + "- The function then returns the `ConfusionMatrixDisplay.figure_` object - this is important as the ValidMind Library expects the output of the custom test to be a plot or a table.\n", + "- The `@vm.test` decorator is doing the work of creating a wrapper around the function that will allow it to be run by the ValidMind Library. It also registers the test so it can be found by the ID `my_custom_tests.ConfusionMatrix` (see the section below on how test IDs work in ValidMind and why this format is important)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Run the Custom Test\n", + "\n", + "Now that we have defined and registered our custom test, lets see how we can run it and properly use it in the ValidMind Platform." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Setup the Model and Dataset\n", + "\n", + "First let's setup a an example model and dataset to run our custom metic against. Since this is a Confusion Matrix, we will use the Customer Churn dataset that ValidMind provides and train a simple XGBoost model." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import xgboost as xgb\n", + "from validmind.datasets.classification import customer_churn\n", + "\n", + "raw_df = customer_churn.load_data()\n", + "train_df, validation_df, test_df = customer_churn.preprocess(raw_df)\n", + "\n", + "x_train = train_df.drop(customer_churn.target_column, axis=1)\n", + "y_train = train_df[customer_churn.target_column]\n", + "x_val = validation_df.drop(customer_churn.target_column, axis=1)\n", + "y_val = validation_df[customer_churn.target_column]\n", + "\n", + "model = xgb.XGBClassifier(early_stopping_rounds=10)\n", + "model.set_params(\n", + " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", + ")\n", + "model.fit(\n", + " x_train,\n", + " y_train,\n", + " eval_set=[(x_val, y_val)],\n", + " verbose=False,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Easy enough! Now we have a model and dataset setup and trained. One last thing to do is bring the dataset and model into the ValidMind Library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# for now, we'll just use the test dataset\n", + "vm_test_ds = vm.init_dataset(\n", + " dataset=test_df,\n", + " target_column=customer_churn.target_column,\n", + " input_id=\"test_dataset\",\n", + ")\n", + "\n", + "vm_model = vm.init_model(model, input_id=\"model\")\n", + "\n", + "# link the model to the dataset\n", + "vm_test_ds.assign_predictions(model=vm_model)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Run the Custom Test\n", + "\n", + "Now that we have our model and dataset setup, we have everything we need to run our custom test. We can do this by importing the `run_test` function from the `validmind.tests` module and passing in the test ID of our custom test along with the model and dataset we want to run it against.\n", + "\n", + ">Notice how the `inputs` dictionary is used to map an `input_id` which we set above to the `model` and `dataset` keys that are expected by our custom test function. This is how the ValidMind Library knows which inputs to pass to different tests and is key when using many different datasets and models." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.tests import run_test\n", + "\n", + "result = run_test(\n", + " \"my_custom_tests.ConfusionMatrix\",\n", + " inputs={\"model\": \"model\", \"dataset\": \"test_dataset\"},\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You'll notice that the docstring becomes a markdown description of the test. The figure is then displayed as the test result. What you see above is how it will look in the ValidMind Platform as well. Let's go ahead and log the result to see how that works." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Adding Custom Test to Model Documentation\n", + "\n", + "To do this, go to the documentation page of the model you registered above and navigate to the `Model Development` -> `Model Evaluation` section. Then hover between any existing content block to reveal the `+` button as shown in the screenshot below.\n", + "\n", + "![screenshot showing insert button for test-driven blocks](./insert-test-driven-block.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now click on the `+` button and select the `Test-Driven Block` option. This will open a dialog where you can select `My Custom Tests Confusion Matrix` from the list of available tests. You can preview the result and then click `Insert Block` to add it to the documentation.\n", + "\n", + "![screenshot showing how to insert a test-driven block](./insert-test-driven-block-custom.png)\n", + "\n", + "The test should match the result you see above. It is now part of your documentation and will now be run everytime you run `vm.run_documentation_tests()` for your model. Let's do that now." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.reload()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If you preview the template, it should show the custom test in the `Model Development`->`Model Evaluation` section:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Just so we can run all of the tests in the template, let's initialize the train and raw dataset.\n", + "\n", + "(Refer to [**Quickstart for documentation**](../../../quickstart/quickstart_documentation.ipynb) and the ValidMind docs for more information on what we are doing here)" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_dataset = vm.init_dataset(\n", + " dataset=raw_df,\n", + " input_id=\"raw_dataset\",\n", + " target_column=customer_churn.target_column,\n", + " class_labels=customer_churn.class_labels,\n", + ")\n", + "\n", + "vm_train_ds = vm.init_dataset(\n", + " dataset=train_df,\n", + " input_id=\"train_dataset\",\n", + " target_column=customer_churn.target_column,\n", + ")\n", + "vm_train_ds.assign_predictions(model=vm_model)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To run all the tests in the template, you can use the `vm.run_documentation_tests()` and pass the inputs we initialized above and the demo config from our customer_churn module. We will have to add a section to the config for our new test to tell it which inputs it should receive. This is done by simply adding a new element in the config dictionary where the key is the ID of the test and the value is a dictionary with the following structure:\n", + "```python\n", + "{\n", + " \"inputs\": {\n", + " \"model\": \"test_dataset\",\n", + " \"dataset\": \"model\",\n", + " }\n", + "}\n", + "```" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.utils import preview_test_config\n", + "\n", + "test_config = customer_churn.get_demo_test_config()\n", + "test_config[\"my_custom_tests.ConfusionMatrix\"] = {\n", + " \"inputs\": {\n", + " \"dataset\": \"test_dataset\",\n", + " \"model\": \"model\",\n", + " }\n", + "}\n", + "preview_test_config(test_config)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "full_suite = vm.run_documentation_tests(config=test_config)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "## Some More Custom Tests\n", + "\n", + "Now that you understand the entire process of creating custom tests and using them in your documentation, let's create a few more to see different ways you can utilize custom tests." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Custom Test: Table of Model Hyperparameters\n", + "\n", + "This custom test will display a table of the hyperparameters used in the model:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "@vm.test(\"my_custom_tests.Hyperparameters\")\n", + "def hyperparameters(model):\n", + " \"\"\"The hyperparameters of a machine learning model are the settings that control the learning process.\n", + " These settings are specified before the learning process begins and can have a significant impact on the\n", + " performance of the model.\n", + "\n", + " The hyperparameters of a model can be used to tune the model to achieve the best possible performance\n", + " on a given dataset. By examining the hyperparameters of a model, you can gain insight into how the model\n", + " was trained and how it might be improved.\n", + " \"\"\"\n", + " hyperparameters = model.model.get_xgb_params() # dictionary of hyperparameters\n", + "\n", + " # turn the dictionary into a table where each row contains a hyperparameter and its value\n", + " return [{\"Hyperparam\": k, \"Value\": v} for k, v in hyperparameters.items() if v]\n", + "\n", + "\n", + "result = run_test(\"my_custom_tests.Hyperparameters\", inputs={\"model\": \"model\"})\n", + "result.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Since the test has been run and logged, you can add it to your documentation using the same process as above. It should look like this:\n", + "\n", + "![screenshot showing hyperparameters test](./hyperparameters-custom-metric.png)\n", + "\n", + "For our simple toy model, there are aren't really any proper hyperparameters but you can see how this could be useful for more complex models that have gone through hyperparameter tuning." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Custom Test: External API Call\n", + "\n", + "This custom test will make an external API call to get the current BTC price and display it as a table. This demonstrates how you might integrate external data sources into your model documentation in a programmatic way. You could, for instance, setup a pipeline that runs a test like this every day to keep your model documentation in sync with an external system." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import requests\n", + "import random\n", + "\n", + "\n", + "@vm.test(\"my_custom_tests.ExternalAPI\")\n", + "def external_api():\n", + " \"\"\"This test calls an external API to get a list of fake users. It then creates\n", + " a table with the relevant data so it can be displayed in the documentation.\n", + "\n", + " The purpose of this test is to demonstrate how to call an external API and use the\n", + " data in a test. A test like this could even be setup to run in a scheduled\n", + " pipeline to keep your documentation in-sync with an external data source.\n", + " \"\"\"\n", + " url = \"https://jsonplaceholder.typicode.com/users\"\n", + " response = requests.get(url)\n", + " data = response.json()\n", + "\n", + " # extract the time and the current BTC price in USD\n", + " return {\n", + " \"Model Owners/Stakeholders\": [\n", + " {\n", + " \"Name\": user[\"name\"],\n", + " \"Role\": random.choice([\"Owner\", \"Stakeholder\"]),\n", + " \"Email\": user[\"email\"],\n", + " \"Phone\": user[\"phone\"],\n", + " \"Slack Handle\": f\"@{user['name'].lower().replace(' ', '.')}\",\n", + " }\n", + " for user in data[:3]\n", + " ],\n", + " \"Model Developers\": [\n", + " {\n", + " \"Name\": user[\"name\"],\n", + " \"Role\": \"Developer\",\n", + " \"Email\": user[\"email\"],\n", + " }\n", + " for user in data[3:7]\n", + " ],\n", + " \"Model Validators\": [\n", + " {\n", + " \"Name\": user[\"name\"],\n", + " \"Role\": \"Validator\",\n", + " \"Email\": user[\"email\"],\n", + " }\n", + " for user in data[7:]\n", + " ],\n", + " }\n", + "\n", + "\n", + "result = run_test(\"my_custom_tests.ExternalAPI\")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Again, you can add this to your documentation to see how it looks:\n", + "\n", + "![screenshot showing BTC price metric](./external-data-custom-test.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Custom Test: Passing Parameters\n", + "\n", + "Custom test functions, as stated earlier, can take both inputs and params. When you define your function there is no need to distinguish between the two, the ValidMind Library will handle that for you. You simply need to add both to the function as arguments and the library will pass in the correct values.\n", + "\n", + "So for instance, if you wanted to parameterize the first custom test we created, the confusion matrix, you could do so like this:\n", + "\n", + "```python\n", + "def confusion_matrix(dataset: VMDataset, model: VMModel, my_param: str = \"Default Value\"):\n", + " pass\n", + "```\n", + "\n", + "And then when you run the test, you can pass in the parameter like this:\n", + "\n", + "```python\n", + "vm.run_test(\n", + " \"my_custom_tests.ConfusionMatrix\",\n", + " inputs={\"model\": \"model\", \"dataset\": \"test_dataset\"},\n", + " params={\"my_param\": \"My Value\"},\n", + ")\n", + "```\n", + "\n", + "Or if you are running the entire documentation template, you would update the config like this:\n", + "\n", + "```python\n", + "test_config[\"my_custom_tests.ConfusionMatrix\"] = {\n", + " \"inputs\": {\n", + " \"dataset\": \"test_dataset\",\n", + " \"model\": \"model\",\n", + " },\n", + " \"params\": {\n", + " \"my_param\": \"My Value\",\n", + " },\n", + "}\n", + "```\n", + "\n", + "Let's go ahead and create a toy test that takes a parameter and uses it in the result:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import plotly.express as px\n", + "\n", + "\n", + "@vm.test(\"my_custom_tests.ParameterExample\")\n", + "def parameter_example(\n", + " plot_title=\"Default Plot Title\", x_col=\"sepal_width\", y_col=\"sepal_length\"\n", + "):\n", + " \"\"\"This test takes two parameters and creates a scatter plot based on them.\n", + "\n", + " The purpose of this test is to demonstrate how to create a test that takes\n", + " parameters and uses them to generate a plot. This can be useful for creating\n", + " tests that are more flexible and can be used in a variety of scenarios.\n", + " \"\"\"\n", + " # return px.scatter(px.data.iris(), x=x_col, y=y_col, color=\"species\")\n", + " return px.scatter(\n", + " px.data.iris(), x=x_col, y=y_col, color=\"species\", title=plot_title\n", + " )\n", + "\n", + "\n", + "result = run_test(\n", + " \"my_custom_tests.ParameterExample\",\n", + " params={\n", + " \"plot_title\": \"My Cool Plot\",\n", + " \"x_col\": \"sepal_width\",\n", + " \"y_col\": \"sepal_length\",\n", + " },\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Play around with this and see how you can use parameters, default values and other features to make your custom tests more flexible and useful.\n", + "\n", + "Here's how this one looks in the documentation:\n", + "![screenshot showing parameterized test](./parameterized-custom-metric.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "\n", + "### Custom Test: Multiple Tables and Plots in a Single Test\n", + "\n", + "Custom test functions, as stated earlier, can return more than just one table or plot. In fact, any number of tables and plots can be returned. Let's see an example of this:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import numpy as np\n", + "import plotly.express as px\n", + "\n", + "\n", + "@vm.test(\"my_custom_tests.ComplexOutput\")\n", + "def complex_output():\n", + " \"\"\"This test demonstrates how to return many tables and figures in a single test\"\"\"\n", + " # create a couple tables\n", + " table = [{\"A\": 1, \"B\": 2}, {\"A\": 3, \"B\": 4}]\n", + " table2 = [{\"C\": 5, \"D\": 6}, {\"C\": 7, \"D\": 8}]\n", + "\n", + " # create a few figures showing some random data\n", + " fig1 = px.line(x=np.arange(10), y=np.random.rand(10), title=\"Random Line Plot\")\n", + " fig2 = px.bar(x=[\"A\", \"B\", \"C\"], y=np.random.rand(3), title=\"Random Bar Plot\")\n", + " fig3 = px.scatter(\n", + " x=np.random.rand(10), y=np.random.rand(10), title=\"Random Scatter Plot\"\n", + " )\n", + "\n", + " return (\n", + " {\n", + " \"My Cool Table\": table,\n", + " \"Another Table\": table2,\n", + " },\n", + " {\n", + " # Figures support the same dict-of-titles convention as tables.\n", + " # These titles flow into the document media registry as\n", + " # \"Figure N. \" alongside table captions.\n", + " \"Random Line Plot\": fig1,\n", + " \"Random Bar Plot\": fig2,\n", + " \"Random Scatter Plot\": fig3,\n", + " },\n", + " )\n", + "\n", + "\n", + "result = run_test(\"my_custom_tests.ComplexOutput\")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Notice how you can return the tables as a dictionary where the key is the title of the table and the value is the table itself. The same convention works for **figures** — wrap them in a dict whose keys are the titles you want shown in the document media registry (e.g. *Figure 7. Random Line Plot*). You could also just return the figures by themselves but this way you can give them a title to more easily identify them in the result.\n", + "\n", + "![screenshot showing multiple tables and plots](./multiple-tables-plots-custom-metric.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_5__'></a>\n", + "\n", + "### Custom Test: Images\n", + "\n", + "If you are using a plotting library that isn't supported by ValidMind (i.e. not `matplotlib` or `plotly`), you can still return the image directly as a bytes-like object. This could also be used to bring any type of image into your documentation in a programmatic way. For instance, you may want to include a diagram of your model architecture or a screenshot of a dashboard that your model is integrated with. As long as you can produce the image with Python or open it from a file, you can include it in your documentation." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import io\n", + "import matplotlib.pyplot as plt\n", + "\n", + "\n", + "@vm.test(\"my_custom_tests.Image\")\n", + "def image():\n", + " \"\"\"This test demonstrates how to return an image in a test\"\"\"\n", + "\n", + " # create a simple plot\n", + " fig, ax = plt.subplots()\n", + " ax.plot([1, 2, 3, 4])\n", + " ax.set_title(\"Simple Line Plot\")\n", + "\n", + " # save the plot as a PNG image (in-memory buffer)\n", + " img_data = io.BytesIO()\n", + " fig.savefig(img_data, format=\"png\")\n", + " img_data.seek(0)\n", + "\n", + " plt.close() # close the plot to avoid displaying it\n", + "\n", + " return img_data.read()\n", + "\n", + "\n", + "result = run_test(\"my_custom_tests.Image\")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Adding this custom test to your documentation will display the image:\n", + "\n", + "![screenshot showing image custom test](./image-in-custom-metric.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If you want to log an image as a test result, you can do so by passing the path to the image as a parameter to the custom test and then opening the file in the test function. Here's an example:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "@vm.test(\"my_custom_tests.MyPNGCorrelationMatrix\")\n", + "def Image(path: str):\n", + " \"\"\"Opens a png image file and logs it as a test result to ValidMind\"\"\"\n", + " if not path.endswith(\".png\"):\n", + " raise ValueError(\"Image must be a PNG file\")\n", + "\n", + " # return raw image bytes\n", + " with open(path, \"rb\") as f:\n", + " return f.read()\n", + " \n", + "run_test(\n", + " \"my_custom_tests.MyPNGCorrelationMatrix\",\n", + " params={\"path\": \"./pearson-correlation-matrix.png\"},\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The image is displayed in the test result:\n", + "\n", + "![screenshot showing image from file](./pearson-correlation-matrix-test-output.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_6__'></a>\n", + "\n", + "### Custom Test: Description\n", + "\n", + "If you want to write a custom test description for your custom test instead of it is interpreted through llm, you can do so by returning string in your test." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import pandas as pd\n", + "\n", + "@vm.test(\"my_custom_tests.MyCustomTest\")\n", + "def my_custom_test(dataset, model):\n", + " \"\"\"\n", + " This is a custom computed test that computes confusion matrix for a binary classification model and return a string as a test description.\n", + " \"\"\"\n", + " y_true = dataset.y\n", + " y_pred = dataset.y_pred(model)\n", + "\n", + " confusion_matrix = metrics.confusion_matrix(y_true, y_pred)\n", + "\n", + " cm_display = metrics.ConfusionMatrixDisplay(\n", + " confusion_matrix=confusion_matrix, display_labels=[False, True]\n", + " )\n", + " cm_display.plot()\n", + "\n", + " plt.close() # close the plot to avoid displaying it\n", + "\n", + " return cm_display.figure_, \"Test Description - Confusion Matrix\", pd.DataFrame({\"Value\": [1, 2, 3]}) # return the figure object itself\n", + "\n" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You can see here test result description has been customized here. The same result description will be displayed in the UI." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.MyCustomTest\",\n", + " inputs={\"model\": \"model\", \"dataset\": \"test_dataset\"},\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Conclusion\n", + "\n", + "In this notebook, we have demonstrated how to create custom tests in ValidMind. We have shown how to define custom test functions, register them with the ValidMind Library, run them against models and datasets, and add them to model documentation templates. We have also shown how to return tables and plots from custom tests and how to use them in the ValidMind Platform. We hope this tutorial has been helpful in understanding how to create and use custom tests in ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "<a id='toc8_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "<a id='toc8_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc9__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-bcdac57ebb8d440f86ba120ee6511db3" + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.5" + } }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - "- [Implement a Custom Test](#toc3__) \n", - "- [Run the Custom Test](#toc4__) \n", - " - [Setup the Model and Dataset](#toc4_1__) \n", - " - [Run the Custom Test](#toc4_2__) \n", - "- [Adding Custom Test to Model Documentation](#toc5__) \n", - "- [Some More Custom Tests](#toc6__) \n", - " - [Custom Test: Table of Model Hyperparameters](#toc6_1__) \n", - " - [Custom Test: External API Call](#toc6_2__) \n", - " - [Custom Test: Passing Parameters](#toc6_3__) \n", - " - [Custom Test: Multiple Tables and Plots in a Single Test](#toc6_4__) \n", - " - [Custom Test: Images](#toc6_5__) \n", - " - [Custom Test: Description](#toc6_6__) \n", - "- [Conclusion](#toc7__) \n", - "- [Next steps](#toc8__) \n", - " - [Work with your model documentation](#toc8_1__) \n", - " - [Discover more learning resources](#toc8_2__) \n", - "- [Upgrade ValidMind](#toc9__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model\u2019s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom test can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Implement a Custom Test\n", - "\n", - "Let's start off by creating a simple custom test that creates a Confusion Matrix for a binary classification model. We will use the `sklearn.metrics.confusion_matrix` function to calculate the confusion matrix and then display it as a heatmap using `plotly`. (This is already a built-in test in ValidMind, but we will use it as an example to demonstrate how to create custom tests.)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import matplotlib.pyplot as plt\n", - "from sklearn import metrics\n", - "\n", - "\n", - "@vm.test(\"my_custom_tests.ConfusionMatrix\")\n", - "def confusion_matrix(dataset, model):\n", - " \"\"\"The confusion matrix is a table that is often used to describe the performance of a classification model on a set of data for which the true values are known.\n", - "\n", - " The confusion matrix is a 2x2 table that contains 4 values:\n", - "\n", - " - True Positive (TP): the number of correct positive predictions\n", - " - True Negative (TN): the number of correct negative predictions\n", - " - False Positive (FP): the number of incorrect positive predictions\n", - " - False Negative (FN): the number of incorrect negative predictions\n", - "\n", - " The confusion matrix can be used to assess the holistic performance of a classification model by showing the accuracy, precision, recall, and F1 score of the model on a single figure.\n", - " \"\"\"\n", - " y_true = dataset.y\n", - " y_pred = dataset.y_pred(model)\n", - "\n", - " confusion_matrix = metrics.confusion_matrix(y_true, y_pred)\n", - "\n", - " cm_display = metrics.ConfusionMatrixDisplay(\n", - " confusion_matrix=confusion_matrix, display_labels=[False, True]\n", - " )\n", - " cm_display.plot()\n", - "\n", - " plt.close() # close the plot to avoid displaying it\n", - "\n", - " return cm_display.figure_ # return the figure object itself" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Thats our custom test defined and ready to go... Let's take a look at whats going on here:\n", - "\n", - "- The function `confusion_matrix` takes two arguments `dataset` and `model`. This is a VMDataset and VMModel object respectively.\n", - "- The function docstring provides a description of what the test does. This will be displayed along with the result in this notebook as well as in the ValidMind Platform.\n", - "- The function body calculates the confusion matrix using the `sklearn.metrics.confusion_matrix` function and then plots it using `sklearn.metric.ConfusionMatrixDisplay`.\n", - "- The function then returns the `ConfusionMatrixDisplay.figure_` object - this is important as the ValidMind Library expects the output of the custom test to be a plot or a table.\n", - "- The `@vm.test` decorator is doing the work of creating a wrapper around the function that will allow it to be run by the ValidMind Library. It also registers the test so it can be found by the ID `my_custom_tests.ConfusionMatrix` (see the section below on how test IDs work in ValidMind and why this format is important)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Run the Custom Test\n", - "\n", - "Now that we have defined and registered our custom test, lets see how we can run it and properly use it in the ValidMind Platform." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Setup the Model and Dataset\n", - "\n", - "First let's setup a an example model and dataset to run our custom metic against. Since this is a Confusion Matrix, we will use the Customer Churn dataset that ValidMind provides and train a simple XGBoost model." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import xgboost as xgb\n", - "from validmind.datasets.classification import customer_churn\n", - "\n", - "raw_df = customer_churn.load_data()\n", - "train_df, validation_df, test_df = customer_churn.preprocess(raw_df)\n", - "\n", - "x_train = train_df.drop(customer_churn.target_column, axis=1)\n", - "y_train = train_df[customer_churn.target_column]\n", - "x_val = validation_df.drop(customer_churn.target_column, axis=1)\n", - "y_val = validation_df[customer_churn.target_column]\n", - "\n", - "model = xgb.XGBClassifier(early_stopping_rounds=10)\n", - "model.set_params(\n", - " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", - ")\n", - "model.fit(\n", - " x_train,\n", - " y_train,\n", - " eval_set=[(x_val, y_val)],\n", - " verbose=False,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Easy enough! Now we have a model and dataset setup and trained. One last thing to do is bring the dataset and model into the ValidMind Library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# for now, we'll just use the test dataset\n", - "vm_test_ds = vm.init_dataset(\n", - " dataset=test_df,\n", - " target_column=customer_churn.target_column,\n", - " input_id=\"test_dataset\",\n", - ")\n", - "\n", - "vm_model = vm.init_model(model, input_id=\"model\")\n", - "\n", - "# link the model to the dataset\n", - "vm_test_ds.assign_predictions(model=vm_model)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### Run the Custom Test\n", - "\n", - "Now that we have our model and dataset setup, we have everything we need to run our custom test. We can do this by importing the `run_test` function from the `validmind.tests` module and passing in the test ID of our custom test along with the model and dataset we want to run it against.\n", - "\n", - ">Notice how the `inputs` dictionary is used to map an `input_id` which we set above to the `model` and `dataset` keys that are expected by our custom test function. This is how the ValidMind Library knows which inputs to pass to different tests and is key when using many different datasets and models." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.tests import run_test\n", - "\n", - "result = run_test(\n", - " \"my_custom_tests.ConfusionMatrix\",\n", - " inputs={\"model\": \"model\", \"dataset\": \"test_dataset\"},\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You'll notice that the docstring becomes a markdown description of the test. The figure is then displayed as the test result. What you see above is how it will look in the ValidMind Platform as well. Let's go ahead and log the result to see how that works." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Adding Custom Test to Model Documentation\n", - "\n", - "To do this, go to the documentation page of the model you registered above and navigate to the `Model Development` -> `Model Evaluation` section. Then hover between any existing content block to reveal the `+` button as shown in the screenshot below.\n", - "\n", - "![screenshot showing insert button for test-driven blocks](./insert-test-driven-block.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now click on the `+` button and select the `Test-Driven Block` option. This will open a dialog where you can select `My Custom Tests Confusion Matrix` from the list of available tests. You can preview the result and then click `Insert Block` to add it to the documentation.\n", - "\n", - "![screenshot showing how to insert a test-driven block](./insert-test-driven-block-custom.png)\n", - "\n", - "The test should match the result you see above. It is now part of your documentation and will now be run everytime you run `vm.run_documentation_tests()` for your model. Let's do that now." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.reload()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If you preview the template, it should show the custom test in the `Model Development`->`Model Evaluation` section:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Just so we can run all of the tests in the template, let's initialize the train and raw dataset.\n", - "\n", - "(Refer to [**Quickstart for documentation**](../../../quickstart/quickstart_documentation.ipynb) and the ValidMind docs for more information on what we are doing here)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_dataset = vm.init_dataset(\n", - " dataset=raw_df,\n", - " input_id=\"raw_dataset\",\n", - " target_column=customer_churn.target_column,\n", - " class_labels=customer_churn.class_labels,\n", - ")\n", - "\n", - "vm_train_ds = vm.init_dataset(\n", - " dataset=train_df,\n", - " input_id=\"train_dataset\",\n", - " target_column=customer_churn.target_column,\n", - ")\n", - "vm_train_ds.assign_predictions(model=vm_model)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "To run all the tests in the template, you can use the `vm.run_documentation_tests()` and pass the inputs we initialized above and the demo config from our customer_churn module. We will have to add a section to the config for our new test to tell it which inputs it should receive. This is done by simply adding a new element in the config dictionary where the key is the ID of the test and the value is a dictionary with the following structure:\n", - "```python\n", - "{\n", - " \"inputs\": {\n", - " \"model\": \"test_dataset\",\n", - " \"dataset\": \"model\",\n", - " }\n", - "}\n", - "```" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.utils import preview_test_config\n", - "\n", - "test_config = customer_churn.get_demo_test_config()\n", - "test_config[\"my_custom_tests.ConfusionMatrix\"] = {\n", - " \"inputs\": {\n", - " \"dataset\": \"test_dataset\",\n", - " \"model\": \"model\",\n", - " }\n", - "}\n", - "preview_test_config(test_config)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "full_suite = vm.run_documentation_tests(config=test_config)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Some More Custom Tests\n", - "\n", - "Now that you understand the entire process of creating custom tests and using them in your documentation, let's create a few more to see different ways you can utilize custom tests." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_1__'></a>\n", - "\n", - "### Custom Test: Table of Model Hyperparameters\n", - "\n", - "This custom test will display a table of the hyperparameters used in the model:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "@vm.test(\"my_custom_tests.Hyperparameters\")\n", - "def hyperparameters(model):\n", - " \"\"\"The hyperparameters of a machine learning model are the settings that control the learning process.\n", - " These settings are specified before the learning process begins and can have a significant impact on the\n", - " performance of the model.\n", - "\n", - " The hyperparameters of a model can be used to tune the model to achieve the best possible performance\n", - " on a given dataset. By examining the hyperparameters of a model, you can gain insight into how the model\n", - " was trained and how it might be improved.\n", - " \"\"\"\n", - " hyperparameters = model.model.get_xgb_params() # dictionary of hyperparameters\n", - "\n", - " # turn the dictionary into a table where each row contains a hyperparameter and its value\n", - " return [{\"Hyperparam\": k, \"Value\": v} for k, v in hyperparameters.items() if v]\n", - "\n", - "\n", - "result = run_test(\"my_custom_tests.Hyperparameters\", inputs={\"model\": \"model\"})\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Since the test has been run and logged, you can add it to your documentation using the same process as above. It should look like this:\n", - "\n", - "![screenshot showing hyperparameters test](./hyperparameters-custom-metric.png)\n", - "\n", - "For our simple toy model, there are aren't really any proper hyperparameters but you can see how this could be useful for more complex models that have gone through hyperparameter tuning." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_2__'></a>\n", - "\n", - "### Custom Test: External API Call\n", - "\n", - "This custom test will make an external API call to get the current BTC price and display it as a table. This demonstrates how you might integrate external data sources into your model documentation in a programmatic way. You could, for instance, setup a pipeline that runs a test like this every day to keep your model documentation in sync with an external system." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import requests\n", - "import random\n", - "\n", - "\n", - "@vm.test(\"my_custom_tests.ExternalAPI\")\n", - "def external_api():\n", - " \"\"\"This test calls an external API to get a list of fake users. It then creates\n", - " a table with the relevant data so it can be displayed in the documentation.\n", - "\n", - " The purpose of this test is to demonstrate how to call an external API and use the\n", - " data in a test. A test like this could even be setup to run in a scheduled\n", - " pipeline to keep your documentation in-sync with an external data source.\n", - " \"\"\"\n", - " url = \"https://jsonplaceholder.typicode.com/users\"\n", - " response = requests.get(url)\n", - " data = response.json()\n", - "\n", - " # extract the time and the current BTC price in USD\n", - " return {\n", - " \"Model Owners/Stakeholders\": [\n", - " {\n", - " \"Name\": user[\"name\"],\n", - " \"Role\": random.choice([\"Owner\", \"Stakeholder\"]),\n", - " \"Email\": user[\"email\"],\n", - " \"Phone\": user[\"phone\"],\n", - " \"Slack Handle\": f\"@{user['name'].lower().replace(' ', '.')}\",\n", - " }\n", - " for user in data[:3]\n", - " ],\n", - " \"Model Developers\": [\n", - " {\n", - " \"Name\": user[\"name\"],\n", - " \"Role\": \"Developer\",\n", - " \"Email\": user[\"email\"],\n", - " }\n", - " for user in data[3:7]\n", - " ],\n", - " \"Model Validators\": [\n", - " {\n", - " \"Name\": user[\"name\"],\n", - " \"Role\": \"Validator\",\n", - " \"Email\": user[\"email\"],\n", - " }\n", - " for user in data[7:]\n", - " ],\n", - " }\n", - "\n", - "\n", - "result = run_test(\"my_custom_tests.ExternalAPI\")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Again, you can add this to your documentation to see how it looks:\n", - "\n", - "![screenshot showing BTC price metric](./external-data-custom-test.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_3__'></a>\n", - "\n", - "### Custom Test: Passing Parameters\n", - "\n", - "Custom test functions, as stated earlier, can take both inputs and params. When you define your function there is no need to distinguish between the two, the ValidMind Library will handle that for you. You simply need to add both to the function as arguments and the library will pass in the correct values.\n", - "\n", - "So for instance, if you wanted to parameterize the first custom test we created, the confusion matrix, you could do so like this:\n", - "\n", - "```python\n", - "def confusion_matrix(dataset: VMDataset, model: VMModel, my_param: str = \"Default Value\"):\n", - " pass\n", - "```\n", - "\n", - "And then when you run the test, you can pass in the parameter like this:\n", - "\n", - "```python\n", - "vm.run_test(\n", - " \"my_custom_tests.ConfusionMatrix\",\n", - " inputs={\"model\": \"model\", \"dataset\": \"test_dataset\"},\n", - " params={\"my_param\": \"My Value\"},\n", - ")\n", - "```\n", - "\n", - "Or if you are running the entire documentation template, you would update the config like this:\n", - "\n", - "```python\n", - "test_config[\"my_custom_tests.ConfusionMatrix\"] = {\n", - " \"inputs\": {\n", - " \"dataset\": \"test_dataset\",\n", - " \"model\": \"model\",\n", - " },\n", - " \"params\": {\n", - " \"my_param\": \"My Value\",\n", - " },\n", - "}\n", - "```\n", - "\n", - "Let's go ahead and create a toy test that takes a parameter and uses it in the result:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import plotly.express as px\n", - "\n", - "\n", - "@vm.test(\"my_custom_tests.ParameterExample\")\n", - "def parameter_example(\n", - " plot_title=\"Default Plot Title\", x_col=\"sepal_width\", y_col=\"sepal_length\"\n", - "):\n", - " \"\"\"This test takes two parameters and creates a scatter plot based on them.\n", - "\n", - " The purpose of this test is to demonstrate how to create a test that takes\n", - " parameters and uses them to generate a plot. This can be useful for creating\n", - " tests that are more flexible and can be used in a variety of scenarios.\n", - " \"\"\"\n", - " # return px.scatter(px.data.iris(), x=x_col, y=y_col, color=\"species\")\n", - " return px.scatter(\n", - " px.data.iris(), x=x_col, y=y_col, color=\"species\", title=plot_title\n", - " )\n", - "\n", - "\n", - "result = run_test(\n", - " \"my_custom_tests.ParameterExample\",\n", - " params={\n", - " \"plot_title\": \"My Cool Plot\",\n", - " \"x_col\": \"sepal_width\",\n", - " \"y_col\": \"sepal_length\",\n", - " },\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Play around with this and see how you can use parameters, default values and other features to make your custom tests more flexible and useful.\n", - "\n", - "Here's how this one looks in the documentation:\n", - "![screenshot showing parameterized test](./parameterized-custom-metric.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_4__'></a>\n", - "\n", - "### Custom Test: Multiple Tables and Plots in a Single Test\n", - "\n", - "Custom test functions, as stated earlier, can return more than just one table or plot. In fact, any number of tables and plots can be returned. Let's see an example of this:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import numpy as np\n", - "import plotly.express as px\n", - "\n", - "\n", - "@vm.test(\"my_custom_tests.ComplexOutput\")\n", - "def complex_output():\n", - " \"\"\"This test demonstrates how to return many tables and figures in a single test\"\"\"\n", - " # create a couple tables\n", - " table = [{\"A\": 1, \"B\": 2}, {\"A\": 3, \"B\": 4}]\n", - " table2 = [{\"C\": 5, \"D\": 6}, {\"C\": 7, \"D\": 8}]\n", - "\n", - " # create a few figures showing some random data\n", - " fig1 = px.line(x=np.arange(10), y=np.random.rand(10), title=\"Random Line Plot\")\n", - " fig2 = px.bar(x=[\"A\", \"B\", \"C\"], y=np.random.rand(3), title=\"Random Bar Plot\")\n", - " fig3 = px.scatter(\n", - " x=np.random.rand(10), y=np.random.rand(10), title=\"Random Scatter Plot\"\n", - " )\n", - "\n", - " return (\n", - " {\n", - " \"My Cool Table\": table,\n", - " \"Another Table\": table2,\n", - " },\n", - " {\n", - " # Figures support the same dict-of-titles convention as tables.\n", - " # These titles flow into the document media registry as\n", - " # \"Figure N. <title>\" alongside table captions.\n", - " \"Random Line Plot\": fig1,\n", - " \"Random Bar Plot\": fig2,\n", - " \"Random Scatter Plot\": fig3,\n", - " },\n", - " )\n", - "\n", - "\n", - "result = run_test(\"my_custom_tests.ComplexOutput\")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Notice how you can return the tables as a dictionary where the key is the title of the table and the value is the table itself. The same convention works for **figures** \u2014 wrap them in a dict whose keys are the titles you want shown in the document media registry (e.g. *Figure 7. Random Line Plot*). You could also just return the figures by themselves but this way you can give them a title to more easily identify them in the result.\n", - "\n", - "![screenshot showing multiple tables and plots](./multiple-tables-plots-custom-metric.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_5__'></a>\n", - "\n", - "### Custom Test: Images\n", - "\n", - "If you are using a plotting library that isn't supported by ValidMind (i.e. not `matplotlib` or `plotly`), you can still return the image directly as a bytes-like object. This could also be used to bring any type of image into your documentation in a programmatic way. For instance, you may want to include a diagram of your model architecture or a screenshot of a dashboard that your model is integrated with. As long as you can produce the image with Python or open it from a file, you can include it in your documentation." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import io\n", - "import matplotlib.pyplot as plt\n", - "\n", - "\n", - "@vm.test(\"my_custom_tests.Image\")\n", - "def image():\n", - " \"\"\"This test demonstrates how to return an image in a test\"\"\"\n", - "\n", - " # create a simple plot\n", - " fig, ax = plt.subplots()\n", - " ax.plot([1, 2, 3, 4])\n", - " ax.set_title(\"Simple Line Plot\")\n", - "\n", - " # save the plot as a PNG image (in-memory buffer)\n", - " img_data = io.BytesIO()\n", - " fig.savefig(img_data, format=\"png\")\n", - " img_data.seek(0)\n", - "\n", - " plt.close() # close the plot to avoid displaying it\n", - "\n", - " return img_data.read()\n", - "\n", - "\n", - "result = run_test(\"my_custom_tests.Image\")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Adding this custom test to your documentation will display the image:\n", - "\n", - "![screenshot showing image custom test](./image-in-custom-metric.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If you want to log an image as a test result, you can do so by passing the path to the image as a parameter to the custom test and then opening the file in the test function. Here's an example:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "@vm.test(\"my_custom_tests.MyPNGCorrelationMatrix\")\n", - "def Image(path: str):\n", - " \"\"\"Opens a png image file and logs it as a test result to ValidMind\"\"\"\n", - " if not path.endswith(\".png\"):\n", - " raise ValueError(\"Image must be a PNG file\")\n", - "\n", - " # return raw image bytes\n", - " with open(path, \"rb\") as f:\n", - " return f.read()\n", - " \n", - "run_test(\n", - " \"my_custom_tests.MyPNGCorrelationMatrix\",\n", - " params={\"path\": \"./pearson-correlation-matrix.png\"},\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The image is displayed in the test result:\n", - "\n", - "![screenshot showing image from file](./pearson-correlation-matrix-test-output.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_6__'></a>\n", - "\n", - "### Custom Test: Description\n", - "\n", - "If you want to write a custom test description for your custom test instead of it is interpreted through llm, you can do so by returning string in your test." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import pandas as pd\n", - "\n", - "@vm.test(\"my_custom_tests.MyCustomTest\")\n", - "def my_custom_test(dataset, model):\n", - " \"\"\"\n", - " This is a custom computed test that computes confusion matrix for a binary classification model and return a string as a test description.\n", - " \"\"\"\n", - " y_true = dataset.y\n", - " y_pred = dataset.y_pred(model)\n", - "\n", - " confusion_matrix = metrics.confusion_matrix(y_true, y_pred)\n", - "\n", - " cm_display = metrics.ConfusionMatrixDisplay(\n", - " confusion_matrix=confusion_matrix, display_labels=[False, True]\n", - " )\n", - " cm_display.plot()\n", - "\n", - " plt.close() # close the plot to avoid displaying it\n", - "\n", - " return cm_display.figure_, \"Test Description - Confusion Matrix\", pd.DataFrame({\"Value\": [1, 2, 3]}) # return the figure object itself\n", - "\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You can see here test result description has been customized here. The same result description will be displayed in the UI." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.MyCustomTest\",\n", - " inputs={\"model\": \"model\", \"dataset\": \"test_dataset\"},\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Conclusion\n", - "\n", - "In this notebook, we have demonstrated how to create custom tests in ValidMind. We have shown how to define custom test functions, register them with the ValidMind Library, run them against models and datasets, and add them to model documentation templates. We have also shown how to return tables and plots from custom tests and how to use them in the ValidMind Platform. We hope this tutorial has been helpful in understanding how to create and use custom tests in ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way \u2014 use the ValidMind Platform to work with your model documentation.\n", - "\n", - "<a id='toc8_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "<a id='toc8_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc9__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you\u2019ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-bcdac57ebb8d440f86ba120ee6511db3", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright \u00a9 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.11.5" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} + "nbformat": 4, + "nbformat_minor": 4 +} \ No newline at end of file diff --git a/site/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb b/site/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb index e3a7a3b944..2191dbd98e 100644 --- a/site/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb +++ b/site/notebooks/how_to/tests/explore_tests/explore_test_suites.ipynb @@ -1,931 +1,932 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Explore test suites\n", - "\n", - "Explore ValidMind test suites, pre-built collections of related tests used to evaluate specific aspects of your model. Retrieve available test suites and details for tests within a suite to understand their functionality, allowing you to select the appropriate test suites for your use cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Install the ValidMind Library](#toc2__) \n", - "- [List available test suites](#toc3__) \n", - "- [View test suite details](#toc4__) \n", - " - [View test details](#toc4_1__) \n", - "- [Next steps](#toc5__) \n", - " - [Discover more learning resources](#toc5_1__) \n", - "- [Upgrade ValidMind](#toc6__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Install the ValidMind Library\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", - "<br></br>\n", - "Python 3.8 <= x <= 3.14</div>\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## List available test suites\n", - "After we import the ValidMind Library, we'll call [test_suites.list_suites()](https://docs.validmind.ai/validmind/validmind/test_suites.html#list_suites) to retrieve a structured list of all available test suites, that includes each suite's name, description, and associated tests:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Explore test suites\n", + "\n", + "Explore ValidMind test suites, pre-built collections of related tests used to evaluate specific aspects of your model. Retrieve available test suites and details for tests within a suite to understand their functionality, allowing you to select the appropriate test suites for your use cases." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Install the ValidMind Library](#toc2__) \n", + "- [List available test suites](#toc3__) \n", + "- [View test suite details](#toc4__) \n", + " - [View test details](#toc4_1__) \n", + "- [Next steps](#toc5__) \n", + " - [Discover more learning resources](#toc5_1__) \n", + "- [Upgrade ValidMind](#toc6__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", + "\n", + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, { - "data": { - "text/html": [ - "<style type=\"text/css\">\n", - "#T_9e889 th {\n", - " text-align: left;\n", - "}\n", - "#T_9e889_row0_col0, #T_9e889_row0_col1, #T_9e889_row0_col2, #T_9e889_row0_col3, #T_9e889_row1_col0, #T_9e889_row1_col1, #T_9e889_row1_col2, #T_9e889_row1_col3, #T_9e889_row2_col0, #T_9e889_row2_col1, #T_9e889_row2_col2, #T_9e889_row2_col3, #T_9e889_row3_col0, #T_9e889_row3_col1, #T_9e889_row3_col2, #T_9e889_row3_col3, #T_9e889_row4_col0, #T_9e889_row4_col1, #T_9e889_row4_col2, #T_9e889_row4_col3, #T_9e889_row5_col0, #T_9e889_row5_col1, #T_9e889_row5_col2, #T_9e889_row5_col3, #T_9e889_row6_col0, #T_9e889_row6_col1, #T_9e889_row6_col2, #T_9e889_row6_col3, #T_9e889_row7_col0, #T_9e889_row7_col1, #T_9e889_row7_col2, #T_9e889_row7_col3, #T_9e889_row8_col0, #T_9e889_row8_col1, #T_9e889_row8_col2, #T_9e889_row8_col3, #T_9e889_row9_col0, #T_9e889_row9_col1, #T_9e889_row9_col2, #T_9e889_row9_col3, #T_9e889_row10_col0, #T_9e889_row10_col1, #T_9e889_row10_col2, #T_9e889_row10_col3, #T_9e889_row11_col0, #T_9e889_row11_col1, #T_9e889_row11_col2, #T_9e889_row11_col3, #T_9e889_row12_col0, #T_9e889_row12_col1, #T_9e889_row12_col2, #T_9e889_row12_col3, #T_9e889_row13_col0, #T_9e889_row13_col1, #T_9e889_row13_col2, #T_9e889_row13_col3, #T_9e889_row14_col0, #T_9e889_row14_col1, #T_9e889_row14_col2, #T_9e889_row14_col3, #T_9e889_row15_col0, #T_9e889_row15_col1, #T_9e889_row15_col2, #T_9e889_row15_col3, #T_9e889_row16_col0, #T_9e889_row16_col1, #T_9e889_row16_col2, #T_9e889_row16_col3, #T_9e889_row17_col0, #T_9e889_row17_col1, #T_9e889_row17_col2, #T_9e889_row17_col3, #T_9e889_row18_col0, #T_9e889_row18_col1, #T_9e889_row18_col2, #T_9e889_row18_col3, #T_9e889_row19_col0, #T_9e889_row19_col1, #T_9e889_row19_col2, #T_9e889_row19_col3, #T_9e889_row20_col0, #T_9e889_row20_col1, #T_9e889_row20_col2, #T_9e889_row20_col3, #T_9e889_row21_col0, #T_9e889_row21_col1, #T_9e889_row21_col2, #T_9e889_row21_col3, #T_9e889_row22_col0, #T_9e889_row22_col1, #T_9e889_row22_col2, #T_9e889_row22_col3, #T_9e889_row23_col0, #T_9e889_row23_col1, #T_9e889_row23_col2, #T_9e889_row23_col3, #T_9e889_row24_col0, #T_9e889_row24_col1, #T_9e889_row24_col2, #T_9e889_row24_col3, #T_9e889_row25_col0, #T_9e889_row25_col1, #T_9e889_row25_col2, #T_9e889_row25_col3, #T_9e889_row26_col0, #T_9e889_row26_col1, #T_9e889_row26_col2, #T_9e889_row26_col3, #T_9e889_row27_col0, #T_9e889_row27_col1, #T_9e889_row27_col2, #T_9e889_row27_col3, #T_9e889_row28_col0, #T_9e889_row28_col1, #T_9e889_row28_col2, #T_9e889_row28_col3, #T_9e889_row29_col0, #T_9e889_row29_col1, #T_9e889_row29_col2, #T_9e889_row29_col3 {\n", - " text-align: left;\n", - "}\n", - "</style>\n", - "<table id=\"T_9e889\">\n", - " <thead>\n", - " <tr>\n", - " <th id=\"T_9e889_level0_col0\" class=\"col_heading level0 col0\" >ID</th>\n", - " <th id=\"T_9e889_level0_col1\" class=\"col_heading level0 col1\" >Name</th>\n", - " <th id=\"T_9e889_level0_col2\" class=\"col_heading level0 col2\" >Description</th>\n", - " <th id=\"T_9e889_level0_col3\" class=\"col_heading level0 col3\" >Tests</th>\n", - " </tr>\n", - " </thead>\n", - " <tbody>\n", - " <tr>\n", - " <td id=\"T_9e889_row0_col0\" class=\"data row0 col0\" >classifier_model_diagnosis</td>\n", - " <td id=\"T_9e889_row0_col1\" class=\"data row0 col1\" >ClassifierDiagnosis</td>\n", - " <td id=\"T_9e889_row0_col2\" class=\"data row0 col2\" >Test suite for sklearn classifier model diagnosis tests</td>\n", - " <td id=\"T_9e889_row0_col3\" class=\"data row0 col3\" >validmind.model_validation.sklearn.OverfitDiagnosis, validmind.model_validation.sklearn.WeakspotsDiagnosis, validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row1_col0\" class=\"data row1 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_9e889_row1_col1\" class=\"data row1 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_9e889_row1_col2\" class=\"data row1 col2\" >Full test suite for binary classification models.</td>\n", - " <td id=\"T_9e889_row1_col3\" class=\"data row1 col3\" >validmind.data_validation.DatasetDescription, validmind.data_validation.DescriptiveStatistics, validmind.data_validation.PearsonCorrelationMatrix, validmind.data_validation.ClassImbalance, validmind.data_validation.Duplicates, validmind.data_validation.HighCardinality, validmind.data_validation.HighPearsonCorrelation, validmind.data_validation.MissingValues, validmind.data_validation.Skewness, validmind.data_validation.UniqueRows, validmind.data_validation.TooManyZeroValues, validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.sklearn.ConfusionMatrix, validmind.model_validation.sklearn.ClassifierPerformance, validmind.model_validation.sklearn.PermutationFeatureImportance, validmind.model_validation.sklearn.PrecisionRecallCurve, validmind.model_validation.sklearn.ROCCurve, validmind.model_validation.sklearn.PopulationStabilityIndex, validmind.model_validation.sklearn.SHAPGlobalImportance, validmind.model_validation.sklearn.MinimumAccuracy, validmind.model_validation.sklearn.MinimumF1Score, validmind.model_validation.sklearn.MinimumROCAUCScore, validmind.model_validation.sklearn.TrainingTestDegradation, validmind.model_validation.sklearn.ModelsPerformanceComparison, validmind.model_validation.sklearn.OverfitDiagnosis, validmind.model_validation.sklearn.WeakspotsDiagnosis, validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row2_col0\" class=\"data row2 col0\" >classifier_metrics</td>\n", - " <td id=\"T_9e889_row2_col1\" class=\"data row2 col1\" >ClassifierMetrics</td>\n", - " <td id=\"T_9e889_row2_col2\" class=\"data row2 col2\" >Test suite for sklearn classifier metrics</td>\n", - " <td id=\"T_9e889_row2_col3\" class=\"data row2 col3\" >validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.sklearn.ConfusionMatrix, validmind.model_validation.sklearn.ClassifierPerformance, validmind.model_validation.sklearn.PermutationFeatureImportance, validmind.model_validation.sklearn.PrecisionRecallCurve, validmind.model_validation.sklearn.ROCCurve, validmind.model_validation.sklearn.PopulationStabilityIndex, validmind.model_validation.sklearn.SHAPGlobalImportance</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row3_col0\" class=\"data row3 col0\" >classifier_model_validation</td>\n", - " <td id=\"T_9e889_row3_col1\" class=\"data row3 col1\" >ClassifierModelValidation</td>\n", - " <td id=\"T_9e889_row3_col2\" class=\"data row3 col2\" >Test suite for binary classification models.</td>\n", - " <td id=\"T_9e889_row3_col3\" class=\"data row3 col3\" >validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.sklearn.ConfusionMatrix, validmind.model_validation.sklearn.ClassifierPerformance, validmind.model_validation.sklearn.PermutationFeatureImportance, validmind.model_validation.sklearn.PrecisionRecallCurve, validmind.model_validation.sklearn.ROCCurve, validmind.model_validation.sklearn.PopulationStabilityIndex, validmind.model_validation.sklearn.SHAPGlobalImportance, validmind.model_validation.sklearn.MinimumAccuracy, validmind.model_validation.sklearn.MinimumF1Score, validmind.model_validation.sklearn.MinimumROCAUCScore, validmind.model_validation.sklearn.TrainingTestDegradation, validmind.model_validation.sklearn.ModelsPerformanceComparison, validmind.model_validation.sklearn.OverfitDiagnosis, validmind.model_validation.sklearn.WeakspotsDiagnosis, validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row4_col0\" class=\"data row4 col0\" >classifier_validation</td>\n", - " <td id=\"T_9e889_row4_col1\" class=\"data row4 col1\" >ClassifierPerformance</td>\n", - " <td id=\"T_9e889_row4_col2\" class=\"data row4 col2\" >Test suite for sklearn classifier models</td>\n", - " <td id=\"T_9e889_row4_col3\" class=\"data row4 col3\" >validmind.model_validation.sklearn.MinimumAccuracy, validmind.model_validation.sklearn.MinimumF1Score, validmind.model_validation.sklearn.MinimumROCAUCScore, validmind.model_validation.sklearn.TrainingTestDegradation, validmind.model_validation.sklearn.ModelsPerformanceComparison</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row5_col0\" class=\"data row5 col0\" >cluster_full_suite</td>\n", - " <td id=\"T_9e889_row5_col1\" class=\"data row5 col1\" >ClusterFullSuite</td>\n", - " <td id=\"T_9e889_row5_col2\" class=\"data row5 col2\" >Full test suite for clustering models.</td>\n", - " <td id=\"T_9e889_row5_col3\" class=\"data row5 col3\" >validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.sklearn.HomogeneityScore, validmind.model_validation.sklearn.CompletenessScore, validmind.model_validation.sklearn.VMeasure, validmind.model_validation.sklearn.AdjustedRandIndex, validmind.model_validation.sklearn.AdjustedMutualInformation, validmind.model_validation.sklearn.FowlkesMallowsScore, validmind.model_validation.sklearn.ClusterPerformanceMetrics, validmind.model_validation.sklearn.ClusterCosineSimilarity, validmind.model_validation.sklearn.SilhouettePlot, validmind.model_validation.ClusterSizeDistribution, validmind.model_validation.sklearn.HyperParametersTuning, validmind.model_validation.sklearn.KMeansClustersOptimization</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row6_col0\" class=\"data row6 col0\" >cluster_metrics</td>\n", - " <td id=\"T_9e889_row6_col1\" class=\"data row6 col1\" >ClusterMetrics</td>\n", - " <td id=\"T_9e889_row6_col2\" class=\"data row6 col2\" >Test suite for sklearn clustering metrics</td>\n", - " <td id=\"T_9e889_row6_col3\" class=\"data row6 col3\" >validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.sklearn.HomogeneityScore, validmind.model_validation.sklearn.CompletenessScore, validmind.model_validation.sklearn.VMeasure, validmind.model_validation.sklearn.AdjustedRandIndex, validmind.model_validation.sklearn.AdjustedMutualInformation, validmind.model_validation.sklearn.FowlkesMallowsScore, validmind.model_validation.sklearn.ClusterPerformanceMetrics, validmind.model_validation.sklearn.ClusterCosineSimilarity, validmind.model_validation.sklearn.SilhouettePlot</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row7_col0\" class=\"data row7 col0\" >cluster_performance</td>\n", - " <td id=\"T_9e889_row7_col1\" class=\"data row7 col1\" >ClusterPerformance</td>\n", - " <td id=\"T_9e889_row7_col2\" class=\"data row7 col2\" >Test suite for sklearn cluster performance</td>\n", - " <td id=\"T_9e889_row7_col3\" class=\"data row7 col3\" >validmind.model_validation.ClusterSizeDistribution</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row8_col0\" class=\"data row8 col0\" >embeddings_full_suite</td>\n", - " <td id=\"T_9e889_row8_col1\" class=\"data row8 col1\" >EmbeddingsFullSuite</td>\n", - " <td id=\"T_9e889_row8_col2\" class=\"data row8 col2\" >Full test suite for embeddings models.</td>\n", - " <td id=\"T_9e889_row8_col3\" class=\"data row8 col3\" >validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.embeddings.DescriptiveAnalytics, validmind.model_validation.embeddings.CosineSimilarityDistribution, validmind.model_validation.embeddings.ClusterDistribution, validmind.model_validation.embeddings.EmbeddingsVisualization2D, validmind.model_validation.embeddings.StabilityAnalysisRandomNoise, validmind.model_validation.embeddings.StabilityAnalysisSynonyms, validmind.model_validation.embeddings.StabilityAnalysisKeyword, validmind.model_validation.embeddings.StabilityAnalysisTranslation</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row9_col0\" class=\"data row9 col0\" >embeddings_metrics</td>\n", - " <td id=\"T_9e889_row9_col1\" class=\"data row9 col1\" >EmbeddingsMetrics</td>\n", - " <td id=\"T_9e889_row9_col2\" class=\"data row9 col2\" >Test suite for embeddings metrics</td>\n", - " <td id=\"T_9e889_row9_col3\" class=\"data row9 col3\" >validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.embeddings.DescriptiveAnalytics, validmind.model_validation.embeddings.CosineSimilarityDistribution, validmind.model_validation.embeddings.ClusterDistribution, validmind.model_validation.embeddings.EmbeddingsVisualization2D</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row10_col0\" class=\"data row10 col0\" >embeddings_model_performance</td>\n", - " <td id=\"T_9e889_row10_col1\" class=\"data row10 col1\" >EmbeddingsPerformance</td>\n", - " <td id=\"T_9e889_row10_col2\" class=\"data row10 col2\" >Test suite for embeddings model performance</td>\n", - " <td id=\"T_9e889_row10_col3\" class=\"data row10 col3\" >validmind.model_validation.embeddings.StabilityAnalysisRandomNoise, validmind.model_validation.embeddings.StabilityAnalysisSynonyms, validmind.model_validation.embeddings.StabilityAnalysisKeyword, validmind.model_validation.embeddings.StabilityAnalysisTranslation</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row11_col0\" class=\"data row11 col0\" >hyper_parameters_optimization</td>\n", - " <td id=\"T_9e889_row11_col1\" class=\"data row11 col1\" >KmeansParametersOptimization</td>\n", - " <td id=\"T_9e889_row11_col2\" class=\"data row11 col2\" >Test suite for sklearn hyperparameters optimization</td>\n", - " <td id=\"T_9e889_row11_col3\" class=\"data row11 col3\" >validmind.model_validation.sklearn.HyperParametersTuning, validmind.model_validation.sklearn.KMeansClustersOptimization</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row12_col0\" class=\"data row12 col0\" >llm_classifier_full_suite</td>\n", - " <td id=\"T_9e889_row12_col1\" class=\"data row12 col1\" >LLMClassifierFullSuite</td>\n", - " <td id=\"T_9e889_row12_col2\" class=\"data row12 col2\" >Full test suite for LLM classification models.</td>\n", - " <td id=\"T_9e889_row12_col3\" class=\"data row12 col3\" >validmind.data_validation.ClassImbalance, validmind.data_validation.Duplicates, validmind.data_validation.nlp.StopWords, validmind.data_validation.nlp.Punctuations, validmind.data_validation.nlp.CommonWords, validmind.data_validation.nlp.TextDescription, validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.sklearn.ConfusionMatrix, validmind.model_validation.sklearn.ClassifierPerformance, validmind.model_validation.sklearn.PermutationFeatureImportance, validmind.model_validation.sklearn.PrecisionRecallCurve, validmind.model_validation.sklearn.ROCCurve, validmind.model_validation.sklearn.PopulationStabilityIndex, validmind.model_validation.sklearn.SHAPGlobalImportance, validmind.model_validation.sklearn.MinimumAccuracy, validmind.model_validation.sklearn.MinimumF1Score, validmind.model_validation.sklearn.MinimumROCAUCScore, validmind.model_validation.sklearn.TrainingTestDegradation, validmind.model_validation.sklearn.ModelsPerformanceComparison, validmind.model_validation.sklearn.OverfitDiagnosis, validmind.model_validation.sklearn.WeakspotsDiagnosis, validmind.model_validation.sklearn.RobustnessDiagnosis, validmind.prompt_validation.Bias, validmind.prompt_validation.Clarity, validmind.prompt_validation.Conciseness, validmind.prompt_validation.Delimitation, validmind.prompt_validation.NegativeInstruction, validmind.prompt_validation.Robustness, validmind.prompt_validation.Specificity</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row13_col0\" class=\"data row13 col0\" >prompt_validation</td>\n", - " <td id=\"T_9e889_row13_col1\" class=\"data row13 col1\" >PromptValidation</td>\n", - " <td id=\"T_9e889_row13_col2\" class=\"data row13 col2\" >Test suite for prompt validation</td>\n", - " <td id=\"T_9e889_row13_col3\" class=\"data row13 col3\" >validmind.prompt_validation.Bias, validmind.prompt_validation.Clarity, validmind.prompt_validation.Conciseness, validmind.prompt_validation.Delimitation, validmind.prompt_validation.NegativeInstruction, validmind.prompt_validation.Robustness, validmind.prompt_validation.Specificity</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row14_col0\" class=\"data row14 col0\" >nlp_classifier_full_suite</td>\n", - " <td id=\"T_9e889_row14_col1\" class=\"data row14 col1\" >NLPClassifierFullSuite</td>\n", - " <td id=\"T_9e889_row14_col2\" class=\"data row14 col2\" >Full test suite for NLP classification models.</td>\n", - " <td id=\"T_9e889_row14_col3\" class=\"data row14 col3\" >validmind.data_validation.ClassImbalance, validmind.data_validation.Duplicates, validmind.data_validation.nlp.StopWords, validmind.data_validation.nlp.Punctuations, validmind.data_validation.nlp.CommonWords, validmind.data_validation.nlp.TextDescription, validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.sklearn.ConfusionMatrix, validmind.model_validation.sklearn.ClassifierPerformance, validmind.model_validation.sklearn.PermutationFeatureImportance, validmind.model_validation.sklearn.PrecisionRecallCurve, validmind.model_validation.sklearn.ROCCurve, validmind.model_validation.sklearn.PopulationStabilityIndex, validmind.model_validation.sklearn.SHAPGlobalImportance, validmind.model_validation.sklearn.MinimumAccuracy, validmind.model_validation.sklearn.MinimumF1Score, validmind.model_validation.sklearn.MinimumROCAUCScore, validmind.model_validation.sklearn.TrainingTestDegradation, validmind.model_validation.sklearn.ModelsPerformanceComparison, validmind.model_validation.sklearn.OverfitDiagnosis, validmind.model_validation.sklearn.WeakspotsDiagnosis, validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row15_col0\" class=\"data row15 col0\" >regression_metrics</td>\n", - " <td id=\"T_9e889_row15_col1\" class=\"data row15 col1\" >RegressionMetrics</td>\n", - " <td id=\"T_9e889_row15_col2\" class=\"data row15 col2\" >Test suite for performance metrics of regression metrics</td>\n", - " <td id=\"T_9e889_row15_col3\" class=\"data row15 col3\" >validmind.data_validation.DatasetSplit, validmind.model_validation.ModelMetadata, validmind.model_validation.sklearn.PermutationFeatureImportance</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row16_col0\" class=\"data row16 col0\" >regression_model_description</td>\n", - " <td id=\"T_9e889_row16_col1\" class=\"data row16 col1\" >RegressionModelDescription</td>\n", - " <td id=\"T_9e889_row16_col2\" class=\"data row16 col2\" >Test suite for performance metric of regression model of statsmodels library</td>\n", - " <td id=\"T_9e889_row16_col3\" class=\"data row16 col3\" >validmind.data_validation.DatasetSplit, validmind.model_validation.ModelMetadata</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row17_col0\" class=\"data row17 col0\" >regression_models_evaluation</td>\n", - " <td id=\"T_9e889_row17_col1\" class=\"data row17 col1\" >RegressionModelsEvaluation</td>\n", - " <td id=\"T_9e889_row17_col2\" class=\"data row17 col2\" >Test suite for metrics comparison of regression model of statsmodels library</td>\n", - " <td id=\"T_9e889_row17_col3\" class=\"data row17 col3\" >validmind.model_validation.statsmodels.RegressionModelCoeffs, validmind.model_validation.sklearn.RegressionModelsPerformanceComparison</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row18_col0\" class=\"data row18 col0\" >regression_full_suite</td>\n", - " <td id=\"T_9e889_row18_col1\" class=\"data row18 col1\" >RegressionFullSuite</td>\n", - " <td id=\"T_9e889_row18_col2\" class=\"data row18 col2\" >Full test suite for regression models.</td>\n", - " <td id=\"T_9e889_row18_col3\" class=\"data row18 col3\" >validmind.data_validation.DatasetDescription, validmind.data_validation.DescriptiveStatistics, validmind.data_validation.PearsonCorrelationMatrix, validmind.data_validation.ClassImbalance, validmind.data_validation.Duplicates, validmind.data_validation.HighCardinality, validmind.data_validation.HighPearsonCorrelation, validmind.data_validation.MissingValues, validmind.data_validation.Skewness, validmind.data_validation.UniqueRows, validmind.data_validation.TooManyZeroValues, validmind.data_validation.DatasetSplit, validmind.model_validation.ModelMetadata, validmind.model_validation.sklearn.PermutationFeatureImportance, validmind.model_validation.sklearn.RegressionErrors, validmind.model_validation.sklearn.RegressionR2Square</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row19_col0\" class=\"data row19 col0\" >regression_performance</td>\n", - " <td id=\"T_9e889_row19_col1\" class=\"data row19 col1\" >RegressionPerformance</td>\n", - " <td id=\"T_9e889_row19_col2\" class=\"data row19 col2\" >Test suite for regression model performance</td>\n", - " <td id=\"T_9e889_row19_col3\" class=\"data row19 col3\" >validmind.model_validation.sklearn.RegressionErrors, validmind.model_validation.sklearn.RegressionR2Square</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row20_col0\" class=\"data row20 col0\" >summarization_metrics</td>\n", - " <td id=\"T_9e889_row20_col1\" class=\"data row20 col1\" >SummarizationMetrics</td>\n", - " <td id=\"T_9e889_row20_col2\" class=\"data row20 col2\" >Test suite for Summarization metrics</td>\n", - " <td id=\"T_9e889_row20_col3\" class=\"data row20 col3\" >validmind.model_validation.TokenDisparity, validmind.model_validation.BleuScore, validmind.model_validation.BertScore, validmind.model_validation.ContextualRecall</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row21_col0\" class=\"data row21 col0\" >tabular_dataset</td>\n", - " <td id=\"T_9e889_row21_col1\" class=\"data row21 col1\" >TabularDataset</td>\n", - " <td id=\"T_9e889_row21_col2\" class=\"data row21 col2\" >Test suite for tabular datasets.</td>\n", - " <td id=\"T_9e889_row21_col3\" class=\"data row21 col3\" >validmind.data_validation.DatasetDescription, validmind.data_validation.DescriptiveStatistics, validmind.data_validation.PearsonCorrelationMatrix, validmind.data_validation.ClassImbalance, validmind.data_validation.Duplicates, validmind.data_validation.HighCardinality, validmind.data_validation.HighPearsonCorrelation, validmind.data_validation.MissingValues, validmind.data_validation.Skewness, validmind.data_validation.UniqueRows, validmind.data_validation.TooManyZeroValues</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row22_col0\" class=\"data row22 col0\" >tabular_dataset_description</td>\n", - " <td id=\"T_9e889_row22_col1\" class=\"data row22 col1\" >TabularDatasetDescription</td>\n", - " <td id=\"T_9e889_row22_col2\" class=\"data row22 col2\" >Test suite to extract metadata and descriptive\n", - "statistics from a tabular dataset</td>\n", - " <td id=\"T_9e889_row22_col3\" class=\"data row22 col3\" >validmind.data_validation.DatasetDescription, validmind.data_validation.DescriptiveStatistics, validmind.data_validation.PearsonCorrelationMatrix</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row23_col0\" class=\"data row23 col0\" >tabular_data_quality</td>\n", - " <td id=\"T_9e889_row23_col1\" class=\"data row23 col1\" >TabularDataQuality</td>\n", - " <td id=\"T_9e889_row23_col2\" class=\"data row23 col2\" >Test suite for data quality on tabular datasets</td>\n", - " <td id=\"T_9e889_row23_col3\" class=\"data row23 col3\" >validmind.data_validation.ClassImbalance, validmind.data_validation.Duplicates, validmind.data_validation.HighCardinality, validmind.data_validation.HighPearsonCorrelation, validmind.data_validation.MissingValues, validmind.data_validation.Skewness, validmind.data_validation.UniqueRows, validmind.data_validation.TooManyZeroValues</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row24_col0\" class=\"data row24 col0\" >text_data_quality</td>\n", - " <td id=\"T_9e889_row24_col1\" class=\"data row24 col1\" >TextDataQuality</td>\n", - " <td id=\"T_9e889_row24_col2\" class=\"data row24 col2\" >Test suite for data quality on text data</td>\n", - " <td id=\"T_9e889_row24_col3\" class=\"data row24 col3\" >validmind.data_validation.ClassImbalance, validmind.data_validation.Duplicates, validmind.data_validation.nlp.StopWords, validmind.data_validation.nlp.Punctuations, validmind.data_validation.nlp.CommonWords, validmind.data_validation.nlp.TextDescription</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row25_col0\" class=\"data row25 col0\" >time_series_data_quality</td>\n", - " <td id=\"T_9e889_row25_col1\" class=\"data row25 col1\" >TimeSeriesDataQuality</td>\n", - " <td id=\"T_9e889_row25_col2\" class=\"data row25 col2\" >Test suite for data quality on time series datasets</td>\n", - " <td id=\"T_9e889_row25_col3\" class=\"data row25 col3\" >validmind.data_validation.TimeSeriesOutliers, validmind.data_validation.TimeSeriesMissingValues, validmind.data_validation.TimeSeriesFrequency</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row26_col0\" class=\"data row26 col0\" >time_series_dataset</td>\n", - " <td id=\"T_9e889_row26_col1\" class=\"data row26 col1\" >TimeSeriesDataset</td>\n", - " <td id=\"T_9e889_row26_col2\" class=\"data row26 col2\" >Test suite for time series datasets.</td>\n", - " <td id=\"T_9e889_row26_col3\" class=\"data row26 col3\" >validmind.data_validation.TimeSeriesOutliers, validmind.data_validation.TimeSeriesMissingValues, validmind.data_validation.TimeSeriesFrequency, validmind.data_validation.TimeSeriesLinePlot, validmind.data_validation.TimeSeriesHistogram, validmind.data_validation.ACFandPACFPlot, validmind.data_validation.SeasonalDecompose, validmind.data_validation.AutoSeasonality, validmind.data_validation.AutoStationarity, validmind.data_validation.RollingStatsPlot, validmind.data_validation.AutoAR, validmind.data_validation.AutoMA, validmind.data_validation.ScatterPlot, validmind.data_validation.LaggedCorrelationHeatmap, validmind.data_validation.EngleGrangerCoint, validmind.data_validation.SpreadPlot</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row27_col0\" class=\"data row27 col0\" >time_series_model_validation</td>\n", - " <td id=\"T_9e889_row27_col1\" class=\"data row27 col1\" >TimeSeriesModelValidation</td>\n", - " <td id=\"T_9e889_row27_col2\" class=\"data row27 col2\" >Test suite for time series model validation.</td>\n", - " <td id=\"T_9e889_row27_col3\" class=\"data row27 col3\" >validmind.data_validation.DatasetSplit, validmind.model_validation.ModelMetadata, validmind.model_validation.statsmodels.RegressionModelCoeffs, validmind.model_validation.sklearn.RegressionModelsPerformanceComparison</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row28_col0\" class=\"data row28 col0\" >time_series_multivariate</td>\n", - " <td id=\"T_9e889_row28_col1\" class=\"data row28 col1\" >TimeSeriesMultivariate</td>\n", - " <td id=\"T_9e889_row28_col2\" class=\"data row28 col2\" >This test suite provides a preliminary understanding of the features\n", - "and relationship in multivariate dataset. It presents various\n", - "multivariate visualizations that can help identify patterns, trends,\n", - "and relationships between pairs of variables. The visualizations are\n", - "designed to explore the relationships between multiple features\n", - "simultaneously. They allow you to quickly identify any patterns or\n", - "trends in the data, as well as any potential outliers or anomalies.\n", - "The individual feature distribution can also be explored to provide\n", - "insight into the range and frequency of values observed in the data.\n", - "This multivariate analysis test suite aims to provide an overview of\n", - "the data structure and guide further exploration and modeling.</td>\n", - " <td id=\"T_9e889_row28_col3\" class=\"data row28 col3\" >validmind.data_validation.ScatterPlot, validmind.data_validation.LaggedCorrelationHeatmap, validmind.data_validation.EngleGrangerCoint, validmind.data_validation.SpreadPlot</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_9e889_row29_col0\" class=\"data row29 col0\" >time_series_univariate</td>\n", - " <td id=\"T_9e889_row29_col1\" class=\"data row29 col1\" >TimeSeriesUnivariate</td>\n", - " <td id=\"T_9e889_row29_col2\" class=\"data row29 col2\" >This test suite provides a preliminary understanding of the target variable(s)\n", - "used in the time series dataset. It visualizations that present the raw time\n", - "series data and a histogram of the target variable(s).\n", - "\n", - "The raw time series data provides a visual inspection of the target variable's\n", - "behavior over time. This helps to identify any patterns or trends in the data,\n", - "as well as any potential outliers or anomalies. The histogram of the target\n", - "variable displays the distribution of values, providing insight into the range\n", - "and frequency of values observed in the data.</td>\n", - " <td id=\"T_9e889_row29_col3\" class=\"data row29 col3\" >validmind.data_validation.TimeSeriesLinePlot, validmind.data_validation.TimeSeriesHistogram, validmind.data_validation.ACFandPACFPlot, validmind.data_validation.SeasonalDecompose, validmind.data_validation.AutoSeasonality, validmind.data_validation.AutoStationarity, validmind.data_validation.RollingStatsPlot, validmind.data_validation.AutoAR, validmind.data_validation.AutoMA</td>\n", - " </tr>\n", - " </tbody>\n", - "</table>\n" + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Install the ValidMind Library\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", + "<br></br>\n", + "Python 3.8 <= x <= 3.14</div>\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" ], - "text/plain": [ - "<pandas.io.formats.style.Styler at 0x16a11ae00>" + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## List available test suites\n", + "After we import the ValidMind Library, we'll call [test_suites.list_suites()](https://docs.validmind.ai/validmind/validmind/test_suites.html#list_suites) to retrieve a structured list of all available test suites, that includes each suite's name, description, and associated tests:" ] - }, - "execution_count": 2, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "import validmind as vm\n", - "\n", - "vm.test_suites.list_suites()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## View test suite details\n", - "\n", - "Use the [test_suites.describe_suite()](https://docs.validmind.ai/validmind/validmind/test_suites.html#describe_suite) function to retrieve information about a test suite, including its name, description, and the list of tests it contains. \n", - "\n", - "You can call `test_suites.describe_suite()` with just the test suite ID to get basic details, or pass an additional `verbose` parameter for a more comprehensive output: \n", - "\n", - "- **Test ID** - The identifier of the test suite you want to inspect.\n", - "- **Verbose** - A Boolean flag. Set `verbose=True` to return a full breakdown of the test suite." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [ + }, { - "data": { - "text/html": [ - "<style type=\"text/css\">\n", - "#T_7cb1b th {\n", - " text-align: left;\n", - "}\n", - "#T_7cb1b_row0_col0, #T_7cb1b_row0_col1, #T_7cb1b_row0_col2, #T_7cb1b_row0_col3, #T_7cb1b_row0_col4, #T_7cb1b_row1_col0, #T_7cb1b_row1_col1, #T_7cb1b_row1_col2, #T_7cb1b_row1_col3, #T_7cb1b_row1_col4, #T_7cb1b_row2_col0, #T_7cb1b_row2_col1, #T_7cb1b_row2_col2, #T_7cb1b_row2_col3, #T_7cb1b_row2_col4, #T_7cb1b_row3_col0, #T_7cb1b_row3_col1, #T_7cb1b_row3_col2, #T_7cb1b_row3_col3, #T_7cb1b_row3_col4, #T_7cb1b_row4_col0, #T_7cb1b_row4_col1, #T_7cb1b_row4_col2, #T_7cb1b_row4_col3, #T_7cb1b_row4_col4, #T_7cb1b_row5_col0, #T_7cb1b_row5_col1, #T_7cb1b_row5_col2, #T_7cb1b_row5_col3, #T_7cb1b_row5_col4, #T_7cb1b_row6_col0, #T_7cb1b_row6_col1, #T_7cb1b_row6_col2, #T_7cb1b_row6_col3, #T_7cb1b_row6_col4, #T_7cb1b_row7_col0, #T_7cb1b_row7_col1, #T_7cb1b_row7_col2, #T_7cb1b_row7_col3, #T_7cb1b_row7_col4, #T_7cb1b_row8_col0, #T_7cb1b_row8_col1, #T_7cb1b_row8_col2, #T_7cb1b_row8_col3, #T_7cb1b_row8_col4, #T_7cb1b_row9_col0, #T_7cb1b_row9_col1, #T_7cb1b_row9_col2, #T_7cb1b_row9_col3, #T_7cb1b_row9_col4, #T_7cb1b_row10_col0, #T_7cb1b_row10_col1, #T_7cb1b_row10_col2, #T_7cb1b_row10_col3, #T_7cb1b_row10_col4, #T_7cb1b_row11_col0, #T_7cb1b_row11_col1, #T_7cb1b_row11_col2, #T_7cb1b_row11_col3, #T_7cb1b_row11_col4, #T_7cb1b_row12_col0, #T_7cb1b_row12_col1, #T_7cb1b_row12_col2, #T_7cb1b_row12_col3, #T_7cb1b_row12_col4, #T_7cb1b_row13_col0, #T_7cb1b_row13_col1, #T_7cb1b_row13_col2, #T_7cb1b_row13_col3, #T_7cb1b_row13_col4, #T_7cb1b_row14_col0, #T_7cb1b_row14_col1, #T_7cb1b_row14_col2, #T_7cb1b_row14_col3, #T_7cb1b_row14_col4, #T_7cb1b_row15_col0, #T_7cb1b_row15_col1, #T_7cb1b_row15_col2, #T_7cb1b_row15_col3, #T_7cb1b_row15_col4, #T_7cb1b_row16_col0, #T_7cb1b_row16_col1, #T_7cb1b_row16_col2, #T_7cb1b_row16_col3, #T_7cb1b_row16_col4, #T_7cb1b_row17_col0, #T_7cb1b_row17_col1, #T_7cb1b_row17_col2, #T_7cb1b_row17_col3, #T_7cb1b_row17_col4, #T_7cb1b_row18_col0, #T_7cb1b_row18_col1, #T_7cb1b_row18_col2, #T_7cb1b_row18_col3, #T_7cb1b_row18_col4, #T_7cb1b_row19_col0, #T_7cb1b_row19_col1, #T_7cb1b_row19_col2, #T_7cb1b_row19_col3, #T_7cb1b_row19_col4, #T_7cb1b_row20_col0, #T_7cb1b_row20_col1, #T_7cb1b_row20_col2, #T_7cb1b_row20_col3, #T_7cb1b_row20_col4, #T_7cb1b_row21_col0, #T_7cb1b_row21_col1, #T_7cb1b_row21_col2, #T_7cb1b_row21_col3, #T_7cb1b_row21_col4, #T_7cb1b_row22_col0, #T_7cb1b_row22_col1, #T_7cb1b_row22_col2, #T_7cb1b_row22_col3, #T_7cb1b_row22_col4, #T_7cb1b_row23_col0, #T_7cb1b_row23_col1, #T_7cb1b_row23_col2, #T_7cb1b_row23_col3, #T_7cb1b_row23_col4, #T_7cb1b_row24_col0, #T_7cb1b_row24_col1, #T_7cb1b_row24_col2, #T_7cb1b_row24_col3, #T_7cb1b_row24_col4, #T_7cb1b_row25_col0, #T_7cb1b_row25_col1, #T_7cb1b_row25_col2, #T_7cb1b_row25_col3, #T_7cb1b_row25_col4, #T_7cb1b_row26_col0, #T_7cb1b_row26_col1, #T_7cb1b_row26_col2, #T_7cb1b_row26_col3, #T_7cb1b_row26_col4, #T_7cb1b_row27_col0, #T_7cb1b_row27_col1, #T_7cb1b_row27_col2, #T_7cb1b_row27_col3, #T_7cb1b_row27_col4 {\n", - " text-align: left;\n", - "}\n", - "</style>\n", - "<table id=\"T_7cb1b\">\n", - " <thead>\n", - " <tr>\n", - " <th id=\"T_7cb1b_level0_col0\" class=\"col_heading level0 col0\" >Test Suite ID</th>\n", - " <th id=\"T_7cb1b_level0_col1\" class=\"col_heading level0 col1\" >Test Suite Name</th>\n", - " <th id=\"T_7cb1b_level0_col2\" class=\"col_heading level0 col2\" >Test Suite Section</th>\n", - " <th id=\"T_7cb1b_level0_col3\" class=\"col_heading level0 col3\" >Test ID</th>\n", - " <th id=\"T_7cb1b_level0_col4\" class=\"col_heading level0 col4\" >Test Name</th>\n", - " </tr>\n", - " </thead>\n", - " <tbody>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row0_col0\" class=\"data row0 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row0_col1\" class=\"data row0 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row0_col2\" class=\"data row0 col2\" >tabular_dataset_description</td>\n", - " <td id=\"T_7cb1b_row0_col3\" class=\"data row0 col3\" >validmind.data_validation.DatasetDescription</td>\n", - " <td id=\"T_7cb1b_row0_col4\" class=\"data row0 col4\" >Dataset Description</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row1_col0\" class=\"data row1 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row1_col1\" class=\"data row1 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row1_col2\" class=\"data row1 col2\" >tabular_dataset_description</td>\n", - " <td id=\"T_7cb1b_row1_col3\" class=\"data row1 col3\" >validmind.data_validation.DescriptiveStatistics</td>\n", - " <td id=\"T_7cb1b_row1_col4\" class=\"data row1 col4\" >Descriptive Statistics</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row2_col0\" class=\"data row2 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row2_col1\" class=\"data row2 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row2_col2\" class=\"data row2 col2\" >tabular_dataset_description</td>\n", - " <td id=\"T_7cb1b_row2_col3\" class=\"data row2 col3\" >validmind.data_validation.PearsonCorrelationMatrix</td>\n", - " <td id=\"T_7cb1b_row2_col4\" class=\"data row2 col4\" >Pearson Correlation Matrix</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row3_col0\" class=\"data row3 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row3_col1\" class=\"data row3 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row3_col2\" class=\"data row3 col2\" >tabular_data_quality</td>\n", - " <td id=\"T_7cb1b_row3_col3\" class=\"data row3 col3\" >validmind.data_validation.ClassImbalance</td>\n", - " <td id=\"T_7cb1b_row3_col4\" class=\"data row3 col4\" >Class Imbalance</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row4_col0\" class=\"data row4 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row4_col1\" class=\"data row4 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row4_col2\" class=\"data row4 col2\" >tabular_data_quality</td>\n", - " <td id=\"T_7cb1b_row4_col3\" class=\"data row4 col3\" >validmind.data_validation.Duplicates</td>\n", - " <td id=\"T_7cb1b_row4_col4\" class=\"data row4 col4\" >Duplicates</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row5_col0\" class=\"data row5 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row5_col1\" class=\"data row5 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row5_col2\" class=\"data row5 col2\" >tabular_data_quality</td>\n", - " <td id=\"T_7cb1b_row5_col3\" class=\"data row5 col3\" >validmind.data_validation.HighCardinality</td>\n", - " <td id=\"T_7cb1b_row5_col4\" class=\"data row5 col4\" >High Cardinality</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row6_col0\" class=\"data row6 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row6_col1\" class=\"data row6 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row6_col2\" class=\"data row6 col2\" >tabular_data_quality</td>\n", - " <td id=\"T_7cb1b_row6_col3\" class=\"data row6 col3\" >validmind.data_validation.HighPearsonCorrelation</td>\n", - " <td id=\"T_7cb1b_row6_col4\" class=\"data row6 col4\" >High Pearson Correlation</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row7_col0\" class=\"data row7 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row7_col1\" class=\"data row7 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row7_col2\" class=\"data row7 col2\" >tabular_data_quality</td>\n", - " <td id=\"T_7cb1b_row7_col3\" class=\"data row7 col3\" >validmind.data_validation.MissingValues</td>\n", - " <td id=\"T_7cb1b_row7_col4\" class=\"data row7 col4\" >Missing Values</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row8_col0\" class=\"data row8 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row8_col1\" class=\"data row8 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row8_col2\" class=\"data row8 col2\" >tabular_data_quality</td>\n", - " <td id=\"T_7cb1b_row8_col3\" class=\"data row8 col3\" >validmind.data_validation.Skewness</td>\n", - " <td id=\"T_7cb1b_row8_col4\" class=\"data row8 col4\" >Skewness</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row9_col0\" class=\"data row9 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row9_col1\" class=\"data row9 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row9_col2\" class=\"data row9 col2\" >tabular_data_quality</td>\n", - " <td id=\"T_7cb1b_row9_col3\" class=\"data row9 col3\" >validmind.data_validation.UniqueRows</td>\n", - " <td id=\"T_7cb1b_row9_col4\" class=\"data row9 col4\" >Unique Rows</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row10_col0\" class=\"data row10 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row10_col1\" class=\"data row10 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row10_col2\" class=\"data row10 col2\" >tabular_data_quality</td>\n", - " <td id=\"T_7cb1b_row10_col3\" class=\"data row10 col3\" >validmind.data_validation.TooManyZeroValues</td>\n", - " <td id=\"T_7cb1b_row10_col4\" class=\"data row10 col4\" >Too Many Zero Values</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row11_col0\" class=\"data row11 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row11_col1\" class=\"data row11 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row11_col2\" class=\"data row11 col2\" >classifier_metrics</td>\n", - " <td id=\"T_7cb1b_row11_col3\" class=\"data row11 col3\" >validmind.model_validation.ModelMetadata</td>\n", - " <td id=\"T_7cb1b_row11_col4\" class=\"data row11 col4\" >Model Metadata</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row12_col0\" class=\"data row12 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row12_col1\" class=\"data row12 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row12_col2\" class=\"data row12 col2\" >classifier_metrics</td>\n", - " <td id=\"T_7cb1b_row12_col3\" class=\"data row12 col3\" >validmind.data_validation.DatasetSplit</td>\n", - " <td id=\"T_7cb1b_row12_col4\" class=\"data row12 col4\" >Dataset Split</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row13_col0\" class=\"data row13 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row13_col1\" class=\"data row13 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row13_col2\" class=\"data row13 col2\" >classifier_metrics</td>\n", - " <td id=\"T_7cb1b_row13_col3\" class=\"data row13 col3\" >validmind.model_validation.sklearn.ConfusionMatrix</td>\n", - " <td id=\"T_7cb1b_row13_col4\" class=\"data row13 col4\" >Confusion Matrix</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row14_col0\" class=\"data row14 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row14_col1\" class=\"data row14 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row14_col2\" class=\"data row14 col2\" >classifier_metrics</td>\n", - " <td id=\"T_7cb1b_row14_col3\" class=\"data row14 col3\" >validmind.model_validation.sklearn.ClassifierPerformance</td>\n", - " <td id=\"T_7cb1b_row14_col4\" class=\"data row14 col4\" >Classifier Performance</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row15_col0\" class=\"data row15 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row15_col1\" class=\"data row15 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row15_col2\" class=\"data row15 col2\" >classifier_metrics</td>\n", - " <td id=\"T_7cb1b_row15_col3\" class=\"data row15 col3\" >validmind.model_validation.sklearn.PermutationFeatureImportance</td>\n", - " <td id=\"T_7cb1b_row15_col4\" class=\"data row15 col4\" >Permutation Feature Importance</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row16_col0\" class=\"data row16 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row16_col1\" class=\"data row16 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row16_col2\" class=\"data row16 col2\" >classifier_metrics</td>\n", - " <td id=\"T_7cb1b_row16_col3\" class=\"data row16 col3\" >validmind.model_validation.sklearn.PrecisionRecallCurve</td>\n", - " <td id=\"T_7cb1b_row16_col4\" class=\"data row16 col4\" >Precision Recall Curve</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row17_col0\" class=\"data row17 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row17_col1\" class=\"data row17 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row17_col2\" class=\"data row17 col2\" >classifier_metrics</td>\n", - " <td id=\"T_7cb1b_row17_col3\" class=\"data row17 col3\" >validmind.model_validation.sklearn.ROCCurve</td>\n", - " <td id=\"T_7cb1b_row17_col4\" class=\"data row17 col4\" >ROC Curve</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row18_col0\" class=\"data row18 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row18_col1\" class=\"data row18 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row18_col2\" class=\"data row18 col2\" >classifier_metrics</td>\n", - " <td id=\"T_7cb1b_row18_col3\" class=\"data row18 col3\" >validmind.model_validation.sklearn.PopulationStabilityIndex</td>\n", - " <td id=\"T_7cb1b_row18_col4\" class=\"data row18 col4\" >Population Stability Index</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row19_col0\" class=\"data row19 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row19_col1\" class=\"data row19 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row19_col2\" class=\"data row19 col2\" >classifier_metrics</td>\n", - " <td id=\"T_7cb1b_row19_col3\" class=\"data row19 col3\" >validmind.model_validation.sklearn.SHAPGlobalImportance</td>\n", - " <td id=\"T_7cb1b_row19_col4\" class=\"data row19 col4\" >SHAP Global Importance</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row20_col0\" class=\"data row20 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row20_col1\" class=\"data row20 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row20_col2\" class=\"data row20 col2\" >classifier_validation</td>\n", - " <td id=\"T_7cb1b_row20_col3\" class=\"data row20 col3\" >validmind.model_validation.sklearn.MinimumAccuracy</td>\n", - " <td id=\"T_7cb1b_row20_col4\" class=\"data row20 col4\" >Minimum Accuracy</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row21_col0\" class=\"data row21 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row21_col1\" class=\"data row21 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row21_col2\" class=\"data row21 col2\" >classifier_validation</td>\n", - " <td id=\"T_7cb1b_row21_col3\" class=\"data row21 col3\" >validmind.model_validation.sklearn.MinimumF1Score</td>\n", - " <td id=\"T_7cb1b_row21_col4\" class=\"data row21 col4\" >Minimum F1 Score</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row22_col0\" class=\"data row22 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row22_col1\" class=\"data row22 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row22_col2\" class=\"data row22 col2\" >classifier_validation</td>\n", - " <td id=\"T_7cb1b_row22_col3\" class=\"data row22 col3\" >validmind.model_validation.sklearn.MinimumROCAUCScore</td>\n", - " <td id=\"T_7cb1b_row22_col4\" class=\"data row22 col4\" >Minimum ROCAUC Score</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row23_col0\" class=\"data row23 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row23_col1\" class=\"data row23 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row23_col2\" class=\"data row23 col2\" >classifier_validation</td>\n", - " <td id=\"T_7cb1b_row23_col3\" class=\"data row23 col3\" >validmind.model_validation.sklearn.TrainingTestDegradation</td>\n", - " <td id=\"T_7cb1b_row23_col4\" class=\"data row23 col4\" >Training Test Degradation</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row24_col0\" class=\"data row24 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row24_col1\" class=\"data row24 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row24_col2\" class=\"data row24 col2\" >classifier_validation</td>\n", - " <td id=\"T_7cb1b_row24_col3\" class=\"data row24 col3\" >validmind.model_validation.sklearn.ModelsPerformanceComparison</td>\n", - " <td id=\"T_7cb1b_row24_col4\" class=\"data row24 col4\" >Models Performance Comparison</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row25_col0\" class=\"data row25 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row25_col1\" class=\"data row25 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row25_col2\" class=\"data row25 col2\" >classifier_model_diagnosis</td>\n", - " <td id=\"T_7cb1b_row25_col3\" class=\"data row25 col3\" >validmind.model_validation.sklearn.OverfitDiagnosis</td>\n", - " <td id=\"T_7cb1b_row25_col4\" class=\"data row25 col4\" >Overfit Diagnosis</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row26_col0\" class=\"data row26 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row26_col1\" class=\"data row26 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row26_col2\" class=\"data row26 col2\" >classifier_model_diagnosis</td>\n", - " <td id=\"T_7cb1b_row26_col3\" class=\"data row26 col3\" >validmind.model_validation.sklearn.WeakspotsDiagnosis</td>\n", - " <td id=\"T_7cb1b_row26_col4\" class=\"data row26 col4\" >Weakspots Diagnosis</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_7cb1b_row27_col0\" class=\"data row27 col0\" >classifier_full_suite</td>\n", - " <td id=\"T_7cb1b_row27_col1\" class=\"data row27 col1\" >ClassifierFullSuite</td>\n", - " <td id=\"T_7cb1b_row27_col2\" class=\"data row27 col2\" >classifier_model_diagnosis</td>\n", - " <td id=\"T_7cb1b_row27_col3\" class=\"data row27 col3\" >validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", - " <td id=\"T_7cb1b_row27_col4\" class=\"data row27 col4\" >Robustness Diagnosis</td>\n", - " </tr>\n", - " </tbody>\n", - "</table>\n" + "cell_type": "code", + "metadata": {}, + "source": [ + "import validmind as vm\n", + "\n", + "vm.test_suites.list_suites()" ], - "text/plain": [ - "<pandas.io.formats.style.Styler at 0x16a167fa0>" + "execution_count": null, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/html": [ + "<style type=\"text/css\">\n", + "#T_9e889 th {\n", + " text-align: left;\n", + "}\n", + "#T_9e889_row0_col0, #T_9e889_row0_col1, #T_9e889_row0_col2, #T_9e889_row0_col3, #T_9e889_row1_col0, #T_9e889_row1_col1, #T_9e889_row1_col2, #T_9e889_row1_col3, #T_9e889_row2_col0, #T_9e889_row2_col1, #T_9e889_row2_col2, #T_9e889_row2_col3, #T_9e889_row3_col0, #T_9e889_row3_col1, #T_9e889_row3_col2, #T_9e889_row3_col3, #T_9e889_row4_col0, #T_9e889_row4_col1, #T_9e889_row4_col2, #T_9e889_row4_col3, #T_9e889_row5_col0, #T_9e889_row5_col1, #T_9e889_row5_col2, #T_9e889_row5_col3, #T_9e889_row6_col0, #T_9e889_row6_col1, #T_9e889_row6_col2, #T_9e889_row6_col3, #T_9e889_row7_col0, #T_9e889_row7_col1, #T_9e889_row7_col2, #T_9e889_row7_col3, #T_9e889_row8_col0, #T_9e889_row8_col1, #T_9e889_row8_col2, #T_9e889_row8_col3, #T_9e889_row9_col0, #T_9e889_row9_col1, #T_9e889_row9_col2, #T_9e889_row9_col3, #T_9e889_row10_col0, #T_9e889_row10_col1, #T_9e889_row10_col2, #T_9e889_row10_col3, #T_9e889_row11_col0, #T_9e889_row11_col1, #T_9e889_row11_col2, #T_9e889_row11_col3, #T_9e889_row12_col0, #T_9e889_row12_col1, #T_9e889_row12_col2, #T_9e889_row12_col3, #T_9e889_row13_col0, #T_9e889_row13_col1, #T_9e889_row13_col2, #T_9e889_row13_col3, #T_9e889_row14_col0, #T_9e889_row14_col1, #T_9e889_row14_col2, #T_9e889_row14_col3, #T_9e889_row15_col0, #T_9e889_row15_col1, #T_9e889_row15_col2, #T_9e889_row15_col3, #T_9e889_row16_col0, #T_9e889_row16_col1, #T_9e889_row16_col2, #T_9e889_row16_col3, #T_9e889_row17_col0, #T_9e889_row17_col1, #T_9e889_row17_col2, #T_9e889_row17_col3, #T_9e889_row18_col0, #T_9e889_row18_col1, #T_9e889_row18_col2, #T_9e889_row18_col3, #T_9e889_row19_col0, #T_9e889_row19_col1, #T_9e889_row19_col2, #T_9e889_row19_col3, #T_9e889_row20_col0, #T_9e889_row20_col1, #T_9e889_row20_col2, #T_9e889_row20_col3, #T_9e889_row21_col0, #T_9e889_row21_col1, #T_9e889_row21_col2, #T_9e889_row21_col3, #T_9e889_row22_col0, #T_9e889_row22_col1, #T_9e889_row22_col2, #T_9e889_row22_col3, #T_9e889_row23_col0, #T_9e889_row23_col1, #T_9e889_row23_col2, #T_9e889_row23_col3, #T_9e889_row24_col0, #T_9e889_row24_col1, #T_9e889_row24_col2, #T_9e889_row24_col3, #T_9e889_row25_col0, #T_9e889_row25_col1, #T_9e889_row25_col2, #T_9e889_row25_col3, #T_9e889_row26_col0, #T_9e889_row26_col1, #T_9e889_row26_col2, #T_9e889_row26_col3, #T_9e889_row27_col0, #T_9e889_row27_col1, #T_9e889_row27_col2, #T_9e889_row27_col3, #T_9e889_row28_col0, #T_9e889_row28_col1, #T_9e889_row28_col2, #T_9e889_row28_col3, #T_9e889_row29_col0, #T_9e889_row29_col1, #T_9e889_row29_col2, #T_9e889_row29_col3 {\n", + " text-align: left;\n", + "}\n", + "</style>\n", + "<table id=\"T_9e889\">\n", + " <thead>\n", + " <tr>\n", + " <th id=\"T_9e889_level0_col0\" class=\"col_heading level0 col0\" >ID</th>\n", + " <th id=\"T_9e889_level0_col1\" class=\"col_heading level0 col1\" >Name</th>\n", + " <th id=\"T_9e889_level0_col2\" class=\"col_heading level0 col2\" >Description</th>\n", + " <th id=\"T_9e889_level0_col3\" class=\"col_heading level0 col3\" >Tests</th>\n", + " </tr>\n", + " </thead>\n", + " <tbody>\n", + " <tr>\n", + " <td id=\"T_9e889_row0_col0\" class=\"data row0 col0\" >classifier_model_diagnosis</td>\n", + " <td id=\"T_9e889_row0_col1\" class=\"data row0 col1\" >ClassifierDiagnosis</td>\n", + " <td id=\"T_9e889_row0_col2\" class=\"data row0 col2\" >Test suite for sklearn classifier model diagnosis tests</td>\n", + " <td id=\"T_9e889_row0_col3\" class=\"data row0 col3\" >validmind.model_validation.sklearn.OverfitDiagnosis, validmind.model_validation.sklearn.WeakspotsDiagnosis, validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row1_col0\" class=\"data row1 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_9e889_row1_col1\" class=\"data row1 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_9e889_row1_col2\" class=\"data row1 col2\" >Full test suite for binary classification models.</td>\n", + " <td id=\"T_9e889_row1_col3\" class=\"data row1 col3\" >validmind.data_validation.DatasetDescription, validmind.data_validation.DescriptiveStatistics, validmind.data_validation.PearsonCorrelationMatrix, validmind.data_validation.ClassImbalance, validmind.data_validation.Duplicates, validmind.data_validation.HighCardinality, validmind.data_validation.HighPearsonCorrelation, validmind.data_validation.MissingValues, validmind.data_validation.Skewness, validmind.data_validation.UniqueRows, validmind.data_validation.TooManyZeroValues, validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.sklearn.ConfusionMatrix, validmind.model_validation.sklearn.ClassifierPerformance, validmind.model_validation.sklearn.PermutationFeatureImportance, validmind.model_validation.sklearn.PrecisionRecallCurve, validmind.model_validation.sklearn.ROCCurve, validmind.model_validation.sklearn.PopulationStabilityIndex, validmind.model_validation.sklearn.SHAPGlobalImportance, validmind.model_validation.sklearn.MinimumAccuracy, validmind.model_validation.sklearn.MinimumF1Score, validmind.model_validation.sklearn.MinimumROCAUCScore, validmind.model_validation.sklearn.TrainingTestDegradation, validmind.model_validation.sklearn.ModelsPerformanceComparison, validmind.model_validation.sklearn.OverfitDiagnosis, validmind.model_validation.sklearn.WeakspotsDiagnosis, validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row2_col0\" class=\"data row2 col0\" >classifier_metrics</td>\n", + " <td id=\"T_9e889_row2_col1\" class=\"data row2 col1\" >ClassifierMetrics</td>\n", + " <td id=\"T_9e889_row2_col2\" class=\"data row2 col2\" >Test suite for sklearn classifier metrics</td>\n", + " <td id=\"T_9e889_row2_col3\" class=\"data row2 col3\" >validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.sklearn.ConfusionMatrix, validmind.model_validation.sklearn.ClassifierPerformance, validmind.model_validation.sklearn.PermutationFeatureImportance, validmind.model_validation.sklearn.PrecisionRecallCurve, validmind.model_validation.sklearn.ROCCurve, validmind.model_validation.sklearn.PopulationStabilityIndex, validmind.model_validation.sklearn.SHAPGlobalImportance</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row3_col0\" class=\"data row3 col0\" >classifier_model_validation</td>\n", + " <td id=\"T_9e889_row3_col1\" class=\"data row3 col1\" >ClassifierModelValidation</td>\n", + " <td id=\"T_9e889_row3_col2\" class=\"data row3 col2\" >Test suite for binary classification models.</td>\n", + " <td id=\"T_9e889_row3_col3\" class=\"data row3 col3\" >validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.sklearn.ConfusionMatrix, validmind.model_validation.sklearn.ClassifierPerformance, validmind.model_validation.sklearn.PermutationFeatureImportance, validmind.model_validation.sklearn.PrecisionRecallCurve, validmind.model_validation.sklearn.ROCCurve, validmind.model_validation.sklearn.PopulationStabilityIndex, validmind.model_validation.sklearn.SHAPGlobalImportance, validmind.model_validation.sklearn.MinimumAccuracy, validmind.model_validation.sklearn.MinimumF1Score, validmind.model_validation.sklearn.MinimumROCAUCScore, validmind.model_validation.sklearn.TrainingTestDegradation, validmind.model_validation.sklearn.ModelsPerformanceComparison, validmind.model_validation.sklearn.OverfitDiagnosis, validmind.model_validation.sklearn.WeakspotsDiagnosis, validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row4_col0\" class=\"data row4 col0\" >classifier_validation</td>\n", + " <td id=\"T_9e889_row4_col1\" class=\"data row4 col1\" >ClassifierPerformance</td>\n", + " <td id=\"T_9e889_row4_col2\" class=\"data row4 col2\" >Test suite for sklearn classifier models</td>\n", + " <td id=\"T_9e889_row4_col3\" class=\"data row4 col3\" >validmind.model_validation.sklearn.MinimumAccuracy, validmind.model_validation.sklearn.MinimumF1Score, validmind.model_validation.sklearn.MinimumROCAUCScore, validmind.model_validation.sklearn.TrainingTestDegradation, validmind.model_validation.sklearn.ModelsPerformanceComparison</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row5_col0\" class=\"data row5 col0\" >cluster_full_suite</td>\n", + " <td id=\"T_9e889_row5_col1\" class=\"data row5 col1\" >ClusterFullSuite</td>\n", + " <td id=\"T_9e889_row5_col2\" class=\"data row5 col2\" >Full test suite for clustering models.</td>\n", + " <td id=\"T_9e889_row5_col3\" class=\"data row5 col3\" >validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.sklearn.HomogeneityScore, validmind.model_validation.sklearn.CompletenessScore, validmind.model_validation.sklearn.VMeasure, validmind.model_validation.sklearn.AdjustedRandIndex, validmind.model_validation.sklearn.AdjustedMutualInformation, validmind.model_validation.sklearn.FowlkesMallowsScore, validmind.model_validation.sklearn.ClusterPerformanceMetrics, validmind.model_validation.sklearn.ClusterCosineSimilarity, validmind.model_validation.sklearn.SilhouettePlot, validmind.model_validation.ClusterSizeDistribution, validmind.model_validation.sklearn.HyperParametersTuning, validmind.model_validation.sklearn.KMeansClustersOptimization</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row6_col0\" class=\"data row6 col0\" >cluster_metrics</td>\n", + " <td id=\"T_9e889_row6_col1\" class=\"data row6 col1\" >ClusterMetrics</td>\n", + " <td id=\"T_9e889_row6_col2\" class=\"data row6 col2\" >Test suite for sklearn clustering metrics</td>\n", + " <td id=\"T_9e889_row6_col3\" class=\"data row6 col3\" >validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.sklearn.HomogeneityScore, validmind.model_validation.sklearn.CompletenessScore, validmind.model_validation.sklearn.VMeasure, validmind.model_validation.sklearn.AdjustedRandIndex, validmind.model_validation.sklearn.AdjustedMutualInformation, validmind.model_validation.sklearn.FowlkesMallowsScore, validmind.model_validation.sklearn.ClusterPerformanceMetrics, validmind.model_validation.sklearn.ClusterCosineSimilarity, validmind.model_validation.sklearn.SilhouettePlot</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row7_col0\" class=\"data row7 col0\" >cluster_performance</td>\n", + " <td id=\"T_9e889_row7_col1\" class=\"data row7 col1\" >ClusterPerformance</td>\n", + " <td id=\"T_9e889_row7_col2\" class=\"data row7 col2\" >Test suite for sklearn cluster performance</td>\n", + " <td id=\"T_9e889_row7_col3\" class=\"data row7 col3\" >validmind.model_validation.ClusterSizeDistribution</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row8_col0\" class=\"data row8 col0\" >embeddings_full_suite</td>\n", + " <td id=\"T_9e889_row8_col1\" class=\"data row8 col1\" >EmbeddingsFullSuite</td>\n", + " <td id=\"T_9e889_row8_col2\" class=\"data row8 col2\" >Full test suite for embeddings models.</td>\n", + " <td id=\"T_9e889_row8_col3\" class=\"data row8 col3\" >validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.embeddings.DescriptiveAnalytics, validmind.model_validation.embeddings.CosineSimilarityDistribution, validmind.model_validation.embeddings.ClusterDistribution, validmind.model_validation.embeddings.EmbeddingsVisualization2D, validmind.model_validation.embeddings.StabilityAnalysisRandomNoise, validmind.model_validation.embeddings.StabilityAnalysisSynonyms, validmind.model_validation.embeddings.StabilityAnalysisKeyword, validmind.model_validation.embeddings.StabilityAnalysisTranslation</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row9_col0\" class=\"data row9 col0\" >embeddings_metrics</td>\n", + " <td id=\"T_9e889_row9_col1\" class=\"data row9 col1\" >EmbeddingsMetrics</td>\n", + " <td id=\"T_9e889_row9_col2\" class=\"data row9 col2\" >Test suite for embeddings metrics</td>\n", + " <td id=\"T_9e889_row9_col3\" class=\"data row9 col3\" >validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.embeddings.DescriptiveAnalytics, validmind.model_validation.embeddings.CosineSimilarityDistribution, validmind.model_validation.embeddings.ClusterDistribution, validmind.model_validation.embeddings.EmbeddingsVisualization2D</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row10_col0\" class=\"data row10 col0\" >embeddings_model_performance</td>\n", + " <td id=\"T_9e889_row10_col1\" class=\"data row10 col1\" >EmbeddingsPerformance</td>\n", + " <td id=\"T_9e889_row10_col2\" class=\"data row10 col2\" >Test suite for embeddings model performance</td>\n", + " <td id=\"T_9e889_row10_col3\" class=\"data row10 col3\" >validmind.model_validation.embeddings.StabilityAnalysisRandomNoise, validmind.model_validation.embeddings.StabilityAnalysisSynonyms, validmind.model_validation.embeddings.StabilityAnalysisKeyword, validmind.model_validation.embeddings.StabilityAnalysisTranslation</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row11_col0\" class=\"data row11 col0\" >hyper_parameters_optimization</td>\n", + " <td id=\"T_9e889_row11_col1\" class=\"data row11 col1\" >KmeansParametersOptimization</td>\n", + " <td id=\"T_9e889_row11_col2\" class=\"data row11 col2\" >Test suite for sklearn hyperparameters optimization</td>\n", + " <td id=\"T_9e889_row11_col3\" class=\"data row11 col3\" >validmind.model_validation.sklearn.HyperParametersTuning, validmind.model_validation.sklearn.KMeansClustersOptimization</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row12_col0\" class=\"data row12 col0\" >llm_classifier_full_suite</td>\n", + " <td id=\"T_9e889_row12_col1\" class=\"data row12 col1\" >LLMClassifierFullSuite</td>\n", + " <td id=\"T_9e889_row12_col2\" class=\"data row12 col2\" >Full test suite for LLM classification models.</td>\n", + " <td id=\"T_9e889_row12_col3\" class=\"data row12 col3\" >validmind.data_validation.ClassImbalance, validmind.data_validation.Duplicates, validmind.data_validation.nlp.StopWords, validmind.data_validation.nlp.Punctuations, validmind.data_validation.nlp.CommonWords, validmind.data_validation.nlp.TextDescription, validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.sklearn.ConfusionMatrix, validmind.model_validation.sklearn.ClassifierPerformance, validmind.model_validation.sklearn.PermutationFeatureImportance, validmind.model_validation.sklearn.PrecisionRecallCurve, validmind.model_validation.sklearn.ROCCurve, validmind.model_validation.sklearn.PopulationStabilityIndex, validmind.model_validation.sklearn.SHAPGlobalImportance, validmind.model_validation.sklearn.MinimumAccuracy, validmind.model_validation.sklearn.MinimumF1Score, validmind.model_validation.sklearn.MinimumROCAUCScore, validmind.model_validation.sklearn.TrainingTestDegradation, validmind.model_validation.sklearn.ModelsPerformanceComparison, validmind.model_validation.sklearn.OverfitDiagnosis, validmind.model_validation.sklearn.WeakspotsDiagnosis, validmind.model_validation.sklearn.RobustnessDiagnosis, validmind.prompt_validation.Bias, validmind.prompt_validation.Clarity, validmind.prompt_validation.Conciseness, validmind.prompt_validation.Delimitation, validmind.prompt_validation.NegativeInstruction, validmind.prompt_validation.Robustness, validmind.prompt_validation.Specificity</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row13_col0\" class=\"data row13 col0\" >prompt_validation</td>\n", + " <td id=\"T_9e889_row13_col1\" class=\"data row13 col1\" >PromptValidation</td>\n", + " <td id=\"T_9e889_row13_col2\" class=\"data row13 col2\" >Test suite for prompt validation</td>\n", + " <td id=\"T_9e889_row13_col3\" class=\"data row13 col3\" >validmind.prompt_validation.Bias, validmind.prompt_validation.Clarity, validmind.prompt_validation.Conciseness, validmind.prompt_validation.Delimitation, validmind.prompt_validation.NegativeInstruction, validmind.prompt_validation.Robustness, validmind.prompt_validation.Specificity</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row14_col0\" class=\"data row14 col0\" >nlp_classifier_full_suite</td>\n", + " <td id=\"T_9e889_row14_col1\" class=\"data row14 col1\" >NLPClassifierFullSuite</td>\n", + " <td id=\"T_9e889_row14_col2\" class=\"data row14 col2\" >Full test suite for NLP classification models.</td>\n", + " <td id=\"T_9e889_row14_col3\" class=\"data row14 col3\" >validmind.data_validation.ClassImbalance, validmind.data_validation.Duplicates, validmind.data_validation.nlp.StopWords, validmind.data_validation.nlp.Punctuations, validmind.data_validation.nlp.CommonWords, validmind.data_validation.nlp.TextDescription, validmind.model_validation.ModelMetadata, validmind.data_validation.DatasetSplit, validmind.model_validation.sklearn.ConfusionMatrix, validmind.model_validation.sklearn.ClassifierPerformance, validmind.model_validation.sklearn.PermutationFeatureImportance, validmind.model_validation.sklearn.PrecisionRecallCurve, validmind.model_validation.sklearn.ROCCurve, validmind.model_validation.sklearn.PopulationStabilityIndex, validmind.model_validation.sklearn.SHAPGlobalImportance, validmind.model_validation.sklearn.MinimumAccuracy, validmind.model_validation.sklearn.MinimumF1Score, validmind.model_validation.sklearn.MinimumROCAUCScore, validmind.model_validation.sklearn.TrainingTestDegradation, validmind.model_validation.sklearn.ModelsPerformanceComparison, validmind.model_validation.sklearn.OverfitDiagnosis, validmind.model_validation.sklearn.WeakspotsDiagnosis, validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row15_col0\" class=\"data row15 col0\" >regression_metrics</td>\n", + " <td id=\"T_9e889_row15_col1\" class=\"data row15 col1\" >RegressionMetrics</td>\n", + " <td id=\"T_9e889_row15_col2\" class=\"data row15 col2\" >Test suite for performance metrics of regression metrics</td>\n", + " <td id=\"T_9e889_row15_col3\" class=\"data row15 col3\" >validmind.data_validation.DatasetSplit, validmind.model_validation.ModelMetadata, validmind.model_validation.sklearn.PermutationFeatureImportance</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row16_col0\" class=\"data row16 col0\" >regression_model_description</td>\n", + " <td id=\"T_9e889_row16_col1\" class=\"data row16 col1\" >RegressionModelDescription</td>\n", + " <td id=\"T_9e889_row16_col2\" class=\"data row16 col2\" >Test suite for performance metric of regression model of statsmodels library</td>\n", + " <td id=\"T_9e889_row16_col3\" class=\"data row16 col3\" >validmind.data_validation.DatasetSplit, validmind.model_validation.ModelMetadata</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row17_col0\" class=\"data row17 col0\" >regression_models_evaluation</td>\n", + " <td id=\"T_9e889_row17_col1\" class=\"data row17 col1\" >RegressionModelsEvaluation</td>\n", + " <td id=\"T_9e889_row17_col2\" class=\"data row17 col2\" >Test suite for metrics comparison of regression model of statsmodels library</td>\n", + " <td id=\"T_9e889_row17_col3\" class=\"data row17 col3\" >validmind.model_validation.statsmodels.RegressionModelCoeffs, validmind.model_validation.sklearn.RegressionModelsPerformanceComparison</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row18_col0\" class=\"data row18 col0\" >regression_full_suite</td>\n", + " <td id=\"T_9e889_row18_col1\" class=\"data row18 col1\" >RegressionFullSuite</td>\n", + " <td id=\"T_9e889_row18_col2\" class=\"data row18 col2\" >Full test suite for regression models.</td>\n", + " <td id=\"T_9e889_row18_col3\" class=\"data row18 col3\" >validmind.data_validation.DatasetDescription, validmind.data_validation.DescriptiveStatistics, validmind.data_validation.PearsonCorrelationMatrix, validmind.data_validation.ClassImbalance, validmind.data_validation.Duplicates, validmind.data_validation.HighCardinality, validmind.data_validation.HighPearsonCorrelation, validmind.data_validation.MissingValues, validmind.data_validation.Skewness, validmind.data_validation.UniqueRows, validmind.data_validation.TooManyZeroValues, validmind.data_validation.DatasetSplit, validmind.model_validation.ModelMetadata, validmind.model_validation.sklearn.PermutationFeatureImportance, validmind.model_validation.sklearn.RegressionErrors, validmind.model_validation.sklearn.RegressionR2Square</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row19_col0\" class=\"data row19 col0\" >regression_performance</td>\n", + " <td id=\"T_9e889_row19_col1\" class=\"data row19 col1\" >RegressionPerformance</td>\n", + " <td id=\"T_9e889_row19_col2\" class=\"data row19 col2\" >Test suite for regression model performance</td>\n", + " <td id=\"T_9e889_row19_col3\" class=\"data row19 col3\" >validmind.model_validation.sklearn.RegressionErrors, validmind.model_validation.sklearn.RegressionR2Square</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row20_col0\" class=\"data row20 col0\" >summarization_metrics</td>\n", + " <td id=\"T_9e889_row20_col1\" class=\"data row20 col1\" >SummarizationMetrics</td>\n", + " <td id=\"T_9e889_row20_col2\" class=\"data row20 col2\" >Test suite for Summarization metrics</td>\n", + " <td id=\"T_9e889_row20_col3\" class=\"data row20 col3\" >validmind.model_validation.TokenDisparity, validmind.model_validation.BleuScore, validmind.model_validation.BertScore, validmind.model_validation.ContextualRecall</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row21_col0\" class=\"data row21 col0\" >tabular_dataset</td>\n", + " <td id=\"T_9e889_row21_col1\" class=\"data row21 col1\" >TabularDataset</td>\n", + " <td id=\"T_9e889_row21_col2\" class=\"data row21 col2\" >Test suite for tabular datasets.</td>\n", + " <td id=\"T_9e889_row21_col3\" class=\"data row21 col3\" >validmind.data_validation.DatasetDescription, validmind.data_validation.DescriptiveStatistics, validmind.data_validation.PearsonCorrelationMatrix, validmind.data_validation.ClassImbalance, validmind.data_validation.Duplicates, validmind.data_validation.HighCardinality, validmind.data_validation.HighPearsonCorrelation, validmind.data_validation.MissingValues, validmind.data_validation.Skewness, validmind.data_validation.UniqueRows, validmind.data_validation.TooManyZeroValues</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row22_col0\" class=\"data row22 col0\" >tabular_dataset_description</td>\n", + " <td id=\"T_9e889_row22_col1\" class=\"data row22 col1\" >TabularDatasetDescription</td>\n", + " <td id=\"T_9e889_row22_col2\" class=\"data row22 col2\" >Test suite to extract metadata and descriptive\n", + "statistics from a tabular dataset</td>\n", + " <td id=\"T_9e889_row22_col3\" class=\"data row22 col3\" >validmind.data_validation.DatasetDescription, validmind.data_validation.DescriptiveStatistics, validmind.data_validation.PearsonCorrelationMatrix</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row23_col0\" class=\"data row23 col0\" >tabular_data_quality</td>\n", + " <td id=\"T_9e889_row23_col1\" class=\"data row23 col1\" >TabularDataQuality</td>\n", + " <td id=\"T_9e889_row23_col2\" class=\"data row23 col2\" >Test suite for data quality on tabular datasets</td>\n", + " <td id=\"T_9e889_row23_col3\" class=\"data row23 col3\" >validmind.data_validation.ClassImbalance, validmind.data_validation.Duplicates, validmind.data_validation.HighCardinality, validmind.data_validation.HighPearsonCorrelation, validmind.data_validation.MissingValues, validmind.data_validation.Skewness, validmind.data_validation.UniqueRows, validmind.data_validation.TooManyZeroValues</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row24_col0\" class=\"data row24 col0\" >text_data_quality</td>\n", + " <td id=\"T_9e889_row24_col1\" class=\"data row24 col1\" >TextDataQuality</td>\n", + " <td id=\"T_9e889_row24_col2\" class=\"data row24 col2\" >Test suite for data quality on text data</td>\n", + " <td id=\"T_9e889_row24_col3\" class=\"data row24 col3\" >validmind.data_validation.ClassImbalance, validmind.data_validation.Duplicates, validmind.data_validation.nlp.StopWords, validmind.data_validation.nlp.Punctuations, validmind.data_validation.nlp.CommonWords, validmind.data_validation.nlp.TextDescription</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row25_col0\" class=\"data row25 col0\" >time_series_data_quality</td>\n", + " <td id=\"T_9e889_row25_col1\" class=\"data row25 col1\" >TimeSeriesDataQuality</td>\n", + " <td id=\"T_9e889_row25_col2\" class=\"data row25 col2\" >Test suite for data quality on time series datasets</td>\n", + " <td id=\"T_9e889_row25_col3\" class=\"data row25 col3\" >validmind.data_validation.TimeSeriesOutliers, validmind.data_validation.TimeSeriesMissingValues, validmind.data_validation.TimeSeriesFrequency</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row26_col0\" class=\"data row26 col0\" >time_series_dataset</td>\n", + " <td id=\"T_9e889_row26_col1\" class=\"data row26 col1\" >TimeSeriesDataset</td>\n", + " <td id=\"T_9e889_row26_col2\" class=\"data row26 col2\" >Test suite for time series datasets.</td>\n", + " <td id=\"T_9e889_row26_col3\" class=\"data row26 col3\" >validmind.data_validation.TimeSeriesOutliers, validmind.data_validation.TimeSeriesMissingValues, validmind.data_validation.TimeSeriesFrequency, validmind.data_validation.TimeSeriesLinePlot, validmind.data_validation.TimeSeriesHistogram, validmind.data_validation.ACFandPACFPlot, validmind.data_validation.SeasonalDecompose, validmind.data_validation.AutoSeasonality, validmind.data_validation.AutoStationarity, validmind.data_validation.RollingStatsPlot, validmind.data_validation.AutoAR, validmind.data_validation.AutoMA, validmind.data_validation.ScatterPlot, validmind.data_validation.LaggedCorrelationHeatmap, validmind.data_validation.EngleGrangerCoint, validmind.data_validation.SpreadPlot</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row27_col0\" class=\"data row27 col0\" >time_series_model_validation</td>\n", + " <td id=\"T_9e889_row27_col1\" class=\"data row27 col1\" >TimeSeriesModelValidation</td>\n", + " <td id=\"T_9e889_row27_col2\" class=\"data row27 col2\" >Test suite for time series model validation.</td>\n", + " <td id=\"T_9e889_row27_col3\" class=\"data row27 col3\" >validmind.data_validation.DatasetSplit, validmind.model_validation.ModelMetadata, validmind.model_validation.statsmodels.RegressionModelCoeffs, validmind.model_validation.sklearn.RegressionModelsPerformanceComparison</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row28_col0\" class=\"data row28 col0\" >time_series_multivariate</td>\n", + " <td id=\"T_9e889_row28_col1\" class=\"data row28 col1\" >TimeSeriesMultivariate</td>\n", + " <td id=\"T_9e889_row28_col2\" class=\"data row28 col2\" >This test suite provides a preliminary understanding of the features\n", + "and relationship in multivariate dataset. It presents various\n", + "multivariate visualizations that can help identify patterns, trends,\n", + "and relationships between pairs of variables. The visualizations are\n", + "designed to explore the relationships between multiple features\n", + "simultaneously. They allow you to quickly identify any patterns or\n", + "trends in the data, as well as any potential outliers or anomalies.\n", + "The individual feature distribution can also be explored to provide\n", + "insight into the range and frequency of values observed in the data.\n", + "This multivariate analysis test suite aims to provide an overview of\n", + "the data structure and guide further exploration and modeling.</td>\n", + " <td id=\"T_9e889_row28_col3\" class=\"data row28 col3\" >validmind.data_validation.ScatterPlot, validmind.data_validation.LaggedCorrelationHeatmap, validmind.data_validation.EngleGrangerCoint, validmind.data_validation.SpreadPlot</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_9e889_row29_col0\" class=\"data row29 col0\" >time_series_univariate</td>\n", + " <td id=\"T_9e889_row29_col1\" class=\"data row29 col1\" >TimeSeriesUnivariate</td>\n", + " <td id=\"T_9e889_row29_col2\" class=\"data row29 col2\" >This test suite provides a preliminary understanding of the target variable(s)\n", + "used in the time series dataset. It visualizations that present the raw time\n", + "series data and a histogram of the target variable(s).\n", + "\n", + "The raw time series data provides a visual inspection of the target variable's\n", + "behavior over time. This helps to identify any patterns or trends in the data,\n", + "as well as any potential outliers or anomalies. The histogram of the target\n", + "variable displays the distribution of values, providing insight into the range\n", + "and frequency of values observed in the data.</td>\n", + " <td id=\"T_9e889_row29_col3\" class=\"data row29 col3\" >validmind.data_validation.TimeSeriesLinePlot, validmind.data_validation.TimeSeriesHistogram, validmind.data_validation.ACFandPACFPlot, validmind.data_validation.SeasonalDecompose, validmind.data_validation.AutoSeasonality, validmind.data_validation.AutoStationarity, validmind.data_validation.RollingStatsPlot, validmind.data_validation.AutoAR, validmind.data_validation.AutoMA</td>\n", + " </tr>\n", + " </tbody>\n", + "</table>\n" + ], + "text/plain": [ + "<pandas.io.formats.style.Styler at 0x16a11ae00>" + ] + } + } ] - }, - "execution_count": 3, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "vm.test_suites.describe_suite(\"classifier_full_suite\", verbose=True)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### View test details\n", - "\n", - "To inspect a specific test in a suite, pass the name of the test to [tests.describe_test()](https://docs.validmind.ai/validmind/validmind/tests.html#describe_test) to get detailed information about the test such as its purpose, strengths and limitations:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [ + }, { - "data": { - "text/html": [ - "\n", - " <div class=\"vm-accordion\" id=\"accordion-c38a3af7\">\n", - " \n", - " <div class=\"vm-accordion-item\">\n", - " <div class=\"vm-accordion-header\"\n", - " onclick=\"toggleAccordionItem('accordion-c38a3af7-item-0')\"\n", - " style=\"cursor: pointer; padding: 10px; background-color: #f8f9fa; border: 1px solid #dee2e6; font-weight: bold;\">\n", - " <span class=\"vm-accordion-toggle\" id=\"accordion-c38a3af7-item-0-toggle\">▶</span>\n", - " Test: Descriptive Statistics ('validmind.data_validation.DescriptiveStatistics')\n", - " </div>\n", - " <div class=\"vm-accordion-content\"\n", - " id=\"accordion-c38a3af7-item-0\"\n", - " style=\"display: none; padding: 15px; border: 1px solid #dee2e6; border-top: none;\">\n", - " \n", - "<div>\n", - " <h2>Descriptive Statistics</h2>\n", - " <div style=\"border: 1px solid #ddd; border-radius: 4px; padding: 10px; margin: 10px 0;\">\n", - " <p>Performs a detailed descriptive statistical analysis of both numerical and categorical data within a model's\n", - "dataset.</p>\n", - "<h3>Purpose</h3>\n", - "<p>The purpose of the Descriptive Statistics metric is to provide a comprehensive summary of both numerical and\n", - "categorical data within a dataset. This involves statistics such as count, mean, standard deviation, minimum and\n", - "maximum values for numerical data. For categorical data, it calculates the count, number of unique values, most\n", - "common value and its frequency, and the proportion of the most frequent value relative to the total. The goal is to\n", - "visualize the overall distribution of the variables in the dataset, aiding in understanding the model's behavior\n", - "and predicting its performance.</p>\n", - "<h3>Test Mechanism</h3>\n", - "<p>The testing mechanism utilizes two in-built functions of pandas dataframes: <code>describe()</code> for numerical fields and\n", - "<code>value_counts()</code> for categorical fields. The <code>describe()</code> function pulls out several summary statistics, while\n", - "<code>value_counts()</code> accounts for unique values. The resulting data is formatted into two distinct tables, one for\n", - "numerical and another for categorical variable summaries. These tables provide a clear summary of the main\n", - "characteristics of the variables, which can be instrumental in assessing the model's performance.</p>\n", - "<h3>Signs of High Risk</h3>\n", - "<ul>\n", - "<li>Skewed data or significant outliers can represent high risk. For numerical data, this may be reflected via a\n", - "significant difference between the mean and median (50% percentile).</li>\n", - "<li>For categorical data, a lack of diversity (low count of unique values), or overdominance of a single category\n", - "(high frequency of the top value) can indicate high risk.</li>\n", - "</ul>\n", - "<h3>Strengths</h3>\n", - "<ul>\n", - "<li>Provides a comprehensive summary of the dataset, shedding light on the distribution and characteristics of the\n", - "variables under consideration.</li>\n", - "<li>It is a versatile and robust method, applicable to both numerical and categorical data.</li>\n", - "<li>Helps highlight crucial anomalies such as outliers, extreme skewness, or lack of diversity, which are vital in\n", - "understanding model behavior during testing and validation.</li>\n", - "</ul>\n", - "<h3>Limitations</h3>\n", - "<ul>\n", - "<li>While this metric offers a high-level overview of the data, it may fail to detect subtle correlations or complex\n", - "patterns.</li>\n", - "<li>Does not offer any insights on the relationship between variables.</li>\n", - "<li>Alone, descriptive statistics cannot be used to infer properties about future unseen data.</li>\n", - "<li>Should be used in conjunction with other statistical tests to provide a comprehensive understanding of the\n", - "model's data.</li>\n", - "</ul>\n", - "\n", - " </div>\n", - "</div>\n", - "\n", - "<h4 class=\"vm_required_context\">\n", - " Required Inputs: <span style=\"font-size: 13px\"><i>dataset</i></span>\n", - "</h4>\n", - "\n", - "<div style=\"display: none;\">\n", - " <h4>Parameters:</h4>\n", - " <table class=\"vm_params_table\" style=\"display: none;\">\n", - " <tr>\n", - " <th>Parameter</th>\n", - " <th>Default Value</th>\n", - " </tr>\n", - " \n", - " </table>\n", - "</div>\n", - "\n", - "<div class=\"unset\">\n", - " <h3>How to Run:</h3>\n", - "\n", - " <button\n", - " onclick=\"(() => {e = document.getElementById('expandable_instructions_7e3e1a19-00f2-4e0b-95b6-720bc7e3ba8b'); e.style.display === 'none' ? e.style.display = 'block' : e.style.display = 'none'})()\"\n", - " >Show/Hide Instructions</button>\n", - "\n", - " <div id=\"expandable_instructions_7e3e1a19-00f2-4e0b-95b6-720bc7e3ba8b\" style=\"display: block;\">\n", - " <h4>Code:</h4>\n", - " <pre>\n", - " <code class='language-python'>\n", - "import validmind as vm\n", - "\n", - "# inputs dictionary maps your inputs to the expected input names\n", - "# keys are the expected input names and values are the actual inputs\n", - "# values may be string input_ids or the actual VMDataset or VMModel objects\n", - "inputs = {\n", - " \"dataset\": \"my_vm_dataset\"\n", - "}\n", - "params = {}\n", - "\n", - "# to run and view the result of this test, run the following code:\n", - "result = vm.tests.run_test(\n", - " \"validmind.data_validation.DescriptiveStatistics\", inputs=inputs, params=params\n", - ")\n", - "\n", - "# To see the result of the test, ensure that you have called `vm.init()` and then run:\n", - "result.log()</code>\n", - " </pre>\n", - " </div>\n", - "</div>\n", - "\n", - "<style>\n", - "h5.vm_required_context {\n", - " margin-top: 25px;\n", - "}\n", - "table.vm_params_table {\n", - " margin-top: 20px;\n", - " width: 350px;\n", - " border-collapse: collapse;\n", - " border-color: --jp-border-color0;\n", - "}\n", - "table.vm_params_table td, table.vm_params_table th {\n", - " text-align: right;\n", - "}\n", - "table.vm_params_table td:first-child, table.vm_params_table th:first-child {\n", - " text-align: left;\n", - "}\n", - "table.vm_params_table th {\n", - " background-color: --jp-content-color0;\n", - " font-weight: bold;\n", - " font-size: 14px !important;\n", - "}\n", - "table.vm_params_table tr:nth-child(even) {\n", - " background-color: --jp-layout-color1;\n", - "}\n", - "table.vm_params_table tr:nth-child(odd) {\n", - " background-color: --jp-layout-color2;\n", - "}\n", - "table.vm_params_table tr:hover {\n", - " background-color: --jp-layout-color3;\n", - "}\n", - "table.vm_params_table td, table.vm_params_table th {\n", - " padding: 5px;\n", - " border: .8px solid --jp-border-color0;\n", - "}\n", - "</style>\n", - "\n", - " </div>\n", - " </div>\n", - " \n", - " </div>\n", - "\n", - " <script>\n", - " function toggleAccordionItem(itemId) {\n", - " const content = document.getElementById(itemId);\n", - " const toggle = document.getElementById(itemId + '-toggle');\n", - "\n", - " if (content.style.display === 'none' || content.style.display === '') {\n", - " content.style.display = 'block';\n", - " toggle.innerHTML = '▼';\n", - " } else {\n", - " content.style.display = 'none';\n", - " toggle.innerHTML = '▶';\n", - " }\n", - " }\n", - " </script>\n", - " " + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## View test suite details\n", + "\n", + "Use the [test_suites.describe_suite()](https://docs.validmind.ai/validmind/validmind/test_suites.html#describe_suite) function to retrieve information about a test suite, including its name, description, and the list of tests it contains. \n", + "\n", + "You can call `test_suites.describe_suite()` with just the test suite ID to get basic details, or pass an additional `verbose` parameter for a more comprehensive output: \n", + "\n", + "- **Test ID** - The identifier of the test suite you want to inspect.\n", + "- **Verbose** - A Boolean flag. Set `verbose=True` to return a full breakdown of the test suite." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.test_suites.describe_suite(\"classifier_full_suite\", verbose=True)" ], - "text/plain": [ - "<IPython.core.display.HTML object>" + "execution_count": null, + "outputs": [ + { + "output_type": "execute_result", + "data": { + "text/html": [ + "<style type=\"text/css\">\n", + "#T_7cb1b th {\n", + " text-align: left;\n", + "}\n", + "#T_7cb1b_row0_col0, #T_7cb1b_row0_col1, #T_7cb1b_row0_col2, #T_7cb1b_row0_col3, #T_7cb1b_row0_col4, #T_7cb1b_row1_col0, #T_7cb1b_row1_col1, #T_7cb1b_row1_col2, #T_7cb1b_row1_col3, #T_7cb1b_row1_col4, #T_7cb1b_row2_col0, #T_7cb1b_row2_col1, #T_7cb1b_row2_col2, #T_7cb1b_row2_col3, #T_7cb1b_row2_col4, #T_7cb1b_row3_col0, #T_7cb1b_row3_col1, #T_7cb1b_row3_col2, #T_7cb1b_row3_col3, #T_7cb1b_row3_col4, #T_7cb1b_row4_col0, #T_7cb1b_row4_col1, #T_7cb1b_row4_col2, #T_7cb1b_row4_col3, #T_7cb1b_row4_col4, #T_7cb1b_row5_col0, #T_7cb1b_row5_col1, #T_7cb1b_row5_col2, #T_7cb1b_row5_col3, #T_7cb1b_row5_col4, #T_7cb1b_row6_col0, #T_7cb1b_row6_col1, #T_7cb1b_row6_col2, #T_7cb1b_row6_col3, #T_7cb1b_row6_col4, #T_7cb1b_row7_col0, #T_7cb1b_row7_col1, #T_7cb1b_row7_col2, #T_7cb1b_row7_col3, #T_7cb1b_row7_col4, #T_7cb1b_row8_col0, #T_7cb1b_row8_col1, #T_7cb1b_row8_col2, #T_7cb1b_row8_col3, #T_7cb1b_row8_col4, #T_7cb1b_row9_col0, #T_7cb1b_row9_col1, #T_7cb1b_row9_col2, #T_7cb1b_row9_col3, #T_7cb1b_row9_col4, #T_7cb1b_row10_col0, #T_7cb1b_row10_col1, #T_7cb1b_row10_col2, #T_7cb1b_row10_col3, #T_7cb1b_row10_col4, #T_7cb1b_row11_col0, #T_7cb1b_row11_col1, #T_7cb1b_row11_col2, #T_7cb1b_row11_col3, #T_7cb1b_row11_col4, #T_7cb1b_row12_col0, #T_7cb1b_row12_col1, #T_7cb1b_row12_col2, #T_7cb1b_row12_col3, #T_7cb1b_row12_col4, #T_7cb1b_row13_col0, #T_7cb1b_row13_col1, #T_7cb1b_row13_col2, #T_7cb1b_row13_col3, #T_7cb1b_row13_col4, #T_7cb1b_row14_col0, #T_7cb1b_row14_col1, #T_7cb1b_row14_col2, #T_7cb1b_row14_col3, #T_7cb1b_row14_col4, #T_7cb1b_row15_col0, #T_7cb1b_row15_col1, #T_7cb1b_row15_col2, #T_7cb1b_row15_col3, #T_7cb1b_row15_col4, #T_7cb1b_row16_col0, #T_7cb1b_row16_col1, #T_7cb1b_row16_col2, #T_7cb1b_row16_col3, #T_7cb1b_row16_col4, #T_7cb1b_row17_col0, #T_7cb1b_row17_col1, #T_7cb1b_row17_col2, #T_7cb1b_row17_col3, #T_7cb1b_row17_col4, #T_7cb1b_row18_col0, #T_7cb1b_row18_col1, #T_7cb1b_row18_col2, #T_7cb1b_row18_col3, #T_7cb1b_row18_col4, #T_7cb1b_row19_col0, #T_7cb1b_row19_col1, #T_7cb1b_row19_col2, #T_7cb1b_row19_col3, #T_7cb1b_row19_col4, #T_7cb1b_row20_col0, #T_7cb1b_row20_col1, #T_7cb1b_row20_col2, #T_7cb1b_row20_col3, #T_7cb1b_row20_col4, #T_7cb1b_row21_col0, #T_7cb1b_row21_col1, #T_7cb1b_row21_col2, #T_7cb1b_row21_col3, #T_7cb1b_row21_col4, #T_7cb1b_row22_col0, #T_7cb1b_row22_col1, #T_7cb1b_row22_col2, #T_7cb1b_row22_col3, #T_7cb1b_row22_col4, #T_7cb1b_row23_col0, #T_7cb1b_row23_col1, #T_7cb1b_row23_col2, #T_7cb1b_row23_col3, #T_7cb1b_row23_col4, #T_7cb1b_row24_col0, #T_7cb1b_row24_col1, #T_7cb1b_row24_col2, #T_7cb1b_row24_col3, #T_7cb1b_row24_col4, #T_7cb1b_row25_col0, #T_7cb1b_row25_col1, #T_7cb1b_row25_col2, #T_7cb1b_row25_col3, #T_7cb1b_row25_col4, #T_7cb1b_row26_col0, #T_7cb1b_row26_col1, #T_7cb1b_row26_col2, #T_7cb1b_row26_col3, #T_7cb1b_row26_col4, #T_7cb1b_row27_col0, #T_7cb1b_row27_col1, #T_7cb1b_row27_col2, #T_7cb1b_row27_col3, #T_7cb1b_row27_col4 {\n", + " text-align: left;\n", + "}\n", + "</style>\n", + "<table id=\"T_7cb1b\">\n", + " <thead>\n", + " <tr>\n", + " <th id=\"T_7cb1b_level0_col0\" class=\"col_heading level0 col0\" >Test Suite ID</th>\n", + " <th id=\"T_7cb1b_level0_col1\" class=\"col_heading level0 col1\" >Test Suite Name</th>\n", + " <th id=\"T_7cb1b_level0_col2\" class=\"col_heading level0 col2\" >Test Suite Section</th>\n", + " <th id=\"T_7cb1b_level0_col3\" class=\"col_heading level0 col3\" >Test ID</th>\n", + " <th id=\"T_7cb1b_level0_col4\" class=\"col_heading level0 col4\" >Test Name</th>\n", + " </tr>\n", + " </thead>\n", + " <tbody>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row0_col0\" class=\"data row0 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row0_col1\" class=\"data row0 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row0_col2\" class=\"data row0 col2\" >tabular_dataset_description</td>\n", + " <td id=\"T_7cb1b_row0_col3\" class=\"data row0 col3\" >validmind.data_validation.DatasetDescription</td>\n", + " <td id=\"T_7cb1b_row0_col4\" class=\"data row0 col4\" >Dataset Description</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row1_col0\" class=\"data row1 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row1_col1\" class=\"data row1 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row1_col2\" class=\"data row1 col2\" >tabular_dataset_description</td>\n", + " <td id=\"T_7cb1b_row1_col3\" class=\"data row1 col3\" >validmind.data_validation.DescriptiveStatistics</td>\n", + " <td id=\"T_7cb1b_row1_col4\" class=\"data row1 col4\" >Descriptive Statistics</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row2_col0\" class=\"data row2 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row2_col1\" class=\"data row2 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row2_col2\" class=\"data row2 col2\" >tabular_dataset_description</td>\n", + " <td id=\"T_7cb1b_row2_col3\" class=\"data row2 col3\" >validmind.data_validation.PearsonCorrelationMatrix</td>\n", + " <td id=\"T_7cb1b_row2_col4\" class=\"data row2 col4\" >Pearson Correlation Matrix</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row3_col0\" class=\"data row3 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row3_col1\" class=\"data row3 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row3_col2\" class=\"data row3 col2\" >tabular_data_quality</td>\n", + " <td id=\"T_7cb1b_row3_col3\" class=\"data row3 col3\" >validmind.data_validation.ClassImbalance</td>\n", + " <td id=\"T_7cb1b_row3_col4\" class=\"data row3 col4\" >Class Imbalance</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row4_col0\" class=\"data row4 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row4_col1\" class=\"data row4 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row4_col2\" class=\"data row4 col2\" >tabular_data_quality</td>\n", + " <td id=\"T_7cb1b_row4_col3\" class=\"data row4 col3\" >validmind.data_validation.Duplicates</td>\n", + " <td id=\"T_7cb1b_row4_col4\" class=\"data row4 col4\" >Duplicates</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row5_col0\" class=\"data row5 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row5_col1\" class=\"data row5 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row5_col2\" class=\"data row5 col2\" >tabular_data_quality</td>\n", + " <td id=\"T_7cb1b_row5_col3\" class=\"data row5 col3\" >validmind.data_validation.HighCardinality</td>\n", + " <td id=\"T_7cb1b_row5_col4\" class=\"data row5 col4\" >High Cardinality</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row6_col0\" class=\"data row6 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row6_col1\" class=\"data row6 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row6_col2\" class=\"data row6 col2\" >tabular_data_quality</td>\n", + " <td id=\"T_7cb1b_row6_col3\" class=\"data row6 col3\" >validmind.data_validation.HighPearsonCorrelation</td>\n", + " <td id=\"T_7cb1b_row6_col4\" class=\"data row6 col4\" >High Pearson Correlation</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row7_col0\" class=\"data row7 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row7_col1\" class=\"data row7 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row7_col2\" class=\"data row7 col2\" >tabular_data_quality</td>\n", + " <td id=\"T_7cb1b_row7_col3\" class=\"data row7 col3\" >validmind.data_validation.MissingValues</td>\n", + " <td id=\"T_7cb1b_row7_col4\" class=\"data row7 col4\" >Missing Values</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row8_col0\" class=\"data row8 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row8_col1\" class=\"data row8 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row8_col2\" class=\"data row8 col2\" >tabular_data_quality</td>\n", + " <td id=\"T_7cb1b_row8_col3\" class=\"data row8 col3\" >validmind.data_validation.Skewness</td>\n", + " <td id=\"T_7cb1b_row8_col4\" class=\"data row8 col4\" >Skewness</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row9_col0\" class=\"data row9 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row9_col1\" class=\"data row9 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row9_col2\" class=\"data row9 col2\" >tabular_data_quality</td>\n", + " <td id=\"T_7cb1b_row9_col3\" class=\"data row9 col3\" >validmind.data_validation.UniqueRows</td>\n", + " <td id=\"T_7cb1b_row9_col4\" class=\"data row9 col4\" >Unique Rows</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row10_col0\" class=\"data row10 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row10_col1\" class=\"data row10 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row10_col2\" class=\"data row10 col2\" >tabular_data_quality</td>\n", + " <td id=\"T_7cb1b_row10_col3\" class=\"data row10 col3\" >validmind.data_validation.TooManyZeroValues</td>\n", + " <td id=\"T_7cb1b_row10_col4\" class=\"data row10 col4\" >Too Many Zero Values</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row11_col0\" class=\"data row11 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row11_col1\" class=\"data row11 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row11_col2\" class=\"data row11 col2\" >classifier_metrics</td>\n", + " <td id=\"T_7cb1b_row11_col3\" class=\"data row11 col3\" >validmind.model_validation.ModelMetadata</td>\n", + " <td id=\"T_7cb1b_row11_col4\" class=\"data row11 col4\" >Model Metadata</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row12_col0\" class=\"data row12 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row12_col1\" class=\"data row12 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row12_col2\" class=\"data row12 col2\" >classifier_metrics</td>\n", + " <td id=\"T_7cb1b_row12_col3\" class=\"data row12 col3\" >validmind.data_validation.DatasetSplit</td>\n", + " <td id=\"T_7cb1b_row12_col4\" class=\"data row12 col4\" >Dataset Split</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row13_col0\" class=\"data row13 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row13_col1\" class=\"data row13 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row13_col2\" class=\"data row13 col2\" >classifier_metrics</td>\n", + " <td id=\"T_7cb1b_row13_col3\" class=\"data row13 col3\" >validmind.model_validation.sklearn.ConfusionMatrix</td>\n", + " <td id=\"T_7cb1b_row13_col4\" class=\"data row13 col4\" >Confusion Matrix</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row14_col0\" class=\"data row14 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row14_col1\" class=\"data row14 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row14_col2\" class=\"data row14 col2\" >classifier_metrics</td>\n", + " <td id=\"T_7cb1b_row14_col3\" class=\"data row14 col3\" >validmind.model_validation.sklearn.ClassifierPerformance</td>\n", + " <td id=\"T_7cb1b_row14_col4\" class=\"data row14 col4\" >Classifier Performance</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row15_col0\" class=\"data row15 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row15_col1\" class=\"data row15 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row15_col2\" class=\"data row15 col2\" >classifier_metrics</td>\n", + " <td id=\"T_7cb1b_row15_col3\" class=\"data row15 col3\" >validmind.model_validation.sklearn.PermutationFeatureImportance</td>\n", + " <td id=\"T_7cb1b_row15_col4\" class=\"data row15 col4\" >Permutation Feature Importance</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row16_col0\" class=\"data row16 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row16_col1\" class=\"data row16 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row16_col2\" class=\"data row16 col2\" >classifier_metrics</td>\n", + " <td id=\"T_7cb1b_row16_col3\" class=\"data row16 col3\" >validmind.model_validation.sklearn.PrecisionRecallCurve</td>\n", + " <td id=\"T_7cb1b_row16_col4\" class=\"data row16 col4\" >Precision Recall Curve</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row17_col0\" class=\"data row17 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row17_col1\" class=\"data row17 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row17_col2\" class=\"data row17 col2\" >classifier_metrics</td>\n", + " <td id=\"T_7cb1b_row17_col3\" class=\"data row17 col3\" >validmind.model_validation.sklearn.ROCCurve</td>\n", + " <td id=\"T_7cb1b_row17_col4\" class=\"data row17 col4\" >ROC Curve</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row18_col0\" class=\"data row18 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row18_col1\" class=\"data row18 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row18_col2\" class=\"data row18 col2\" >classifier_metrics</td>\n", + " <td id=\"T_7cb1b_row18_col3\" class=\"data row18 col3\" >validmind.model_validation.sklearn.PopulationStabilityIndex</td>\n", + " <td id=\"T_7cb1b_row18_col4\" class=\"data row18 col4\" >Population Stability Index</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row19_col0\" class=\"data row19 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row19_col1\" class=\"data row19 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row19_col2\" class=\"data row19 col2\" >classifier_metrics</td>\n", + " <td id=\"T_7cb1b_row19_col3\" class=\"data row19 col3\" >validmind.model_validation.sklearn.SHAPGlobalImportance</td>\n", + " <td id=\"T_7cb1b_row19_col4\" class=\"data row19 col4\" >SHAP Global Importance</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row20_col0\" class=\"data row20 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row20_col1\" class=\"data row20 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row20_col2\" class=\"data row20 col2\" >classifier_validation</td>\n", + " <td id=\"T_7cb1b_row20_col3\" class=\"data row20 col3\" >validmind.model_validation.sklearn.MinimumAccuracy</td>\n", + " <td id=\"T_7cb1b_row20_col4\" class=\"data row20 col4\" >Minimum Accuracy</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row21_col0\" class=\"data row21 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row21_col1\" class=\"data row21 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row21_col2\" class=\"data row21 col2\" >classifier_validation</td>\n", + " <td id=\"T_7cb1b_row21_col3\" class=\"data row21 col3\" >validmind.model_validation.sklearn.MinimumF1Score</td>\n", + " <td id=\"T_7cb1b_row21_col4\" class=\"data row21 col4\" >Minimum F1 Score</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row22_col0\" class=\"data row22 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row22_col1\" class=\"data row22 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row22_col2\" class=\"data row22 col2\" >classifier_validation</td>\n", + " <td id=\"T_7cb1b_row22_col3\" class=\"data row22 col3\" >validmind.model_validation.sklearn.MinimumROCAUCScore</td>\n", + " <td id=\"T_7cb1b_row22_col4\" class=\"data row22 col4\" >Minimum ROCAUC Score</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row23_col0\" class=\"data row23 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row23_col1\" class=\"data row23 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row23_col2\" class=\"data row23 col2\" >classifier_validation</td>\n", + " <td id=\"T_7cb1b_row23_col3\" class=\"data row23 col3\" >validmind.model_validation.sklearn.TrainingTestDegradation</td>\n", + " <td id=\"T_7cb1b_row23_col4\" class=\"data row23 col4\" >Training Test Degradation</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row24_col0\" class=\"data row24 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row24_col1\" class=\"data row24 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row24_col2\" class=\"data row24 col2\" >classifier_validation</td>\n", + " <td id=\"T_7cb1b_row24_col3\" class=\"data row24 col3\" >validmind.model_validation.sklearn.ModelsPerformanceComparison</td>\n", + " <td id=\"T_7cb1b_row24_col4\" class=\"data row24 col4\" >Models Performance Comparison</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row25_col0\" class=\"data row25 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row25_col1\" class=\"data row25 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row25_col2\" class=\"data row25 col2\" >classifier_model_diagnosis</td>\n", + " <td id=\"T_7cb1b_row25_col3\" class=\"data row25 col3\" >validmind.model_validation.sklearn.OverfitDiagnosis</td>\n", + " <td id=\"T_7cb1b_row25_col4\" class=\"data row25 col4\" >Overfit Diagnosis</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row26_col0\" class=\"data row26 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row26_col1\" class=\"data row26 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row26_col2\" class=\"data row26 col2\" >classifier_model_diagnosis</td>\n", + " <td id=\"T_7cb1b_row26_col3\" class=\"data row26 col3\" >validmind.model_validation.sklearn.WeakspotsDiagnosis</td>\n", + " <td id=\"T_7cb1b_row26_col4\" class=\"data row26 col4\" >Weakspots Diagnosis</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_7cb1b_row27_col0\" class=\"data row27 col0\" >classifier_full_suite</td>\n", + " <td id=\"T_7cb1b_row27_col1\" class=\"data row27 col1\" >ClassifierFullSuite</td>\n", + " <td id=\"T_7cb1b_row27_col2\" class=\"data row27 col2\" >classifier_model_diagnosis</td>\n", + " <td id=\"T_7cb1b_row27_col3\" class=\"data row27 col3\" >validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", + " <td id=\"T_7cb1b_row27_col4\" class=\"data row27 col4\" >Robustness Diagnosis</td>\n", + " </tr>\n", + " </tbody>\n", + "</table>\n" + ], + "text/plain": [ + "<pandas.io.formats.style.Styler at 0x16a167fa0>" + ] + } + } ] - }, - "metadata": {}, - "output_type": "display_data" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### View test details\n", + "\n", + "To inspect a specific test in a suite, pass the name of the test to [tests.describe_test()](https://docs.validmind.ai/validmind/validmind/tests.html#describe_test) to get detailed information about the test such as its purpose, strengths and limitations:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.describe_test(\"validmind.data_validation.DescriptiveStatistics\")" + ], + "execution_count": null, + "outputs": [ + { + "output_type": "display_data", + "data": { + "text/html": [ + "\n", + " <div class=\"vm-accordion\" id=\"accordion-c38a3af7\">\n", + " \n", + " <div class=\"vm-accordion-item\">\n", + " <div class=\"vm-accordion-header\"\n", + " onclick=\"toggleAccordionItem('accordion-c38a3af7-item-0')\"\n", + " style=\"cursor: pointer; padding: 10px; background-color: #f8f9fa; border: 1px solid #dee2e6; font-weight: bold;\">\n", + " <span class=\"vm-accordion-toggle\" id=\"accordion-c38a3af7-item-0-toggle\">▶</span>\n", + " Test: Descriptive Statistics ('validmind.data_validation.DescriptiveStatistics')\n", + " </div>\n", + " <div class=\"vm-accordion-content\"\n", + " id=\"accordion-c38a3af7-item-0\"\n", + " style=\"display: none; padding: 15px; border: 1px solid #dee2e6; border-top: none;\">\n", + " \n", + "<div>\n", + " <h2>Descriptive Statistics</h2>\n", + " <div style=\"border: 1px solid #ddd; border-radius: 4px; padding: 10px; margin: 10px 0;\">\n", + " <p>Performs a detailed descriptive statistical analysis of both numerical and categorical data within a model's\n", + "dataset.</p>\n", + "<h3>Purpose</h3>\n", + "<p>The purpose of the Descriptive Statistics metric is to provide a comprehensive summary of both numerical and\n", + "categorical data within a dataset. This involves statistics such as count, mean, standard deviation, minimum and\n", + "maximum values for numerical data. For categorical data, it calculates the count, number of unique values, most\n", + "common value and its frequency, and the proportion of the most frequent value relative to the total. The goal is to\n", + "visualize the overall distribution of the variables in the dataset, aiding in understanding the model's behavior\n", + "and predicting its performance.</p>\n", + "<h3>Test Mechanism</h3>\n", + "<p>The testing mechanism utilizes two in-built functions of pandas dataframes: <code>describe()</code> for numerical fields and\n", + "<code>value_counts()</code> for categorical fields. The <code>describe()</code> function pulls out several summary statistics, while\n", + "<code>value_counts()</code> accounts for unique values. The resulting data is formatted into two distinct tables, one for\n", + "numerical and another for categorical variable summaries. These tables provide a clear summary of the main\n", + "characteristics of the variables, which can be instrumental in assessing the model's performance.</p>\n", + "<h3>Signs of High Risk</h3>\n", + "<ul>\n", + "<li>Skewed data or significant outliers can represent high risk. For numerical data, this may be reflected via a\n", + "significant difference between the mean and median (50% percentile).</li>\n", + "<li>For categorical data, a lack of diversity (low count of unique values), or overdominance of a single category\n", + "(high frequency of the top value) can indicate high risk.</li>\n", + "</ul>\n", + "<h3>Strengths</h3>\n", + "<ul>\n", + "<li>Provides a comprehensive summary of the dataset, shedding light on the distribution and characteristics of the\n", + "variables under consideration.</li>\n", + "<li>It is a versatile and robust method, applicable to both numerical and categorical data.</li>\n", + "<li>Helps highlight crucial anomalies such as outliers, extreme skewness, or lack of diversity, which are vital in\n", + "understanding model behavior during testing and validation.</li>\n", + "</ul>\n", + "<h3>Limitations</h3>\n", + "<ul>\n", + "<li>While this metric offers a high-level overview of the data, it may fail to detect subtle correlations or complex\n", + "patterns.</li>\n", + "<li>Does not offer any insights on the relationship between variables.</li>\n", + "<li>Alone, descriptive statistics cannot be used to infer properties about future unseen data.</li>\n", + "<li>Should be used in conjunction with other statistical tests to provide a comprehensive understanding of the\n", + "model's data.</li>\n", + "</ul>\n", + "\n", + " </div>\n", + "</div>\n", + "\n", + "<h4 class=\"vm_required_context\">\n", + " Required Inputs: <span style=\"font-size: 13px\"><i>dataset</i></span>\n", + "</h4>\n", + "\n", + "<div style=\"display: none;\">\n", + " <h4>Parameters:</h4>\n", + " <table class=\"vm_params_table\" style=\"display: none;\">\n", + " <tr>\n", + " <th>Parameter</th>\n", + " <th>Default Value</th>\n", + " </tr>\n", + " \n", + " </table>\n", + "</div>\n", + "\n", + "<div class=\"unset\">\n", + " <h3>How to Run:</h3>\n", + "\n", + " <button\n", + " onclick=\"(() => {e = document.getElementById('expandable_instructions_7e3e1a19-00f2-4e0b-95b6-720bc7e3ba8b'); e.style.display === 'none' ? e.style.display = 'block' : e.style.display = 'none'})()\"\n", + " >Show/Hide Instructions</button>\n", + "\n", + " <div id=\"expandable_instructions_7e3e1a19-00f2-4e0b-95b6-720bc7e3ba8b\" style=\"display: block;\">\n", + " <h4>Code:</h4>\n", + " <pre>\n", + " <code class='language-python'>\n", + "import validmind as vm\n", + "\n", + "# inputs dictionary maps your inputs to the expected input names\n", + "# keys are the expected input names and values are the actual inputs\n", + "# values may be string input_ids or the actual VMDataset or VMModel objects\n", + "inputs = {\n", + " \"dataset\": \"my_vm_dataset\"\n", + "}\n", + "params = {}\n", + "\n", + "# to run and view the result of this test, run the following code:\n", + "result = vm.tests.run_test(\n", + " \"validmind.data_validation.DescriptiveStatistics\", inputs=inputs, params=params\n", + ")\n", + "\n", + "# To see the result of the test, ensure that you have called `vm.init()` and then run:\n", + "result.log()</code>\n", + " </pre>\n", + " </div>\n", + "</div>\n", + "\n", + "<style>\n", + "h5.vm_required_context {\n", + " margin-top: 25px;\n", + "}\n", + "table.vm_params_table {\n", + " margin-top: 20px;\n", + " width: 350px;\n", + " border-collapse: collapse;\n", + " border-color: --jp-border-color0;\n", + "}\n", + "table.vm_params_table td, table.vm_params_table th {\n", + " text-align: right;\n", + "}\n", + "table.vm_params_table td:first-child, table.vm_params_table th:first-child {\n", + " text-align: left;\n", + "}\n", + "table.vm_params_table th {\n", + " background-color: --jp-content-color0;\n", + " font-weight: bold;\n", + " font-size: 14px !important;\n", + "}\n", + "table.vm_params_table tr:nth-child(even) {\n", + " background-color: --jp-layout-color1;\n", + "}\n", + "table.vm_params_table tr:nth-child(odd) {\n", + " background-color: --jp-layout-color2;\n", + "}\n", + "table.vm_params_table tr:hover {\n", + " background-color: --jp-layout-color3;\n", + "}\n", + "table.vm_params_table td, table.vm_params_table th {\n", + " padding: 5px;\n", + " border: .8px solid --jp-border-color0;\n", + "}\n", + "</style>\n", + "\n", + " </div>\n", + " </div>\n", + " \n", + " </div>\n", + "\n", + " <script>\n", + " function toggleAccordionItem(itemId) {\n", + " const content = document.getElementById(itemId);\n", + " const toggle = document.getElementById(itemId + '-toggle');\n", + "\n", + " if (content.style.display === 'none' || content.style.display === '') {\n", + " content.style.display = 'block';\n", + " toggle.innerHTML = '▼';\n", + " } else {\n", + " content.style.display = 'none';\n", + " toggle.innerHTML = '▶';\n", + " }\n", + " }\n", + " </script>\n", + " " + ], + "text/plain": [ + "<IPython.core.display.HTML object>" + ] + } + } + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "Now that you’ve learned how to identify ValidMind test suites relevant to your use cases, we encourage you to explore our interactive notebooks to discover additional tests, learn how to run them, and effectively document your records (models).\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn more about the individual tests available in the ValidMind Library</b></span>\n", + "<br></br>\n", + "Check out our <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/explore_tests/explore_tests.html\" style=\"color: #DE257E;\"><b>Explore tests</b></a> notebook for more code examples and usage of key functions.</div>\n", + "\n", + "<a id='toc5_1__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you'll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-daee3ccea95b41b4b4bc81230a4a55f5" + } + ], + "metadata": { + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.13" } - ], - "source": [ - "vm.tests.describe_test(\"validmind.data_validation.DescriptiveStatistics\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "Now that you’ve learned how to identify ValidMind test suites relevant to your use cases, we encourage you to explore our interactive notebooks to discover additional tests, learn how to run them, and effectively document your records (models).\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn more about the individual tests available in the ValidMind Library</b></span>\n", - "<br></br>\n", - "Check out our <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/explore_tests/explore_tests.html\" style=\"color: #DE257E;\"><b>Explore tests</b></a> notebook for more code examples and usage of key functions.</div>\n", - "\n", - "<a id='toc5_1__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you'll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-daee3ccea95b41b4b4bc81230a4a55f5", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} + "nbformat": 4, + "nbformat_minor": 2 +} \ No newline at end of file diff --git a/site/notebooks/how_to/tests/explore_tests/explore_tests.ipynb b/site/notebooks/how_to/tests/explore_tests/explore_tests.ipynb index 015777bfe4..048459ea72 100644 --- a/site/notebooks/how_to/tests/explore_tests/explore_tests.ipynb +++ b/site/notebooks/how_to/tests/explore_tests/explore_tests.ipynb @@ -1,4463 +1,4469 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Explore tests\n", - "\n", - "Explore the individual out-the-box tests available in the ValidMind Library, and identify which tests to run to evaluate different aspects of your model. Browse available tests, view their descriptions, and filter by tags or task type to find tests relevant to your use case." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Install the ValidMind Library](#toc2__) \n", - "- [List all available tests](#toc3__) \n", - "- [Understand tags and task types](#toc4__) \n", - "- [Filter tests by tags and task types](#toc5__) \n", - "- [Store test sets for use](#toc6__) \n", - "- [Next steps](#toc7__) \n", - " - [Discover more learning resources](#toc7_1__) \n", - "- [Upgrade ValidMind](#toc8__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Install the ValidMind Library\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", - "<br></br>\n", - "Python 3.8 <= x <= 3.14</div>\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## List all available tests" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Start by importing the functions from the [validmind.tests](https://docs.validmind.ai/validmind/validmind/tests.html) module for listing tests, listing tasks, listing tags, and listing tasks and tags to access these functions in the rest of this notebook:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.tests import (\n", - " list_tests,\n", - " list_tasks,\n", - " list_tags,\n", - " list_tasks_and_tags,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Use [list_tests()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to retrieve all available ValidMind tests, which returns a DataFrame with the following columns:\n", - "\n", - "- **ID** – A unique identifier for each test.\n", - "- **Name** – The test’s name.\n", - "- **Description** – A short summary of what the test evaluates.\n", - "- **Tags** – Keywords that describe what the test does or applies to.\n", - "- **Tasks** – The type of modeling task the test supports." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Explore tests\n", + "\n", + "Explore the individual out-the-box tests available in the ValidMind Library, and identify which tests to run to evaluate different aspects of your model. Browse available tests, view their descriptions, and filter by tags or task type to find tests relevant to your use case." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Install the ValidMind Library](#toc2__) \n", + "- [List all available tests](#toc3__) \n", + "- [Understand tags and task types](#toc4__) \n", + "- [Filter tests by tags and task types](#toc5__) \n", + "- [Store test sets for use](#toc6__) \n", + "- [Next steps](#toc7__) \n", + " - [Discover more learning resources](#toc7_1__) \n", + "- [Upgrade ValidMind](#toc8__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", + "\n", + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Install the ValidMind Library\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", + "<br></br>\n", + "Python 3.8 <= x <= 3.14</div>\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%pip install -q validmind" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## List all available tests" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Start by importing the functions from the [validmind.tests](https://docs.validmind.ai/validmind/validmind/tests.html) module for listing tests, listing tasks, listing tags, and listing tasks and tags to access these functions in the rest of this notebook:" + ] + }, { - "data": { - "text/html": [ - "<style type=\"text/css\">\n", - "#T_0502a th {\n", - " text-align: left;\n", - "}\n", - "#T_0502a_row0_col0, #T_0502a_row0_col1, #T_0502a_row0_col2, #T_0502a_row0_col3, #T_0502a_row0_col4, #T_0502a_row0_col5, #T_0502a_row0_col6, #T_0502a_row0_col7, #T_0502a_row0_col8, #T_0502a_row1_col0, #T_0502a_row1_col1, #T_0502a_row1_col2, #T_0502a_row1_col3, #T_0502a_row1_col4, #T_0502a_row1_col5, #T_0502a_row1_col6, #T_0502a_row1_col7, #T_0502a_row1_col8, #T_0502a_row2_col0, #T_0502a_row2_col1, #T_0502a_row2_col2, #T_0502a_row2_col3, #T_0502a_row2_col4, #T_0502a_row2_col5, #T_0502a_row2_col6, #T_0502a_row2_col7, #T_0502a_row2_col8, #T_0502a_row3_col0, #T_0502a_row3_col1, #T_0502a_row3_col2, #T_0502a_row3_col3, #T_0502a_row3_col4, #T_0502a_row3_col5, #T_0502a_row3_col6, #T_0502a_row3_col7, #T_0502a_row3_col8, #T_0502a_row4_col0, #T_0502a_row4_col1, #T_0502a_row4_col2, #T_0502a_row4_col3, #T_0502a_row4_col4, #T_0502a_row4_col5, #T_0502a_row4_col6, #T_0502a_row4_col7, #T_0502a_row4_col8, #T_0502a_row5_col0, #T_0502a_row5_col1, #T_0502a_row5_col2, #T_0502a_row5_col3, #T_0502a_row5_col4, #T_0502a_row5_col5, #T_0502a_row5_col6, #T_0502a_row5_col7, #T_0502a_row5_col8, #T_0502a_row6_col0, #T_0502a_row6_col1, #T_0502a_row6_col2, #T_0502a_row6_col3, #T_0502a_row6_col4, #T_0502a_row6_col5, #T_0502a_row6_col6, #T_0502a_row6_col7, #T_0502a_row6_col8, #T_0502a_row7_col0, #T_0502a_row7_col1, #T_0502a_row7_col2, #T_0502a_row7_col3, #T_0502a_row7_col4, #T_0502a_row7_col5, #T_0502a_row7_col6, #T_0502a_row7_col7, #T_0502a_row7_col8, #T_0502a_row8_col0, #T_0502a_row8_col1, #T_0502a_row8_col2, #T_0502a_row8_col3, #T_0502a_row8_col4, #T_0502a_row8_col5, #T_0502a_row8_col6, #T_0502a_row8_col7, #T_0502a_row8_col8, #T_0502a_row9_col0, #T_0502a_row9_col1, #T_0502a_row9_col2, #T_0502a_row9_col3, #T_0502a_row9_col4, #T_0502a_row9_col5, #T_0502a_row9_col6, #T_0502a_row9_col7, #T_0502a_row9_col8, #T_0502a_row10_col0, #T_0502a_row10_col1, #T_0502a_row10_col2, #T_0502a_row10_col3, #T_0502a_row10_col4, #T_0502a_row10_col5, #T_0502a_row10_col6, #T_0502a_row10_col7, #T_0502a_row10_col8, #T_0502a_row11_col0, #T_0502a_row11_col1, #T_0502a_row11_col2, #T_0502a_row11_col3, #T_0502a_row11_col4, #T_0502a_row11_col5, #T_0502a_row11_col6, #T_0502a_row11_col7, #T_0502a_row11_col8, #T_0502a_row12_col0, #T_0502a_row12_col1, #T_0502a_row12_col2, #T_0502a_row12_col3, #T_0502a_row12_col4, #T_0502a_row12_col5, #T_0502a_row12_col6, #T_0502a_row12_col7, #T_0502a_row12_col8, #T_0502a_row13_col0, #T_0502a_row13_col1, #T_0502a_row13_col2, #T_0502a_row13_col3, #T_0502a_row13_col4, #T_0502a_row13_col5, #T_0502a_row13_col6, #T_0502a_row13_col7, #T_0502a_row13_col8, #T_0502a_row14_col0, #T_0502a_row14_col1, #T_0502a_row14_col2, #T_0502a_row14_col3, #T_0502a_row14_col4, #T_0502a_row14_col5, #T_0502a_row14_col6, #T_0502a_row14_col7, #T_0502a_row14_col8, #T_0502a_row15_col0, #T_0502a_row15_col1, #T_0502a_row15_col2, #T_0502a_row15_col3, #T_0502a_row15_col4, #T_0502a_row15_col5, #T_0502a_row15_col6, #T_0502a_row15_col7, #T_0502a_row15_col8, #T_0502a_row16_col0, #T_0502a_row16_col1, #T_0502a_row16_col2, #T_0502a_row16_col3, #T_0502a_row16_col4, #T_0502a_row16_col5, #T_0502a_row16_col6, #T_0502a_row16_col7, #T_0502a_row16_col8, #T_0502a_row17_col0, #T_0502a_row17_col1, #T_0502a_row17_col2, #T_0502a_row17_col3, #T_0502a_row17_col4, #T_0502a_row17_col5, #T_0502a_row17_col6, #T_0502a_row17_col7, #T_0502a_row17_col8, #T_0502a_row18_col0, #T_0502a_row18_col1, #T_0502a_row18_col2, #T_0502a_row18_col3, #T_0502a_row18_col4, #T_0502a_row18_col5, #T_0502a_row18_col6, #T_0502a_row18_col7, #T_0502a_row18_col8, #T_0502a_row19_col0, #T_0502a_row19_col1, #T_0502a_row19_col2, #T_0502a_row19_col3, #T_0502a_row19_col4, #T_0502a_row19_col5, #T_0502a_row19_col6, #T_0502a_row19_col7, #T_0502a_row19_col8, #T_0502a_row20_col0, #T_0502a_row20_col1, #T_0502a_row20_col2, #T_0502a_row20_col3, #T_0502a_row20_col4, #T_0502a_row20_col5, #T_0502a_row20_col6, #T_0502a_row20_col7, #T_0502a_row20_col8, #T_0502a_row21_col0, #T_0502a_row21_col1, #T_0502a_row21_col2, #T_0502a_row21_col3, #T_0502a_row21_col4, #T_0502a_row21_col5, #T_0502a_row21_col6, #T_0502a_row21_col7, #T_0502a_row21_col8, #T_0502a_row22_col0, #T_0502a_row22_col1, #T_0502a_row22_col2, #T_0502a_row22_col3, #T_0502a_row22_col4, #T_0502a_row22_col5, #T_0502a_row22_col6, #T_0502a_row22_col7, #T_0502a_row22_col8, #T_0502a_row23_col0, #T_0502a_row23_col1, #T_0502a_row23_col2, #T_0502a_row23_col3, #T_0502a_row23_col4, #T_0502a_row23_col5, #T_0502a_row23_col6, #T_0502a_row23_col7, #T_0502a_row23_col8, #T_0502a_row24_col0, #T_0502a_row24_col1, #T_0502a_row24_col2, #T_0502a_row24_col3, #T_0502a_row24_col4, #T_0502a_row24_col5, #T_0502a_row24_col6, #T_0502a_row24_col7, #T_0502a_row24_col8, #T_0502a_row25_col0, #T_0502a_row25_col1, #T_0502a_row25_col2, #T_0502a_row25_col3, #T_0502a_row25_col4, #T_0502a_row25_col5, #T_0502a_row25_col6, #T_0502a_row25_col7, #T_0502a_row25_col8, #T_0502a_row26_col0, #T_0502a_row26_col1, #T_0502a_row26_col2, #T_0502a_row26_col3, #T_0502a_row26_col4, #T_0502a_row26_col5, #T_0502a_row26_col6, #T_0502a_row26_col7, #T_0502a_row26_col8, #T_0502a_row27_col0, #T_0502a_row27_col1, #T_0502a_row27_col2, #T_0502a_row27_col3, #T_0502a_row27_col4, #T_0502a_row27_col5, #T_0502a_row27_col6, #T_0502a_row27_col7, #T_0502a_row27_col8, #T_0502a_row28_col0, #T_0502a_row28_col1, #T_0502a_row28_col2, #T_0502a_row28_col3, #T_0502a_row28_col4, #T_0502a_row28_col5, #T_0502a_row28_col6, #T_0502a_row28_col7, #T_0502a_row28_col8, #T_0502a_row29_col0, #T_0502a_row29_col1, #T_0502a_row29_col2, #T_0502a_row29_col3, #T_0502a_row29_col4, #T_0502a_row29_col5, #T_0502a_row29_col6, #T_0502a_row29_col7, #T_0502a_row29_col8, #T_0502a_row30_col0, #T_0502a_row30_col1, #T_0502a_row30_col2, #T_0502a_row30_col3, #T_0502a_row30_col4, #T_0502a_row30_col5, #T_0502a_row30_col6, #T_0502a_row30_col7, #T_0502a_row30_col8, #T_0502a_row31_col0, #T_0502a_row31_col1, #T_0502a_row31_col2, #T_0502a_row31_col3, #T_0502a_row31_col4, #T_0502a_row31_col5, #T_0502a_row31_col6, #T_0502a_row31_col7, #T_0502a_row31_col8, #T_0502a_row32_col0, #T_0502a_row32_col1, #T_0502a_row32_col2, #T_0502a_row32_col3, #T_0502a_row32_col4, #T_0502a_row32_col5, #T_0502a_row32_col6, #T_0502a_row32_col7, #T_0502a_row32_col8, #T_0502a_row33_col0, #T_0502a_row33_col1, #T_0502a_row33_col2, #T_0502a_row33_col3, #T_0502a_row33_col4, #T_0502a_row33_col5, #T_0502a_row33_col6, #T_0502a_row33_col7, #T_0502a_row33_col8, #T_0502a_row34_col0, #T_0502a_row34_col1, #T_0502a_row34_col2, #T_0502a_row34_col3, #T_0502a_row34_col4, #T_0502a_row34_col5, #T_0502a_row34_col6, #T_0502a_row34_col7, #T_0502a_row34_col8, #T_0502a_row35_col0, #T_0502a_row35_col1, #T_0502a_row35_col2, #T_0502a_row35_col3, #T_0502a_row35_col4, #T_0502a_row35_col5, #T_0502a_row35_col6, #T_0502a_row35_col7, #T_0502a_row35_col8, #T_0502a_row36_col0, #T_0502a_row36_col1, #T_0502a_row36_col2, #T_0502a_row36_col3, #T_0502a_row36_col4, #T_0502a_row36_col5, #T_0502a_row36_col6, #T_0502a_row36_col7, #T_0502a_row36_col8, #T_0502a_row37_col0, #T_0502a_row37_col1, #T_0502a_row37_col2, #T_0502a_row37_col3, #T_0502a_row37_col4, #T_0502a_row37_col5, #T_0502a_row37_col6, #T_0502a_row37_col7, #T_0502a_row37_col8, #T_0502a_row38_col0, #T_0502a_row38_col1, #T_0502a_row38_col2, #T_0502a_row38_col3, #T_0502a_row38_col4, #T_0502a_row38_col5, #T_0502a_row38_col6, #T_0502a_row38_col7, #T_0502a_row38_col8, #T_0502a_row39_col0, #T_0502a_row39_col1, #T_0502a_row39_col2, #T_0502a_row39_col3, #T_0502a_row39_col4, #T_0502a_row39_col5, #T_0502a_row39_col6, #T_0502a_row39_col7, #T_0502a_row39_col8, #T_0502a_row40_col0, #T_0502a_row40_col1, #T_0502a_row40_col2, #T_0502a_row40_col3, #T_0502a_row40_col4, #T_0502a_row40_col5, #T_0502a_row40_col6, #T_0502a_row40_col7, #T_0502a_row40_col8, #T_0502a_row41_col0, #T_0502a_row41_col1, #T_0502a_row41_col2, #T_0502a_row41_col3, #T_0502a_row41_col4, #T_0502a_row41_col5, #T_0502a_row41_col6, #T_0502a_row41_col7, #T_0502a_row41_col8, #T_0502a_row42_col0, #T_0502a_row42_col1, #T_0502a_row42_col2, #T_0502a_row42_col3, #T_0502a_row42_col4, #T_0502a_row42_col5, #T_0502a_row42_col6, #T_0502a_row42_col7, #T_0502a_row42_col8, #T_0502a_row43_col0, #T_0502a_row43_col1, #T_0502a_row43_col2, #T_0502a_row43_col3, #T_0502a_row43_col4, #T_0502a_row43_col5, #T_0502a_row43_col6, #T_0502a_row43_col7, #T_0502a_row43_col8, #T_0502a_row44_col0, #T_0502a_row44_col1, #T_0502a_row44_col2, #T_0502a_row44_col3, #T_0502a_row44_col4, #T_0502a_row44_col5, #T_0502a_row44_col6, #T_0502a_row44_col7, #T_0502a_row44_col8, #T_0502a_row45_col0, #T_0502a_row45_col1, #T_0502a_row45_col2, #T_0502a_row45_col3, #T_0502a_row45_col4, #T_0502a_row45_col5, #T_0502a_row45_col6, #T_0502a_row45_col7, #T_0502a_row45_col8, #T_0502a_row46_col0, #T_0502a_row46_col1, #T_0502a_row46_col2, #T_0502a_row46_col3, #T_0502a_row46_col4, #T_0502a_row46_col5, #T_0502a_row46_col6, #T_0502a_row46_col7, #T_0502a_row46_col8, #T_0502a_row47_col0, #T_0502a_row47_col1, #T_0502a_row47_col2, #T_0502a_row47_col3, #T_0502a_row47_col4, #T_0502a_row47_col5, #T_0502a_row47_col6, #T_0502a_row47_col7, #T_0502a_row47_col8, #T_0502a_row48_col0, #T_0502a_row48_col1, #T_0502a_row48_col2, #T_0502a_row48_col3, #T_0502a_row48_col4, #T_0502a_row48_col5, #T_0502a_row48_col6, #T_0502a_row48_col7, #T_0502a_row48_col8, #T_0502a_row49_col0, #T_0502a_row49_col1, #T_0502a_row49_col2, #T_0502a_row49_col3, #T_0502a_row49_col4, #T_0502a_row49_col5, #T_0502a_row49_col6, #T_0502a_row49_col7, #T_0502a_row49_col8, #T_0502a_row50_col0, #T_0502a_row50_col1, #T_0502a_row50_col2, #T_0502a_row50_col3, #T_0502a_row50_col4, #T_0502a_row50_col5, #T_0502a_row50_col6, #T_0502a_row50_col7, #T_0502a_row50_col8, #T_0502a_row51_col0, #T_0502a_row51_col1, #T_0502a_row51_col2, #T_0502a_row51_col3, #T_0502a_row51_col4, #T_0502a_row51_col5, #T_0502a_row51_col6, #T_0502a_row51_col7, #T_0502a_row51_col8, #T_0502a_row52_col0, #T_0502a_row52_col1, #T_0502a_row52_col2, #T_0502a_row52_col3, #T_0502a_row52_col4, #T_0502a_row52_col5, #T_0502a_row52_col6, #T_0502a_row52_col7, #T_0502a_row52_col8, #T_0502a_row53_col0, #T_0502a_row53_col1, #T_0502a_row53_col2, #T_0502a_row53_col3, #T_0502a_row53_col4, #T_0502a_row53_col5, #T_0502a_row53_col6, #T_0502a_row53_col7, #T_0502a_row53_col8, #T_0502a_row54_col0, #T_0502a_row54_col1, #T_0502a_row54_col2, #T_0502a_row54_col3, #T_0502a_row54_col4, #T_0502a_row54_col5, #T_0502a_row54_col6, #T_0502a_row54_col7, #T_0502a_row54_col8, #T_0502a_row55_col0, #T_0502a_row55_col1, #T_0502a_row55_col2, #T_0502a_row55_col3, #T_0502a_row55_col4, #T_0502a_row55_col5, #T_0502a_row55_col6, #T_0502a_row55_col7, #T_0502a_row55_col8, #T_0502a_row56_col0, #T_0502a_row56_col1, #T_0502a_row56_col2, #T_0502a_row56_col3, #T_0502a_row56_col4, #T_0502a_row56_col5, #T_0502a_row56_col6, #T_0502a_row56_col7, #T_0502a_row56_col8, #T_0502a_row57_col0, #T_0502a_row57_col1, #T_0502a_row57_col2, #T_0502a_row57_col3, #T_0502a_row57_col4, #T_0502a_row57_col5, #T_0502a_row57_col6, #T_0502a_row57_col7, #T_0502a_row57_col8, #T_0502a_row58_col0, #T_0502a_row58_col1, #T_0502a_row58_col2, #T_0502a_row58_col3, #T_0502a_row58_col4, #T_0502a_row58_col5, #T_0502a_row58_col6, #T_0502a_row58_col7, #T_0502a_row58_col8, #T_0502a_row59_col0, #T_0502a_row59_col1, #T_0502a_row59_col2, #T_0502a_row59_col3, #T_0502a_row59_col4, #T_0502a_row59_col5, #T_0502a_row59_col6, #T_0502a_row59_col7, #T_0502a_row59_col8, #T_0502a_row60_col0, #T_0502a_row60_col1, #T_0502a_row60_col2, #T_0502a_row60_col3, #T_0502a_row60_col4, #T_0502a_row60_col5, #T_0502a_row60_col6, #T_0502a_row60_col7, #T_0502a_row60_col8, #T_0502a_row61_col0, #T_0502a_row61_col1, #T_0502a_row61_col2, #T_0502a_row61_col3, #T_0502a_row61_col4, #T_0502a_row61_col5, #T_0502a_row61_col6, #T_0502a_row61_col7, #T_0502a_row61_col8, #T_0502a_row62_col0, #T_0502a_row62_col1, #T_0502a_row62_col2, #T_0502a_row62_col3, #T_0502a_row62_col4, #T_0502a_row62_col5, #T_0502a_row62_col6, #T_0502a_row62_col7, #T_0502a_row62_col8, #T_0502a_row63_col0, #T_0502a_row63_col1, #T_0502a_row63_col2, #T_0502a_row63_col3, #T_0502a_row63_col4, #T_0502a_row63_col5, #T_0502a_row63_col6, #T_0502a_row63_col7, #T_0502a_row63_col8, #T_0502a_row64_col0, #T_0502a_row64_col1, #T_0502a_row64_col2, #T_0502a_row64_col3, #T_0502a_row64_col4, #T_0502a_row64_col5, #T_0502a_row64_col6, #T_0502a_row64_col7, #T_0502a_row64_col8, #T_0502a_row65_col0, #T_0502a_row65_col1, #T_0502a_row65_col2, #T_0502a_row65_col3, #T_0502a_row65_col4, #T_0502a_row65_col5, #T_0502a_row65_col6, #T_0502a_row65_col7, #T_0502a_row65_col8, #T_0502a_row66_col0, #T_0502a_row66_col1, #T_0502a_row66_col2, #T_0502a_row66_col3, #T_0502a_row66_col4, #T_0502a_row66_col5, #T_0502a_row66_col6, #T_0502a_row66_col7, #T_0502a_row66_col8, #T_0502a_row67_col0, #T_0502a_row67_col1, #T_0502a_row67_col2, #T_0502a_row67_col3, #T_0502a_row67_col4, #T_0502a_row67_col5, #T_0502a_row67_col6, #T_0502a_row67_col7, #T_0502a_row67_col8, #T_0502a_row68_col0, #T_0502a_row68_col1, #T_0502a_row68_col2, #T_0502a_row68_col3, #T_0502a_row68_col4, #T_0502a_row68_col5, #T_0502a_row68_col6, #T_0502a_row68_col7, #T_0502a_row68_col8, #T_0502a_row69_col0, #T_0502a_row69_col1, #T_0502a_row69_col2, #T_0502a_row69_col3, #T_0502a_row69_col4, #T_0502a_row69_col5, #T_0502a_row69_col6, #T_0502a_row69_col7, #T_0502a_row69_col8, #T_0502a_row70_col0, #T_0502a_row70_col1, #T_0502a_row70_col2, #T_0502a_row70_col3, #T_0502a_row70_col4, #T_0502a_row70_col5, #T_0502a_row70_col6, #T_0502a_row70_col7, #T_0502a_row70_col8, #T_0502a_row71_col0, #T_0502a_row71_col1, #T_0502a_row71_col2, #T_0502a_row71_col3, #T_0502a_row71_col4, #T_0502a_row71_col5, #T_0502a_row71_col6, #T_0502a_row71_col7, #T_0502a_row71_col8, #T_0502a_row72_col0, #T_0502a_row72_col1, #T_0502a_row72_col2, #T_0502a_row72_col3, #T_0502a_row72_col4, #T_0502a_row72_col5, #T_0502a_row72_col6, #T_0502a_row72_col7, #T_0502a_row72_col8, #T_0502a_row73_col0, #T_0502a_row73_col1, #T_0502a_row73_col2, #T_0502a_row73_col3, #T_0502a_row73_col4, #T_0502a_row73_col5, #T_0502a_row73_col6, #T_0502a_row73_col7, #T_0502a_row73_col8, #T_0502a_row74_col0, #T_0502a_row74_col1, #T_0502a_row74_col2, #T_0502a_row74_col3, #T_0502a_row74_col4, #T_0502a_row74_col5, #T_0502a_row74_col6, #T_0502a_row74_col7, #T_0502a_row74_col8, #T_0502a_row75_col0, #T_0502a_row75_col1, #T_0502a_row75_col2, #T_0502a_row75_col3, #T_0502a_row75_col4, #T_0502a_row75_col5, #T_0502a_row75_col6, #T_0502a_row75_col7, #T_0502a_row75_col8, #T_0502a_row76_col0, #T_0502a_row76_col1, #T_0502a_row76_col2, #T_0502a_row76_col3, #T_0502a_row76_col4, #T_0502a_row76_col5, #T_0502a_row76_col6, #T_0502a_row76_col7, #T_0502a_row76_col8, #T_0502a_row77_col0, #T_0502a_row77_col1, #T_0502a_row77_col2, #T_0502a_row77_col3, #T_0502a_row77_col4, #T_0502a_row77_col5, #T_0502a_row77_col6, #T_0502a_row77_col7, #T_0502a_row77_col8, #T_0502a_row78_col0, #T_0502a_row78_col1, #T_0502a_row78_col2, #T_0502a_row78_col3, #T_0502a_row78_col4, #T_0502a_row78_col5, #T_0502a_row78_col6, #T_0502a_row78_col7, #T_0502a_row78_col8, #T_0502a_row79_col0, #T_0502a_row79_col1, #T_0502a_row79_col2, #T_0502a_row79_col3, #T_0502a_row79_col4, #T_0502a_row79_col5, #T_0502a_row79_col6, #T_0502a_row79_col7, #T_0502a_row79_col8, #T_0502a_row80_col0, #T_0502a_row80_col1, #T_0502a_row80_col2, #T_0502a_row80_col3, #T_0502a_row80_col4, #T_0502a_row80_col5, #T_0502a_row80_col6, #T_0502a_row80_col7, #T_0502a_row80_col8, #T_0502a_row81_col0, #T_0502a_row81_col1, #T_0502a_row81_col2, #T_0502a_row81_col3, #T_0502a_row81_col4, #T_0502a_row81_col5, #T_0502a_row81_col6, #T_0502a_row81_col7, #T_0502a_row81_col8, #T_0502a_row82_col0, #T_0502a_row82_col1, #T_0502a_row82_col2, #T_0502a_row82_col3, #T_0502a_row82_col4, #T_0502a_row82_col5, #T_0502a_row82_col6, #T_0502a_row82_col7, #T_0502a_row82_col8, #T_0502a_row83_col0, #T_0502a_row83_col1, #T_0502a_row83_col2, #T_0502a_row83_col3, #T_0502a_row83_col4, #T_0502a_row83_col5, #T_0502a_row83_col6, #T_0502a_row83_col7, #T_0502a_row83_col8, #T_0502a_row84_col0, #T_0502a_row84_col1, #T_0502a_row84_col2, #T_0502a_row84_col3, #T_0502a_row84_col4, #T_0502a_row84_col5, #T_0502a_row84_col6, #T_0502a_row84_col7, #T_0502a_row84_col8, #T_0502a_row85_col0, #T_0502a_row85_col1, #T_0502a_row85_col2, #T_0502a_row85_col3, #T_0502a_row85_col4, #T_0502a_row85_col5, #T_0502a_row85_col6, #T_0502a_row85_col7, #T_0502a_row85_col8, #T_0502a_row86_col0, #T_0502a_row86_col1, #T_0502a_row86_col2, #T_0502a_row86_col3, #T_0502a_row86_col4, #T_0502a_row86_col5, #T_0502a_row86_col6, #T_0502a_row86_col7, #T_0502a_row86_col8, #T_0502a_row87_col0, #T_0502a_row87_col1, #T_0502a_row87_col2, #T_0502a_row87_col3, #T_0502a_row87_col4, #T_0502a_row87_col5, #T_0502a_row87_col6, #T_0502a_row87_col7, #T_0502a_row87_col8, #T_0502a_row88_col0, #T_0502a_row88_col1, #T_0502a_row88_col2, #T_0502a_row88_col3, #T_0502a_row88_col4, #T_0502a_row88_col5, #T_0502a_row88_col6, #T_0502a_row88_col7, #T_0502a_row88_col8, #T_0502a_row89_col0, #T_0502a_row89_col1, #T_0502a_row89_col2, #T_0502a_row89_col3, #T_0502a_row89_col4, #T_0502a_row89_col5, #T_0502a_row89_col6, #T_0502a_row89_col7, #T_0502a_row89_col8, #T_0502a_row90_col0, #T_0502a_row90_col1, #T_0502a_row90_col2, #T_0502a_row90_col3, #T_0502a_row90_col4, #T_0502a_row90_col5, #T_0502a_row90_col6, #T_0502a_row90_col7, #T_0502a_row90_col8, #T_0502a_row91_col0, #T_0502a_row91_col1, #T_0502a_row91_col2, #T_0502a_row91_col3, #T_0502a_row91_col4, #T_0502a_row91_col5, #T_0502a_row91_col6, #T_0502a_row91_col7, #T_0502a_row91_col8, #T_0502a_row92_col0, #T_0502a_row92_col1, #T_0502a_row92_col2, #T_0502a_row92_col3, #T_0502a_row92_col4, #T_0502a_row92_col5, #T_0502a_row92_col6, #T_0502a_row92_col7, #T_0502a_row92_col8, #T_0502a_row93_col0, #T_0502a_row93_col1, #T_0502a_row93_col2, #T_0502a_row93_col3, #T_0502a_row93_col4, #T_0502a_row93_col5, #T_0502a_row93_col6, #T_0502a_row93_col7, #T_0502a_row93_col8, #T_0502a_row94_col0, #T_0502a_row94_col1, #T_0502a_row94_col2, #T_0502a_row94_col3, #T_0502a_row94_col4, #T_0502a_row94_col5, #T_0502a_row94_col6, #T_0502a_row94_col7, #T_0502a_row94_col8, #T_0502a_row95_col0, #T_0502a_row95_col1, #T_0502a_row95_col2, #T_0502a_row95_col3, #T_0502a_row95_col4, #T_0502a_row95_col5, #T_0502a_row95_col6, #T_0502a_row95_col7, #T_0502a_row95_col8, #T_0502a_row96_col0, #T_0502a_row96_col1, #T_0502a_row96_col2, #T_0502a_row96_col3, #T_0502a_row96_col4, #T_0502a_row96_col5, #T_0502a_row96_col6, #T_0502a_row96_col7, #T_0502a_row96_col8, #T_0502a_row97_col0, #T_0502a_row97_col1, #T_0502a_row97_col2, #T_0502a_row97_col3, #T_0502a_row97_col4, #T_0502a_row97_col5, #T_0502a_row97_col6, #T_0502a_row97_col7, #T_0502a_row97_col8, #T_0502a_row98_col0, #T_0502a_row98_col1, #T_0502a_row98_col2, #T_0502a_row98_col3, #T_0502a_row98_col4, #T_0502a_row98_col5, #T_0502a_row98_col6, #T_0502a_row98_col7, #T_0502a_row98_col8, #T_0502a_row99_col0, #T_0502a_row99_col1, #T_0502a_row99_col2, #T_0502a_row99_col3, #T_0502a_row99_col4, #T_0502a_row99_col5, #T_0502a_row99_col6, #T_0502a_row99_col7, #T_0502a_row99_col8, #T_0502a_row100_col0, #T_0502a_row100_col1, #T_0502a_row100_col2, #T_0502a_row100_col3, #T_0502a_row100_col4, #T_0502a_row100_col5, #T_0502a_row100_col6, #T_0502a_row100_col7, #T_0502a_row100_col8, #T_0502a_row101_col0, #T_0502a_row101_col1, #T_0502a_row101_col2, #T_0502a_row101_col3, #T_0502a_row101_col4, #T_0502a_row101_col5, #T_0502a_row101_col6, #T_0502a_row101_col7, #T_0502a_row101_col8, #T_0502a_row102_col0, #T_0502a_row102_col1, #T_0502a_row102_col2, #T_0502a_row102_col3, #T_0502a_row102_col4, #T_0502a_row102_col5, #T_0502a_row102_col6, #T_0502a_row102_col7, #T_0502a_row102_col8, #T_0502a_row103_col0, #T_0502a_row103_col1, #T_0502a_row103_col2, #T_0502a_row103_col3, #T_0502a_row103_col4, #T_0502a_row103_col5, #T_0502a_row103_col6, #T_0502a_row103_col7, #T_0502a_row103_col8, #T_0502a_row104_col0, #T_0502a_row104_col1, #T_0502a_row104_col2, #T_0502a_row104_col3, #T_0502a_row104_col4, #T_0502a_row104_col5, #T_0502a_row104_col6, #T_0502a_row104_col7, #T_0502a_row104_col8, #T_0502a_row105_col0, #T_0502a_row105_col1, #T_0502a_row105_col2, #T_0502a_row105_col3, #T_0502a_row105_col4, #T_0502a_row105_col5, #T_0502a_row105_col6, #T_0502a_row105_col7, #T_0502a_row105_col8, #T_0502a_row106_col0, #T_0502a_row106_col1, #T_0502a_row106_col2, #T_0502a_row106_col3, #T_0502a_row106_col4, #T_0502a_row106_col5, #T_0502a_row106_col6, #T_0502a_row106_col7, #T_0502a_row106_col8, #T_0502a_row107_col0, #T_0502a_row107_col1, #T_0502a_row107_col2, #T_0502a_row107_col3, #T_0502a_row107_col4, #T_0502a_row107_col5, #T_0502a_row107_col6, #T_0502a_row107_col7, #T_0502a_row107_col8, #T_0502a_row108_col0, #T_0502a_row108_col1, #T_0502a_row108_col2, #T_0502a_row108_col3, #T_0502a_row108_col4, #T_0502a_row108_col5, #T_0502a_row108_col6, #T_0502a_row108_col7, #T_0502a_row108_col8, #T_0502a_row109_col0, #T_0502a_row109_col1, #T_0502a_row109_col2, #T_0502a_row109_col3, #T_0502a_row109_col4, #T_0502a_row109_col5, #T_0502a_row109_col6, #T_0502a_row109_col7, #T_0502a_row109_col8, #T_0502a_row110_col0, #T_0502a_row110_col1, #T_0502a_row110_col2, #T_0502a_row110_col3, #T_0502a_row110_col4, #T_0502a_row110_col5, #T_0502a_row110_col6, #T_0502a_row110_col7, #T_0502a_row110_col8, #T_0502a_row111_col0, #T_0502a_row111_col1, #T_0502a_row111_col2, #T_0502a_row111_col3, #T_0502a_row111_col4, #T_0502a_row111_col5, #T_0502a_row111_col6, #T_0502a_row111_col7, #T_0502a_row111_col8, #T_0502a_row112_col0, #T_0502a_row112_col1, #T_0502a_row112_col2, #T_0502a_row112_col3, #T_0502a_row112_col4, #T_0502a_row112_col5, #T_0502a_row112_col6, #T_0502a_row112_col7, #T_0502a_row112_col8, #T_0502a_row113_col0, #T_0502a_row113_col1, #T_0502a_row113_col2, #T_0502a_row113_col3, #T_0502a_row113_col4, #T_0502a_row113_col5, #T_0502a_row113_col6, #T_0502a_row113_col7, #T_0502a_row113_col8, #T_0502a_row114_col0, #T_0502a_row114_col1, #T_0502a_row114_col2, #T_0502a_row114_col3, #T_0502a_row114_col4, #T_0502a_row114_col5, #T_0502a_row114_col6, #T_0502a_row114_col7, #T_0502a_row114_col8, #T_0502a_row115_col0, #T_0502a_row115_col1, #T_0502a_row115_col2, #T_0502a_row115_col3, #T_0502a_row115_col4, #T_0502a_row115_col5, #T_0502a_row115_col6, #T_0502a_row115_col7, #T_0502a_row115_col8, #T_0502a_row116_col0, #T_0502a_row116_col1, #T_0502a_row116_col2, #T_0502a_row116_col3, #T_0502a_row116_col4, #T_0502a_row116_col5, #T_0502a_row116_col6, #T_0502a_row116_col7, #T_0502a_row116_col8, #T_0502a_row117_col0, #T_0502a_row117_col1, #T_0502a_row117_col2, #T_0502a_row117_col3, #T_0502a_row117_col4, #T_0502a_row117_col5, #T_0502a_row117_col6, #T_0502a_row117_col7, #T_0502a_row117_col8, #T_0502a_row118_col0, #T_0502a_row118_col1, #T_0502a_row118_col2, #T_0502a_row118_col3, #T_0502a_row118_col4, #T_0502a_row118_col5, #T_0502a_row118_col6, #T_0502a_row118_col7, #T_0502a_row118_col8, #T_0502a_row119_col0, #T_0502a_row119_col1, #T_0502a_row119_col2, #T_0502a_row119_col3, #T_0502a_row119_col4, #T_0502a_row119_col5, #T_0502a_row119_col6, #T_0502a_row119_col7, #T_0502a_row119_col8, #T_0502a_row120_col0, #T_0502a_row120_col1, #T_0502a_row120_col2, #T_0502a_row120_col3, #T_0502a_row120_col4, #T_0502a_row120_col5, #T_0502a_row120_col6, #T_0502a_row120_col7, #T_0502a_row120_col8, #T_0502a_row121_col0, #T_0502a_row121_col1, #T_0502a_row121_col2, #T_0502a_row121_col3, #T_0502a_row121_col4, #T_0502a_row121_col5, #T_0502a_row121_col6, #T_0502a_row121_col7, #T_0502a_row121_col8, #T_0502a_row122_col0, #T_0502a_row122_col1, #T_0502a_row122_col2, #T_0502a_row122_col3, #T_0502a_row122_col4, #T_0502a_row122_col5, #T_0502a_row122_col6, #T_0502a_row122_col7, #T_0502a_row122_col8, #T_0502a_row123_col0, #T_0502a_row123_col1, #T_0502a_row123_col2, #T_0502a_row123_col3, #T_0502a_row123_col4, #T_0502a_row123_col5, #T_0502a_row123_col6, #T_0502a_row123_col7, #T_0502a_row123_col8, #T_0502a_row124_col0, #T_0502a_row124_col1, #T_0502a_row124_col2, #T_0502a_row124_col3, #T_0502a_row124_col4, #T_0502a_row124_col5, #T_0502a_row124_col6, #T_0502a_row124_col7, #T_0502a_row124_col8, #T_0502a_row125_col0, #T_0502a_row125_col1, #T_0502a_row125_col2, #T_0502a_row125_col3, #T_0502a_row125_col4, #T_0502a_row125_col5, #T_0502a_row125_col6, #T_0502a_row125_col7, #T_0502a_row125_col8, #T_0502a_row126_col0, #T_0502a_row126_col1, #T_0502a_row126_col2, #T_0502a_row126_col3, #T_0502a_row126_col4, #T_0502a_row126_col5, #T_0502a_row126_col6, #T_0502a_row126_col7, #T_0502a_row126_col8, #T_0502a_row127_col0, #T_0502a_row127_col1, #T_0502a_row127_col2, #T_0502a_row127_col3, #T_0502a_row127_col4, #T_0502a_row127_col5, #T_0502a_row127_col6, #T_0502a_row127_col7, #T_0502a_row127_col8, #T_0502a_row128_col0, #T_0502a_row128_col1, #T_0502a_row128_col2, #T_0502a_row128_col3, #T_0502a_row128_col4, #T_0502a_row128_col5, #T_0502a_row128_col6, #T_0502a_row128_col7, #T_0502a_row128_col8, #T_0502a_row129_col0, #T_0502a_row129_col1, #T_0502a_row129_col2, #T_0502a_row129_col3, #T_0502a_row129_col4, #T_0502a_row129_col5, #T_0502a_row129_col6, #T_0502a_row129_col7, #T_0502a_row129_col8, #T_0502a_row130_col0, #T_0502a_row130_col1, #T_0502a_row130_col2, #T_0502a_row130_col3, #T_0502a_row130_col4, #T_0502a_row130_col5, #T_0502a_row130_col6, #T_0502a_row130_col7, #T_0502a_row130_col8, #T_0502a_row131_col0, #T_0502a_row131_col1, #T_0502a_row131_col2, #T_0502a_row131_col3, #T_0502a_row131_col4, #T_0502a_row131_col5, #T_0502a_row131_col6, #T_0502a_row131_col7, #T_0502a_row131_col8, #T_0502a_row132_col0, #T_0502a_row132_col1, #T_0502a_row132_col2, #T_0502a_row132_col3, #T_0502a_row132_col4, #T_0502a_row132_col5, #T_0502a_row132_col6, #T_0502a_row132_col7, #T_0502a_row132_col8, #T_0502a_row133_col0, #T_0502a_row133_col1, #T_0502a_row133_col2, #T_0502a_row133_col3, #T_0502a_row133_col4, #T_0502a_row133_col5, #T_0502a_row133_col6, #T_0502a_row133_col7, #T_0502a_row133_col8, #T_0502a_row134_col0, #T_0502a_row134_col1, #T_0502a_row134_col2, #T_0502a_row134_col3, #T_0502a_row134_col4, #T_0502a_row134_col5, #T_0502a_row134_col6, #T_0502a_row134_col7, #T_0502a_row134_col8, #T_0502a_row135_col0, #T_0502a_row135_col1, #T_0502a_row135_col2, #T_0502a_row135_col3, #T_0502a_row135_col4, #T_0502a_row135_col5, #T_0502a_row135_col6, #T_0502a_row135_col7, #T_0502a_row135_col8, #T_0502a_row136_col0, #T_0502a_row136_col1, #T_0502a_row136_col2, #T_0502a_row136_col3, #T_0502a_row136_col4, #T_0502a_row136_col5, #T_0502a_row136_col6, #T_0502a_row136_col7, #T_0502a_row136_col8, #T_0502a_row137_col0, #T_0502a_row137_col1, #T_0502a_row137_col2, #T_0502a_row137_col3, #T_0502a_row137_col4, #T_0502a_row137_col5, #T_0502a_row137_col6, #T_0502a_row137_col7, #T_0502a_row137_col8, #T_0502a_row138_col0, #T_0502a_row138_col1, #T_0502a_row138_col2, #T_0502a_row138_col3, #T_0502a_row138_col4, #T_0502a_row138_col5, #T_0502a_row138_col6, #T_0502a_row138_col7, #T_0502a_row138_col8, #T_0502a_row139_col0, #T_0502a_row139_col1, #T_0502a_row139_col2, #T_0502a_row139_col3, #T_0502a_row139_col4, #T_0502a_row139_col5, #T_0502a_row139_col6, #T_0502a_row139_col7, #T_0502a_row139_col8, #T_0502a_row140_col0, #T_0502a_row140_col1, #T_0502a_row140_col2, #T_0502a_row140_col3, #T_0502a_row140_col4, #T_0502a_row140_col5, #T_0502a_row140_col6, #T_0502a_row140_col7, #T_0502a_row140_col8, #T_0502a_row141_col0, #T_0502a_row141_col1, #T_0502a_row141_col2, #T_0502a_row141_col3, #T_0502a_row141_col4, #T_0502a_row141_col5, #T_0502a_row141_col6, #T_0502a_row141_col7, #T_0502a_row141_col8, #T_0502a_row142_col0, #T_0502a_row142_col1, #T_0502a_row142_col2, #T_0502a_row142_col3, #T_0502a_row142_col4, #T_0502a_row142_col5, #T_0502a_row142_col6, #T_0502a_row142_col7, #T_0502a_row142_col8, #T_0502a_row143_col0, #T_0502a_row143_col1, #T_0502a_row143_col2, #T_0502a_row143_col3, #T_0502a_row143_col4, #T_0502a_row143_col5, #T_0502a_row143_col6, #T_0502a_row143_col7, #T_0502a_row143_col8, #T_0502a_row144_col0, #T_0502a_row144_col1, #T_0502a_row144_col2, #T_0502a_row144_col3, #T_0502a_row144_col4, #T_0502a_row144_col5, #T_0502a_row144_col6, #T_0502a_row144_col7, #T_0502a_row144_col8, #T_0502a_row145_col0, #T_0502a_row145_col1, #T_0502a_row145_col2, #T_0502a_row145_col3, #T_0502a_row145_col4, #T_0502a_row145_col5, #T_0502a_row145_col6, #T_0502a_row145_col7, #T_0502a_row145_col8, #T_0502a_row146_col0, #T_0502a_row146_col1, #T_0502a_row146_col2, #T_0502a_row146_col3, #T_0502a_row146_col4, #T_0502a_row146_col5, #T_0502a_row146_col6, #T_0502a_row146_col7, #T_0502a_row146_col8, #T_0502a_row147_col0, #T_0502a_row147_col1, #T_0502a_row147_col2, #T_0502a_row147_col3, #T_0502a_row147_col4, #T_0502a_row147_col5, #T_0502a_row147_col6, #T_0502a_row147_col7, #T_0502a_row147_col8, #T_0502a_row148_col0, #T_0502a_row148_col1, #T_0502a_row148_col2, #T_0502a_row148_col3, #T_0502a_row148_col4, #T_0502a_row148_col5, #T_0502a_row148_col6, #T_0502a_row148_col7, #T_0502a_row148_col8, #T_0502a_row149_col0, #T_0502a_row149_col1, #T_0502a_row149_col2, #T_0502a_row149_col3, #T_0502a_row149_col4, #T_0502a_row149_col5, #T_0502a_row149_col6, #T_0502a_row149_col7, #T_0502a_row149_col8, #T_0502a_row150_col0, #T_0502a_row150_col1, #T_0502a_row150_col2, #T_0502a_row150_col3, #T_0502a_row150_col4, #T_0502a_row150_col5, #T_0502a_row150_col6, #T_0502a_row150_col7, #T_0502a_row150_col8, #T_0502a_row151_col0, #T_0502a_row151_col1, #T_0502a_row151_col2, #T_0502a_row151_col3, #T_0502a_row151_col4, #T_0502a_row151_col5, #T_0502a_row151_col6, #T_0502a_row151_col7, #T_0502a_row151_col8, #T_0502a_row152_col0, #T_0502a_row152_col1, #T_0502a_row152_col2, #T_0502a_row152_col3, #T_0502a_row152_col4, #T_0502a_row152_col5, #T_0502a_row152_col6, #T_0502a_row152_col7, #T_0502a_row152_col8, #T_0502a_row153_col0, #T_0502a_row153_col1, #T_0502a_row153_col2, #T_0502a_row153_col3, #T_0502a_row153_col4, #T_0502a_row153_col5, #T_0502a_row153_col6, #T_0502a_row153_col7, #T_0502a_row153_col8, #T_0502a_row154_col0, #T_0502a_row154_col1, #T_0502a_row154_col2, #T_0502a_row154_col3, #T_0502a_row154_col4, #T_0502a_row154_col5, #T_0502a_row154_col6, #T_0502a_row154_col7, #T_0502a_row154_col8, #T_0502a_row155_col0, #T_0502a_row155_col1, #T_0502a_row155_col2, #T_0502a_row155_col3, #T_0502a_row155_col4, #T_0502a_row155_col5, #T_0502a_row155_col6, #T_0502a_row155_col7, #T_0502a_row155_col8, #T_0502a_row156_col0, #T_0502a_row156_col1, #T_0502a_row156_col2, #T_0502a_row156_col3, #T_0502a_row156_col4, #T_0502a_row156_col5, #T_0502a_row156_col6, #T_0502a_row156_col7, #T_0502a_row156_col8, #T_0502a_row157_col0, #T_0502a_row157_col1, #T_0502a_row157_col2, #T_0502a_row157_col3, #T_0502a_row157_col4, #T_0502a_row157_col5, #T_0502a_row157_col6, #T_0502a_row157_col7, #T_0502a_row157_col8, #T_0502a_row158_col0, #T_0502a_row158_col1, #T_0502a_row158_col2, #T_0502a_row158_col3, #T_0502a_row158_col4, #T_0502a_row158_col5, #T_0502a_row158_col6, #T_0502a_row158_col7, #T_0502a_row158_col8, #T_0502a_row159_col0, #T_0502a_row159_col1, #T_0502a_row159_col2, #T_0502a_row159_col3, #T_0502a_row159_col4, #T_0502a_row159_col5, #T_0502a_row159_col6, #T_0502a_row159_col7, #T_0502a_row159_col8, #T_0502a_row160_col0, #T_0502a_row160_col1, #T_0502a_row160_col2, #T_0502a_row160_col3, #T_0502a_row160_col4, #T_0502a_row160_col5, #T_0502a_row160_col6, #T_0502a_row160_col7, #T_0502a_row160_col8, #T_0502a_row161_col0, #T_0502a_row161_col1, #T_0502a_row161_col2, #T_0502a_row161_col3, #T_0502a_row161_col4, #T_0502a_row161_col5, #T_0502a_row161_col6, #T_0502a_row161_col7, #T_0502a_row161_col8, #T_0502a_row162_col0, #T_0502a_row162_col1, #T_0502a_row162_col2, #T_0502a_row162_col3, #T_0502a_row162_col4, #T_0502a_row162_col5, #T_0502a_row162_col6, #T_0502a_row162_col7, #T_0502a_row162_col8, #T_0502a_row163_col0, #T_0502a_row163_col1, #T_0502a_row163_col2, #T_0502a_row163_col3, #T_0502a_row163_col4, #T_0502a_row163_col5, #T_0502a_row163_col6, #T_0502a_row163_col7, #T_0502a_row163_col8, #T_0502a_row164_col0, #T_0502a_row164_col1, #T_0502a_row164_col2, #T_0502a_row164_col3, #T_0502a_row164_col4, #T_0502a_row164_col5, #T_0502a_row164_col6, #T_0502a_row164_col7, #T_0502a_row164_col8, #T_0502a_row165_col0, #T_0502a_row165_col1, #T_0502a_row165_col2, #T_0502a_row165_col3, #T_0502a_row165_col4, #T_0502a_row165_col5, #T_0502a_row165_col6, #T_0502a_row165_col7, #T_0502a_row165_col8, #T_0502a_row166_col0, #T_0502a_row166_col1, #T_0502a_row166_col2, #T_0502a_row166_col3, #T_0502a_row166_col4, #T_0502a_row166_col5, #T_0502a_row166_col6, #T_0502a_row166_col7, #T_0502a_row166_col8, #T_0502a_row167_col0, #T_0502a_row167_col1, #T_0502a_row167_col2, #T_0502a_row167_col3, #T_0502a_row167_col4, #T_0502a_row167_col5, #T_0502a_row167_col6, #T_0502a_row167_col7, #T_0502a_row167_col8, #T_0502a_row168_col0, #T_0502a_row168_col1, #T_0502a_row168_col2, #T_0502a_row168_col3, #T_0502a_row168_col4, #T_0502a_row168_col5, #T_0502a_row168_col6, #T_0502a_row168_col7, #T_0502a_row168_col8, #T_0502a_row169_col0, #T_0502a_row169_col1, #T_0502a_row169_col2, #T_0502a_row169_col3, #T_0502a_row169_col4, #T_0502a_row169_col5, #T_0502a_row169_col6, #T_0502a_row169_col7, #T_0502a_row169_col8, #T_0502a_row170_col0, #T_0502a_row170_col1, #T_0502a_row170_col2, #T_0502a_row170_col3, #T_0502a_row170_col4, #T_0502a_row170_col5, #T_0502a_row170_col6, #T_0502a_row170_col7, #T_0502a_row170_col8, #T_0502a_row171_col0, #T_0502a_row171_col1, #T_0502a_row171_col2, #T_0502a_row171_col3, #T_0502a_row171_col4, #T_0502a_row171_col5, #T_0502a_row171_col6, #T_0502a_row171_col7, #T_0502a_row171_col8, #T_0502a_row172_col0, #T_0502a_row172_col1, #T_0502a_row172_col2, #T_0502a_row172_col3, #T_0502a_row172_col4, #T_0502a_row172_col5, #T_0502a_row172_col6, #T_0502a_row172_col7, #T_0502a_row172_col8, #T_0502a_row173_col0, #T_0502a_row173_col1, #T_0502a_row173_col2, #T_0502a_row173_col3, #T_0502a_row173_col4, #T_0502a_row173_col5, #T_0502a_row173_col6, #T_0502a_row173_col7, #T_0502a_row173_col8, #T_0502a_row174_col0, #T_0502a_row174_col1, #T_0502a_row174_col2, #T_0502a_row174_col3, #T_0502a_row174_col4, #T_0502a_row174_col5, #T_0502a_row174_col6, #T_0502a_row174_col7, #T_0502a_row174_col8, #T_0502a_row175_col0, #T_0502a_row175_col1, #T_0502a_row175_col2, #T_0502a_row175_col3, #T_0502a_row175_col4, #T_0502a_row175_col5, #T_0502a_row175_col6, #T_0502a_row175_col7, #T_0502a_row175_col8, #T_0502a_row176_col0, #T_0502a_row176_col1, #T_0502a_row176_col2, #T_0502a_row176_col3, #T_0502a_row176_col4, #T_0502a_row176_col5, #T_0502a_row176_col6, #T_0502a_row176_col7, #T_0502a_row176_col8, #T_0502a_row177_col0, #T_0502a_row177_col1, #T_0502a_row177_col2, #T_0502a_row177_col3, #T_0502a_row177_col4, #T_0502a_row177_col5, #T_0502a_row177_col6, #T_0502a_row177_col7, #T_0502a_row177_col8, #T_0502a_row178_col0, #T_0502a_row178_col1, #T_0502a_row178_col2, #T_0502a_row178_col3, #T_0502a_row178_col4, #T_0502a_row178_col5, #T_0502a_row178_col6, #T_0502a_row178_col7, #T_0502a_row178_col8, #T_0502a_row179_col0, #T_0502a_row179_col1, #T_0502a_row179_col2, #T_0502a_row179_col3, #T_0502a_row179_col4, #T_0502a_row179_col5, #T_0502a_row179_col6, #T_0502a_row179_col7, #T_0502a_row179_col8, #T_0502a_row180_col0, #T_0502a_row180_col1, #T_0502a_row180_col2, #T_0502a_row180_col3, #T_0502a_row180_col4, #T_0502a_row180_col5, #T_0502a_row180_col6, #T_0502a_row180_col7, #T_0502a_row180_col8, #T_0502a_row181_col0, #T_0502a_row181_col1, #T_0502a_row181_col2, #T_0502a_row181_col3, #T_0502a_row181_col4, #T_0502a_row181_col5, #T_0502a_row181_col6, #T_0502a_row181_col7, #T_0502a_row181_col8, #T_0502a_row182_col0, #T_0502a_row182_col1, #T_0502a_row182_col2, #T_0502a_row182_col3, #T_0502a_row182_col4, #T_0502a_row182_col5, #T_0502a_row182_col6, #T_0502a_row182_col7, #T_0502a_row182_col8, #T_0502a_row183_col0, #T_0502a_row183_col1, #T_0502a_row183_col2, #T_0502a_row183_col3, #T_0502a_row183_col4, #T_0502a_row183_col5, #T_0502a_row183_col6, #T_0502a_row183_col7, #T_0502a_row183_col8, #T_0502a_row184_col0, #T_0502a_row184_col1, #T_0502a_row184_col2, #T_0502a_row184_col3, #T_0502a_row184_col4, #T_0502a_row184_col5, #T_0502a_row184_col6, #T_0502a_row184_col7, #T_0502a_row184_col8, #T_0502a_row185_col0, #T_0502a_row185_col1, #T_0502a_row185_col2, #T_0502a_row185_col3, #T_0502a_row185_col4, #T_0502a_row185_col5, #T_0502a_row185_col6, #T_0502a_row185_col7, #T_0502a_row185_col8, #T_0502a_row186_col0, #T_0502a_row186_col1, #T_0502a_row186_col2, #T_0502a_row186_col3, #T_0502a_row186_col4, #T_0502a_row186_col5, #T_0502a_row186_col6, #T_0502a_row186_col7, #T_0502a_row186_col8, #T_0502a_row187_col0, #T_0502a_row187_col1, #T_0502a_row187_col2, #T_0502a_row187_col3, #T_0502a_row187_col4, #T_0502a_row187_col5, #T_0502a_row187_col6, #T_0502a_row187_col7, #T_0502a_row187_col8, #T_0502a_row188_col0, #T_0502a_row188_col1, #T_0502a_row188_col2, #T_0502a_row188_col3, #T_0502a_row188_col4, #T_0502a_row188_col5, #T_0502a_row188_col6, #T_0502a_row188_col7, #T_0502a_row188_col8, #T_0502a_row189_col0, #T_0502a_row189_col1, #T_0502a_row189_col2, #T_0502a_row189_col3, #T_0502a_row189_col4, #T_0502a_row189_col5, #T_0502a_row189_col6, #T_0502a_row189_col7, #T_0502a_row189_col8, #T_0502a_row190_col0, #T_0502a_row190_col1, #T_0502a_row190_col2, #T_0502a_row190_col3, #T_0502a_row190_col4, #T_0502a_row190_col5, #T_0502a_row190_col6, #T_0502a_row190_col7, #T_0502a_row190_col8, #T_0502a_row191_col0, #T_0502a_row191_col1, #T_0502a_row191_col2, #T_0502a_row191_col3, #T_0502a_row191_col4, #T_0502a_row191_col5, #T_0502a_row191_col6, #T_0502a_row191_col7, #T_0502a_row191_col8, #T_0502a_row192_col0, #T_0502a_row192_col1, #T_0502a_row192_col2, #T_0502a_row192_col3, #T_0502a_row192_col4, #T_0502a_row192_col5, #T_0502a_row192_col6, #T_0502a_row192_col7, #T_0502a_row192_col8, #T_0502a_row193_col0, #T_0502a_row193_col1, #T_0502a_row193_col2, #T_0502a_row193_col3, #T_0502a_row193_col4, #T_0502a_row193_col5, #T_0502a_row193_col6, #T_0502a_row193_col7, #T_0502a_row193_col8, #T_0502a_row194_col0, #T_0502a_row194_col1, #T_0502a_row194_col2, #T_0502a_row194_col3, #T_0502a_row194_col4, #T_0502a_row194_col5, #T_0502a_row194_col6, #T_0502a_row194_col7, #T_0502a_row194_col8 {\n", - " text-align: left;\n", - "}\n", - "</style>\n", - "<table id=\"T_0502a\">\n", - " <thead>\n", - " <tr>\n", - " <th id=\"T_0502a_level0_col0\" class=\"col_heading level0 col0\" >ID</th>\n", - " <th id=\"T_0502a_level0_col1\" class=\"col_heading level0 col1\" >Name</th>\n", - " <th id=\"T_0502a_level0_col2\" class=\"col_heading level0 col2\" >Description</th>\n", - " <th id=\"T_0502a_level0_col3\" class=\"col_heading level0 col3\" >Has Figure</th>\n", - " <th id=\"T_0502a_level0_col4\" class=\"col_heading level0 col4\" >Has Table</th>\n", - " <th id=\"T_0502a_level0_col5\" class=\"col_heading level0 col5\" >Required Inputs</th>\n", - " <th id=\"T_0502a_level0_col6\" class=\"col_heading level0 col6\" >Params</th>\n", - " <th id=\"T_0502a_level0_col7\" class=\"col_heading level0 col7\" >Tags</th>\n", - " <th id=\"T_0502a_level0_col8\" class=\"col_heading level0 col8\" >Tasks</th>\n", - " </tr>\n", - " </thead>\n", - " <tbody>\n", - " <tr>\n", - " <td id=\"T_0502a_row0_col0\" class=\"data row0 col0\" >validmind.data_validation.ACFandPACFPlot</td>\n", - " <td id=\"T_0502a_row0_col1\" class=\"data row0 col1\" >AC Fand PACF Plot</td>\n", - " <td id=\"T_0502a_row0_col2\" class=\"data row0 col2\" >Analyzes time series data using Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) plots to...</td>\n", - " <td id=\"T_0502a_row0_col3\" class=\"data row0 col3\" >True</td>\n", - " <td id=\"T_0502a_row0_col4\" class=\"data row0 col4\" >False</td>\n", - " <td id=\"T_0502a_row0_col5\" class=\"data row0 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row0_col6\" class=\"data row0 col6\" >{}</td>\n", - " <td id=\"T_0502a_row0_col7\" class=\"data row0 col7\" >['time_series_data', 'forecasting', 'statistical_test', 'visualization']</td>\n", - " <td id=\"T_0502a_row0_col8\" class=\"data row0 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row1_col0\" class=\"data row1 col0\" >validmind.data_validation.ADF</td>\n", - " <td id=\"T_0502a_row1_col1\" class=\"data row1 col1\" >ADF</td>\n", - " <td id=\"T_0502a_row1_col2\" class=\"data row1 col2\" >Assesses the stationarity of a time series dataset using the Augmented Dickey-Fuller (ADF) test....</td>\n", - " <td id=\"T_0502a_row1_col3\" class=\"data row1 col3\" >False</td>\n", - " <td id=\"T_0502a_row1_col4\" class=\"data row1 col4\" >True</td>\n", - " <td id=\"T_0502a_row1_col5\" class=\"data row1 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row1_col6\" class=\"data row1 col6\" >{}</td>\n", - " <td id=\"T_0502a_row1_col7\" class=\"data row1 col7\" >['time_series_data', 'statsmodels', 'forecasting', 'statistical_test', 'stationarity']</td>\n", - " <td id=\"T_0502a_row1_col8\" class=\"data row1 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row2_col0\" class=\"data row2 col0\" >validmind.data_validation.AutoAR</td>\n", - " <td id=\"T_0502a_row2_col1\" class=\"data row2 col1\" >Auto AR</td>\n", - " <td id=\"T_0502a_row2_col2\" class=\"data row2 col2\" >Automatically identifies the optimal Autoregressive (AR) order for a time series using BIC and AIC criteria....</td>\n", - " <td id=\"T_0502a_row2_col3\" class=\"data row2 col3\" >False</td>\n", - " <td id=\"T_0502a_row2_col4\" class=\"data row2 col4\" >True</td>\n", - " <td id=\"T_0502a_row2_col5\" class=\"data row2 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row2_col6\" class=\"data row2 col6\" >{'max_ar_order': {'type': 'int', 'default': 3}}</td>\n", - " <td id=\"T_0502a_row2_col7\" class=\"data row2 col7\" >['time_series_data', 'statsmodels', 'forecasting', 'statistical_test']</td>\n", - " <td id=\"T_0502a_row2_col8\" class=\"data row2 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row3_col0\" class=\"data row3 col0\" >validmind.data_validation.AutoMA</td>\n", - " <td id=\"T_0502a_row3_col1\" class=\"data row3 col1\" >Auto MA</td>\n", - " <td id=\"T_0502a_row3_col2\" class=\"data row3 col2\" >Automatically selects the optimal Moving Average (MA) order for each variable in a time series dataset based on...</td>\n", - " <td id=\"T_0502a_row3_col3\" class=\"data row3 col3\" >False</td>\n", - " <td id=\"T_0502a_row3_col4\" class=\"data row3 col4\" >True</td>\n", - " <td id=\"T_0502a_row3_col5\" class=\"data row3 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row3_col6\" class=\"data row3 col6\" >{'max_ma_order': {'type': 'int', 'default': 3}}</td>\n", - " <td id=\"T_0502a_row3_col7\" class=\"data row3 col7\" >['time_series_data', 'statsmodels', 'forecasting', 'statistical_test']</td>\n", - " <td id=\"T_0502a_row3_col8\" class=\"data row3 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row4_col0\" class=\"data row4 col0\" >validmind.data_validation.AutoStationarity</td>\n", - " <td id=\"T_0502a_row4_col1\" class=\"data row4 col1\" >Auto Stationarity</td>\n", - " <td id=\"T_0502a_row4_col2\" class=\"data row4 col2\" >Automates Augmented Dickey-Fuller test to assess stationarity across multiple time series in a DataFrame....</td>\n", - " <td id=\"T_0502a_row4_col3\" class=\"data row4 col3\" >False</td>\n", - " <td id=\"T_0502a_row4_col4\" class=\"data row4 col4\" >True</td>\n", - " <td id=\"T_0502a_row4_col5\" class=\"data row4 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row4_col6\" class=\"data row4 col6\" >{'max_order': {'type': 'int', 'default': 5}, 'threshold': {'type': 'float', 'default': 0.05}}</td>\n", - " <td id=\"T_0502a_row4_col7\" class=\"data row4 col7\" >['time_series_data', 'statsmodels', 'forecasting', 'statistical_test']</td>\n", - " <td id=\"T_0502a_row4_col8\" class=\"data row4 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row5_col0\" class=\"data row5 col0\" >validmind.data_validation.BivariateScatterPlots</td>\n", - " <td id=\"T_0502a_row5_col1\" class=\"data row5 col1\" >Bivariate Scatter Plots</td>\n", - " <td id=\"T_0502a_row5_col2\" class=\"data row5 col2\" >Generates bivariate scatterplots to visually inspect relationships between pairs of numerical predictor variables...</td>\n", - " <td id=\"T_0502a_row5_col3\" class=\"data row5 col3\" >True</td>\n", - " <td id=\"T_0502a_row5_col4\" class=\"data row5 col4\" >False</td>\n", - " <td id=\"T_0502a_row5_col5\" class=\"data row5 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row5_col6\" class=\"data row5 col6\" >{}</td>\n", - " <td id=\"T_0502a_row5_col7\" class=\"data row5 col7\" >['tabular_data', 'numerical_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row5_col8\" class=\"data row5 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row6_col0\" class=\"data row6 col0\" >validmind.data_validation.BoxPierce</td>\n", - " <td id=\"T_0502a_row6_col1\" class=\"data row6 col1\" >Box Pierce</td>\n", - " <td id=\"T_0502a_row6_col2\" class=\"data row6 col2\" >Detects autocorrelation in time-series data through the Box-Pierce test to validate model performance....</td>\n", - " <td id=\"T_0502a_row6_col3\" class=\"data row6 col3\" >False</td>\n", - " <td id=\"T_0502a_row6_col4\" class=\"data row6 col4\" >True</td>\n", - " <td id=\"T_0502a_row6_col5\" class=\"data row6 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row6_col6\" class=\"data row6 col6\" >{}</td>\n", - " <td id=\"T_0502a_row6_col7\" class=\"data row6 col7\" >['time_series_data', 'forecasting', 'statistical_test', 'statsmodels']</td>\n", - " <td id=\"T_0502a_row6_col8\" class=\"data row6 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row7_col0\" class=\"data row7 col0\" >validmind.data_validation.ChiSquaredFeaturesTable</td>\n", - " <td id=\"T_0502a_row7_col1\" class=\"data row7 col1\" >Chi Squared Features Table</td>\n", - " <td id=\"T_0502a_row7_col2\" class=\"data row7 col2\" >Assesses the statistical association between categorical features and a target variable using the Chi-Squared test....</td>\n", - " <td id=\"T_0502a_row7_col3\" class=\"data row7 col3\" >False</td>\n", - " <td id=\"T_0502a_row7_col4\" class=\"data row7 col4\" >True</td>\n", - " <td id=\"T_0502a_row7_col5\" class=\"data row7 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row7_col6\" class=\"data row7 col6\" >{'p_threshold': {'type': '_empty', 'default': 0.05}}</td>\n", - " <td id=\"T_0502a_row7_col7\" class=\"data row7 col7\" >['tabular_data', 'categorical_data', 'statistical_test']</td>\n", - " <td id=\"T_0502a_row7_col8\" class=\"data row7 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row8_col0\" class=\"data row8 col0\" >validmind.data_validation.ClassImbalance</td>\n", - " <td id=\"T_0502a_row8_col1\" class=\"data row8 col1\" >Class Imbalance</td>\n", - " <td id=\"T_0502a_row8_col2\" class=\"data row8 col2\" >Evaluates and quantifies class distribution imbalance in a dataset used by a machine learning model....</td>\n", - " <td id=\"T_0502a_row8_col3\" class=\"data row8 col3\" >True</td>\n", - " <td id=\"T_0502a_row8_col4\" class=\"data row8 col4\" >True</td>\n", - " <td id=\"T_0502a_row8_col5\" class=\"data row8 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row8_col6\" class=\"data row8 col6\" >{'min_percent_threshold': {'type': 'int', 'default': 10}}</td>\n", - " <td id=\"T_0502a_row8_col7\" class=\"data row8 col7\" >['tabular_data', 'binary_classification', 'multiclass_classification', 'data_quality']</td>\n", - " <td id=\"T_0502a_row8_col8\" class=\"data row8 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row9_col0\" class=\"data row9 col0\" >validmind.data_validation.DatasetDescription</td>\n", - " <td id=\"T_0502a_row9_col1\" class=\"data row9 col1\" >Dataset Description</td>\n", - " <td id=\"T_0502a_row9_col2\" class=\"data row9 col2\" >Provides comprehensive analysis and statistical summaries of each column in a machine learning model's dataset....</td>\n", - " <td id=\"T_0502a_row9_col3\" class=\"data row9 col3\" >False</td>\n", - " <td id=\"T_0502a_row9_col4\" class=\"data row9 col4\" >True</td>\n", - " <td id=\"T_0502a_row9_col5\" class=\"data row9 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row9_col6\" class=\"data row9 col6\" >{}</td>\n", - " <td id=\"T_0502a_row9_col7\" class=\"data row9 col7\" >['tabular_data', 'time_series_data', 'text_data']</td>\n", - " <td id=\"T_0502a_row9_col8\" class=\"data row9 col8\" >['classification', 'regression', 'text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row10_col0\" class=\"data row10 col0\" >validmind.data_validation.DatasetSplit</td>\n", - " <td id=\"T_0502a_row10_col1\" class=\"data row10 col1\" >Dataset Split</td>\n", - " <td id=\"T_0502a_row10_col2\" class=\"data row10 col2\" >Evaluates and visualizes the distribution proportions among training, testing, and validation datasets of an ML...</td>\n", - " <td id=\"T_0502a_row10_col3\" class=\"data row10 col3\" >False</td>\n", - " <td id=\"T_0502a_row10_col4\" class=\"data row10 col4\" >True</td>\n", - " <td id=\"T_0502a_row10_col5\" class=\"data row10 col5\" >['datasets']</td>\n", - " <td id=\"T_0502a_row10_col6\" class=\"data row10 col6\" >{}</td>\n", - " <td id=\"T_0502a_row10_col7\" class=\"data row10 col7\" >['tabular_data', 'time_series_data', 'text_data']</td>\n", - " <td id=\"T_0502a_row10_col8\" class=\"data row10 col8\" >['classification', 'regression', 'text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row11_col0\" class=\"data row11 col0\" >validmind.data_validation.DescriptiveStatistics</td>\n", - " <td id=\"T_0502a_row11_col1\" class=\"data row11 col1\" >Descriptive Statistics</td>\n", - " <td id=\"T_0502a_row11_col2\" class=\"data row11 col2\" >Performs a detailed descriptive statistical analysis of both numerical and categorical data within a model's...</td>\n", - " <td id=\"T_0502a_row11_col3\" class=\"data row11 col3\" >False</td>\n", - " <td id=\"T_0502a_row11_col4\" class=\"data row11 col4\" >True</td>\n", - " <td id=\"T_0502a_row11_col5\" class=\"data row11 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row11_col6\" class=\"data row11 col6\" >{}</td>\n", - " <td id=\"T_0502a_row11_col7\" class=\"data row11 col7\" >['tabular_data', 'time_series_data', 'data_quality']</td>\n", - " <td id=\"T_0502a_row11_col8\" class=\"data row11 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row12_col0\" class=\"data row12 col0\" >validmind.data_validation.DickeyFullerGLS</td>\n", - " <td id=\"T_0502a_row12_col1\" class=\"data row12 col1\" >Dickey Fuller GLS</td>\n", - " <td id=\"T_0502a_row12_col2\" class=\"data row12 col2\" >Assesses stationarity in time series data using the Dickey-Fuller GLS test to determine the order of integration....</td>\n", - " <td id=\"T_0502a_row12_col3\" class=\"data row12 col3\" >False</td>\n", - " <td id=\"T_0502a_row12_col4\" class=\"data row12 col4\" >True</td>\n", - " <td id=\"T_0502a_row12_col5\" class=\"data row12 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row12_col6\" class=\"data row12 col6\" >{}</td>\n", - " <td id=\"T_0502a_row12_col7\" class=\"data row12 col7\" >['time_series_data', 'forecasting', 'unit_root_test']</td>\n", - " <td id=\"T_0502a_row12_col8\" class=\"data row12 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row13_col0\" class=\"data row13 col0\" >validmind.data_validation.Duplicates</td>\n", - " <td id=\"T_0502a_row13_col1\" class=\"data row13 col1\" >Duplicates</td>\n", - " <td id=\"T_0502a_row13_col2\" class=\"data row13 col2\" >Tests dataset for duplicate entries, ensuring model reliability via data quality verification....</td>\n", - " <td id=\"T_0502a_row13_col3\" class=\"data row13 col3\" >False</td>\n", - " <td id=\"T_0502a_row13_col4\" class=\"data row13 col4\" >True</td>\n", - " <td id=\"T_0502a_row13_col5\" class=\"data row13 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row13_col6\" class=\"data row13 col6\" >{'min_threshold': {'type': '_empty', 'default': 1}}</td>\n", - " <td id=\"T_0502a_row13_col7\" class=\"data row13 col7\" >['tabular_data', 'data_quality', 'text_data']</td>\n", - " <td id=\"T_0502a_row13_col8\" class=\"data row13 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row14_col0\" class=\"data row14 col0\" >validmind.data_validation.EngleGrangerCoint</td>\n", - " <td id=\"T_0502a_row14_col1\" class=\"data row14 col1\" >Engle Granger Coint</td>\n", - " <td id=\"T_0502a_row14_col2\" class=\"data row14 col2\" >Assesses the degree of co-movement between pairs of time series data using the Engle-Granger cointegration test....</td>\n", - " <td id=\"T_0502a_row14_col3\" class=\"data row14 col3\" >False</td>\n", - " <td id=\"T_0502a_row14_col4\" class=\"data row14 col4\" >True</td>\n", - " <td id=\"T_0502a_row14_col5\" class=\"data row14 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row14_col6\" class=\"data row14 col6\" >{'threshold': {'type': 'float', 'default': 0.05}}</td>\n", - " <td id=\"T_0502a_row14_col7\" class=\"data row14 col7\" >['time_series_data', 'statistical_test', 'forecasting']</td>\n", - " <td id=\"T_0502a_row14_col8\" class=\"data row14 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row15_col0\" class=\"data row15 col0\" >validmind.data_validation.FeatureTargetCorrelationPlot</td>\n", - " <td id=\"T_0502a_row15_col1\" class=\"data row15 col1\" >Feature Target Correlation Plot</td>\n", - " <td id=\"T_0502a_row15_col2\" class=\"data row15 col2\" >Visualizes the correlation between input features and the model's target output in a color-coded horizontal bar...</td>\n", - " <td id=\"T_0502a_row15_col3\" class=\"data row15 col3\" >True</td>\n", - " <td id=\"T_0502a_row15_col4\" class=\"data row15 col4\" >False</td>\n", - " <td id=\"T_0502a_row15_col5\" class=\"data row15 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row15_col6\" class=\"data row15 col6\" >{'fig_height': {'type': '_empty', 'default': 600}}</td>\n", - " <td id=\"T_0502a_row15_col7\" class=\"data row15 col7\" >['tabular_data', 'visualization', 'correlation']</td>\n", - " <td id=\"T_0502a_row15_col8\" class=\"data row15 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row16_col0\" class=\"data row16 col0\" >validmind.data_validation.HighCardinality</td>\n", - " <td id=\"T_0502a_row16_col1\" class=\"data row16 col1\" >High Cardinality</td>\n", - " <td id=\"T_0502a_row16_col2\" class=\"data row16 col2\" >Assesses the number of unique values in categorical columns to detect high cardinality and potential overfitting....</td>\n", - " <td id=\"T_0502a_row16_col3\" class=\"data row16 col3\" >False</td>\n", - " <td id=\"T_0502a_row16_col4\" class=\"data row16 col4\" >True</td>\n", - " <td id=\"T_0502a_row16_col5\" class=\"data row16 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row16_col6\" class=\"data row16 col6\" >{'num_threshold': {'type': 'int', 'default': 100}, 'percent_threshold': {'type': 'float', 'default': 0.1}, 'threshold_type': {'type': 'str', 'default': 'percent'}}</td>\n", - " <td id=\"T_0502a_row16_col7\" class=\"data row16 col7\" >['tabular_data', 'data_quality', 'categorical_data']</td>\n", - " <td id=\"T_0502a_row16_col8\" class=\"data row16 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row17_col0\" class=\"data row17 col0\" >validmind.data_validation.HighPearsonCorrelation</td>\n", - " <td id=\"T_0502a_row17_col1\" class=\"data row17 col1\" >High Pearson Correlation</td>\n", - " <td id=\"T_0502a_row17_col2\" class=\"data row17 col2\" >Identifies highly correlated feature pairs in a dataset suggesting feature redundancy or multicollinearity....</td>\n", - " <td id=\"T_0502a_row17_col3\" class=\"data row17 col3\" >False</td>\n", - " <td id=\"T_0502a_row17_col4\" class=\"data row17 col4\" >True</td>\n", - " <td id=\"T_0502a_row17_col5\" class=\"data row17 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row17_col6\" class=\"data row17 col6\" >{'max_threshold': {'type': 'float', 'default': 0.3}, 'top_n_correlations': {'type': 'int', 'default': 10}, 'feature_columns': {'type': 'list', 'default': None}}</td>\n", - " <td id=\"T_0502a_row17_col7\" class=\"data row17 col7\" >['tabular_data', 'data_quality', 'correlation']</td>\n", - " <td id=\"T_0502a_row17_col8\" class=\"data row17 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row18_col0\" class=\"data row18 col0\" >validmind.data_validation.IQROutliersBarPlot</td>\n", - " <td id=\"T_0502a_row18_col1\" class=\"data row18 col1\" >IQR Outliers Bar Plot</td>\n", - " <td id=\"T_0502a_row18_col2\" class=\"data row18 col2\" >Visualizes outlier distribution across percentiles in numerical data using the Interquartile Range (IQR) method....</td>\n", - " <td id=\"T_0502a_row18_col3\" class=\"data row18 col3\" >True</td>\n", - " <td id=\"T_0502a_row18_col4\" class=\"data row18 col4\" >False</td>\n", - " <td id=\"T_0502a_row18_col5\" class=\"data row18 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row18_col6\" class=\"data row18 col6\" >{'threshold': {'type': 'float', 'default': 1.5}, 'fig_width': {'type': 'int', 'default': 800}}</td>\n", - " <td id=\"T_0502a_row18_col7\" class=\"data row18 col7\" >['tabular_data', 'visualization', 'numerical_data']</td>\n", - " <td id=\"T_0502a_row18_col8\" class=\"data row18 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row19_col0\" class=\"data row19 col0\" >validmind.data_validation.IQROutliersTable</td>\n", - " <td id=\"T_0502a_row19_col1\" class=\"data row19 col1\" >IQR Outliers Table</td>\n", - " <td id=\"T_0502a_row19_col2\" class=\"data row19 col2\" >Determines and summarizes outliers in numerical features using the Interquartile Range method....</td>\n", - " <td id=\"T_0502a_row19_col3\" class=\"data row19 col3\" >False</td>\n", - " <td id=\"T_0502a_row19_col4\" class=\"data row19 col4\" >True</td>\n", - " <td id=\"T_0502a_row19_col5\" class=\"data row19 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row19_col6\" class=\"data row19 col6\" >{'threshold': {'type': 'float', 'default': 1.5}}</td>\n", - " <td id=\"T_0502a_row19_col7\" class=\"data row19 col7\" >['tabular_data', 'numerical_data']</td>\n", - " <td id=\"T_0502a_row19_col8\" class=\"data row19 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row20_col0\" class=\"data row20 col0\" >validmind.data_validation.IsolationForestOutliers</td>\n", - " <td id=\"T_0502a_row20_col1\" class=\"data row20 col1\" >Isolation Forest Outliers</td>\n", - " <td id=\"T_0502a_row20_col2\" class=\"data row20 col2\" >Detects outliers in a dataset using the Isolation Forest algorithm and visualizes results through scatter plots....</td>\n", - " <td id=\"T_0502a_row20_col3\" class=\"data row20 col3\" >True</td>\n", - " <td id=\"T_0502a_row20_col4\" class=\"data row20 col4\" >False</td>\n", - " <td id=\"T_0502a_row20_col5\" class=\"data row20 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row20_col6\" class=\"data row20 col6\" >{'random_state': {'type': 'int', 'default': 0}, 'contamination': {'type': 'float', 'default': 0.1}, 'feature_columns': {'type': 'list', 'default': None}}</td>\n", - " <td id=\"T_0502a_row20_col7\" class=\"data row20 col7\" >['tabular_data', 'anomaly_detection']</td>\n", - " <td id=\"T_0502a_row20_col8\" class=\"data row20 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row21_col0\" class=\"data row21 col0\" >validmind.data_validation.JarqueBera</td>\n", - " <td id=\"T_0502a_row21_col1\" class=\"data row21 col1\" >Jarque Bera</td>\n", - " <td id=\"T_0502a_row21_col2\" class=\"data row21 col2\" >Assesses normality of dataset features in an ML model using the Jarque-Bera test....</td>\n", - " <td id=\"T_0502a_row21_col3\" class=\"data row21 col3\" >False</td>\n", - " <td id=\"T_0502a_row21_col4\" class=\"data row21 col4\" >True</td>\n", - " <td id=\"T_0502a_row21_col5\" class=\"data row21 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row21_col6\" class=\"data row21 col6\" >{}</td>\n", - " <td id=\"T_0502a_row21_col7\" class=\"data row21 col7\" >['tabular_data', 'data_distribution', 'statistical_test', 'statsmodels']</td>\n", - " <td id=\"T_0502a_row21_col8\" class=\"data row21 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row22_col0\" class=\"data row22 col0\" >validmind.data_validation.KPSS</td>\n", - " <td id=\"T_0502a_row22_col1\" class=\"data row22 col1\" >KPSS</td>\n", - " <td id=\"T_0502a_row22_col2\" class=\"data row22 col2\" >Assesses the stationarity of time-series data in a machine learning model using the KPSS unit root test....</td>\n", - " <td id=\"T_0502a_row22_col3\" class=\"data row22 col3\" >False</td>\n", - " <td id=\"T_0502a_row22_col4\" class=\"data row22 col4\" >True</td>\n", - " <td id=\"T_0502a_row22_col5\" class=\"data row22 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row22_col6\" class=\"data row22 col6\" >{}</td>\n", - " <td id=\"T_0502a_row22_col7\" class=\"data row22 col7\" >['time_series_data', 'stationarity', 'unit_root_test', 'statsmodels']</td>\n", - " <td id=\"T_0502a_row22_col8\" class=\"data row22 col8\" >['data_validation']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row23_col0\" class=\"data row23 col0\" >validmind.data_validation.LJungBox</td>\n", - " <td id=\"T_0502a_row23_col1\" class=\"data row23 col1\" >L Jung Box</td>\n", - " <td id=\"T_0502a_row23_col2\" class=\"data row23 col2\" >Assesses autocorrelations in dataset features by performing a Ljung-Box test on each feature....</td>\n", - " <td id=\"T_0502a_row23_col3\" class=\"data row23 col3\" >False</td>\n", - " <td id=\"T_0502a_row23_col4\" class=\"data row23 col4\" >True</td>\n", - " <td id=\"T_0502a_row23_col5\" class=\"data row23 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row23_col6\" class=\"data row23 col6\" >{}</td>\n", - " <td id=\"T_0502a_row23_col7\" class=\"data row23 col7\" >['time_series_data', 'forecasting', 'statistical_test', 'statsmodels']</td>\n", - " <td id=\"T_0502a_row23_col8\" class=\"data row23 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row24_col0\" class=\"data row24 col0\" >validmind.data_validation.LaggedCorrelationHeatmap</td>\n", - " <td id=\"T_0502a_row24_col1\" class=\"data row24 col1\" >Lagged Correlation Heatmap</td>\n", - " <td id=\"T_0502a_row24_col2\" class=\"data row24 col2\" >Assesses and visualizes correlation between target variable and lagged independent variables in a time-series...</td>\n", - " <td id=\"T_0502a_row24_col3\" class=\"data row24 col3\" >True</td>\n", - " <td id=\"T_0502a_row24_col4\" class=\"data row24 col4\" >False</td>\n", - " <td id=\"T_0502a_row24_col5\" class=\"data row24 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row24_col6\" class=\"data row24 col6\" >{'num_lags': {'type': 'int', 'default': 10}}</td>\n", - " <td id=\"T_0502a_row24_col7\" class=\"data row24 col7\" >['time_series_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row24_col8\" class=\"data row24 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row25_col0\" class=\"data row25 col0\" >validmind.data_validation.MissingValues</td>\n", - " <td id=\"T_0502a_row25_col1\" class=\"data row25 col1\" >Missing Values</td>\n", - " <td id=\"T_0502a_row25_col2\" class=\"data row25 col2\" >Evaluates dataset quality by ensuring missing value ratio across all features does not exceed a set threshold....</td>\n", - " <td id=\"T_0502a_row25_col3\" class=\"data row25 col3\" >False</td>\n", - " <td id=\"T_0502a_row25_col4\" class=\"data row25 col4\" >True</td>\n", - " <td id=\"T_0502a_row25_col5\" class=\"data row25 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row25_col6\" class=\"data row25 col6\" >{'min_threshold': {'type': 'int', 'default': 1}}</td>\n", - " <td id=\"T_0502a_row25_col7\" class=\"data row25 col7\" >['tabular_data', 'data_quality']</td>\n", - " <td id=\"T_0502a_row25_col8\" class=\"data row25 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row26_col0\" class=\"data row26 col0\" >validmind.data_validation.MissingValuesBarPlot</td>\n", - " <td id=\"T_0502a_row26_col1\" class=\"data row26 col1\" >Missing Values Bar Plot</td>\n", - " <td id=\"T_0502a_row26_col2\" class=\"data row26 col2\" >Assesses the percentage and distribution of missing values in the dataset via a bar plot, with emphasis on...</td>\n", - " <td id=\"T_0502a_row26_col3\" class=\"data row26 col3\" >True</td>\n", - " <td id=\"T_0502a_row26_col4\" class=\"data row26 col4\" >False</td>\n", - " <td id=\"T_0502a_row26_col5\" class=\"data row26 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row26_col6\" class=\"data row26 col6\" >{'threshold': {'type': 'int', 'default': 80}, 'fig_height': {'type': 'int', 'default': 600}}</td>\n", - " <td id=\"T_0502a_row26_col7\" class=\"data row26 col7\" >['tabular_data', 'data_quality', 'visualization']</td>\n", - " <td id=\"T_0502a_row26_col8\" class=\"data row26 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row27_col0\" class=\"data row27 col0\" >validmind.data_validation.MutualInformation</td>\n", - " <td id=\"T_0502a_row27_col1\" class=\"data row27 col1\" >Mutual Information</td>\n", - " <td id=\"T_0502a_row27_col2\" class=\"data row27 col2\" >Calculates mutual information scores between features and target variable to evaluate feature relevance....</td>\n", - " <td id=\"T_0502a_row27_col3\" class=\"data row27 col3\" >True</td>\n", - " <td id=\"T_0502a_row27_col4\" class=\"data row27 col4\" >False</td>\n", - " <td id=\"T_0502a_row27_col5\" class=\"data row27 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row27_col6\" class=\"data row27 col6\" >{'min_threshold': {'type': 'float', 'default': 0.01}, 'task': {'type': 'str', 'default': 'classification'}}</td>\n", - " <td id=\"T_0502a_row27_col7\" class=\"data row27 col7\" >['feature_selection', 'data_analysis']</td>\n", - " <td id=\"T_0502a_row27_col8\" class=\"data row27 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row28_col0\" class=\"data row28 col0\" >validmind.data_validation.PearsonCorrelationMatrix</td>\n", - " <td id=\"T_0502a_row28_col1\" class=\"data row28 col1\" >Pearson Correlation Matrix</td>\n", - " <td id=\"T_0502a_row28_col2\" class=\"data row28 col2\" >Evaluates linear dependency between numerical variables in a dataset via a Pearson Correlation coefficient heat map....</td>\n", - " <td id=\"T_0502a_row28_col3\" class=\"data row28 col3\" >True</td>\n", - " <td id=\"T_0502a_row28_col4\" class=\"data row28 col4\" >False</td>\n", - " <td id=\"T_0502a_row28_col5\" class=\"data row28 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row28_col6\" class=\"data row28 col6\" >{}</td>\n", - " <td id=\"T_0502a_row28_col7\" class=\"data row28 col7\" >['tabular_data', 'numerical_data', 'correlation']</td>\n", - " <td id=\"T_0502a_row28_col8\" class=\"data row28 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row29_col0\" class=\"data row29 col0\" >validmind.data_validation.PhillipsPerronArch</td>\n", - " <td id=\"T_0502a_row29_col1\" class=\"data row29 col1\" >Phillips Perron Arch</td>\n", - " <td id=\"T_0502a_row29_col2\" class=\"data row29 col2\" >Assesses the stationarity of time series data in each feature of the ML model using the Phillips-Perron test....</td>\n", - " <td id=\"T_0502a_row29_col3\" class=\"data row29 col3\" >False</td>\n", - " <td id=\"T_0502a_row29_col4\" class=\"data row29 col4\" >True</td>\n", - " <td id=\"T_0502a_row29_col5\" class=\"data row29 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row29_col6\" class=\"data row29 col6\" >{}</td>\n", - " <td id=\"T_0502a_row29_col7\" class=\"data row29 col7\" >['time_series_data', 'forecasting', 'statistical_test', 'unit_root_test']</td>\n", - " <td id=\"T_0502a_row29_col8\" class=\"data row29 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row30_col0\" class=\"data row30 col0\" >validmind.data_validation.ProtectedClassesDescription</td>\n", - " <td id=\"T_0502a_row30_col1\" class=\"data row30 col1\" >Protected Classes Description</td>\n", - " <td id=\"T_0502a_row30_col2\" class=\"data row30 col2\" >Visualizes the distribution of protected classes in the dataset relative to the target variable...</td>\n", - " <td id=\"T_0502a_row30_col3\" class=\"data row30 col3\" >True</td>\n", - " <td id=\"T_0502a_row30_col4\" class=\"data row30 col4\" >True</td>\n", - " <td id=\"T_0502a_row30_col5\" class=\"data row30 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row30_col6\" class=\"data row30 col6\" >{'protected_classes': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row30_col7\" class=\"data row30 col7\" >['bias_and_fairness', 'descriptive_statistics']</td>\n", - " <td id=\"T_0502a_row30_col8\" class=\"data row30 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row31_col0\" class=\"data row31 col0\" >validmind.data_validation.RollingStatsPlot</td>\n", - " <td id=\"T_0502a_row31_col1\" class=\"data row31 col1\" >Rolling Stats Plot</td>\n", - " <td id=\"T_0502a_row31_col2\" class=\"data row31 col2\" >Evaluates the stationarity of time series data by plotting its rolling mean and standard deviation over a specified...</td>\n", - " <td id=\"T_0502a_row31_col3\" class=\"data row31 col3\" >True</td>\n", - " <td id=\"T_0502a_row31_col4\" class=\"data row31 col4\" >False</td>\n", - " <td id=\"T_0502a_row31_col5\" class=\"data row31 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row31_col6\" class=\"data row31 col6\" >{'window_size': {'type': 'int', 'default': 12}}</td>\n", - " <td id=\"T_0502a_row31_col7\" class=\"data row31 col7\" >['time_series_data', 'visualization', 'stationarity']</td>\n", - " <td id=\"T_0502a_row31_col8\" class=\"data row31 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row32_col0\" class=\"data row32 col0\" >validmind.data_validation.RunsTest</td>\n", - " <td id=\"T_0502a_row32_col1\" class=\"data row32 col1\" >Runs Test</td>\n", - " <td id=\"T_0502a_row32_col2\" class=\"data row32 col2\" >Executes Runs Test on ML model to detect non-random patterns in output data sequence....</td>\n", - " <td id=\"T_0502a_row32_col3\" class=\"data row32 col3\" >False</td>\n", - " <td id=\"T_0502a_row32_col4\" class=\"data row32 col4\" >True</td>\n", - " <td id=\"T_0502a_row32_col5\" class=\"data row32 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row32_col6\" class=\"data row32 col6\" >{}</td>\n", - " <td id=\"T_0502a_row32_col7\" class=\"data row32 col7\" >['tabular_data', 'statistical_test', 'statsmodels']</td>\n", - " <td id=\"T_0502a_row32_col8\" class=\"data row32 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row33_col0\" class=\"data row33 col0\" >validmind.data_validation.ScatterPlot</td>\n", - " <td id=\"T_0502a_row33_col1\" class=\"data row33 col1\" >Scatter Plot</td>\n", - " <td id=\"T_0502a_row33_col2\" class=\"data row33 col2\" >Assesses visual relationships, patterns, and outliers among features in a dataset through scatter plot matrices....</td>\n", - " <td id=\"T_0502a_row33_col3\" class=\"data row33 col3\" >True</td>\n", - " <td id=\"T_0502a_row33_col4\" class=\"data row33 col4\" >False</td>\n", - " <td id=\"T_0502a_row33_col5\" class=\"data row33 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row33_col6\" class=\"data row33 col6\" >{}</td>\n", - " <td id=\"T_0502a_row33_col7\" class=\"data row33 col7\" >['tabular_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row33_col8\" class=\"data row33 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row34_col0\" class=\"data row34 col0\" >validmind.data_validation.ScoreBandDefaultRates</td>\n", - " <td id=\"T_0502a_row34_col1\" class=\"data row34 col1\" >Score Band Default Rates</td>\n", - " <td id=\"T_0502a_row34_col2\" class=\"data row34 col2\" >Analyzes default rates and population distribution across credit score bands....</td>\n", - " <td id=\"T_0502a_row34_col3\" class=\"data row34 col3\" >False</td>\n", - " <td id=\"T_0502a_row34_col4\" class=\"data row34 col4\" >True</td>\n", - " <td id=\"T_0502a_row34_col5\" class=\"data row34 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row34_col6\" class=\"data row34 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'score_bands': {'type': 'list', 'default': None}}</td>\n", - " <td id=\"T_0502a_row34_col7\" class=\"data row34 col7\" >['visualization', 'credit_risk', 'scorecard']</td>\n", - " <td id=\"T_0502a_row34_col8\" class=\"data row34 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row35_col0\" class=\"data row35 col0\" >validmind.data_validation.SeasonalDecompose</td>\n", - " <td id=\"T_0502a_row35_col1\" class=\"data row35 col1\" >Seasonal Decompose</td>\n", - " <td id=\"T_0502a_row35_col2\" class=\"data row35 col2\" >Assesses patterns and seasonality in a time series dataset by decomposing its features into foundational components....</td>\n", - " <td id=\"T_0502a_row35_col3\" class=\"data row35 col3\" >True</td>\n", - " <td id=\"T_0502a_row35_col4\" class=\"data row35 col4\" >False</td>\n", - " <td id=\"T_0502a_row35_col5\" class=\"data row35 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row35_col6\" class=\"data row35 col6\" >{'seasonal_model': {'type': 'str', 'default': 'additive'}}</td>\n", - " <td id=\"T_0502a_row35_col7\" class=\"data row35 col7\" >['time_series_data', 'seasonality', 'statsmodels']</td>\n", - " <td id=\"T_0502a_row35_col8\" class=\"data row35 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row36_col0\" class=\"data row36 col0\" >validmind.data_validation.ShapiroWilk</td>\n", - " <td id=\"T_0502a_row36_col1\" class=\"data row36 col1\" >Shapiro Wilk</td>\n", - " <td id=\"T_0502a_row36_col2\" class=\"data row36 col2\" >Evaluates feature-wise normality of training data using the Shapiro-Wilk test....</td>\n", - " <td id=\"T_0502a_row36_col3\" class=\"data row36 col3\" >False</td>\n", - " <td id=\"T_0502a_row36_col4\" class=\"data row36 col4\" >True</td>\n", - " <td id=\"T_0502a_row36_col5\" class=\"data row36 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row36_col6\" class=\"data row36 col6\" >{}</td>\n", - " <td id=\"T_0502a_row36_col7\" class=\"data row36 col7\" >['tabular_data', 'data_distribution', 'statistical_test']</td>\n", - " <td id=\"T_0502a_row36_col8\" class=\"data row36 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row37_col0\" class=\"data row37 col0\" >validmind.data_validation.Skewness</td>\n", - " <td id=\"T_0502a_row37_col1\" class=\"data row37 col1\" >Skewness</td>\n", - " <td id=\"T_0502a_row37_col2\" class=\"data row37 col2\" >Evaluates the skewness of numerical data in a dataset to check against a defined threshold, aiming to ensure data...</td>\n", - " <td id=\"T_0502a_row37_col3\" class=\"data row37 col3\" >False</td>\n", - " <td id=\"T_0502a_row37_col4\" class=\"data row37 col4\" >True</td>\n", - " <td id=\"T_0502a_row37_col5\" class=\"data row37 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row37_col6\" class=\"data row37 col6\" >{'max_threshold': {'type': '_empty', 'default': 1}}</td>\n", - " <td id=\"T_0502a_row37_col7\" class=\"data row37 col7\" >['data_quality', 'tabular_data']</td>\n", - " <td id=\"T_0502a_row37_col8\" class=\"data row37 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row38_col0\" class=\"data row38 col0\" >validmind.data_validation.SpreadPlot</td>\n", - " <td id=\"T_0502a_row38_col1\" class=\"data row38 col1\" >Spread Plot</td>\n", - " <td id=\"T_0502a_row38_col2\" class=\"data row38 col2\" >Assesses potential correlations between pairs of time series variables through visualization to enhance...</td>\n", - " <td id=\"T_0502a_row38_col3\" class=\"data row38 col3\" >True</td>\n", - " <td id=\"T_0502a_row38_col4\" class=\"data row38 col4\" >False</td>\n", - " <td id=\"T_0502a_row38_col5\" class=\"data row38 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row38_col6\" class=\"data row38 col6\" >{}</td>\n", - " <td id=\"T_0502a_row38_col7\" class=\"data row38 col7\" >['time_series_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row38_col8\" class=\"data row38 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row39_col0\" class=\"data row39 col0\" >validmind.data_validation.TabularCategoricalBarPlots</td>\n", - " <td id=\"T_0502a_row39_col1\" class=\"data row39 col1\" >Tabular Categorical Bar Plots</td>\n", - " <td id=\"T_0502a_row39_col2\" class=\"data row39 col2\" >Generates and visualizes bar plots for each category in categorical features to evaluate the dataset's composition....</td>\n", - " <td id=\"T_0502a_row39_col3\" class=\"data row39 col3\" >True</td>\n", - " <td id=\"T_0502a_row39_col4\" class=\"data row39 col4\" >False</td>\n", - " <td id=\"T_0502a_row39_col5\" class=\"data row39 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row39_col6\" class=\"data row39 col6\" >{}</td>\n", - " <td id=\"T_0502a_row39_col7\" class=\"data row39 col7\" >['tabular_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row39_col8\" class=\"data row39 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row40_col0\" class=\"data row40 col0\" >validmind.data_validation.TabularDateTimeHistograms</td>\n", - " <td id=\"T_0502a_row40_col1\" class=\"data row40 col1\" >Tabular Date Time Histograms</td>\n", - " <td id=\"T_0502a_row40_col2\" class=\"data row40 col2\" >Generates histograms to provide graphical insight into the distribution of time intervals in a model's datetime...</td>\n", - " <td id=\"T_0502a_row40_col3\" class=\"data row40 col3\" >True</td>\n", - " <td id=\"T_0502a_row40_col4\" class=\"data row40 col4\" >False</td>\n", - " <td id=\"T_0502a_row40_col5\" class=\"data row40 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row40_col6\" class=\"data row40 col6\" >{}</td>\n", - " <td id=\"T_0502a_row40_col7\" class=\"data row40 col7\" >['time_series_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row40_col8\" class=\"data row40 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row41_col0\" class=\"data row41 col0\" >validmind.data_validation.TabularDescriptionTables</td>\n", - " <td id=\"T_0502a_row41_col1\" class=\"data row41 col1\" >Tabular Description Tables</td>\n", - " <td id=\"T_0502a_row41_col2\" class=\"data row41 col2\" >Summarizes key descriptive statistics for numerical, categorical, and datetime variables in a dataset....</td>\n", - " <td id=\"T_0502a_row41_col3\" class=\"data row41 col3\" >False</td>\n", - " <td id=\"T_0502a_row41_col4\" class=\"data row41 col4\" >True</td>\n", - " <td id=\"T_0502a_row41_col5\" class=\"data row41 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row41_col6\" class=\"data row41 col6\" >{}</td>\n", - " <td id=\"T_0502a_row41_col7\" class=\"data row41 col7\" >['tabular_data']</td>\n", - " <td id=\"T_0502a_row41_col8\" class=\"data row41 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row42_col0\" class=\"data row42 col0\" >validmind.data_validation.TabularNumericalHistograms</td>\n", - " <td id=\"T_0502a_row42_col1\" class=\"data row42 col1\" >Tabular Numerical Histograms</td>\n", - " <td id=\"T_0502a_row42_col2\" class=\"data row42 col2\" >Generates histograms for each numerical feature in a dataset to provide visual insights into data distribution and...</td>\n", - " <td id=\"T_0502a_row42_col3\" class=\"data row42 col3\" >True</td>\n", - " <td id=\"T_0502a_row42_col4\" class=\"data row42 col4\" >False</td>\n", - " <td id=\"T_0502a_row42_col5\" class=\"data row42 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row42_col6\" class=\"data row42 col6\" >{}</td>\n", - " <td id=\"T_0502a_row42_col7\" class=\"data row42 col7\" >['tabular_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row42_col8\" class=\"data row42 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row43_col0\" class=\"data row43 col0\" >validmind.data_validation.TargetRateBarPlots</td>\n", - " <td id=\"T_0502a_row43_col1\" class=\"data row43 col1\" >Target Rate Bar Plots</td>\n", - " <td id=\"T_0502a_row43_col2\" class=\"data row43 col2\" >Generates bar plots visualizing the default rates of categorical features for a classification machine learning...</td>\n", - " <td id=\"T_0502a_row43_col3\" class=\"data row43 col3\" >True</td>\n", - " <td id=\"T_0502a_row43_col4\" class=\"data row43 col4\" >False</td>\n", - " <td id=\"T_0502a_row43_col5\" class=\"data row43 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row43_col6\" class=\"data row43 col6\" >{}</td>\n", - " <td id=\"T_0502a_row43_col7\" class=\"data row43 col7\" >['tabular_data', 'visualization', 'categorical_data']</td>\n", - " <td id=\"T_0502a_row43_col8\" class=\"data row43 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row44_col0\" class=\"data row44 col0\" >validmind.data_validation.TimeSeriesDescription</td>\n", - " <td id=\"T_0502a_row44_col1\" class=\"data row44 col1\" >Time Series Description</td>\n", - " <td id=\"T_0502a_row44_col2\" class=\"data row44 col2\" >Generates a detailed analysis for the provided time series dataset, summarizing key statistics to identify trends,...</td>\n", - " <td id=\"T_0502a_row44_col3\" class=\"data row44 col3\" >False</td>\n", - " <td id=\"T_0502a_row44_col4\" class=\"data row44 col4\" >True</td>\n", - " <td id=\"T_0502a_row44_col5\" class=\"data row44 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row44_col6\" class=\"data row44 col6\" >{}</td>\n", - " <td id=\"T_0502a_row44_col7\" class=\"data row44 col7\" >['time_series_data', 'analysis']</td>\n", - " <td id=\"T_0502a_row44_col8\" class=\"data row44 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row45_col0\" class=\"data row45 col0\" >validmind.data_validation.TimeSeriesDescriptiveStatistics</td>\n", - " <td id=\"T_0502a_row45_col1\" class=\"data row45 col1\" >Time Series Descriptive Statistics</td>\n", - " <td id=\"T_0502a_row45_col2\" class=\"data row45 col2\" >Evaluates the descriptive statistics of a time series dataset to identify trends, patterns, and data quality issues....</td>\n", - " <td id=\"T_0502a_row45_col3\" class=\"data row45 col3\" >False</td>\n", - " <td id=\"T_0502a_row45_col4\" class=\"data row45 col4\" >True</td>\n", - " <td id=\"T_0502a_row45_col5\" class=\"data row45 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row45_col6\" class=\"data row45 col6\" >{}</td>\n", - " <td id=\"T_0502a_row45_col7\" class=\"data row45 col7\" >['time_series_data', 'analysis']</td>\n", - " <td id=\"T_0502a_row45_col8\" class=\"data row45 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row46_col0\" class=\"data row46 col0\" >validmind.data_validation.TimeSeriesFrequency</td>\n", - " <td id=\"T_0502a_row46_col1\" class=\"data row46 col1\" >Time Series Frequency</td>\n", - " <td id=\"T_0502a_row46_col2\" class=\"data row46 col2\" >Evaluates consistency of time series data frequency and generates a frequency plot....</td>\n", - " <td id=\"T_0502a_row46_col3\" class=\"data row46 col3\" >True</td>\n", - " <td id=\"T_0502a_row46_col4\" class=\"data row46 col4\" >True</td>\n", - " <td id=\"T_0502a_row46_col5\" class=\"data row46 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row46_col6\" class=\"data row46 col6\" >{}</td>\n", - " <td id=\"T_0502a_row46_col7\" class=\"data row46 col7\" >['time_series_data']</td>\n", - " <td id=\"T_0502a_row46_col8\" class=\"data row46 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row47_col0\" class=\"data row47 col0\" >validmind.data_validation.TimeSeriesHistogram</td>\n", - " <td id=\"T_0502a_row47_col1\" class=\"data row47 col1\" >Time Series Histogram</td>\n", - " <td id=\"T_0502a_row47_col2\" class=\"data row47 col2\" >Visualizes distribution of time-series data using histograms and Kernel Density Estimation (KDE) lines....</td>\n", - " <td id=\"T_0502a_row47_col3\" class=\"data row47 col3\" >True</td>\n", - " <td id=\"T_0502a_row47_col4\" class=\"data row47 col4\" >False</td>\n", - " <td id=\"T_0502a_row47_col5\" class=\"data row47 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row47_col6\" class=\"data row47 col6\" >{'nbins': {'type': '_empty', 'default': 30}}</td>\n", - " <td id=\"T_0502a_row47_col7\" class=\"data row47 col7\" >['data_validation', 'visualization', 'time_series_data']</td>\n", - " <td id=\"T_0502a_row47_col8\" class=\"data row47 col8\" >['regression', 'time_series_forecasting']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row48_col0\" class=\"data row48 col0\" >validmind.data_validation.TimeSeriesLinePlot</td>\n", - " <td id=\"T_0502a_row48_col1\" class=\"data row48 col1\" >Time Series Line Plot</td>\n", - " <td id=\"T_0502a_row48_col2\" class=\"data row48 col2\" >Generates and analyses time-series data through line plots revealing trends, patterns, anomalies over time....</td>\n", - " <td id=\"T_0502a_row48_col3\" class=\"data row48 col3\" >True</td>\n", - " <td id=\"T_0502a_row48_col4\" class=\"data row48 col4\" >False</td>\n", - " <td id=\"T_0502a_row48_col5\" class=\"data row48 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row48_col6\" class=\"data row48 col6\" >{}</td>\n", - " <td id=\"T_0502a_row48_col7\" class=\"data row48 col7\" >['time_series_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row48_col8\" class=\"data row48 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row49_col0\" class=\"data row49 col0\" >validmind.data_validation.TimeSeriesMissingValues</td>\n", - " <td id=\"T_0502a_row49_col1\" class=\"data row49 col1\" >Time Series Missing Values</td>\n", - " <td id=\"T_0502a_row49_col2\" class=\"data row49 col2\" >Validates time-series data quality by confirming the count of missing values is below a certain threshold....</td>\n", - " <td id=\"T_0502a_row49_col3\" class=\"data row49 col3\" >True</td>\n", - " <td id=\"T_0502a_row49_col4\" class=\"data row49 col4\" >True</td>\n", - " <td id=\"T_0502a_row49_col5\" class=\"data row49 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row49_col6\" class=\"data row49 col6\" >{'min_threshold': {'type': 'int', 'default': 1}}</td>\n", - " <td id=\"T_0502a_row49_col7\" class=\"data row49 col7\" >['time_series_data']</td>\n", - " <td id=\"T_0502a_row49_col8\" class=\"data row49 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row50_col0\" class=\"data row50 col0\" >validmind.data_validation.TimeSeriesOutliers</td>\n", - " <td id=\"T_0502a_row50_col1\" class=\"data row50 col1\" >Time Series Outliers</td>\n", - " <td id=\"T_0502a_row50_col2\" class=\"data row50 col2\" >Identifies and visualizes outliers in time-series data using the z-score method....</td>\n", - " <td id=\"T_0502a_row50_col3\" class=\"data row50 col3\" >False</td>\n", - " <td id=\"T_0502a_row50_col4\" class=\"data row50 col4\" >True</td>\n", - " <td id=\"T_0502a_row50_col5\" class=\"data row50 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row50_col6\" class=\"data row50 col6\" >{'zscore_threshold': {'type': 'int', 'default': 3}}</td>\n", - " <td id=\"T_0502a_row50_col7\" class=\"data row50 col7\" >['time_series_data']</td>\n", - " <td id=\"T_0502a_row50_col8\" class=\"data row50 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row51_col0\" class=\"data row51 col0\" >validmind.data_validation.TooManyZeroValues</td>\n", - " <td id=\"T_0502a_row51_col1\" class=\"data row51 col1\" >Too Many Zero Values</td>\n", - " <td id=\"T_0502a_row51_col2\" class=\"data row51 col2\" >Identifies numerical columns in a dataset that contain an excessive number of zero values, defined by a threshold...</td>\n", - " <td id=\"T_0502a_row51_col3\" class=\"data row51 col3\" >False</td>\n", - " <td id=\"T_0502a_row51_col4\" class=\"data row51 col4\" >True</td>\n", - " <td id=\"T_0502a_row51_col5\" class=\"data row51 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row51_col6\" class=\"data row51 col6\" >{'max_percent_threshold': {'type': 'float', 'default': 0.03}}</td>\n", - " <td id=\"T_0502a_row51_col7\" class=\"data row51 col7\" >['tabular_data']</td>\n", - " <td id=\"T_0502a_row51_col8\" class=\"data row51 col8\" >['regression', 'classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row52_col0\" class=\"data row52 col0\" >validmind.data_validation.UniqueRows</td>\n", - " <td id=\"T_0502a_row52_col1\" class=\"data row52 col1\" >Unique Rows</td>\n", - " <td id=\"T_0502a_row52_col2\" class=\"data row52 col2\" >Verifies the diversity of the dataset by ensuring that the count of unique rows exceeds a prescribed threshold....</td>\n", - " <td id=\"T_0502a_row52_col3\" class=\"data row52 col3\" >False</td>\n", - " <td id=\"T_0502a_row52_col4\" class=\"data row52 col4\" >True</td>\n", - " <td id=\"T_0502a_row52_col5\" class=\"data row52 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row52_col6\" class=\"data row52 col6\" >{'min_percent_threshold': {'type': 'float', 'default': 1}}</td>\n", - " <td id=\"T_0502a_row52_col7\" class=\"data row52 col7\" >['tabular_data']</td>\n", - " <td id=\"T_0502a_row52_col8\" class=\"data row52 col8\" >['regression', 'classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row53_col0\" class=\"data row53 col0\" >validmind.data_validation.WOEBinPlots</td>\n", - " <td id=\"T_0502a_row53_col1\" class=\"data row53 col1\" >WOE Bin Plots</td>\n", - " <td id=\"T_0502a_row53_col2\" class=\"data row53 col2\" >Generates visualizations of Weight of Evidence (WoE) and Information Value (IV) for understanding predictive power...</td>\n", - " <td id=\"T_0502a_row53_col3\" class=\"data row53 col3\" >True</td>\n", - " <td id=\"T_0502a_row53_col4\" class=\"data row53 col4\" >False</td>\n", - " <td id=\"T_0502a_row53_col5\" class=\"data row53 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row53_col6\" class=\"data row53 col6\" >{'breaks_adj': {'type': 'list', 'default': None}, 'fig_height': {'type': 'int', 'default': 600}, 'fig_width': {'type': 'int', 'default': 500}}</td>\n", - " <td id=\"T_0502a_row53_col7\" class=\"data row53 col7\" >['tabular_data', 'visualization', 'categorical_data']</td>\n", - " <td id=\"T_0502a_row53_col8\" class=\"data row53 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row54_col0\" class=\"data row54 col0\" >validmind.data_validation.WOEBinTable</td>\n", - " <td id=\"T_0502a_row54_col1\" class=\"data row54 col1\" >WOE Bin Table</td>\n", - " <td id=\"T_0502a_row54_col2\" class=\"data row54 col2\" >Assesses the Weight of Evidence (WoE) and Information Value (IV) of each feature to evaluate its predictive power...</td>\n", - " <td id=\"T_0502a_row54_col3\" class=\"data row54 col3\" >False</td>\n", - " <td id=\"T_0502a_row54_col4\" class=\"data row54 col4\" >True</td>\n", - " <td id=\"T_0502a_row54_col5\" class=\"data row54 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row54_col6\" class=\"data row54 col6\" >{'breaks_adj': {'type': 'list', 'default': None}}</td>\n", - " <td id=\"T_0502a_row54_col7\" class=\"data row54 col7\" >['tabular_data', 'categorical_data']</td>\n", - " <td id=\"T_0502a_row54_col8\" class=\"data row54 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row55_col0\" class=\"data row55 col0\" >validmind.data_validation.ZivotAndrewsArch</td>\n", - " <td id=\"T_0502a_row55_col1\" class=\"data row55 col1\" >Zivot Andrews Arch</td>\n", - " <td id=\"T_0502a_row55_col2\" class=\"data row55 col2\" >Evaluates the order of integration and stationarity of time series data using the Zivot-Andrews unit root test....</td>\n", - " <td id=\"T_0502a_row55_col3\" class=\"data row55 col3\" >False</td>\n", - " <td id=\"T_0502a_row55_col4\" class=\"data row55 col4\" >True</td>\n", - " <td id=\"T_0502a_row55_col5\" class=\"data row55 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row55_col6\" class=\"data row55 col6\" >{}</td>\n", - " <td id=\"T_0502a_row55_col7\" class=\"data row55 col7\" >['time_series_data', 'stationarity', 'unit_root_test']</td>\n", - " <td id=\"T_0502a_row55_col8\" class=\"data row55 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row56_col0\" class=\"data row56 col0\" >validmind.data_validation.nlp.CommonWords</td>\n", - " <td id=\"T_0502a_row56_col1\" class=\"data row56 col1\" >Common Words</td>\n", - " <td id=\"T_0502a_row56_col2\" class=\"data row56 col2\" >Assesses the most frequent non-stopwords in a text column for identifying prevalent language patterns....</td>\n", - " <td id=\"T_0502a_row56_col3\" class=\"data row56 col3\" >True</td>\n", - " <td id=\"T_0502a_row56_col4\" class=\"data row56 col4\" >False</td>\n", - " <td id=\"T_0502a_row56_col5\" class=\"data row56 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row56_col6\" class=\"data row56 col6\" >{}</td>\n", - " <td id=\"T_0502a_row56_col7\" class=\"data row56 col7\" >['nlp', 'text_data', 'visualization', 'frequency_analysis']</td>\n", - " <td id=\"T_0502a_row56_col8\" class=\"data row56 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row57_col0\" class=\"data row57 col0\" >validmind.data_validation.nlp.Hashtags</td>\n", - " <td id=\"T_0502a_row57_col1\" class=\"data row57 col1\" >Hashtags</td>\n", - " <td id=\"T_0502a_row57_col2\" class=\"data row57 col2\" >Assesses hashtag frequency in a text column, highlighting usage trends and potential dataset bias or spam....</td>\n", - " <td id=\"T_0502a_row57_col3\" class=\"data row57 col3\" >True</td>\n", - " <td id=\"T_0502a_row57_col4\" class=\"data row57 col4\" >False</td>\n", - " <td id=\"T_0502a_row57_col5\" class=\"data row57 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row57_col6\" class=\"data row57 col6\" >{'top_hashtags': {'type': 'int', 'default': 25}}</td>\n", - " <td id=\"T_0502a_row57_col7\" class=\"data row57 col7\" >['nlp', 'text_data', 'visualization', 'frequency_analysis']</td>\n", - " <td id=\"T_0502a_row57_col8\" class=\"data row57 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row58_col0\" class=\"data row58 col0\" >validmind.data_validation.nlp.LanguageDetection</td>\n", - " <td id=\"T_0502a_row58_col1\" class=\"data row58 col1\" >Language Detection</td>\n", - " <td id=\"T_0502a_row58_col2\" class=\"data row58 col2\" >Assesses the diversity of languages in a textual dataset by detecting and visualizing the distribution of languages....</td>\n", - " <td id=\"T_0502a_row58_col3\" class=\"data row58 col3\" >True</td>\n", - " <td id=\"T_0502a_row58_col4\" class=\"data row58 col4\" >False</td>\n", - " <td id=\"T_0502a_row58_col5\" class=\"data row58 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row58_col6\" class=\"data row58 col6\" >{}</td>\n", - " <td id=\"T_0502a_row58_col7\" class=\"data row58 col7\" >['nlp', 'text_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row58_col8\" class=\"data row58 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row59_col0\" class=\"data row59 col0\" >validmind.data_validation.nlp.Mentions</td>\n", - " <td id=\"T_0502a_row59_col1\" class=\"data row59 col1\" >Mentions</td>\n", - " <td id=\"T_0502a_row59_col2\" class=\"data row59 col2\" >Calculates and visualizes frequencies of '@' prefixed mentions in a text-based dataset for NLP model analysis....</td>\n", - " <td id=\"T_0502a_row59_col3\" class=\"data row59 col3\" >True</td>\n", - " <td id=\"T_0502a_row59_col4\" class=\"data row59 col4\" >False</td>\n", - " <td id=\"T_0502a_row59_col5\" class=\"data row59 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row59_col6\" class=\"data row59 col6\" >{'top_mentions': {'type': 'int', 'default': 25}}</td>\n", - " <td id=\"T_0502a_row59_col7\" class=\"data row59 col7\" >['nlp', 'text_data', 'visualization', 'frequency_analysis']</td>\n", - " <td id=\"T_0502a_row59_col8\" class=\"data row59 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row60_col0\" class=\"data row60 col0\" >validmind.data_validation.nlp.PolarityAndSubjectivity</td>\n", - " <td id=\"T_0502a_row60_col1\" class=\"data row60 col1\" >Polarity And Subjectivity</td>\n", - " <td id=\"T_0502a_row60_col2\" class=\"data row60 col2\" >Analyzes the polarity and subjectivity of text data within a given dataset to visualize the sentiment distribution....</td>\n", - " <td id=\"T_0502a_row60_col3\" class=\"data row60 col3\" >True</td>\n", - " <td id=\"T_0502a_row60_col4\" class=\"data row60 col4\" >True</td>\n", - " <td id=\"T_0502a_row60_col5\" class=\"data row60 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row60_col6\" class=\"data row60 col6\" >{'threshold_subjectivity': {'type': '_empty', 'default': 0.5}, 'threshold_polarity': {'type': '_empty', 'default': 0}}</td>\n", - " <td id=\"T_0502a_row60_col7\" class=\"data row60 col7\" >['nlp', 'text_data', 'data_validation']</td>\n", - " <td id=\"T_0502a_row60_col8\" class=\"data row60 col8\" >['nlp']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row61_col0\" class=\"data row61 col0\" >validmind.data_validation.nlp.Punctuations</td>\n", - " <td id=\"T_0502a_row61_col1\" class=\"data row61 col1\" >Punctuations</td>\n", - " <td id=\"T_0502a_row61_col2\" class=\"data row61 col2\" >Analyzes and visualizes the frequency distribution of punctuation usage in a given text dataset....</td>\n", - " <td id=\"T_0502a_row61_col3\" class=\"data row61 col3\" >True</td>\n", - " <td id=\"T_0502a_row61_col4\" class=\"data row61 col4\" >False</td>\n", - " <td id=\"T_0502a_row61_col5\" class=\"data row61 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row61_col6\" class=\"data row61 col6\" >{'count_mode': {'type': '_empty', 'default': 'token'}}</td>\n", - " <td id=\"T_0502a_row61_col7\" class=\"data row61 col7\" >['nlp', 'text_data', 'visualization', 'frequency_analysis']</td>\n", - " <td id=\"T_0502a_row61_col8\" class=\"data row61 col8\" >['text_classification', 'text_summarization', 'nlp']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row62_col0\" class=\"data row62 col0\" >validmind.data_validation.nlp.Sentiment</td>\n", - " <td id=\"T_0502a_row62_col1\" class=\"data row62 col1\" >Sentiment</td>\n", - " <td id=\"T_0502a_row62_col2\" class=\"data row62 col2\" >Analyzes the sentiment of text data within a dataset using the VADER sentiment analysis tool....</td>\n", - " <td id=\"T_0502a_row62_col3\" class=\"data row62 col3\" >True</td>\n", - " <td id=\"T_0502a_row62_col4\" class=\"data row62 col4\" >False</td>\n", - " <td id=\"T_0502a_row62_col5\" class=\"data row62 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row62_col6\" class=\"data row62 col6\" >{}</td>\n", - " <td id=\"T_0502a_row62_col7\" class=\"data row62 col7\" >['nlp', 'text_data', 'data_validation']</td>\n", - " <td id=\"T_0502a_row62_col8\" class=\"data row62 col8\" >['nlp']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row63_col0\" class=\"data row63 col0\" >validmind.data_validation.nlp.StopWords</td>\n", - " <td id=\"T_0502a_row63_col1\" class=\"data row63 col1\" >Stop Words</td>\n", - " <td id=\"T_0502a_row63_col2\" class=\"data row63 col2\" >Evaluates and visualizes the frequency of English stop words in a text dataset against a defined threshold....</td>\n", - " <td id=\"T_0502a_row63_col3\" class=\"data row63 col3\" >True</td>\n", - " <td id=\"T_0502a_row63_col4\" class=\"data row63 col4\" >True</td>\n", - " <td id=\"T_0502a_row63_col5\" class=\"data row63 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row63_col6\" class=\"data row63 col6\" >{'min_percent_threshold': {'type': 'float', 'default': 0.5}, 'num_words': {'type': 'int', 'default': 25}}</td>\n", - " <td id=\"T_0502a_row63_col7\" class=\"data row63 col7\" >['nlp', 'text_data', 'frequency_analysis', 'visualization']</td>\n", - " <td id=\"T_0502a_row63_col8\" class=\"data row63 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row64_col0\" class=\"data row64 col0\" >validmind.data_validation.nlp.TextDescription</td>\n", - " <td id=\"T_0502a_row64_col1\" class=\"data row64 col1\" >Text Description</td>\n", - " <td id=\"T_0502a_row64_col2\" class=\"data row64 col2\" >Conducts comprehensive textual analysis on a dataset using NLTK to evaluate various parameters and generate...</td>\n", - " <td id=\"T_0502a_row64_col3\" class=\"data row64 col3\" >True</td>\n", - " <td id=\"T_0502a_row64_col4\" class=\"data row64 col4\" >False</td>\n", - " <td id=\"T_0502a_row64_col5\" class=\"data row64 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row64_col6\" class=\"data row64 col6\" >{'unwanted_tokens': {'type': 'set', 'default': {'s', 'mrs', 'us', \"''\", ' ', 'ms', 'dr', 'dollar', '``', 'mr', \"'s\", \"s'\"}}, 'lang': {'type': 'str', 'default': 'english'}}</td>\n", - " <td id=\"T_0502a_row64_col7\" class=\"data row64 col7\" >['nlp', 'text_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row64_col8\" class=\"data row64 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row65_col0\" class=\"data row65 col0\" >validmind.data_validation.nlp.Toxicity</td>\n", - " <td id=\"T_0502a_row65_col1\" class=\"data row65 col1\" >Toxicity</td>\n", - " <td id=\"T_0502a_row65_col2\" class=\"data row65 col2\" >Assesses the toxicity of text data within a dataset to visualize the distribution of toxicity scores....</td>\n", - " <td id=\"T_0502a_row65_col3\" class=\"data row65 col3\" >True</td>\n", - " <td id=\"T_0502a_row65_col4\" class=\"data row65 col4\" >False</td>\n", - " <td id=\"T_0502a_row65_col5\" class=\"data row65 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row65_col6\" class=\"data row65 col6\" >{}</td>\n", - " <td id=\"T_0502a_row65_col7\" class=\"data row65 col7\" >['nlp', 'text_data', 'data_validation']</td>\n", - " <td id=\"T_0502a_row65_col8\" class=\"data row65 col8\" >['nlp']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row66_col0\" class=\"data row66 col0\" >validmind.model_validation.BertScore</td>\n", - " <td id=\"T_0502a_row66_col1\" class=\"data row66 col1\" >Bert Score</td>\n", - " <td id=\"T_0502a_row66_col2\" class=\"data row66 col2\" >Assesses the quality of machine-generated text using BERTScore metrics and visualizes results through histograms...</td>\n", - " <td id=\"T_0502a_row66_col3\" class=\"data row66 col3\" >True</td>\n", - " <td id=\"T_0502a_row66_col4\" class=\"data row66 col4\" >True</td>\n", - " <td id=\"T_0502a_row66_col5\" class=\"data row66 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row66_col6\" class=\"data row66 col6\" >{'evaluation_model': {'type': '_empty', 'default': 'distilbert-base-uncased'}}</td>\n", - " <td id=\"T_0502a_row66_col7\" class=\"data row66 col7\" >['nlp', 'text_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row66_col8\" class=\"data row66 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row67_col0\" class=\"data row67 col0\" >validmind.model_validation.BleuScore</td>\n", - " <td id=\"T_0502a_row67_col1\" class=\"data row67 col1\" >Bleu Score</td>\n", - " <td id=\"T_0502a_row67_col2\" class=\"data row67 col2\" >Evaluates the quality of machine-generated text using BLEU metrics and visualizes the results through histograms...</td>\n", - " <td id=\"T_0502a_row67_col3\" class=\"data row67 col3\" >True</td>\n", - " <td id=\"T_0502a_row67_col4\" class=\"data row67 col4\" >True</td>\n", - " <td id=\"T_0502a_row67_col5\" class=\"data row67 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row67_col6\" class=\"data row67 col6\" >{}</td>\n", - " <td id=\"T_0502a_row67_col7\" class=\"data row67 col7\" >['nlp', 'text_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row67_col8\" class=\"data row67 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row68_col0\" class=\"data row68 col0\" >validmind.model_validation.ClusterSizeDistribution</td>\n", - " <td id=\"T_0502a_row68_col1\" class=\"data row68 col1\" >Cluster Size Distribution</td>\n", - " <td id=\"T_0502a_row68_col2\" class=\"data row68 col2\" >Assesses the performance of clustering models by comparing the distribution of cluster sizes in model predictions...</td>\n", - " <td id=\"T_0502a_row68_col3\" class=\"data row68 col3\" >True</td>\n", - " <td id=\"T_0502a_row68_col4\" class=\"data row68 col4\" >False</td>\n", - " <td id=\"T_0502a_row68_col5\" class=\"data row68 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row68_col6\" class=\"data row68 col6\" >{}</td>\n", - " <td id=\"T_0502a_row68_col7\" class=\"data row68 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_0502a_row68_col8\" class=\"data row68 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row69_col0\" class=\"data row69 col0\" >validmind.model_validation.ContextualRecall</td>\n", - " <td id=\"T_0502a_row69_col1\" class=\"data row69 col1\" >Contextual Recall</td>\n", - " <td id=\"T_0502a_row69_col2\" class=\"data row69 col2\" >Evaluates a Natural Language Generation model's ability to generate contextually relevant and factually correct...</td>\n", - " <td id=\"T_0502a_row69_col3\" class=\"data row69 col3\" >True</td>\n", - " <td id=\"T_0502a_row69_col4\" class=\"data row69 col4\" >True</td>\n", - " <td id=\"T_0502a_row69_col5\" class=\"data row69 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row69_col6\" class=\"data row69 col6\" >{}</td>\n", - " <td id=\"T_0502a_row69_col7\" class=\"data row69 col7\" >['nlp', 'text_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row69_col8\" class=\"data row69 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row70_col0\" class=\"data row70 col0\" >validmind.model_validation.FeaturesAUC</td>\n", - " <td id=\"T_0502a_row70_col1\" class=\"data row70 col1\" >Features AUC</td>\n", - " <td id=\"T_0502a_row70_col2\" class=\"data row70 col2\" >Evaluates the discriminatory power of each individual feature within a binary classification model by calculating...</td>\n", - " <td id=\"T_0502a_row70_col3\" class=\"data row70 col3\" >True</td>\n", - " <td id=\"T_0502a_row70_col4\" class=\"data row70 col4\" >False</td>\n", - " <td id=\"T_0502a_row70_col5\" class=\"data row70 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row70_col6\" class=\"data row70 col6\" >{'fontsize': {'type': 'int', 'default': 12}, 'figure_height': {'type': 'int', 'default': 500}}</td>\n", - " <td id=\"T_0502a_row70_col7\" class=\"data row70 col7\" >['feature_importance', 'AUC', 'visualization']</td>\n", - " <td id=\"T_0502a_row70_col8\" class=\"data row70 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row71_col0\" class=\"data row71 col0\" >validmind.model_validation.MeteorScore</td>\n", - " <td id=\"T_0502a_row71_col1\" class=\"data row71 col1\" >Meteor Score</td>\n", - " <td id=\"T_0502a_row71_col2\" class=\"data row71 col2\" >Assesses the quality of machine-generated translations by comparing them to human-produced references using the...</td>\n", - " <td id=\"T_0502a_row71_col3\" class=\"data row71 col3\" >True</td>\n", - " <td id=\"T_0502a_row71_col4\" class=\"data row71 col4\" >True</td>\n", - " <td id=\"T_0502a_row71_col5\" class=\"data row71 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row71_col6\" class=\"data row71 col6\" >{}</td>\n", - " <td id=\"T_0502a_row71_col7\" class=\"data row71 col7\" >['nlp', 'text_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row71_col8\" class=\"data row71 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row72_col0\" class=\"data row72 col0\" >validmind.model_validation.ModelMetadata</td>\n", - " <td id=\"T_0502a_row72_col1\" class=\"data row72 col1\" >Model Metadata</td>\n", - " <td id=\"T_0502a_row72_col2\" class=\"data row72 col2\" >Compare metadata of different models and generate a summary table with the results....</td>\n", - " <td id=\"T_0502a_row72_col3\" class=\"data row72 col3\" >False</td>\n", - " <td id=\"T_0502a_row72_col4\" class=\"data row72 col4\" >True</td>\n", - " <td id=\"T_0502a_row72_col5\" class=\"data row72 col5\" >['model']</td>\n", - " <td id=\"T_0502a_row72_col6\" class=\"data row72 col6\" >{}</td>\n", - " <td id=\"T_0502a_row72_col7\" class=\"data row72 col7\" >['model_training', 'metadata']</td>\n", - " <td id=\"T_0502a_row72_col8\" class=\"data row72 col8\" >['regression', 'time_series_forecasting']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row73_col0\" class=\"data row73 col0\" >validmind.model_validation.ModelPredictionResiduals</td>\n", - " <td id=\"T_0502a_row73_col1\" class=\"data row73 col1\" >Model Prediction Residuals</td>\n", - " <td id=\"T_0502a_row73_col2\" class=\"data row73 col2\" >Assesses normality and behavior of residuals in regression models through visualization and statistical tests....</td>\n", - " <td id=\"T_0502a_row73_col3\" class=\"data row73 col3\" >True</td>\n", - " <td id=\"T_0502a_row73_col4\" class=\"data row73 col4\" >True</td>\n", - " <td id=\"T_0502a_row73_col5\" class=\"data row73 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row73_col6\" class=\"data row73 col6\" >{'nbins': {'type': 'int', 'default': 100}, 'p_value_threshold': {'type': 'float', 'default': 0.05}, 'start_date': {'type': None, 'default': None}, 'end_date': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_0502a_row73_col7\" class=\"data row73 col7\" >['regression']</td>\n", - " <td id=\"T_0502a_row73_col8\" class=\"data row73 col8\" >['residual_analysis', 'visualization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row74_col0\" class=\"data row74 col0\" >validmind.model_validation.RegardScore</td>\n", - " <td id=\"T_0502a_row74_col1\" class=\"data row74 col1\" >Regard Score</td>\n", - " <td id=\"T_0502a_row74_col2\" class=\"data row74 col2\" >Assesses the sentiment and potential biases in text generated by NLP models by computing and visualizing regard...</td>\n", - " <td id=\"T_0502a_row74_col3\" class=\"data row74 col3\" >True</td>\n", - " <td id=\"T_0502a_row74_col4\" class=\"data row74 col4\" >True</td>\n", - " <td id=\"T_0502a_row74_col5\" class=\"data row74 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row74_col6\" class=\"data row74 col6\" >{}</td>\n", - " <td id=\"T_0502a_row74_col7\" class=\"data row74 col7\" >['nlp', 'text_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row74_col8\" class=\"data row74 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row75_col0\" class=\"data row75 col0\" >validmind.model_validation.RegressionResidualsPlot</td>\n", - " <td id=\"T_0502a_row75_col1\" class=\"data row75 col1\" >Regression Residuals Plot</td>\n", - " <td id=\"T_0502a_row75_col2\" class=\"data row75 col2\" >Evaluates regression model performance using residual distribution and actual vs. predicted plots....</td>\n", - " <td id=\"T_0502a_row75_col3\" class=\"data row75 col3\" >True</td>\n", - " <td id=\"T_0502a_row75_col4\" class=\"data row75 col4\" >False</td>\n", - " <td id=\"T_0502a_row75_col5\" class=\"data row75 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row75_col6\" class=\"data row75 col6\" >{'bin_size': {'type': 'float', 'default': 0.1}}</td>\n", - " <td id=\"T_0502a_row75_col7\" class=\"data row75 col7\" >['model_performance', 'visualization']</td>\n", - " <td id=\"T_0502a_row75_col8\" class=\"data row75 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row76_col0\" class=\"data row76 col0\" >validmind.model_validation.RougeScore</td>\n", - " <td id=\"T_0502a_row76_col1\" class=\"data row76 col1\" >Rouge Score</td>\n", - " <td id=\"T_0502a_row76_col2\" class=\"data row76 col2\" >Assesses the quality of machine-generated text using ROUGE metrics and visualizes the results to provide...</td>\n", - " <td id=\"T_0502a_row76_col3\" class=\"data row76 col3\" >True</td>\n", - " <td id=\"T_0502a_row76_col4\" class=\"data row76 col4\" >True</td>\n", - " <td id=\"T_0502a_row76_col5\" class=\"data row76 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row76_col6\" class=\"data row76 col6\" >{'metric': {'type': 'str', 'default': 'rouge-1'}}</td>\n", - " <td id=\"T_0502a_row76_col7\" class=\"data row76 col7\" >['nlp', 'text_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row76_col8\" class=\"data row76 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row77_col0\" class=\"data row77 col0\" >validmind.model_validation.TimeSeriesPredictionWithCI</td>\n", - " <td id=\"T_0502a_row77_col1\" class=\"data row77 col1\" >Time Series Prediction With CI</td>\n", - " <td id=\"T_0502a_row77_col2\" class=\"data row77 col2\" >Assesses predictive accuracy and uncertainty in time series models, highlighting breaches beyond confidence...</td>\n", - " <td id=\"T_0502a_row77_col3\" class=\"data row77 col3\" >True</td>\n", - " <td id=\"T_0502a_row77_col4\" class=\"data row77 col4\" >True</td>\n", - " <td id=\"T_0502a_row77_col5\" class=\"data row77 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row77_col6\" class=\"data row77 col6\" >{'confidence': {'type': 'float', 'default': 0.95}}</td>\n", - " <td id=\"T_0502a_row77_col7\" class=\"data row77 col7\" >['model_predictions', 'visualization']</td>\n", - " <td id=\"T_0502a_row77_col8\" class=\"data row77 col8\" >['regression', 'time_series_forecasting']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row78_col0\" class=\"data row78 col0\" >validmind.model_validation.TimeSeriesPredictionsPlot</td>\n", - " <td id=\"T_0502a_row78_col1\" class=\"data row78 col1\" >Time Series Predictions Plot</td>\n", - " <td id=\"T_0502a_row78_col2\" class=\"data row78 col2\" >Plot actual vs predicted values for time series data and generate a visual comparison for the model....</td>\n", - " <td id=\"T_0502a_row78_col3\" class=\"data row78 col3\" >True</td>\n", - " <td id=\"T_0502a_row78_col4\" class=\"data row78 col4\" >False</td>\n", - " <td id=\"T_0502a_row78_col5\" class=\"data row78 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row78_col6\" class=\"data row78 col6\" >{}</td>\n", - " <td id=\"T_0502a_row78_col7\" class=\"data row78 col7\" >['model_predictions', 'visualization']</td>\n", - " <td id=\"T_0502a_row78_col8\" class=\"data row78 col8\" >['regression', 'time_series_forecasting']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row79_col0\" class=\"data row79 col0\" >validmind.model_validation.TimeSeriesR2SquareBySegments</td>\n", - " <td id=\"T_0502a_row79_col1\" class=\"data row79 col1\" >Time Series R2 Square By Segments</td>\n", - " <td id=\"T_0502a_row79_col2\" class=\"data row79 col2\" >Evaluates the R-Squared values of regression models over specified time segments in time series data to assess...</td>\n", - " <td id=\"T_0502a_row79_col3\" class=\"data row79 col3\" >True</td>\n", - " <td id=\"T_0502a_row79_col4\" class=\"data row79 col4\" >True</td>\n", - " <td id=\"T_0502a_row79_col5\" class=\"data row79 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row79_col6\" class=\"data row79 col6\" >{'segments': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_0502a_row79_col7\" class=\"data row79 col7\" >['model_performance', 'sklearn']</td>\n", - " <td id=\"T_0502a_row79_col8\" class=\"data row79 col8\" >['regression', 'time_series_forecasting']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row80_col0\" class=\"data row80 col0\" >validmind.model_validation.TokenDisparity</td>\n", - " <td id=\"T_0502a_row80_col1\" class=\"data row80 col1\" >Token Disparity</td>\n", - " <td id=\"T_0502a_row80_col2\" class=\"data row80 col2\" >Evaluates the token disparity between reference and generated texts, visualizing the results through histograms and...</td>\n", - " <td id=\"T_0502a_row80_col3\" class=\"data row80 col3\" >True</td>\n", - " <td id=\"T_0502a_row80_col4\" class=\"data row80 col4\" >True</td>\n", - " <td id=\"T_0502a_row80_col5\" class=\"data row80 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row80_col6\" class=\"data row80 col6\" >{}</td>\n", - " <td id=\"T_0502a_row80_col7\" class=\"data row80 col7\" >['nlp', 'text_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row80_col8\" class=\"data row80 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row81_col0\" class=\"data row81 col0\" >validmind.model_validation.ToxicityScore</td>\n", - " <td id=\"T_0502a_row81_col1\" class=\"data row81 col1\" >Toxicity Score</td>\n", - " <td id=\"T_0502a_row81_col2\" class=\"data row81 col2\" >Assesses the toxicity levels of texts generated by NLP models to identify and mitigate harmful or offensive content....</td>\n", - " <td id=\"T_0502a_row81_col3\" class=\"data row81 col3\" >True</td>\n", - " <td id=\"T_0502a_row81_col4\" class=\"data row81 col4\" >True</td>\n", - " <td id=\"T_0502a_row81_col5\" class=\"data row81 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row81_col6\" class=\"data row81 col6\" >{}</td>\n", - " <td id=\"T_0502a_row81_col7\" class=\"data row81 col7\" >['nlp', 'text_data', 'visualization']</td>\n", - " <td id=\"T_0502a_row81_col8\" class=\"data row81 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row82_col0\" class=\"data row82 col0\" >validmind.model_validation.embeddings.ClusterDistribution</td>\n", - " <td id=\"T_0502a_row82_col1\" class=\"data row82 col1\" >Cluster Distribution</td>\n", - " <td id=\"T_0502a_row82_col2\" class=\"data row82 col2\" >Assesses the distribution of text embeddings across clusters produced by a model using KMeans clustering....</td>\n", - " <td id=\"T_0502a_row82_col3\" class=\"data row82 col3\" >True</td>\n", - " <td id=\"T_0502a_row82_col4\" class=\"data row82 col4\" >False</td>\n", - " <td id=\"T_0502a_row82_col5\" class=\"data row82 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row82_col6\" class=\"data row82 col6\" >{'num_clusters': {'type': 'int', 'default': 5}}</td>\n", - " <td id=\"T_0502a_row82_col7\" class=\"data row82 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", - " <td id=\"T_0502a_row82_col8\" class=\"data row82 col8\" >['feature_extraction']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row83_col0\" class=\"data row83 col0\" >validmind.model_validation.embeddings.CosineSimilarityComparison</td>\n", - " <td id=\"T_0502a_row83_col1\" class=\"data row83 col1\" >Cosine Similarity Comparison</td>\n", - " <td id=\"T_0502a_row83_col2\" class=\"data row83 col2\" >Assesses the similarity between embeddings generated by different models using Cosine Similarity, providing both...</td>\n", - " <td id=\"T_0502a_row83_col3\" class=\"data row83 col3\" >True</td>\n", - " <td id=\"T_0502a_row83_col4\" class=\"data row83 col4\" >True</td>\n", - " <td id=\"T_0502a_row83_col5\" class=\"data row83 col5\" >['dataset', 'models']</td>\n", - " <td id=\"T_0502a_row83_col6\" class=\"data row83 col6\" >{}</td>\n", - " <td id=\"T_0502a_row83_col7\" class=\"data row83 col7\" >['visualization', 'dimensionality_reduction', 'embeddings']</td>\n", - " <td id=\"T_0502a_row83_col8\" class=\"data row83 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row84_col0\" class=\"data row84 col0\" >validmind.model_validation.embeddings.CosineSimilarityDistribution</td>\n", - " <td id=\"T_0502a_row84_col1\" class=\"data row84 col1\" >Cosine Similarity Distribution</td>\n", - " <td id=\"T_0502a_row84_col2\" class=\"data row84 col2\" >Assesses the similarity between predicted text embeddings from a model using a Cosine Similarity distribution...</td>\n", - " <td id=\"T_0502a_row84_col3\" class=\"data row84 col3\" >True</td>\n", - " <td id=\"T_0502a_row84_col4\" class=\"data row84 col4\" >False</td>\n", - " <td id=\"T_0502a_row84_col5\" class=\"data row84 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row84_col6\" class=\"data row84 col6\" >{}</td>\n", - " <td id=\"T_0502a_row84_col7\" class=\"data row84 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", - " <td id=\"T_0502a_row84_col8\" class=\"data row84 col8\" >['feature_extraction']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row85_col0\" class=\"data row85 col0\" >validmind.model_validation.embeddings.CosineSimilarityHeatmap</td>\n", - " <td id=\"T_0502a_row85_col1\" class=\"data row85 col1\" >Cosine Similarity Heatmap</td>\n", - " <td id=\"T_0502a_row85_col2\" class=\"data row85 col2\" >Generates an interactive heatmap to visualize the cosine similarities among embeddings derived from a given model....</td>\n", - " <td id=\"T_0502a_row85_col3\" class=\"data row85 col3\" >True</td>\n", - " <td id=\"T_0502a_row85_col4\" class=\"data row85 col4\" >False</td>\n", - " <td id=\"T_0502a_row85_col5\" class=\"data row85 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row85_col6\" class=\"data row85 col6\" >{'title': {'type': '_empty', 'default': 'Cosine Similarity Matrix'}, 'color': {'type': '_empty', 'default': 'Cosine Similarity'}, 'xaxis_title': {'type': '_empty', 'default': 'Index'}, 'yaxis_title': {'type': '_empty', 'default': 'Index'}, 'color_scale': {'type': '_empty', 'default': 'Blues'}}</td>\n", - " <td id=\"T_0502a_row85_col7\" class=\"data row85 col7\" >['visualization', 'dimensionality_reduction', 'embeddings']</td>\n", - " <td id=\"T_0502a_row85_col8\" class=\"data row85 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row86_col0\" class=\"data row86 col0\" >validmind.model_validation.embeddings.DescriptiveAnalytics</td>\n", - " <td id=\"T_0502a_row86_col1\" class=\"data row86 col1\" >Descriptive Analytics</td>\n", - " <td id=\"T_0502a_row86_col2\" class=\"data row86 col2\" >Evaluates statistical properties of text embeddings in an ML model via mean, median, and standard deviation...</td>\n", - " <td id=\"T_0502a_row86_col3\" class=\"data row86 col3\" >True</td>\n", - " <td id=\"T_0502a_row86_col4\" class=\"data row86 col4\" >False</td>\n", - " <td id=\"T_0502a_row86_col5\" class=\"data row86 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row86_col6\" class=\"data row86 col6\" >{}</td>\n", - " <td id=\"T_0502a_row86_col7\" class=\"data row86 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", - " <td id=\"T_0502a_row86_col8\" class=\"data row86 col8\" >['feature_extraction']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row87_col0\" class=\"data row87 col0\" >validmind.model_validation.embeddings.EmbeddingsVisualization2D</td>\n", - " <td id=\"T_0502a_row87_col1\" class=\"data row87 col1\" >Embeddings Visualization2 D</td>\n", - " <td id=\"T_0502a_row87_col2\" class=\"data row87 col2\" >Visualizes 2D representation of text embeddings generated by a model using t-SNE technique....</td>\n", - " <td id=\"T_0502a_row87_col3\" class=\"data row87 col3\" >True</td>\n", - " <td id=\"T_0502a_row87_col4\" class=\"data row87 col4\" >False</td>\n", - " <td id=\"T_0502a_row87_col5\" class=\"data row87 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row87_col6\" class=\"data row87 col6\" >{'cluster_column': {'type': None, 'default': None}, 'perplexity': {'type': 'int', 'default': 30}}</td>\n", - " <td id=\"T_0502a_row87_col7\" class=\"data row87 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", - " <td id=\"T_0502a_row87_col8\" class=\"data row87 col8\" >['feature_extraction']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row88_col0\" class=\"data row88 col0\" >validmind.model_validation.embeddings.EuclideanDistanceComparison</td>\n", - " <td id=\"T_0502a_row88_col1\" class=\"data row88 col1\" >Euclidean Distance Comparison</td>\n", - " <td id=\"T_0502a_row88_col2\" class=\"data row88 col2\" >Assesses and visualizes the dissimilarity between model embeddings using Euclidean distance, providing insights...</td>\n", - " <td id=\"T_0502a_row88_col3\" class=\"data row88 col3\" >True</td>\n", - " <td id=\"T_0502a_row88_col4\" class=\"data row88 col4\" >True</td>\n", - " <td id=\"T_0502a_row88_col5\" class=\"data row88 col5\" >['dataset', 'models']</td>\n", - " <td id=\"T_0502a_row88_col6\" class=\"data row88 col6\" >{}</td>\n", - " <td id=\"T_0502a_row88_col7\" class=\"data row88 col7\" >['visualization', 'dimensionality_reduction', 'embeddings']</td>\n", - " <td id=\"T_0502a_row88_col8\" class=\"data row88 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row89_col0\" class=\"data row89 col0\" >validmind.model_validation.embeddings.EuclideanDistanceHeatmap</td>\n", - " <td id=\"T_0502a_row89_col1\" class=\"data row89 col1\" >Euclidean Distance Heatmap</td>\n", - " <td id=\"T_0502a_row89_col2\" class=\"data row89 col2\" >Generates an interactive heatmap to visualize the Euclidean distances among embeddings derived from a given model....</td>\n", - " <td id=\"T_0502a_row89_col3\" class=\"data row89 col3\" >True</td>\n", - " <td id=\"T_0502a_row89_col4\" class=\"data row89 col4\" >False</td>\n", - " <td id=\"T_0502a_row89_col5\" class=\"data row89 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row89_col6\" class=\"data row89 col6\" >{'title': {'type': '_empty', 'default': 'Euclidean Distance Matrix'}, 'color': {'type': '_empty', 'default': 'Euclidean Distance'}, 'xaxis_title': {'type': '_empty', 'default': 'Index'}, 'yaxis_title': {'type': '_empty', 'default': 'Index'}, 'color_scale': {'type': '_empty', 'default': 'Blues'}}</td>\n", - " <td id=\"T_0502a_row89_col7\" class=\"data row89 col7\" >['visualization', 'dimensionality_reduction', 'embeddings']</td>\n", - " <td id=\"T_0502a_row89_col8\" class=\"data row89 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row90_col0\" class=\"data row90 col0\" >validmind.model_validation.embeddings.PCAComponentsPairwisePlots</td>\n", - " <td id=\"T_0502a_row90_col1\" class=\"data row90 col1\" >PCA Components Pairwise Plots</td>\n", - " <td id=\"T_0502a_row90_col2\" class=\"data row90 col2\" >Generates scatter plots for pairwise combinations of principal component analysis (PCA) components of model...</td>\n", - " <td id=\"T_0502a_row90_col3\" class=\"data row90 col3\" >True</td>\n", - " <td id=\"T_0502a_row90_col4\" class=\"data row90 col4\" >False</td>\n", - " <td id=\"T_0502a_row90_col5\" class=\"data row90 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row90_col6\" class=\"data row90 col6\" >{'n_components': {'type': 'int', 'default': 3}}</td>\n", - " <td id=\"T_0502a_row90_col7\" class=\"data row90 col7\" >['visualization', 'dimensionality_reduction', 'embeddings']</td>\n", - " <td id=\"T_0502a_row90_col8\" class=\"data row90 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row91_col0\" class=\"data row91 col0\" >validmind.model_validation.embeddings.StabilityAnalysisKeyword</td>\n", - " <td id=\"T_0502a_row91_col1\" class=\"data row91 col1\" >Stability Analysis Keyword</td>\n", - " <td id=\"T_0502a_row91_col2\" class=\"data row91 col2\" >Evaluates robustness of embedding models to keyword swaps in the test dataset....</td>\n", - " <td id=\"T_0502a_row91_col3\" class=\"data row91 col3\" >True</td>\n", - " <td id=\"T_0502a_row91_col4\" class=\"data row91 col4\" >True</td>\n", - " <td id=\"T_0502a_row91_col5\" class=\"data row91 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row91_col6\" class=\"data row91 col6\" >{'keyword_dict': {'type': None, 'default': None}, 'mean_similarity_threshold': {'type': 'float', 'default': 0.7}}</td>\n", - " <td id=\"T_0502a_row91_col7\" class=\"data row91 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", - " <td id=\"T_0502a_row91_col8\" class=\"data row91 col8\" >['feature_extraction']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row92_col0\" class=\"data row92 col0\" >validmind.model_validation.embeddings.StabilityAnalysisRandomNoise</td>\n", - " <td id=\"T_0502a_row92_col1\" class=\"data row92 col1\" >Stability Analysis Random Noise</td>\n", - " <td id=\"T_0502a_row92_col2\" class=\"data row92 col2\" >Assesses the robustness of text embeddings models to random noise introduced via text perturbations....</td>\n", - " <td id=\"T_0502a_row92_col3\" class=\"data row92 col3\" >True</td>\n", - " <td id=\"T_0502a_row92_col4\" class=\"data row92 col4\" >True</td>\n", - " <td id=\"T_0502a_row92_col5\" class=\"data row92 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row92_col6\" class=\"data row92 col6\" >{'probability': {'type': 'float', 'default': 0.02}, 'mean_similarity_threshold': {'type': 'float', 'default': 0.7}}</td>\n", - " <td id=\"T_0502a_row92_col7\" class=\"data row92 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", - " <td id=\"T_0502a_row92_col8\" class=\"data row92 col8\" >['feature_extraction']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row93_col0\" class=\"data row93 col0\" >validmind.model_validation.embeddings.StabilityAnalysisSynonyms</td>\n", - " <td id=\"T_0502a_row93_col1\" class=\"data row93 col1\" >Stability Analysis Synonyms</td>\n", - " <td id=\"T_0502a_row93_col2\" class=\"data row93 col2\" >Evaluates the stability of text embeddings models when words in test data are replaced by their synonyms randomly....</td>\n", - " <td id=\"T_0502a_row93_col3\" class=\"data row93 col3\" >True</td>\n", - " <td id=\"T_0502a_row93_col4\" class=\"data row93 col4\" >True</td>\n", - " <td id=\"T_0502a_row93_col5\" class=\"data row93 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row93_col6\" class=\"data row93 col6\" >{'probability': {'type': 'float', 'default': 0.02}, 'mean_similarity_threshold': {'type': 'float', 'default': 0.7}}</td>\n", - " <td id=\"T_0502a_row93_col7\" class=\"data row93 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", - " <td id=\"T_0502a_row93_col8\" class=\"data row93 col8\" >['feature_extraction']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row94_col0\" class=\"data row94 col0\" >validmind.model_validation.embeddings.StabilityAnalysisTranslation</td>\n", - " <td id=\"T_0502a_row94_col1\" class=\"data row94 col1\" >Stability Analysis Translation</td>\n", - " <td id=\"T_0502a_row94_col2\" class=\"data row94 col2\" >Evaluates robustness of text embeddings models to noise introduced by translating the original text to another...</td>\n", - " <td id=\"T_0502a_row94_col3\" class=\"data row94 col3\" >True</td>\n", - " <td id=\"T_0502a_row94_col4\" class=\"data row94 col4\" >True</td>\n", - " <td id=\"T_0502a_row94_col5\" class=\"data row94 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row94_col6\" class=\"data row94 col6\" >{'source_lang': {'type': 'str', 'default': 'en'}, 'target_lang': {'type': 'str', 'default': 'fr'}, 'mean_similarity_threshold': {'type': 'float', 'default': 0.7}}</td>\n", - " <td id=\"T_0502a_row94_col7\" class=\"data row94 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", - " <td id=\"T_0502a_row94_col8\" class=\"data row94 col8\" >['feature_extraction']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row95_col0\" class=\"data row95 col0\" >validmind.model_validation.embeddings.TSNEComponentsPairwisePlots</td>\n", - " <td id=\"T_0502a_row95_col1\" class=\"data row95 col1\" >TSNE Components Pairwise Plots</td>\n", - " <td id=\"T_0502a_row95_col2\" class=\"data row95 col2\" >Creates scatter plots for pairwise combinations of t-SNE components to visualize embeddings and highlight potential...</td>\n", - " <td id=\"T_0502a_row95_col3\" class=\"data row95 col3\" >True</td>\n", - " <td id=\"T_0502a_row95_col4\" class=\"data row95 col4\" >False</td>\n", - " <td id=\"T_0502a_row95_col5\" class=\"data row95 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row95_col6\" class=\"data row95 col6\" >{'n_components': {'type': 'int', 'default': 2}, 'perplexity': {'type': 'int', 'default': 30}, 'title': {'type': 'str', 'default': 't-SNE'}}</td>\n", - " <td id=\"T_0502a_row95_col7\" class=\"data row95 col7\" >['visualization', 'dimensionality_reduction', 'embeddings']</td>\n", - " <td id=\"T_0502a_row95_col8\" class=\"data row95 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row96_col0\" class=\"data row96 col0\" >validmind.model_validation.ragas.AnswerCorrectness</td>\n", - " <td id=\"T_0502a_row96_col1\" class=\"data row96 col1\" >Answer Correctness</td>\n", - " <td id=\"T_0502a_row96_col2\" class=\"data row96 col2\" >Evaluates the correctness of answers in a dataset with respect to the provided ground...</td>\n", - " <td id=\"T_0502a_row96_col3\" class=\"data row96 col3\" >True</td>\n", - " <td id=\"T_0502a_row96_col4\" class=\"data row96 col4\" >True</td>\n", - " <td id=\"T_0502a_row96_col5\" class=\"data row96 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row96_col6\" class=\"data row96 col6\" >{'user_input_column': {'type': 'str', 'default': 'user_input'}, 'response_column': {'type': 'str', 'default': 'response'}, 'reference_column': {'type': 'str', 'default': 'reference'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row96_col7\" class=\"data row96 col7\" >['ragas', 'llm']</td>\n", - " <td id=\"T_0502a_row96_col8\" class=\"data row96 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row97_col0\" class=\"data row97 col0\" >validmind.model_validation.ragas.AspectCritic</td>\n", - " <td id=\"T_0502a_row97_col1\" class=\"data row97 col1\" >Aspect Critic</td>\n", - " <td id=\"T_0502a_row97_col2\" class=\"data row97 col2\" >Evaluates generations against the following aspects: harmfulness, maliciousness,...</td>\n", - " <td id=\"T_0502a_row97_col3\" class=\"data row97 col3\" >True</td>\n", - " <td id=\"T_0502a_row97_col4\" class=\"data row97 col4\" >True</td>\n", - " <td id=\"T_0502a_row97_col5\" class=\"data row97 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row97_col6\" class=\"data row97 col6\" >{'user_input_column': {'type': 'str', 'default': 'user_input'}, 'response_column': {'type': 'str', 'default': 'response'}, 'retrieved_contexts_column': {'type': None, 'default': None}, 'aspects': {'type': None, 'default': ['coherence', 'conciseness', 'correctness', 'harmfulness', 'maliciousness']}, 'additional_aspects': {'type': None, 'default': None}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row97_col7\" class=\"data row97 col7\" >['ragas', 'llm', 'qualitative']</td>\n", - " <td id=\"T_0502a_row97_col8\" class=\"data row97 col8\" >['text_summarization', 'text_generation', 'text_qa']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row98_col0\" class=\"data row98 col0\" >validmind.model_validation.ragas.ContextEntityRecall</td>\n", - " <td id=\"T_0502a_row98_col1\" class=\"data row98 col1\" >Context Entity Recall</td>\n", - " <td id=\"T_0502a_row98_col2\" class=\"data row98 col2\" >Evaluates the context entity recall for dataset entries and visualizes the results....</td>\n", - " <td id=\"T_0502a_row98_col3\" class=\"data row98 col3\" >True</td>\n", - " <td id=\"T_0502a_row98_col4\" class=\"data row98 col4\" >True</td>\n", - " <td id=\"T_0502a_row98_col5\" class=\"data row98 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row98_col6\" class=\"data row98 col6\" >{'retrieved_contexts_column': {'type': 'str', 'default': 'retrieved_contexts'}, 'reference_column': {'type': 'str', 'default': 'reference'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row98_col7\" class=\"data row98 col7\" >['ragas', 'llm', 'retrieval_performance']</td>\n", - " <td id=\"T_0502a_row98_col8\" class=\"data row98 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row99_col0\" class=\"data row99 col0\" >validmind.model_validation.ragas.ContextPrecision</td>\n", - " <td id=\"T_0502a_row99_col1\" class=\"data row99 col1\" >Context Precision</td>\n", - " <td id=\"T_0502a_row99_col2\" class=\"data row99 col2\" >Context Precision is a metric that evaluates whether all of the ground-truth...</td>\n", - " <td id=\"T_0502a_row99_col3\" class=\"data row99 col3\" >True</td>\n", - " <td id=\"T_0502a_row99_col4\" class=\"data row99 col4\" >True</td>\n", - " <td id=\"T_0502a_row99_col5\" class=\"data row99 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row99_col6\" class=\"data row99 col6\" >{'user_input_column': {'type': 'str', 'default': 'user_input'}, 'retrieved_contexts_column': {'type': 'str', 'default': 'retrieved_contexts'}, 'reference_column': {'type': 'str', 'default': 'reference'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row99_col7\" class=\"data row99 col7\" >['ragas', 'llm', 'retrieval_performance']</td>\n", - " <td id=\"T_0502a_row99_col8\" class=\"data row99 col8\" >['text_qa', 'text_generation', 'text_summarization', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row100_col0\" class=\"data row100 col0\" >validmind.model_validation.ragas.ContextPrecisionWithoutReference</td>\n", - " <td id=\"T_0502a_row100_col1\" class=\"data row100 col1\" >Context Precision Without Reference</td>\n", - " <td id=\"T_0502a_row100_col2\" class=\"data row100 col2\" >Context Precision Without Reference is a metric used to evaluate the relevance of...</td>\n", - " <td id=\"T_0502a_row100_col3\" class=\"data row100 col3\" >True</td>\n", - " <td id=\"T_0502a_row100_col4\" class=\"data row100 col4\" >True</td>\n", - " <td id=\"T_0502a_row100_col5\" class=\"data row100 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row100_col6\" class=\"data row100 col6\" >{'user_input_column': {'type': 'str', 'default': 'user_input'}, 'retrieved_contexts_column': {'type': 'str', 'default': 'retrieved_contexts'}, 'response_column': {'type': 'str', 'default': 'response'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row100_col7\" class=\"data row100 col7\" >['ragas', 'llm', 'retrieval_performance']</td>\n", - " <td id=\"T_0502a_row100_col8\" class=\"data row100 col8\" >['text_qa', 'text_generation', 'text_summarization', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row101_col0\" class=\"data row101 col0\" >validmind.model_validation.ragas.ContextRecall</td>\n", - " <td id=\"T_0502a_row101_col1\" class=\"data row101 col1\" >Context Recall</td>\n", - " <td id=\"T_0502a_row101_col2\" class=\"data row101 col2\" >Context recall measures the extent to which the retrieved context aligns with the...</td>\n", - " <td id=\"T_0502a_row101_col3\" class=\"data row101 col3\" >True</td>\n", - " <td id=\"T_0502a_row101_col4\" class=\"data row101 col4\" >True</td>\n", - " <td id=\"T_0502a_row101_col5\" class=\"data row101 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row101_col6\" class=\"data row101 col6\" >{'user_input_column': {'type': 'str', 'default': 'user_input'}, 'retrieved_contexts_column': {'type': 'str', 'default': 'retrieved_contexts'}, 'reference_column': {'type': 'str', 'default': 'reference'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row101_col7\" class=\"data row101 col7\" >['ragas', 'llm', 'retrieval_performance']</td>\n", - " <td id=\"T_0502a_row101_col8\" class=\"data row101 col8\" >['text_qa', 'text_generation', 'text_summarization', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row102_col0\" class=\"data row102 col0\" >validmind.model_validation.ragas.Faithfulness</td>\n", - " <td id=\"T_0502a_row102_col1\" class=\"data row102 col1\" >Faithfulness</td>\n", - " <td id=\"T_0502a_row102_col2\" class=\"data row102 col2\" >Evaluates the faithfulness of the generated answers with respect to retrieved contexts....</td>\n", - " <td id=\"T_0502a_row102_col3\" class=\"data row102 col3\" >True</td>\n", - " <td id=\"T_0502a_row102_col4\" class=\"data row102 col4\" >True</td>\n", - " <td id=\"T_0502a_row102_col5\" class=\"data row102 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row102_col6\" class=\"data row102 col6\" >{'user_input_column': {'type': 'str', 'default': 'user_input'}, 'response_column': {'type': 'str', 'default': 'response'}, 'retrieved_contexts_column': {'type': 'str', 'default': 'retrieved_contexts'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row102_col7\" class=\"data row102 col7\" >['ragas', 'llm', 'rag_performance']</td>\n", - " <td id=\"T_0502a_row102_col8\" class=\"data row102 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row103_col0\" class=\"data row103 col0\" >validmind.model_validation.ragas.NoiseSensitivity</td>\n", - " <td id=\"T_0502a_row103_col1\" class=\"data row103 col1\" >Noise Sensitivity</td>\n", - " <td id=\"T_0502a_row103_col2\" class=\"data row103 col2\" >Assesses the sensitivity of a Large Language Model (LLM) to noise in retrieved context by measuring how often it...</td>\n", - " <td id=\"T_0502a_row103_col3\" class=\"data row103 col3\" >True</td>\n", - " <td id=\"T_0502a_row103_col4\" class=\"data row103 col4\" >True</td>\n", - " <td id=\"T_0502a_row103_col5\" class=\"data row103 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row103_col6\" class=\"data row103 col6\" >{'response_column': {'type': 'str', 'default': 'response'}, 'retrieved_contexts_column': {'type': 'str', 'default': 'retrieved_contexts'}, 'reference_column': {'type': 'str', 'default': 'reference'}, 'focus': {'type': 'str', 'default': 'relevant'}, 'user_input_column': {'type': 'str', 'default': 'user_input'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row103_col7\" class=\"data row103 col7\" >['ragas', 'llm', 'rag_performance']</td>\n", - " <td id=\"T_0502a_row103_col8\" class=\"data row103 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row104_col0\" class=\"data row104 col0\" >validmind.model_validation.ragas.ResponseRelevancy</td>\n", - " <td id=\"T_0502a_row104_col1\" class=\"data row104 col1\" >Response Relevancy</td>\n", - " <td id=\"T_0502a_row104_col2\" class=\"data row104 col2\" >Assesses how pertinent the generated answer is to the given prompt....</td>\n", - " <td id=\"T_0502a_row104_col3\" class=\"data row104 col3\" >True</td>\n", - " <td id=\"T_0502a_row104_col4\" class=\"data row104 col4\" >True</td>\n", - " <td id=\"T_0502a_row104_col5\" class=\"data row104 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row104_col6\" class=\"data row104 col6\" >{'user_input_column': {'type': 'str', 'default': 'user_input'}, 'retrieved_contexts_column': {'type': 'str', 'default': None}, 'response_column': {'type': 'str', 'default': 'response'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row104_col7\" class=\"data row104 col7\" >['ragas', 'llm', 'rag_performance']</td>\n", - " <td id=\"T_0502a_row104_col8\" class=\"data row104 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row105_col0\" class=\"data row105 col0\" >validmind.model_validation.ragas.SemanticSimilarity</td>\n", - " <td id=\"T_0502a_row105_col1\" class=\"data row105 col1\" >Semantic Similarity</td>\n", - " <td id=\"T_0502a_row105_col2\" class=\"data row105 col2\" >Calculates the semantic similarity between generated responses and ground truths...</td>\n", - " <td id=\"T_0502a_row105_col3\" class=\"data row105 col3\" >True</td>\n", - " <td id=\"T_0502a_row105_col4\" class=\"data row105 col4\" >True</td>\n", - " <td id=\"T_0502a_row105_col5\" class=\"data row105 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row105_col6\" class=\"data row105 col6\" >{'response_column': {'type': 'str', 'default': 'response'}, 'reference_column': {'type': 'str', 'default': 'reference'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row105_col7\" class=\"data row105 col7\" >['ragas', 'llm']</td>\n", - " <td id=\"T_0502a_row105_col8\" class=\"data row105 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row106_col0\" class=\"data row106 col0\" >validmind.model_validation.sklearn.AdjustedMutualInformation</td>\n", - " <td id=\"T_0502a_row106_col1\" class=\"data row106 col1\" >Adjusted Mutual Information</td>\n", - " <td id=\"T_0502a_row106_col2\" class=\"data row106 col2\" >Evaluates clustering model performance by measuring mutual information between true and predicted labels, adjusting...</td>\n", - " <td id=\"T_0502a_row106_col3\" class=\"data row106 col3\" >False</td>\n", - " <td id=\"T_0502a_row106_col4\" class=\"data row106 col4\" >True</td>\n", - " <td id=\"T_0502a_row106_col5\" class=\"data row106 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row106_col6\" class=\"data row106 col6\" >{}</td>\n", - " <td id=\"T_0502a_row106_col7\" class=\"data row106 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", - " <td id=\"T_0502a_row106_col8\" class=\"data row106 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row107_col0\" class=\"data row107 col0\" >validmind.model_validation.sklearn.AdjustedRandIndex</td>\n", - " <td id=\"T_0502a_row107_col1\" class=\"data row107 col1\" >Adjusted Rand Index</td>\n", - " <td id=\"T_0502a_row107_col2\" class=\"data row107 col2\" >Measures the similarity between two data clusters using the Adjusted Rand Index (ARI) metric in clustering machine...</td>\n", - " <td id=\"T_0502a_row107_col3\" class=\"data row107 col3\" >False</td>\n", - " <td id=\"T_0502a_row107_col4\" class=\"data row107 col4\" >True</td>\n", - " <td id=\"T_0502a_row107_col5\" class=\"data row107 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row107_col6\" class=\"data row107 col6\" >{}</td>\n", - " <td id=\"T_0502a_row107_col7\" class=\"data row107 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", - " <td id=\"T_0502a_row107_col8\" class=\"data row107 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row108_col0\" class=\"data row108 col0\" >validmind.model_validation.sklearn.CalibrationCurve</td>\n", - " <td id=\"T_0502a_row108_col1\" class=\"data row108 col1\" >Calibration Curve</td>\n", - " <td id=\"T_0502a_row108_col2\" class=\"data row108 col2\" >Evaluates the calibration of probability estimates by comparing predicted probabilities against observed...</td>\n", - " <td id=\"T_0502a_row108_col3\" class=\"data row108 col3\" >True</td>\n", - " <td id=\"T_0502a_row108_col4\" class=\"data row108 col4\" >False</td>\n", - " <td id=\"T_0502a_row108_col5\" class=\"data row108 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row108_col6\" class=\"data row108 col6\" >{'n_bins': {'type': 'int', 'default': 10}}</td>\n", - " <td id=\"T_0502a_row108_col7\" class=\"data row108 col7\" >['sklearn', 'model_performance', 'classification']</td>\n", - " <td id=\"T_0502a_row108_col8\" class=\"data row108 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row109_col0\" class=\"data row109 col0\" >validmind.model_validation.sklearn.ClassifierPerformance</td>\n", - " <td id=\"T_0502a_row109_col1\" class=\"data row109 col1\" >Classifier Performance</td>\n", - " <td id=\"T_0502a_row109_col2\" class=\"data row109 col2\" >Evaluates performance of binary or multiclass classification models using precision, recall, F1-Score, accuracy,...</td>\n", - " <td id=\"T_0502a_row109_col3\" class=\"data row109 col3\" >False</td>\n", - " <td id=\"T_0502a_row109_col4\" class=\"data row109 col4\" >True</td>\n", - " <td id=\"T_0502a_row109_col5\" class=\"data row109 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row109_col6\" class=\"data row109 col6\" >{'average': {'type': 'str', 'default': 'macro'}}</td>\n", - " <td id=\"T_0502a_row109_col7\" class=\"data row109 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_0502a_row109_col8\" class=\"data row109 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row110_col0\" class=\"data row110 col0\" >validmind.model_validation.sklearn.ClassifierThresholdOptimization</td>\n", - " <td id=\"T_0502a_row110_col1\" class=\"data row110 col1\" >Classifier Threshold Optimization</td>\n", - " <td id=\"T_0502a_row110_col2\" class=\"data row110 col2\" >Analyzes and visualizes different threshold optimization methods for binary classification models....</td>\n", - " <td id=\"T_0502a_row110_col3\" class=\"data row110 col3\" >False</td>\n", - " <td id=\"T_0502a_row110_col4\" class=\"data row110 col4\" >True</td>\n", - " <td id=\"T_0502a_row110_col5\" class=\"data row110 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row110_col6\" class=\"data row110 col6\" >{'methods': {'type': None, 'default': None}, 'target_recall': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_0502a_row110_col7\" class=\"data row110 col7\" >['model_validation', 'threshold_optimization', 'classification_metrics']</td>\n", - " <td id=\"T_0502a_row110_col8\" class=\"data row110 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row111_col0\" class=\"data row111 col0\" >validmind.model_validation.sklearn.ClusterCosineSimilarity</td>\n", - " <td id=\"T_0502a_row111_col1\" class=\"data row111 col1\" >Cluster Cosine Similarity</td>\n", - " <td id=\"T_0502a_row111_col2\" class=\"data row111 col2\" >Measures the intra-cluster similarity of a clustering model using cosine similarity....</td>\n", - " <td id=\"T_0502a_row111_col3\" class=\"data row111 col3\" >False</td>\n", - " <td id=\"T_0502a_row111_col4\" class=\"data row111 col4\" >True</td>\n", - " <td id=\"T_0502a_row111_col5\" class=\"data row111 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row111_col6\" class=\"data row111 col6\" >{}</td>\n", - " <td id=\"T_0502a_row111_col7\" class=\"data row111 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", - " <td id=\"T_0502a_row111_col8\" class=\"data row111 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row112_col0\" class=\"data row112 col0\" >validmind.model_validation.sklearn.ClusterPerformanceMetrics</td>\n", - " <td id=\"T_0502a_row112_col1\" class=\"data row112 col1\" >Cluster Performance Metrics</td>\n", - " <td id=\"T_0502a_row112_col2\" class=\"data row112 col2\" >Evaluates the performance of clustering machine learning models using multiple established metrics....</td>\n", - " <td id=\"T_0502a_row112_col3\" class=\"data row112 col3\" >False</td>\n", - " <td id=\"T_0502a_row112_col4\" class=\"data row112 col4\" >True</td>\n", - " <td id=\"T_0502a_row112_col5\" class=\"data row112 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row112_col6\" class=\"data row112 col6\" >{}</td>\n", - " <td id=\"T_0502a_row112_col7\" class=\"data row112 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", - " <td id=\"T_0502a_row112_col8\" class=\"data row112 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row113_col0\" class=\"data row113 col0\" >validmind.model_validation.sklearn.CompletenessScore</td>\n", - " <td id=\"T_0502a_row113_col1\" class=\"data row113 col1\" >Completeness Score</td>\n", - " <td id=\"T_0502a_row113_col2\" class=\"data row113 col2\" >Evaluates a clustering model's capacity to categorize instances from a single class into the same cluster....</td>\n", - " <td id=\"T_0502a_row113_col3\" class=\"data row113 col3\" >False</td>\n", - " <td id=\"T_0502a_row113_col4\" class=\"data row113 col4\" >True</td>\n", - " <td id=\"T_0502a_row113_col5\" class=\"data row113 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row113_col6\" class=\"data row113 col6\" >{}</td>\n", - " <td id=\"T_0502a_row113_col7\" class=\"data row113 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", - " <td id=\"T_0502a_row113_col8\" class=\"data row113 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row114_col0\" class=\"data row114 col0\" >validmind.model_validation.sklearn.ConfusionMatrix</td>\n", - " <td id=\"T_0502a_row114_col1\" class=\"data row114 col1\" >Confusion Matrix</td>\n", - " <td id=\"T_0502a_row114_col2\" class=\"data row114 col2\" >Evaluates and visually represents the classification ML model's predictive performance using a Confusion Matrix...</td>\n", - " <td id=\"T_0502a_row114_col3\" class=\"data row114 col3\" >True</td>\n", - " <td id=\"T_0502a_row114_col4\" class=\"data row114 col4\" >False</td>\n", - " <td id=\"T_0502a_row114_col5\" class=\"data row114 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row114_col6\" class=\"data row114 col6\" >{'threshold': {'type': 'float', 'default': 0.5}}</td>\n", - " <td id=\"T_0502a_row114_col7\" class=\"data row114 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_0502a_row114_col8\" class=\"data row114 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row115_col0\" class=\"data row115 col0\" >validmind.model_validation.sklearn.FeatureImportance</td>\n", - " <td id=\"T_0502a_row115_col1\" class=\"data row115 col1\" >Feature Importance</td>\n", - " <td id=\"T_0502a_row115_col2\" class=\"data row115 col2\" >Compute feature importance scores for a given model and generate a summary table...</td>\n", - " <td id=\"T_0502a_row115_col3\" class=\"data row115 col3\" >False</td>\n", - " <td id=\"T_0502a_row115_col4\" class=\"data row115 col4\" >True</td>\n", - " <td id=\"T_0502a_row115_col5\" class=\"data row115 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row115_col6\" class=\"data row115 col6\" >{'num_features': {'type': 'int', 'default': 3}}</td>\n", - " <td id=\"T_0502a_row115_col7\" class=\"data row115 col7\" >['model_explainability', 'sklearn']</td>\n", - " <td id=\"T_0502a_row115_col8\" class=\"data row115 col8\" >['regression', 'time_series_forecasting']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row116_col0\" class=\"data row116 col0\" >validmind.model_validation.sklearn.FowlkesMallowsScore</td>\n", - " <td id=\"T_0502a_row116_col1\" class=\"data row116 col1\" >Fowlkes Mallows Score</td>\n", - " <td id=\"T_0502a_row116_col2\" class=\"data row116 col2\" >Evaluates the similarity between predicted and actual cluster assignments in a model using the Fowlkes-Mallows...</td>\n", - " <td id=\"T_0502a_row116_col3\" class=\"data row116 col3\" >False</td>\n", - " <td id=\"T_0502a_row116_col4\" class=\"data row116 col4\" >True</td>\n", - " <td id=\"T_0502a_row116_col5\" class=\"data row116 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row116_col6\" class=\"data row116 col6\" >{}</td>\n", - " <td id=\"T_0502a_row116_col7\" class=\"data row116 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_0502a_row116_col8\" class=\"data row116 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row117_col0\" class=\"data row117 col0\" >validmind.model_validation.sklearn.HomogeneityScore</td>\n", - " <td id=\"T_0502a_row117_col1\" class=\"data row117 col1\" >Homogeneity Score</td>\n", - " <td id=\"T_0502a_row117_col2\" class=\"data row117 col2\" >Assesses clustering homogeneity by comparing true and predicted labels, scoring from 0 (heterogeneous) to 1...</td>\n", - " <td id=\"T_0502a_row117_col3\" class=\"data row117 col3\" >False</td>\n", - " <td id=\"T_0502a_row117_col4\" class=\"data row117 col4\" >True</td>\n", - " <td id=\"T_0502a_row117_col5\" class=\"data row117 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row117_col6\" class=\"data row117 col6\" >{}</td>\n", - " <td id=\"T_0502a_row117_col7\" class=\"data row117 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_0502a_row117_col8\" class=\"data row117 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row118_col0\" class=\"data row118 col0\" >validmind.model_validation.sklearn.HyperParametersTuning</td>\n", - " <td id=\"T_0502a_row118_col1\" class=\"data row118 col1\" >Hyper Parameters Tuning</td>\n", - " <td id=\"T_0502a_row118_col2\" class=\"data row118 col2\" >Performs exhaustive grid search over specified parameter ranges to find optimal model configurations...</td>\n", - " <td id=\"T_0502a_row118_col3\" class=\"data row118 col3\" >False</td>\n", - " <td id=\"T_0502a_row118_col4\" class=\"data row118 col4\" >True</td>\n", - " <td id=\"T_0502a_row118_col5\" class=\"data row118 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row118_col6\" class=\"data row118 col6\" >{'param_grid': {'type': 'dict', 'default': None}, 'scoring': {'type': None, 'default': None}, 'thresholds': {'type': None, 'default': None}, 'fit_params': {'type': 'dict', 'default': None}}</td>\n", - " <td id=\"T_0502a_row118_col7\" class=\"data row118 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_0502a_row118_col8\" class=\"data row118 col8\" >['clustering', 'classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row119_col0\" class=\"data row119 col0\" >validmind.model_validation.sklearn.KMeansClustersOptimization</td>\n", - " <td id=\"T_0502a_row119_col1\" class=\"data row119 col1\" >K Means Clusters Optimization</td>\n", - " <td id=\"T_0502a_row119_col2\" class=\"data row119 col2\" >Optimizes the number of clusters in K-means models using Elbow and Silhouette methods....</td>\n", - " <td id=\"T_0502a_row119_col3\" class=\"data row119 col3\" >True</td>\n", - " <td id=\"T_0502a_row119_col4\" class=\"data row119 col4\" >False</td>\n", - " <td id=\"T_0502a_row119_col5\" class=\"data row119 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row119_col6\" class=\"data row119 col6\" >{'n_clusters': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_0502a_row119_col7\" class=\"data row119 col7\" >['sklearn', 'model_performance', 'kmeans']</td>\n", - " <td id=\"T_0502a_row119_col8\" class=\"data row119 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row120_col0\" class=\"data row120 col0\" >validmind.model_validation.sklearn.MinimumAccuracy</td>\n", - " <td id=\"T_0502a_row120_col1\" class=\"data row120 col1\" >Minimum Accuracy</td>\n", - " <td id=\"T_0502a_row120_col2\" class=\"data row120 col2\" >Checks if the model's prediction accuracy meets or surpasses a specified threshold....</td>\n", - " <td id=\"T_0502a_row120_col3\" class=\"data row120 col3\" >False</td>\n", - " <td id=\"T_0502a_row120_col4\" class=\"data row120 col4\" >True</td>\n", - " <td id=\"T_0502a_row120_col5\" class=\"data row120 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row120_col6\" class=\"data row120 col6\" >{'min_threshold': {'type': 'float', 'default': 0.7}}</td>\n", - " <td id=\"T_0502a_row120_col7\" class=\"data row120 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_0502a_row120_col8\" class=\"data row120 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row121_col0\" class=\"data row121 col0\" >validmind.model_validation.sklearn.MinimumF1Score</td>\n", - " <td id=\"T_0502a_row121_col1\" class=\"data row121 col1\" >Minimum F1 Score</td>\n", - " <td id=\"T_0502a_row121_col2\" class=\"data row121 col2\" >Assesses if the model's F1 score on the validation set meets a predefined minimum threshold, ensuring balanced...</td>\n", - " <td id=\"T_0502a_row121_col3\" class=\"data row121 col3\" >False</td>\n", - " <td id=\"T_0502a_row121_col4\" class=\"data row121 col4\" >True</td>\n", - " <td id=\"T_0502a_row121_col5\" class=\"data row121 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row121_col6\" class=\"data row121 col6\" >{'min_threshold': {'type': 'float', 'default': 0.5}}</td>\n", - " <td id=\"T_0502a_row121_col7\" class=\"data row121 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_0502a_row121_col8\" class=\"data row121 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row122_col0\" class=\"data row122 col0\" >validmind.model_validation.sklearn.MinimumROCAUCScore</td>\n", - " <td id=\"T_0502a_row122_col1\" class=\"data row122 col1\" >Minimum ROCAUC Score</td>\n", - " <td id=\"T_0502a_row122_col2\" class=\"data row122 col2\" >Validates model by checking if the ROC AUC score meets or surpasses a specified threshold....</td>\n", - " <td id=\"T_0502a_row122_col3\" class=\"data row122 col3\" >False</td>\n", - " <td id=\"T_0502a_row122_col4\" class=\"data row122 col4\" >True</td>\n", - " <td id=\"T_0502a_row122_col5\" class=\"data row122 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row122_col6\" class=\"data row122 col6\" >{'min_threshold': {'type': 'float', 'default': 0.5}}</td>\n", - " <td id=\"T_0502a_row122_col7\" class=\"data row122 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_0502a_row122_col8\" class=\"data row122 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row123_col0\" class=\"data row123 col0\" >validmind.model_validation.sklearn.ModelParameters</td>\n", - " <td id=\"T_0502a_row123_col1\" class=\"data row123 col1\" >Model Parameters</td>\n", - " <td id=\"T_0502a_row123_col2\" class=\"data row123 col2\" >Extracts and displays model parameters in a structured format for transparency and reproducibility....</td>\n", - " <td id=\"T_0502a_row123_col3\" class=\"data row123 col3\" >False</td>\n", - " <td id=\"T_0502a_row123_col4\" class=\"data row123 col4\" >True</td>\n", - " <td id=\"T_0502a_row123_col5\" class=\"data row123 col5\" >['model']</td>\n", - " <td id=\"T_0502a_row123_col6\" class=\"data row123 col6\" >{'model_params': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_0502a_row123_col7\" class=\"data row123 col7\" >['model_training', 'metadata']</td>\n", - " <td id=\"T_0502a_row123_col8\" class=\"data row123 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row124_col0\" class=\"data row124 col0\" >validmind.model_validation.sklearn.ModelsPerformanceComparison</td>\n", - " <td id=\"T_0502a_row124_col1\" class=\"data row124 col1\" >Models Performance Comparison</td>\n", - " <td id=\"T_0502a_row124_col2\" class=\"data row124 col2\" >Evaluates and compares the performance of multiple Machine Learning models using various metrics like accuracy,...</td>\n", - " <td id=\"T_0502a_row124_col3\" class=\"data row124 col3\" >False</td>\n", - " <td id=\"T_0502a_row124_col4\" class=\"data row124 col4\" >True</td>\n", - " <td id=\"T_0502a_row124_col5\" class=\"data row124 col5\" >['dataset', 'models']</td>\n", - " <td id=\"T_0502a_row124_col6\" class=\"data row124 col6\" >{}</td>\n", - " <td id=\"T_0502a_row124_col7\" class=\"data row124 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'model_comparison']</td>\n", - " <td id=\"T_0502a_row124_col8\" class=\"data row124 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row125_col0\" class=\"data row125 col0\" >validmind.model_validation.sklearn.OverfitDiagnosis</td>\n", - " <td id=\"T_0502a_row125_col1\" class=\"data row125 col1\" >Overfit Diagnosis</td>\n", - " <td id=\"T_0502a_row125_col2\" class=\"data row125 col2\" >Assesses potential overfitting in a model's predictions, identifying regions where performance between training and...</td>\n", - " <td id=\"T_0502a_row125_col3\" class=\"data row125 col3\" >True</td>\n", - " <td id=\"T_0502a_row125_col4\" class=\"data row125 col4\" >True</td>\n", - " <td id=\"T_0502a_row125_col5\" class=\"data row125 col5\" >['model', 'datasets']</td>\n", - " <td id=\"T_0502a_row125_col6\" class=\"data row125 col6\" >{'metric': {'type': 'str', 'default': None}, 'cut_off_threshold': {'type': 'float', 'default': 0.04}}</td>\n", - " <td id=\"T_0502a_row125_col7\" class=\"data row125 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'linear_regression', 'model_diagnosis']</td>\n", - " <td id=\"T_0502a_row125_col8\" class=\"data row125 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row126_col0\" class=\"data row126 col0\" >validmind.model_validation.sklearn.PermutationFeatureImportance</td>\n", - " <td id=\"T_0502a_row126_col1\" class=\"data row126 col1\" >Permutation Feature Importance</td>\n", - " <td id=\"T_0502a_row126_col2\" class=\"data row126 col2\" >Assesses the significance of each feature in a model by evaluating the impact on model performance when feature...</td>\n", - " <td id=\"T_0502a_row126_col3\" class=\"data row126 col3\" >True</td>\n", - " <td id=\"T_0502a_row126_col4\" class=\"data row126 col4\" >False</td>\n", - " <td id=\"T_0502a_row126_col5\" class=\"data row126 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row126_col6\" class=\"data row126 col6\" >{'fontsize': {'type': None, 'default': None}, 'figure_height': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_0502a_row126_col7\" class=\"data row126 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'feature_importance', 'visualization']</td>\n", - " <td id=\"T_0502a_row126_col8\" class=\"data row126 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row127_col0\" class=\"data row127 col0\" >validmind.model_validation.sklearn.PopulationStabilityIndex</td>\n", - " <td id=\"T_0502a_row127_col1\" class=\"data row127 col1\" >Population Stability Index</td>\n", - " <td id=\"T_0502a_row127_col2\" class=\"data row127 col2\" >Assesses the Population Stability Index (PSI) to quantify the stability of an ML model's predictions across...</td>\n", - " <td id=\"T_0502a_row127_col3\" class=\"data row127 col3\" >True</td>\n", - " <td id=\"T_0502a_row127_col4\" class=\"data row127 col4\" >True</td>\n", - " <td id=\"T_0502a_row127_col5\" class=\"data row127 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row127_col6\" class=\"data row127 col6\" >{'num_bins': {'type': 'int', 'default': 10}, 'mode': {'type': 'str', 'default': 'fixed'}}</td>\n", - " <td id=\"T_0502a_row127_col7\" class=\"data row127 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_0502a_row127_col8\" class=\"data row127 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row128_col0\" class=\"data row128 col0\" >validmind.model_validation.sklearn.PrecisionRecallCurve</td>\n", - " <td id=\"T_0502a_row128_col1\" class=\"data row128 col1\" >Precision Recall Curve</td>\n", - " <td id=\"T_0502a_row128_col2\" class=\"data row128 col2\" >Evaluates the precision-recall trade-off for binary classification models and visualizes the Precision-Recall curve....</td>\n", - " <td id=\"T_0502a_row128_col3\" class=\"data row128 col3\" >True</td>\n", - " <td id=\"T_0502a_row128_col4\" class=\"data row128 col4\" >False</td>\n", - " <td id=\"T_0502a_row128_col5\" class=\"data row128 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row128_col6\" class=\"data row128 col6\" >{}</td>\n", - " <td id=\"T_0502a_row128_col7\" class=\"data row128 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_0502a_row128_col8\" class=\"data row128 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row129_col0\" class=\"data row129 col0\" >validmind.model_validation.sklearn.ROCCurve</td>\n", - " <td id=\"T_0502a_row129_col1\" class=\"data row129 col1\" >ROC Curve</td>\n", - " <td id=\"T_0502a_row129_col2\" class=\"data row129 col2\" >Evaluates binary classification model performance by generating and plotting the Receiver Operating Characteristic...</td>\n", - " <td id=\"T_0502a_row129_col3\" class=\"data row129 col3\" >True</td>\n", - " <td id=\"T_0502a_row129_col4\" class=\"data row129 col4\" >False</td>\n", - " <td id=\"T_0502a_row129_col5\" class=\"data row129 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row129_col6\" class=\"data row129 col6\" >{}</td>\n", - " <td id=\"T_0502a_row129_col7\" class=\"data row129 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_0502a_row129_col8\" class=\"data row129 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row130_col0\" class=\"data row130 col0\" >validmind.model_validation.sklearn.RegressionErrors</td>\n", - " <td id=\"T_0502a_row130_col1\" class=\"data row130 col1\" >Regression Errors</td>\n", - " <td id=\"T_0502a_row130_col2\" class=\"data row130 col2\" >Assesses the performance and error distribution of a regression model using various error metrics....</td>\n", - " <td id=\"T_0502a_row130_col3\" class=\"data row130 col3\" >False</td>\n", - " <td id=\"T_0502a_row130_col4\" class=\"data row130 col4\" >True</td>\n", - " <td id=\"T_0502a_row130_col5\" class=\"data row130 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row130_col6\" class=\"data row130 col6\" >{}</td>\n", - " <td id=\"T_0502a_row130_col7\" class=\"data row130 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_0502a_row130_col8\" class=\"data row130 col8\" >['regression', 'classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row131_col0\" class=\"data row131 col0\" >validmind.model_validation.sklearn.RegressionErrorsComparison</td>\n", - " <td id=\"T_0502a_row131_col1\" class=\"data row131 col1\" >Regression Errors Comparison</td>\n", - " <td id=\"T_0502a_row131_col2\" class=\"data row131 col2\" >Assesses multiple regression error metrics to compare model performance across different datasets, emphasizing...</td>\n", - " <td id=\"T_0502a_row131_col3\" class=\"data row131 col3\" >False</td>\n", - " <td id=\"T_0502a_row131_col4\" class=\"data row131 col4\" >True</td>\n", - " <td id=\"T_0502a_row131_col5\" class=\"data row131 col5\" >['datasets', 'models']</td>\n", - " <td id=\"T_0502a_row131_col6\" class=\"data row131 col6\" >{}</td>\n", - " <td id=\"T_0502a_row131_col7\" class=\"data row131 col7\" >['model_performance', 'sklearn']</td>\n", - " <td id=\"T_0502a_row131_col8\" class=\"data row131 col8\" >['regression', 'time_series_forecasting']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row132_col0\" class=\"data row132 col0\" >validmind.model_validation.sklearn.RegressionPerformance</td>\n", - " <td id=\"T_0502a_row132_col1\" class=\"data row132 col1\" >Regression Performance</td>\n", - " <td id=\"T_0502a_row132_col2\" class=\"data row132 col2\" >Evaluates the performance of a regression model using five different metrics: MAE, MSE, RMSE, MAPE, and MBD....</td>\n", - " <td id=\"T_0502a_row132_col3\" class=\"data row132 col3\" >False</td>\n", - " <td id=\"T_0502a_row132_col4\" class=\"data row132 col4\" >True</td>\n", - " <td id=\"T_0502a_row132_col5\" class=\"data row132 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row132_col6\" class=\"data row132 col6\" >{}</td>\n", - " <td id=\"T_0502a_row132_col7\" class=\"data row132 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_0502a_row132_col8\" class=\"data row132 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row133_col0\" class=\"data row133 col0\" >validmind.model_validation.sklearn.RegressionR2Square</td>\n", - " <td id=\"T_0502a_row133_col1\" class=\"data row133 col1\" >Regression R2 Square</td>\n", - " <td id=\"T_0502a_row133_col2\" class=\"data row133 col2\" >Assesses the overall goodness-of-fit of a regression model by evaluating R-squared (R2) and Adjusted R-squared (Adj...</td>\n", - " <td id=\"T_0502a_row133_col3\" class=\"data row133 col3\" >False</td>\n", - " <td id=\"T_0502a_row133_col4\" class=\"data row133 col4\" >True</td>\n", - " <td id=\"T_0502a_row133_col5\" class=\"data row133 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row133_col6\" class=\"data row133 col6\" >{}</td>\n", - " <td id=\"T_0502a_row133_col7\" class=\"data row133 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_0502a_row133_col8\" class=\"data row133 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row134_col0\" class=\"data row134 col0\" >validmind.model_validation.sklearn.RegressionR2SquareComparison</td>\n", - " <td id=\"T_0502a_row134_col1\" class=\"data row134 col1\" >Regression R2 Square Comparison</td>\n", - " <td id=\"T_0502a_row134_col2\" class=\"data row134 col2\" >Compares R-Squared and Adjusted R-Squared values for different regression models across multiple datasets to assess...</td>\n", - " <td id=\"T_0502a_row134_col3\" class=\"data row134 col3\" >False</td>\n", - " <td id=\"T_0502a_row134_col4\" class=\"data row134 col4\" >True</td>\n", - " <td id=\"T_0502a_row134_col5\" class=\"data row134 col5\" >['datasets', 'models']</td>\n", - " <td id=\"T_0502a_row134_col6\" class=\"data row134 col6\" >{}</td>\n", - " <td id=\"T_0502a_row134_col7\" class=\"data row134 col7\" >['model_performance', 'sklearn']</td>\n", - " <td id=\"T_0502a_row134_col8\" class=\"data row134 col8\" >['regression', 'time_series_forecasting']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row135_col0\" class=\"data row135 col0\" >validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", - " <td id=\"T_0502a_row135_col1\" class=\"data row135 col1\" >Robustness Diagnosis</td>\n", - " <td id=\"T_0502a_row135_col2\" class=\"data row135 col2\" >Assesses the robustness of a machine learning model by evaluating performance decay under noisy conditions....</td>\n", - " <td id=\"T_0502a_row135_col3\" class=\"data row135 col3\" >True</td>\n", - " <td id=\"T_0502a_row135_col4\" class=\"data row135 col4\" >True</td>\n", - " <td id=\"T_0502a_row135_col5\" class=\"data row135 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row135_col6\" class=\"data row135 col6\" >{'metric': {'type': 'str', 'default': None}, 'scaling_factor_std_dev_list': {'type': None, 'default': [0.1, 0.2, 0.3, 0.4, 0.5]}, 'performance_decay_threshold': {'type': 'float', 'default': 0.05}}</td>\n", - " <td id=\"T_0502a_row135_col7\" class=\"data row135 col7\" >['sklearn', 'model_diagnosis', 'visualization']</td>\n", - " <td id=\"T_0502a_row135_col8\" class=\"data row135 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row136_col0\" class=\"data row136 col0\" >validmind.model_validation.sklearn.SHAPGlobalImportance</td>\n", - " <td id=\"T_0502a_row136_col1\" class=\"data row136 col1\" >SHAP Global Importance</td>\n", - " <td id=\"T_0502a_row136_col2\" class=\"data row136 col2\" >Evaluates and visualizes global feature importance using SHAP values for model explanation and risk identification....</td>\n", - " <td id=\"T_0502a_row136_col3\" class=\"data row136 col3\" >False</td>\n", - " <td id=\"T_0502a_row136_col4\" class=\"data row136 col4\" >True</td>\n", - " <td id=\"T_0502a_row136_col5\" class=\"data row136 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row136_col6\" class=\"data row136 col6\" >{'kernel_explainer_samples': {'type': 'int', 'default': 10}, 'tree_or_linear_explainer_samples': {'type': 'int', 'default': 200}, 'class_of_interest': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_0502a_row136_col7\" class=\"data row136 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'feature_importance', 'visualization']</td>\n", - " <td id=\"T_0502a_row136_col8\" class=\"data row136 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row137_col0\" class=\"data row137 col0\" >validmind.model_validation.sklearn.ScoreProbabilityAlignment</td>\n", - " <td id=\"T_0502a_row137_col1\" class=\"data row137 col1\" >Score Probability Alignment</td>\n", - " <td id=\"T_0502a_row137_col2\" class=\"data row137 col2\" >Analyzes the alignment between credit scores and predicted probabilities....</td>\n", - " <td id=\"T_0502a_row137_col3\" class=\"data row137 col3\" >True</td>\n", - " <td id=\"T_0502a_row137_col4\" class=\"data row137 col4\" >True</td>\n", - " <td id=\"T_0502a_row137_col5\" class=\"data row137 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row137_col6\" class=\"data row137 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'n_bins': {'type': 'int', 'default': 10}}</td>\n", - " <td id=\"T_0502a_row137_col7\" class=\"data row137 col7\" >['visualization', 'credit_risk', 'calibration']</td>\n", - " <td id=\"T_0502a_row137_col8\" class=\"data row137 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row138_col0\" class=\"data row138 col0\" >validmind.model_validation.sklearn.SilhouettePlot</td>\n", - " <td id=\"T_0502a_row138_col1\" class=\"data row138 col1\" >Silhouette Plot</td>\n", - " <td id=\"T_0502a_row138_col2\" class=\"data row138 col2\" >Calculates and visualizes Silhouette Score, assessing the degree of data point suitability to its cluster in ML...</td>\n", - " <td id=\"T_0502a_row138_col3\" class=\"data row138 col3\" >True</td>\n", - " <td id=\"T_0502a_row138_col4\" class=\"data row138 col4\" >True</td>\n", - " <td id=\"T_0502a_row138_col5\" class=\"data row138 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row138_col6\" class=\"data row138 col6\" >{}</td>\n", - " <td id=\"T_0502a_row138_col7\" class=\"data row138 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_0502a_row138_col8\" class=\"data row138 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row139_col0\" class=\"data row139 col0\" >validmind.model_validation.sklearn.TrainingTestDegradation</td>\n", - " <td id=\"T_0502a_row139_col1\" class=\"data row139 col1\" >Training Test Degradation</td>\n", - " <td id=\"T_0502a_row139_col2\" class=\"data row139 col2\" >Tests if model performance degradation between training and test datasets exceeds a predefined threshold....</td>\n", - " <td id=\"T_0502a_row139_col3\" class=\"data row139 col3\" >False</td>\n", - " <td id=\"T_0502a_row139_col4\" class=\"data row139 col4\" >True</td>\n", - " <td id=\"T_0502a_row139_col5\" class=\"data row139 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row139_col6\" class=\"data row139 col6\" >{'max_threshold': {'type': 'float', 'default': 0.1}}</td>\n", - " <td id=\"T_0502a_row139_col7\" class=\"data row139 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_0502a_row139_col8\" class=\"data row139 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row140_col0\" class=\"data row140 col0\" >validmind.model_validation.sklearn.VMeasure</td>\n", - " <td id=\"T_0502a_row140_col1\" class=\"data row140 col1\" >V Measure</td>\n", - " <td id=\"T_0502a_row140_col2\" class=\"data row140 col2\" >Evaluates homogeneity and completeness of a clustering model using the V Measure Score....</td>\n", - " <td id=\"T_0502a_row140_col3\" class=\"data row140 col3\" >False</td>\n", - " <td id=\"T_0502a_row140_col4\" class=\"data row140 col4\" >True</td>\n", - " <td id=\"T_0502a_row140_col5\" class=\"data row140 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row140_col6\" class=\"data row140 col6\" >{}</td>\n", - " <td id=\"T_0502a_row140_col7\" class=\"data row140 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_0502a_row140_col8\" class=\"data row140 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row141_col0\" class=\"data row141 col0\" >validmind.model_validation.sklearn.WeakspotsDiagnosis</td>\n", - " <td id=\"T_0502a_row141_col1\" class=\"data row141 col1\" >Weakspots Diagnosis</td>\n", - " <td id=\"T_0502a_row141_col2\" class=\"data row141 col2\" >Identifies and visualizes weak spots in a machine learning model's performance across various sections of the...</td>\n", - " <td id=\"T_0502a_row141_col3\" class=\"data row141 col3\" >True</td>\n", - " <td id=\"T_0502a_row141_col4\" class=\"data row141 col4\" >True</td>\n", - " <td id=\"T_0502a_row141_col5\" class=\"data row141 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row141_col6\" class=\"data row141 col6\" >{'features_columns': {'type': None, 'default': None}, 'metrics': {'type': None, 'default': None}, 'thresholds': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_0502a_row141_col7\" class=\"data row141 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_diagnosis', 'visualization']</td>\n", - " <td id=\"T_0502a_row141_col8\" class=\"data row141 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row142_col0\" class=\"data row142 col0\" >validmind.model_validation.statsmodels.AutoARIMA</td>\n", - " <td id=\"T_0502a_row142_col1\" class=\"data row142 col1\" >Auto ARIMA</td>\n", - " <td id=\"T_0502a_row142_col2\" class=\"data row142 col2\" >Evaluates ARIMA models for time-series forecasting, ranking them using Bayesian and Akaike Information Criteria....</td>\n", - " <td id=\"T_0502a_row142_col3\" class=\"data row142 col3\" >False</td>\n", - " <td id=\"T_0502a_row142_col4\" class=\"data row142 col4\" >True</td>\n", - " <td id=\"T_0502a_row142_col5\" class=\"data row142 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row142_col6\" class=\"data row142 col6\" >{}</td>\n", - " <td id=\"T_0502a_row142_col7\" class=\"data row142 col7\" >['time_series_data', 'forecasting', 'model_selection', 'statsmodels']</td>\n", - " <td id=\"T_0502a_row142_col8\" class=\"data row142 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row143_col0\" class=\"data row143 col0\" >validmind.model_validation.statsmodels.CumulativePredictionProbabilities</td>\n", - " <td id=\"T_0502a_row143_col1\" class=\"data row143 col1\" >Cumulative Prediction Probabilities</td>\n", - " <td id=\"T_0502a_row143_col2\" class=\"data row143 col2\" >Visualizes cumulative probabilities of positive and negative classes for both training and testing in classification models....</td>\n", - " <td id=\"T_0502a_row143_col3\" class=\"data row143 col3\" >True</td>\n", - " <td id=\"T_0502a_row143_col4\" class=\"data row143 col4\" >False</td>\n", - " <td id=\"T_0502a_row143_col5\" class=\"data row143 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row143_col6\" class=\"data row143 col6\" >{'title': {'type': 'str', 'default': 'Cumulative Probabilities'}}</td>\n", - " <td id=\"T_0502a_row143_col7\" class=\"data row143 col7\" >['visualization', 'credit_risk']</td>\n", - " <td id=\"T_0502a_row143_col8\" class=\"data row143 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row144_col0\" class=\"data row144 col0\" >validmind.model_validation.statsmodels.DurbinWatsonTest</td>\n", - " <td id=\"T_0502a_row144_col1\" class=\"data row144 col1\" >Durbin Watson Test</td>\n", - " <td id=\"T_0502a_row144_col2\" class=\"data row144 col2\" >Assesses autocorrelation in time series data features using the Durbin-Watson statistic....</td>\n", - " <td id=\"T_0502a_row144_col3\" class=\"data row144 col3\" >False</td>\n", - " <td id=\"T_0502a_row144_col4\" class=\"data row144 col4\" >True</td>\n", - " <td id=\"T_0502a_row144_col5\" class=\"data row144 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row144_col6\" class=\"data row144 col6\" >{'threshold': {'type': None, 'default': [1.5, 2.5]}}</td>\n", - " <td id=\"T_0502a_row144_col7\" class=\"data row144 col7\" >['time_series_data', 'forecasting', 'statistical_test', 'statsmodels']</td>\n", - " <td id=\"T_0502a_row144_col8\" class=\"data row144 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row145_col0\" class=\"data row145 col0\" >validmind.model_validation.statsmodels.GINITable</td>\n", - " <td id=\"T_0502a_row145_col1\" class=\"data row145 col1\" >GINI Table</td>\n", - " <td id=\"T_0502a_row145_col2\" class=\"data row145 col2\" >Evaluates classification model performance using AUC, GINI, and KS metrics for training and test datasets....</td>\n", - " <td id=\"T_0502a_row145_col3\" class=\"data row145 col3\" >False</td>\n", - " <td id=\"T_0502a_row145_col4\" class=\"data row145 col4\" >True</td>\n", - " <td id=\"T_0502a_row145_col5\" class=\"data row145 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row145_col6\" class=\"data row145 col6\" >{}</td>\n", - " <td id=\"T_0502a_row145_col7\" class=\"data row145 col7\" >['model_performance']</td>\n", - " <td id=\"T_0502a_row145_col8\" class=\"data row145 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row146_col0\" class=\"data row146 col0\" >validmind.model_validation.statsmodels.KolmogorovSmirnov</td>\n", - " <td id=\"T_0502a_row146_col1\" class=\"data row146 col1\" >Kolmogorov Smirnov</td>\n", - " <td id=\"T_0502a_row146_col2\" class=\"data row146 col2\" >Assesses whether each feature in the dataset aligns with a normal distribution using the Kolmogorov-Smirnov test....</td>\n", - " <td id=\"T_0502a_row146_col3\" class=\"data row146 col3\" >False</td>\n", - " <td id=\"T_0502a_row146_col4\" class=\"data row146 col4\" >True</td>\n", - " <td id=\"T_0502a_row146_col5\" class=\"data row146 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row146_col6\" class=\"data row146 col6\" >{'dist': {'type': 'str', 'default': 'norm'}}</td>\n", - " <td id=\"T_0502a_row146_col7\" class=\"data row146 col7\" >['tabular_data', 'data_distribution', 'statistical_test', 'statsmodels']</td>\n", - " <td id=\"T_0502a_row146_col8\" class=\"data row146 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row147_col0\" class=\"data row147 col0\" >validmind.model_validation.statsmodels.Lilliefors</td>\n", - " <td id=\"T_0502a_row147_col1\" class=\"data row147 col1\" >Lilliefors</td>\n", - " <td id=\"T_0502a_row147_col2\" class=\"data row147 col2\" >Assesses the normality of feature distributions in an ML model's training dataset using the Lilliefors test....</td>\n", - " <td id=\"T_0502a_row147_col3\" class=\"data row147 col3\" >False</td>\n", - " <td id=\"T_0502a_row147_col4\" class=\"data row147 col4\" >True</td>\n", - " <td id=\"T_0502a_row147_col5\" class=\"data row147 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row147_col6\" class=\"data row147 col6\" >{}</td>\n", - " <td id=\"T_0502a_row147_col7\" class=\"data row147 col7\" >['tabular_data', 'data_distribution', 'statistical_test', 'statsmodels']</td>\n", - " <td id=\"T_0502a_row147_col8\" class=\"data row147 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row148_col0\" class=\"data row148 col0\" >validmind.model_validation.statsmodels.PredictionProbabilitiesHistogram</td>\n", - " <td id=\"T_0502a_row148_col1\" class=\"data row148 col1\" >Prediction Probabilities Histogram</td>\n", - " <td id=\"T_0502a_row148_col2\" class=\"data row148 col2\" >Assesses the predictive probability distribution for binary classification to evaluate model performance and...</td>\n", - " <td id=\"T_0502a_row148_col3\" class=\"data row148 col3\" >True</td>\n", - " <td id=\"T_0502a_row148_col4\" class=\"data row148 col4\" >False</td>\n", - " <td id=\"T_0502a_row148_col5\" class=\"data row148 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row148_col6\" class=\"data row148 col6\" >{'title': {'type': 'str', 'default': 'Histogram of Predictive Probabilities'}}</td>\n", - " <td id=\"T_0502a_row148_col7\" class=\"data row148 col7\" >['visualization', 'credit_risk']</td>\n", - " <td id=\"T_0502a_row148_col8\" class=\"data row148 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row149_col0\" class=\"data row149 col0\" >validmind.model_validation.statsmodels.RegressionCoeffs</td>\n", - " <td id=\"T_0502a_row149_col1\" class=\"data row149 col1\" >Regression Coeffs</td>\n", - " <td id=\"T_0502a_row149_col2\" class=\"data row149 col2\" >Assesses the significance and uncertainty of predictor variables in a regression model through visualization of...</td>\n", - " <td id=\"T_0502a_row149_col3\" class=\"data row149 col3\" >True</td>\n", - " <td id=\"T_0502a_row149_col4\" class=\"data row149 col4\" >True</td>\n", - " <td id=\"T_0502a_row149_col5\" class=\"data row149 col5\" >['model']</td>\n", - " <td id=\"T_0502a_row149_col6\" class=\"data row149 col6\" >{}</td>\n", - " <td id=\"T_0502a_row149_col7\" class=\"data row149 col7\" >['tabular_data', 'visualization', 'model_training']</td>\n", - " <td id=\"T_0502a_row149_col8\" class=\"data row149 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row150_col0\" class=\"data row150 col0\" >validmind.model_validation.statsmodels.RegressionFeatureSignificance</td>\n", - " <td id=\"T_0502a_row150_col1\" class=\"data row150 col1\" >Regression Feature Significance</td>\n", - " <td id=\"T_0502a_row150_col2\" class=\"data row150 col2\" >Assesses and visualizes the statistical significance of features in a regression model....</td>\n", - " <td id=\"T_0502a_row150_col3\" class=\"data row150 col3\" >True</td>\n", - " <td id=\"T_0502a_row150_col4\" class=\"data row150 col4\" >False</td>\n", - " <td id=\"T_0502a_row150_col5\" class=\"data row150 col5\" >['model']</td>\n", - " <td id=\"T_0502a_row150_col6\" class=\"data row150 col6\" >{'fontsize': {'type': 'int', 'default': 10}, 'p_threshold': {'type': 'float', 'default': 0.05}}</td>\n", - " <td id=\"T_0502a_row150_col7\" class=\"data row150 col7\" >['statistical_test', 'model_interpretation', 'visualization', 'feature_importance']</td>\n", - " <td id=\"T_0502a_row150_col8\" class=\"data row150 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row151_col0\" class=\"data row151 col0\" >validmind.model_validation.statsmodels.RegressionModelForecastPlot</td>\n", - " <td id=\"T_0502a_row151_col1\" class=\"data row151 col1\" >Regression Model Forecast Plot</td>\n", - " <td id=\"T_0502a_row151_col2\" class=\"data row151 col2\" >Generates plots to visually compare the forecasted outcomes of a regression model against actual observed values over...</td>\n", - " <td id=\"T_0502a_row151_col3\" class=\"data row151 col3\" >True</td>\n", - " <td id=\"T_0502a_row151_col4\" class=\"data row151 col4\" >False</td>\n", - " <td id=\"T_0502a_row151_col5\" class=\"data row151 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row151_col6\" class=\"data row151 col6\" >{'start_date': {'type': None, 'default': None}, 'end_date': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_0502a_row151_col7\" class=\"data row151 col7\" >['time_series_data', 'forecasting', 'visualization']</td>\n", - " <td id=\"T_0502a_row151_col8\" class=\"data row151 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row152_col0\" class=\"data row152 col0\" >validmind.model_validation.statsmodels.RegressionModelForecastPlotLevels</td>\n", - " <td id=\"T_0502a_row152_col1\" class=\"data row152 col1\" >Regression Model Forecast Plot Levels</td>\n", - " <td id=\"T_0502a_row152_col2\" class=\"data row152 col2\" >Assesses the alignment between forecasted and observed values in regression models through visual plots...</td>\n", - " <td id=\"T_0502a_row152_col3\" class=\"data row152 col3\" >True</td>\n", - " <td id=\"T_0502a_row152_col4\" class=\"data row152 col4\" >False</td>\n", - " <td id=\"T_0502a_row152_col5\" class=\"data row152 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row152_col6\" class=\"data row152 col6\" >{}</td>\n", - " <td id=\"T_0502a_row152_col7\" class=\"data row152 col7\" >['time_series_data', 'forecasting', 'visualization']</td>\n", - " <td id=\"T_0502a_row152_col8\" class=\"data row152 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row153_col0\" class=\"data row153 col0\" >validmind.model_validation.statsmodels.RegressionModelSensitivityPlot</td>\n", - " <td id=\"T_0502a_row153_col1\" class=\"data row153 col1\" >Regression Model Sensitivity Plot</td>\n", - " <td id=\"T_0502a_row153_col2\" class=\"data row153 col2\" >Assesses the sensitivity of a regression model to changes in independent variables by applying shocks and...</td>\n", - " <td id=\"T_0502a_row153_col3\" class=\"data row153 col3\" >True</td>\n", - " <td id=\"T_0502a_row153_col4\" class=\"data row153 col4\" >False</td>\n", - " <td id=\"T_0502a_row153_col5\" class=\"data row153 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row153_col6\" class=\"data row153 col6\" >{'shocks': {'type': None, 'default': [0.1]}, 'transformation': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_0502a_row153_col7\" class=\"data row153 col7\" >['senstivity_analysis', 'visualization']</td>\n", - " <td id=\"T_0502a_row153_col8\" class=\"data row153 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row154_col0\" class=\"data row154 col0\" >validmind.model_validation.statsmodels.RegressionModelSummary</td>\n", - " <td id=\"T_0502a_row154_col1\" class=\"data row154 col1\" >Regression Model Summary</td>\n", - " <td id=\"T_0502a_row154_col2\" class=\"data row154 col2\" >Evaluates regression model performance using metrics including R-Squared, Adjusted R-Squared, MSE, and RMSE....</td>\n", - " <td id=\"T_0502a_row154_col3\" class=\"data row154 col3\" >False</td>\n", - " <td id=\"T_0502a_row154_col4\" class=\"data row154 col4\" >True</td>\n", - " <td id=\"T_0502a_row154_col5\" class=\"data row154 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row154_col6\" class=\"data row154 col6\" >{}</td>\n", - " <td id=\"T_0502a_row154_col7\" class=\"data row154 col7\" >['model_performance', 'regression']</td>\n", - " <td id=\"T_0502a_row154_col8\" class=\"data row154 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row155_col0\" class=\"data row155 col0\" >validmind.model_validation.statsmodels.RegressionPermutationFeatureImportance</td>\n", - " <td id=\"T_0502a_row155_col1\" class=\"data row155 col1\" >Regression Permutation Feature Importance</td>\n", - " <td id=\"T_0502a_row155_col2\" class=\"data row155 col2\" >Assesses the significance of each feature in a model by evaluating the impact on model performance when feature...</td>\n", - " <td id=\"T_0502a_row155_col3\" class=\"data row155 col3\" >True</td>\n", - " <td id=\"T_0502a_row155_col4\" class=\"data row155 col4\" >False</td>\n", - " <td id=\"T_0502a_row155_col5\" class=\"data row155 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row155_col6\" class=\"data row155 col6\" >{'fontsize': {'type': 'int', 'default': 12}, 'figure_height': {'type': 'int', 'default': 500}}</td>\n", - " <td id=\"T_0502a_row155_col7\" class=\"data row155 col7\" >['statsmodels', 'feature_importance', 'visualization']</td>\n", - " <td id=\"T_0502a_row155_col8\" class=\"data row155 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row156_col0\" class=\"data row156 col0\" >validmind.model_validation.statsmodels.ScorecardHistogram</td>\n", - " <td id=\"T_0502a_row156_col1\" class=\"data row156 col1\" >Scorecard Histogram</td>\n", - " <td id=\"T_0502a_row156_col2\" class=\"data row156 col2\" >The Scorecard Histogram test evaluates the distribution of credit scores between default and non-default instances,...</td>\n", - " <td id=\"T_0502a_row156_col3\" class=\"data row156 col3\" >True</td>\n", - " <td id=\"T_0502a_row156_col4\" class=\"data row156 col4\" >False</td>\n", - " <td id=\"T_0502a_row156_col5\" class=\"data row156 col5\" >['dataset']</td>\n", - " <td id=\"T_0502a_row156_col6\" class=\"data row156 col6\" >{'title': {'type': 'str', 'default': 'Histogram of Scores'}, 'score_column': {'type': 'str', 'default': 'score'}}</td>\n", - " <td id=\"T_0502a_row156_col7\" class=\"data row156 col7\" >['visualization', 'credit_risk', 'logistic_regression']</td>\n", - " <td id=\"T_0502a_row156_col8\" class=\"data row156 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row157_col0\" class=\"data row157 col0\" >validmind.ongoing_monitoring.CalibrationCurveDrift</td>\n", - " <td id=\"T_0502a_row157_col1\" class=\"data row157 col1\" >Calibration Curve Drift</td>\n", - " <td id=\"T_0502a_row157_col2\" class=\"data row157 col2\" >Evaluates changes in probability calibration between reference and monitoring datasets....</td>\n", - " <td id=\"T_0502a_row157_col3\" class=\"data row157 col3\" >True</td>\n", - " <td id=\"T_0502a_row157_col4\" class=\"data row157 col4\" >True</td>\n", - " <td id=\"T_0502a_row157_col5\" class=\"data row157 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row157_col6\" class=\"data row157 col6\" >{'n_bins': {'type': 'int', 'default': 10}, 'drift_pct_threshold': {'type': 'float', 'default': 20}}</td>\n", - " <td id=\"T_0502a_row157_col7\" class=\"data row157 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_0502a_row157_col8\" class=\"data row157 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row158_col0\" class=\"data row158 col0\" >validmind.ongoing_monitoring.ClassDiscriminationDrift</td>\n", - " <td id=\"T_0502a_row158_col1\" class=\"data row158 col1\" >Class Discrimination Drift</td>\n", - " <td id=\"T_0502a_row158_col2\" class=\"data row158 col2\" >Compares classification discrimination metrics between reference and monitoring datasets....</td>\n", - " <td id=\"T_0502a_row158_col3\" class=\"data row158 col3\" >False</td>\n", - " <td id=\"T_0502a_row158_col4\" class=\"data row158 col4\" >True</td>\n", - " <td id=\"T_0502a_row158_col5\" class=\"data row158 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row158_col6\" class=\"data row158 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", - " <td id=\"T_0502a_row158_col7\" class=\"data row158 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_0502a_row158_col8\" class=\"data row158 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row159_col0\" class=\"data row159 col0\" >validmind.ongoing_monitoring.ClassImbalanceDrift</td>\n", - " <td id=\"T_0502a_row159_col1\" class=\"data row159 col1\" >Class Imbalance Drift</td>\n", - " <td id=\"T_0502a_row159_col2\" class=\"data row159 col2\" >Evaluates drift in class distribution between reference and monitoring datasets....</td>\n", - " <td id=\"T_0502a_row159_col3\" class=\"data row159 col3\" >True</td>\n", - " <td id=\"T_0502a_row159_col4\" class=\"data row159 col4\" >True</td>\n", - " <td id=\"T_0502a_row159_col5\" class=\"data row159 col5\" >['datasets']</td>\n", - " <td id=\"T_0502a_row159_col6\" class=\"data row159 col6\" >{'drift_pct_threshold': {'type': 'float', 'default': 5.0}, 'title': {'type': 'str', 'default': 'Class Distribution Drift'}}</td>\n", - " <td id=\"T_0502a_row159_col7\" class=\"data row159 col7\" >['tabular_data', 'binary_classification', 'multiclass_classification']</td>\n", - " <td id=\"T_0502a_row159_col8\" class=\"data row159 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row160_col0\" class=\"data row160 col0\" >validmind.ongoing_monitoring.ClassificationAccuracyDrift</td>\n", - " <td id=\"T_0502a_row160_col1\" class=\"data row160 col1\" >Classification Accuracy Drift</td>\n", - " <td id=\"T_0502a_row160_col2\" class=\"data row160 col2\" >Compares classification accuracy metrics between reference and monitoring datasets....</td>\n", - " <td id=\"T_0502a_row160_col3\" class=\"data row160 col3\" >False</td>\n", - " <td id=\"T_0502a_row160_col4\" class=\"data row160 col4\" >True</td>\n", - " <td id=\"T_0502a_row160_col5\" class=\"data row160 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row160_col6\" class=\"data row160 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", - " <td id=\"T_0502a_row160_col7\" class=\"data row160 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_0502a_row160_col8\" class=\"data row160 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row161_col0\" class=\"data row161 col0\" >validmind.ongoing_monitoring.ConfusionMatrixDrift</td>\n", - " <td id=\"T_0502a_row161_col1\" class=\"data row161 col1\" >Confusion Matrix Drift</td>\n", - " <td id=\"T_0502a_row161_col2\" class=\"data row161 col2\" >Compares confusion matrix metrics between reference and monitoring datasets....</td>\n", - " <td id=\"T_0502a_row161_col3\" class=\"data row161 col3\" >False</td>\n", - " <td id=\"T_0502a_row161_col4\" class=\"data row161 col4\" >True</td>\n", - " <td id=\"T_0502a_row161_col5\" class=\"data row161 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row161_col6\" class=\"data row161 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", - " <td id=\"T_0502a_row161_col7\" class=\"data row161 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_0502a_row161_col8\" class=\"data row161 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row162_col0\" class=\"data row162 col0\" >validmind.ongoing_monitoring.CumulativePredictionProbabilitiesDrift</td>\n", - " <td id=\"T_0502a_row162_col1\" class=\"data row162 col1\" >Cumulative Prediction Probabilities Drift</td>\n", - " <td id=\"T_0502a_row162_col2\" class=\"data row162 col2\" >Compares cumulative prediction probability distributions between reference and monitoring datasets....</td>\n", - " <td id=\"T_0502a_row162_col3\" class=\"data row162 col3\" >True</td>\n", - " <td id=\"T_0502a_row162_col4\" class=\"data row162 col4\" >False</td>\n", - " <td id=\"T_0502a_row162_col5\" class=\"data row162 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row162_col6\" class=\"data row162 col6\" >{}</td>\n", - " <td id=\"T_0502a_row162_col7\" class=\"data row162 col7\" >['visualization', 'credit_risk']</td>\n", - " <td id=\"T_0502a_row162_col8\" class=\"data row162 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row163_col0\" class=\"data row163 col0\" >validmind.ongoing_monitoring.FeatureDrift</td>\n", - " <td id=\"T_0502a_row163_col1\" class=\"data row163 col1\" >Feature Drift</td>\n", - " <td id=\"T_0502a_row163_col2\" class=\"data row163 col2\" >Evaluates changes in feature distribution over time to identify potential model drift....</td>\n", - " <td id=\"T_0502a_row163_col3\" class=\"data row163 col3\" >True</td>\n", - " <td id=\"T_0502a_row163_col4\" class=\"data row163 col4\" >True</td>\n", - " <td id=\"T_0502a_row163_col5\" class=\"data row163 col5\" >['datasets']</td>\n", - " <td id=\"T_0502a_row163_col6\" class=\"data row163 col6\" >{'bins': {'type': '_empty', 'default': [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]}, 'feature_columns': {'type': '_empty', 'default': None}, 'psi_threshold': {'type': '_empty', 'default': 0.2}}</td>\n", - " <td id=\"T_0502a_row163_col7\" class=\"data row163 col7\" >['visualization']</td>\n", - " <td id=\"T_0502a_row163_col8\" class=\"data row163 col8\" >['monitoring']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row164_col0\" class=\"data row164 col0\" >validmind.ongoing_monitoring.PredictionAcrossEachFeature</td>\n", - " <td id=\"T_0502a_row164_col1\" class=\"data row164 col1\" >Prediction Across Each Feature</td>\n", - " <td id=\"T_0502a_row164_col2\" class=\"data row164 col2\" >Assesses differences in model predictions across individual features between reference and monitoring datasets...</td>\n", - " <td id=\"T_0502a_row164_col3\" class=\"data row164 col3\" >True</td>\n", - " <td id=\"T_0502a_row164_col4\" class=\"data row164 col4\" >False</td>\n", - " <td id=\"T_0502a_row164_col5\" class=\"data row164 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row164_col6\" class=\"data row164 col6\" >{}</td>\n", - " <td id=\"T_0502a_row164_col7\" class=\"data row164 col7\" >['visualization']</td>\n", - " <td id=\"T_0502a_row164_col8\" class=\"data row164 col8\" >['monitoring']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row165_col0\" class=\"data row165 col0\" >validmind.ongoing_monitoring.PredictionCorrelation</td>\n", - " <td id=\"T_0502a_row165_col1\" class=\"data row165 col1\" >Prediction Correlation</td>\n", - " <td id=\"T_0502a_row165_col2\" class=\"data row165 col2\" >Assesses correlation changes between model predictions from reference and monitoring datasets to detect potential...</td>\n", - " <td id=\"T_0502a_row165_col3\" class=\"data row165 col3\" >True</td>\n", - " <td id=\"T_0502a_row165_col4\" class=\"data row165 col4\" >True</td>\n", - " <td id=\"T_0502a_row165_col5\" class=\"data row165 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row165_col6\" class=\"data row165 col6\" >{'drift_pct_threshold': {'type': 'float', 'default': 20}}</td>\n", - " <td id=\"T_0502a_row165_col7\" class=\"data row165 col7\" >['visualization']</td>\n", - " <td id=\"T_0502a_row165_col8\" class=\"data row165 col8\" >['monitoring']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row166_col0\" class=\"data row166 col0\" >validmind.ongoing_monitoring.PredictionProbabilitiesHistogramDrift</td>\n", - " <td id=\"T_0502a_row166_col1\" class=\"data row166 col1\" >Prediction Probabilities Histogram Drift</td>\n", - " <td id=\"T_0502a_row166_col2\" class=\"data row166 col2\" >Compares prediction probability distributions between reference and monitoring datasets....</td>\n", - " <td id=\"T_0502a_row166_col3\" class=\"data row166 col3\" >True</td>\n", - " <td id=\"T_0502a_row166_col4\" class=\"data row166 col4\" >True</td>\n", - " <td id=\"T_0502a_row166_col5\" class=\"data row166 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row166_col6\" class=\"data row166 col6\" >{'title': {'type': '_empty', 'default': 'Prediction Probabilities Histogram Drift'}, 'drift_pct_threshold': {'type': 'float', 'default': 20.0}}</td>\n", - " <td id=\"T_0502a_row166_col7\" class=\"data row166 col7\" >['visualization', 'credit_risk']</td>\n", - " <td id=\"T_0502a_row166_col8\" class=\"data row166 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row167_col0\" class=\"data row167 col0\" >validmind.ongoing_monitoring.PredictionQuantilesAcrossFeatures</td>\n", - " <td id=\"T_0502a_row167_col1\" class=\"data row167 col1\" >Prediction Quantiles Across Features</td>\n", - " <td id=\"T_0502a_row167_col2\" class=\"data row167 col2\" >Assesses differences in model prediction distributions across individual features between reference...</td>\n", - " <td id=\"T_0502a_row167_col3\" class=\"data row167 col3\" >True</td>\n", - " <td id=\"T_0502a_row167_col4\" class=\"data row167 col4\" >False</td>\n", - " <td id=\"T_0502a_row167_col5\" class=\"data row167 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row167_col6\" class=\"data row167 col6\" >{}</td>\n", - " <td id=\"T_0502a_row167_col7\" class=\"data row167 col7\" >['visualization']</td>\n", - " <td id=\"T_0502a_row167_col8\" class=\"data row167 col8\" >['monitoring']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row168_col0\" class=\"data row168 col0\" >validmind.ongoing_monitoring.ROCCurveDrift</td>\n", - " <td id=\"T_0502a_row168_col1\" class=\"data row168 col1\" >ROC Curve Drift</td>\n", - " <td id=\"T_0502a_row168_col2\" class=\"data row168 col2\" >Compares ROC curves between reference and monitoring datasets....</td>\n", - " <td id=\"T_0502a_row168_col3\" class=\"data row168 col3\" >True</td>\n", - " <td id=\"T_0502a_row168_col4\" class=\"data row168 col4\" >False</td>\n", - " <td id=\"T_0502a_row168_col5\" class=\"data row168 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row168_col6\" class=\"data row168 col6\" >{}</td>\n", - " <td id=\"T_0502a_row168_col7\" class=\"data row168 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_0502a_row168_col8\" class=\"data row168 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row169_col0\" class=\"data row169 col0\" >validmind.ongoing_monitoring.ScoreBandsDrift</td>\n", - " <td id=\"T_0502a_row169_col1\" class=\"data row169 col1\" >Score Bands Drift</td>\n", - " <td id=\"T_0502a_row169_col2\" class=\"data row169 col2\" >Analyzes drift in population distribution and default rates across score bands....</td>\n", - " <td id=\"T_0502a_row169_col3\" class=\"data row169 col3\" >False</td>\n", - " <td id=\"T_0502a_row169_col4\" class=\"data row169 col4\" >True</td>\n", - " <td id=\"T_0502a_row169_col5\" class=\"data row169 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row169_col6\" class=\"data row169 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'score_bands': {'type': 'list', 'default': None}, 'drift_threshold': {'type': 'float', 'default': 20.0}}</td>\n", - " <td id=\"T_0502a_row169_col7\" class=\"data row169 col7\" >['visualization', 'credit_risk', 'scorecard']</td>\n", - " <td id=\"T_0502a_row169_col8\" class=\"data row169 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row170_col0\" class=\"data row170 col0\" >validmind.ongoing_monitoring.ScorecardHistogramDrift</td>\n", - " <td id=\"T_0502a_row170_col1\" class=\"data row170 col1\" >Scorecard Histogram Drift</td>\n", - " <td id=\"T_0502a_row170_col2\" class=\"data row170 col2\" >Compares score distributions between reference and monitoring datasets for each class....</td>\n", - " <td id=\"T_0502a_row170_col3\" class=\"data row170 col3\" >True</td>\n", - " <td id=\"T_0502a_row170_col4\" class=\"data row170 col4\" >True</td>\n", - " <td id=\"T_0502a_row170_col5\" class=\"data row170 col5\" >['datasets']</td>\n", - " <td id=\"T_0502a_row170_col6\" class=\"data row170 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'title': {'type': 'str', 'default': 'Scorecard Histogram Drift'}, 'drift_pct_threshold': {'type': 'float', 'default': 20.0}}</td>\n", - " <td id=\"T_0502a_row170_col7\" class=\"data row170 col7\" >['visualization', 'credit_risk', 'logistic_regression']</td>\n", - " <td id=\"T_0502a_row170_col8\" class=\"data row170 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row171_col0\" class=\"data row171 col0\" >validmind.ongoing_monitoring.TargetPredictionDistributionPlot</td>\n", - " <td id=\"T_0502a_row171_col1\" class=\"data row171 col1\" >Target Prediction Distribution Plot</td>\n", - " <td id=\"T_0502a_row171_col2\" class=\"data row171 col2\" >Assesses differences in prediction distributions between a reference dataset and a monitoring dataset to identify...</td>\n", - " <td id=\"T_0502a_row171_col3\" class=\"data row171 col3\" >True</td>\n", - " <td id=\"T_0502a_row171_col4\" class=\"data row171 col4\" >True</td>\n", - " <td id=\"T_0502a_row171_col5\" class=\"data row171 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_0502a_row171_col6\" class=\"data row171 col6\" >{'drift_pct_threshold': {'type': 'float', 'default': 20}}</td>\n", - " <td id=\"T_0502a_row171_col7\" class=\"data row171 col7\" >['visualization']</td>\n", - " <td id=\"T_0502a_row171_col8\" class=\"data row171 col8\" >['monitoring']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row172_col0\" class=\"data row172 col0\" >validmind.prompt_validation.Bias</td>\n", - " <td id=\"T_0502a_row172_col1\" class=\"data row172 col1\" >Bias</td>\n", - " <td id=\"T_0502a_row172_col2\" class=\"data row172 col2\" >Assesses potential bias in a Large Language Model by analyzing the distribution and order of exemplars in the...</td>\n", - " <td id=\"T_0502a_row172_col3\" class=\"data row172 col3\" >False</td>\n", - " <td id=\"T_0502a_row172_col4\" class=\"data row172 col4\" >True</td>\n", - " <td id=\"T_0502a_row172_col5\" class=\"data row172 col5\" >['model']</td>\n", - " <td id=\"T_0502a_row172_col6\" class=\"data row172 col6\" >{'min_threshold': {'type': '_empty', 'default': 7}, 'judge_llm': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row172_col7\" class=\"data row172 col7\" >['llm', 'few_shot']</td>\n", - " <td id=\"T_0502a_row172_col8\" class=\"data row172 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row173_col0\" class=\"data row173 col0\" >validmind.prompt_validation.Clarity</td>\n", - " <td id=\"T_0502a_row173_col1\" class=\"data row173 col1\" >Clarity</td>\n", - " <td id=\"T_0502a_row173_col2\" class=\"data row173 col2\" >Evaluates and scores the clarity of prompts in a Large Language Model based on specified guidelines....</td>\n", - " <td id=\"T_0502a_row173_col3\" class=\"data row173 col3\" >False</td>\n", - " <td id=\"T_0502a_row173_col4\" class=\"data row173 col4\" >True</td>\n", - " <td id=\"T_0502a_row173_col5\" class=\"data row173 col5\" >['model']</td>\n", - " <td id=\"T_0502a_row173_col6\" class=\"data row173 col6\" >{'min_threshold': {'type': '_empty', 'default': 7}, 'judge_llm': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row173_col7\" class=\"data row173 col7\" >['llm', 'zero_shot', 'few_shot']</td>\n", - " <td id=\"T_0502a_row173_col8\" class=\"data row173 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row174_col0\" class=\"data row174 col0\" >validmind.prompt_validation.Conciseness</td>\n", - " <td id=\"T_0502a_row174_col1\" class=\"data row174 col1\" >Conciseness</td>\n", - " <td id=\"T_0502a_row174_col2\" class=\"data row174 col2\" >Analyzes and grades the conciseness of prompts provided to a Large Language Model....</td>\n", - " <td id=\"T_0502a_row174_col3\" class=\"data row174 col3\" >False</td>\n", - " <td id=\"T_0502a_row174_col4\" class=\"data row174 col4\" >True</td>\n", - " <td id=\"T_0502a_row174_col5\" class=\"data row174 col5\" >['model']</td>\n", - " <td id=\"T_0502a_row174_col6\" class=\"data row174 col6\" >{'min_threshold': {'type': '_empty', 'default': 7}, 'judge_llm': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row174_col7\" class=\"data row174 col7\" >['llm', 'zero_shot', 'few_shot']</td>\n", - " <td id=\"T_0502a_row174_col8\" class=\"data row174 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row175_col0\" class=\"data row175 col0\" >validmind.prompt_validation.Delimitation</td>\n", - " <td id=\"T_0502a_row175_col1\" class=\"data row175 col1\" >Delimitation</td>\n", - " <td id=\"T_0502a_row175_col2\" class=\"data row175 col2\" >Evaluates the proper use of delimiters in prompts provided to Large Language Models....</td>\n", - " <td id=\"T_0502a_row175_col3\" class=\"data row175 col3\" >False</td>\n", - " <td id=\"T_0502a_row175_col4\" class=\"data row175 col4\" >True</td>\n", - " <td id=\"T_0502a_row175_col5\" class=\"data row175 col5\" >['model']</td>\n", - " <td id=\"T_0502a_row175_col6\" class=\"data row175 col6\" >{'min_threshold': {'type': '_empty', 'default': 7}, 'judge_llm': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row175_col7\" class=\"data row175 col7\" >['llm', 'zero_shot', 'few_shot']</td>\n", - " <td id=\"T_0502a_row175_col8\" class=\"data row175 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row176_col0\" class=\"data row176 col0\" >validmind.prompt_validation.NegativeInstruction</td>\n", - " <td id=\"T_0502a_row176_col1\" class=\"data row176 col1\" >Negative Instruction</td>\n", - " <td id=\"T_0502a_row176_col2\" class=\"data row176 col2\" >Evaluates and grades the use of affirmative, proactive language over negative instructions in LLM prompts....</td>\n", - " <td id=\"T_0502a_row176_col3\" class=\"data row176 col3\" >False</td>\n", - " <td id=\"T_0502a_row176_col4\" class=\"data row176 col4\" >True</td>\n", - " <td id=\"T_0502a_row176_col5\" class=\"data row176 col5\" >['model']</td>\n", - " <td id=\"T_0502a_row176_col6\" class=\"data row176 col6\" >{'min_threshold': {'type': '_empty', 'default': 7}, 'judge_llm': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row176_col7\" class=\"data row176 col7\" >['llm', 'zero_shot', 'few_shot']</td>\n", - " <td id=\"T_0502a_row176_col8\" class=\"data row176 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row177_col0\" class=\"data row177 col0\" >validmind.prompt_validation.Robustness</td>\n", - " <td id=\"T_0502a_row177_col1\" class=\"data row177 col1\" >Robustness</td>\n", - " <td id=\"T_0502a_row177_col2\" class=\"data row177 col2\" >Assesses the robustness of prompts provided to a Large Language Model under varying conditions and contexts. This test...</td>\n", - " <td id=\"T_0502a_row177_col3\" class=\"data row177 col3\" >False</td>\n", - " <td id=\"T_0502a_row177_col4\" class=\"data row177 col4\" >True</td>\n", - " <td id=\"T_0502a_row177_col5\" class=\"data row177 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row177_col6\" class=\"data row177 col6\" >{'num_tests': {'type': '_empty', 'default': 10}, 'judge_llm': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row177_col7\" class=\"data row177 col7\" >['llm', 'zero_shot', 'few_shot']</td>\n", - " <td id=\"T_0502a_row177_col8\" class=\"data row177 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row178_col0\" class=\"data row178 col0\" >validmind.prompt_validation.Specificity</td>\n", - " <td id=\"T_0502a_row178_col1\" class=\"data row178 col1\" >Specificity</td>\n", - " <td id=\"T_0502a_row178_col2\" class=\"data row178 col2\" >Evaluates and scores the specificity of prompts provided to a Large Language Model (LLM), based on clarity, detail,...</td>\n", - " <td id=\"T_0502a_row178_col3\" class=\"data row178 col3\" >False</td>\n", - " <td id=\"T_0502a_row178_col4\" class=\"data row178 col4\" >True</td>\n", - " <td id=\"T_0502a_row178_col5\" class=\"data row178 col5\" >['model']</td>\n", - " <td id=\"T_0502a_row178_col6\" class=\"data row178 col6\" >{'min_threshold': {'type': '_empty', 'default': 7}, 'judge_llm': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_0502a_row178_col7\" class=\"data row178 col7\" >['llm', 'zero_shot', 'few_shot']</td>\n", - " <td id=\"T_0502a_row178_col8\" class=\"data row178 col8\" >['text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row179_col0\" class=\"data row179 col0\" >validmind.unit_metrics.classification.Accuracy</td>\n", - " <td id=\"T_0502a_row179_col1\" class=\"data row179 col1\" >Accuracy</td>\n", - " <td id=\"T_0502a_row179_col2\" class=\"data row179 col2\" >Calculates the accuracy of a model</td>\n", - " <td id=\"T_0502a_row179_col3\" class=\"data row179 col3\" >False</td>\n", - " <td id=\"T_0502a_row179_col4\" class=\"data row179 col4\" >False</td>\n", - " <td id=\"T_0502a_row179_col5\" class=\"data row179 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row179_col6\" class=\"data row179 col6\" >{}</td>\n", - " <td id=\"T_0502a_row179_col7\" class=\"data row179 col7\" >['classification']</td>\n", - " <td id=\"T_0502a_row179_col8\" class=\"data row179 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row180_col0\" class=\"data row180 col0\" >validmind.unit_metrics.classification.F1</td>\n", - " <td id=\"T_0502a_row180_col1\" class=\"data row180 col1\" >F1</td>\n", - " <td id=\"T_0502a_row180_col2\" class=\"data row180 col2\" >Calculates the F1 score for a classification model.</td>\n", - " <td id=\"T_0502a_row180_col3\" class=\"data row180 col3\" >False</td>\n", - " <td id=\"T_0502a_row180_col4\" class=\"data row180 col4\" >False</td>\n", - " <td id=\"T_0502a_row180_col5\" class=\"data row180 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row180_col6\" class=\"data row180 col6\" >{}</td>\n", - " <td id=\"T_0502a_row180_col7\" class=\"data row180 col7\" >['classification']</td>\n", - " <td id=\"T_0502a_row180_col8\" class=\"data row180 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row181_col0\" class=\"data row181 col0\" >validmind.unit_metrics.classification.Precision</td>\n", - " <td id=\"T_0502a_row181_col1\" class=\"data row181 col1\" >Precision</td>\n", - " <td id=\"T_0502a_row181_col2\" class=\"data row181 col2\" >Calculates the precision for a classification model.</td>\n", - " <td id=\"T_0502a_row181_col3\" class=\"data row181 col3\" >False</td>\n", - " <td id=\"T_0502a_row181_col4\" class=\"data row181 col4\" >False</td>\n", - " <td id=\"T_0502a_row181_col5\" class=\"data row181 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row181_col6\" class=\"data row181 col6\" >{}</td>\n", - " <td id=\"T_0502a_row181_col7\" class=\"data row181 col7\" >['classification']</td>\n", - " <td id=\"T_0502a_row181_col8\" class=\"data row181 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row182_col0\" class=\"data row182 col0\" >validmind.unit_metrics.classification.ROC_AUC</td>\n", - " <td id=\"T_0502a_row182_col1\" class=\"data row182 col1\" >ROC AUC</td>\n", - " <td id=\"T_0502a_row182_col2\" class=\"data row182 col2\" >Calculates the ROC AUC for a classification model.</td>\n", - " <td id=\"T_0502a_row182_col3\" class=\"data row182 col3\" >False</td>\n", - " <td id=\"T_0502a_row182_col4\" class=\"data row182 col4\" >False</td>\n", - " <td id=\"T_0502a_row182_col5\" class=\"data row182 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row182_col6\" class=\"data row182 col6\" >{}</td>\n", - " <td id=\"T_0502a_row182_col7\" class=\"data row182 col7\" >['classification']</td>\n", - " <td id=\"T_0502a_row182_col8\" class=\"data row182 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row183_col0\" class=\"data row183 col0\" >validmind.unit_metrics.classification.Recall</td>\n", - " <td id=\"T_0502a_row183_col1\" class=\"data row183 col1\" >Recall</td>\n", - " <td id=\"T_0502a_row183_col2\" class=\"data row183 col2\" >Calculates the recall for a classification model.</td>\n", - " <td id=\"T_0502a_row183_col3\" class=\"data row183 col3\" >False</td>\n", - " <td id=\"T_0502a_row183_col4\" class=\"data row183 col4\" >False</td>\n", - " <td id=\"T_0502a_row183_col5\" class=\"data row183 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row183_col6\" class=\"data row183 col6\" >{}</td>\n", - " <td id=\"T_0502a_row183_col7\" class=\"data row183 col7\" >['classification']</td>\n", - " <td id=\"T_0502a_row183_col8\" class=\"data row183 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row184_col0\" class=\"data row184 col0\" >validmind.unit_metrics.regression.AdjustedRSquaredScore</td>\n", - " <td id=\"T_0502a_row184_col1\" class=\"data row184 col1\" >Adjusted R Squared Score</td>\n", - " <td id=\"T_0502a_row184_col2\" class=\"data row184 col2\" >Calculates the adjusted R-squared score for a regression model.</td>\n", - " <td id=\"T_0502a_row184_col3\" class=\"data row184 col3\" >False</td>\n", - " <td id=\"T_0502a_row184_col4\" class=\"data row184 col4\" >False</td>\n", - " <td id=\"T_0502a_row184_col5\" class=\"data row184 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row184_col6\" class=\"data row184 col6\" >{}</td>\n", - " <td id=\"T_0502a_row184_col7\" class=\"data row184 col7\" >['regression']</td>\n", - " <td id=\"T_0502a_row184_col8\" class=\"data row184 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row185_col0\" class=\"data row185 col0\" >validmind.unit_metrics.regression.GiniCoefficient</td>\n", - " <td id=\"T_0502a_row185_col1\" class=\"data row185 col1\" >Gini Coefficient</td>\n", - " <td id=\"T_0502a_row185_col2\" class=\"data row185 col2\" >Calculates the Gini coefficient for a regression model.</td>\n", - " <td id=\"T_0502a_row185_col3\" class=\"data row185 col3\" >False</td>\n", - " <td id=\"T_0502a_row185_col4\" class=\"data row185 col4\" >False</td>\n", - " <td id=\"T_0502a_row185_col5\" class=\"data row185 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row185_col6\" class=\"data row185 col6\" >{}</td>\n", - " <td id=\"T_0502a_row185_col7\" class=\"data row185 col7\" >['regression']</td>\n", - " <td id=\"T_0502a_row185_col8\" class=\"data row185 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row186_col0\" class=\"data row186 col0\" >validmind.unit_metrics.regression.HuberLoss</td>\n", - " <td id=\"T_0502a_row186_col1\" class=\"data row186 col1\" >Huber Loss</td>\n", - " <td id=\"T_0502a_row186_col2\" class=\"data row186 col2\" >Calculates the Huber loss for a regression model.</td>\n", - " <td id=\"T_0502a_row186_col3\" class=\"data row186 col3\" >False</td>\n", - " <td id=\"T_0502a_row186_col4\" class=\"data row186 col4\" >False</td>\n", - " <td id=\"T_0502a_row186_col5\" class=\"data row186 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row186_col6\" class=\"data row186 col6\" >{}</td>\n", - " <td id=\"T_0502a_row186_col7\" class=\"data row186 col7\" >['regression']</td>\n", - " <td id=\"T_0502a_row186_col8\" class=\"data row186 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row187_col0\" class=\"data row187 col0\" >validmind.unit_metrics.regression.KolmogorovSmirnovStatistic</td>\n", - " <td id=\"T_0502a_row187_col1\" class=\"data row187 col1\" >Kolmogorov Smirnov Statistic</td>\n", - " <td id=\"T_0502a_row187_col2\" class=\"data row187 col2\" >Calculates the Kolmogorov-Smirnov statistic for a regression model.</td>\n", - " <td id=\"T_0502a_row187_col3\" class=\"data row187 col3\" >False</td>\n", - " <td id=\"T_0502a_row187_col4\" class=\"data row187 col4\" >False</td>\n", - " <td id=\"T_0502a_row187_col5\" class=\"data row187 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_0502a_row187_col6\" class=\"data row187 col6\" >{}</td>\n", - " <td id=\"T_0502a_row187_col7\" class=\"data row187 col7\" >['regression']</td>\n", - " <td id=\"T_0502a_row187_col8\" class=\"data row187 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row188_col0\" class=\"data row188 col0\" >validmind.unit_metrics.regression.MeanAbsoluteError</td>\n", - " <td id=\"T_0502a_row188_col1\" class=\"data row188 col1\" >Mean Absolute Error</td>\n", - " <td id=\"T_0502a_row188_col2\" class=\"data row188 col2\" >Calculates the mean absolute error for a regression model.</td>\n", - " <td id=\"T_0502a_row188_col3\" class=\"data row188 col3\" >False</td>\n", - " <td id=\"T_0502a_row188_col4\" class=\"data row188 col4\" >False</td>\n", - " <td id=\"T_0502a_row188_col5\" class=\"data row188 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row188_col6\" class=\"data row188 col6\" >{}</td>\n", - " <td id=\"T_0502a_row188_col7\" class=\"data row188 col7\" >['regression']</td>\n", - " <td id=\"T_0502a_row188_col8\" class=\"data row188 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row189_col0\" class=\"data row189 col0\" >validmind.unit_metrics.regression.MeanAbsolutePercentageError</td>\n", - " <td id=\"T_0502a_row189_col1\" class=\"data row189 col1\" >Mean Absolute Percentage Error</td>\n", - " <td id=\"T_0502a_row189_col2\" class=\"data row189 col2\" >Calculates the mean absolute percentage error for a regression model.</td>\n", - " <td id=\"T_0502a_row189_col3\" class=\"data row189 col3\" >False</td>\n", - " <td id=\"T_0502a_row189_col4\" class=\"data row189 col4\" >False</td>\n", - " <td id=\"T_0502a_row189_col5\" class=\"data row189 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row189_col6\" class=\"data row189 col6\" >{}</td>\n", - " <td id=\"T_0502a_row189_col7\" class=\"data row189 col7\" >['regression']</td>\n", - " <td id=\"T_0502a_row189_col8\" class=\"data row189 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row190_col0\" class=\"data row190 col0\" >validmind.unit_metrics.regression.MeanBiasDeviation</td>\n", - " <td id=\"T_0502a_row190_col1\" class=\"data row190 col1\" >Mean Bias Deviation</td>\n", - " <td id=\"T_0502a_row190_col2\" class=\"data row190 col2\" >Calculates the mean bias deviation for a regression model.</td>\n", - " <td id=\"T_0502a_row190_col3\" class=\"data row190 col3\" >False</td>\n", - " <td id=\"T_0502a_row190_col4\" class=\"data row190 col4\" >False</td>\n", - " <td id=\"T_0502a_row190_col5\" class=\"data row190 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row190_col6\" class=\"data row190 col6\" >{}</td>\n", - " <td id=\"T_0502a_row190_col7\" class=\"data row190 col7\" >['regression']</td>\n", - " <td id=\"T_0502a_row190_col8\" class=\"data row190 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row191_col0\" class=\"data row191 col0\" >validmind.unit_metrics.regression.MeanSquaredError</td>\n", - " <td id=\"T_0502a_row191_col1\" class=\"data row191 col1\" >Mean Squared Error</td>\n", - " <td id=\"T_0502a_row191_col2\" class=\"data row191 col2\" >Calculates the mean squared error for a regression model.</td>\n", - " <td id=\"T_0502a_row191_col3\" class=\"data row191 col3\" >False</td>\n", - " <td id=\"T_0502a_row191_col4\" class=\"data row191 col4\" >False</td>\n", - " <td id=\"T_0502a_row191_col5\" class=\"data row191 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row191_col6\" class=\"data row191 col6\" >{}</td>\n", - " <td id=\"T_0502a_row191_col7\" class=\"data row191 col7\" >['regression']</td>\n", - " <td id=\"T_0502a_row191_col8\" class=\"data row191 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row192_col0\" class=\"data row192 col0\" >validmind.unit_metrics.regression.QuantileLoss</td>\n", - " <td id=\"T_0502a_row192_col1\" class=\"data row192 col1\" >Quantile Loss</td>\n", - " <td id=\"T_0502a_row192_col2\" class=\"data row192 col2\" >Calculates the quantile loss for a regression model.</td>\n", - " <td id=\"T_0502a_row192_col3\" class=\"data row192 col3\" >False</td>\n", - " <td id=\"T_0502a_row192_col4\" class=\"data row192 col4\" >False</td>\n", - " <td id=\"T_0502a_row192_col5\" class=\"data row192 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row192_col6\" class=\"data row192 col6\" >{'quantile': {'type': '_empty', 'default': 0.5}}</td>\n", - " <td id=\"T_0502a_row192_col7\" class=\"data row192 col7\" >['regression']</td>\n", - " <td id=\"T_0502a_row192_col8\" class=\"data row192 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row193_col0\" class=\"data row193 col0\" >validmind.unit_metrics.regression.RSquaredScore</td>\n", - " <td id=\"T_0502a_row193_col1\" class=\"data row193 col1\" >R Squared Score</td>\n", - " <td id=\"T_0502a_row193_col2\" class=\"data row193 col2\" >Calculates the R-squared score for a regression model.</td>\n", - " <td id=\"T_0502a_row193_col3\" class=\"data row193 col3\" >False</td>\n", - " <td id=\"T_0502a_row193_col4\" class=\"data row193 col4\" >False</td>\n", - " <td id=\"T_0502a_row193_col5\" class=\"data row193 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row193_col6\" class=\"data row193 col6\" >{}</td>\n", - " <td id=\"T_0502a_row193_col7\" class=\"data row193 col7\" >['regression']</td>\n", - " <td id=\"T_0502a_row193_col8\" class=\"data row193 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_0502a_row194_col0\" class=\"data row194 col0\" >validmind.unit_metrics.regression.RootMeanSquaredError</td>\n", - " <td id=\"T_0502a_row194_col1\" class=\"data row194 col1\" >Root Mean Squared Error</td>\n", - " <td id=\"T_0502a_row194_col2\" class=\"data row194 col2\" >Calculates the root mean squared error for a regression model.</td>\n", - " <td id=\"T_0502a_row194_col3\" class=\"data row194 col3\" >False</td>\n", - " <td id=\"T_0502a_row194_col4\" class=\"data row194 col4\" >False</td>\n", - " <td id=\"T_0502a_row194_col5\" class=\"data row194 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_0502a_row194_col6\" class=\"data row194 col6\" >{}</td>\n", - " <td id=\"T_0502a_row194_col7\" class=\"data row194 col7\" >['regression']</td>\n", - " <td id=\"T_0502a_row194_col8\" class=\"data row194 col8\" >['regression']</td>\n", - " </tr>\n", - " </tbody>\n", - "</table>\n" + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from validmind.tests import (\n", + " list_tests,\n", + " list_tasks,\n", + " list_tags,\n", + " list_tasks_and_tags,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Use [list_tests()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to retrieve all available ValidMind tests, which returns a DataFrame with the following columns:\n", + "\n", + "- **ID** – A unique identifier for each test.\n", + "- **Name** – The test’s name.\n", + "- **Description** – A short summary of what the test evaluates.\n", + "- **Tags** – Keywords that describe what the test does or applies to.\n", + "- **Tasks** – The type of modeling task the test supports." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "<style type=\"text/css\">\n", + "#T_0502a th {\n", + " text-align: left;\n", + "}\n", + "#T_0502a_row0_col0, #T_0502a_row0_col1, #T_0502a_row0_col2, #T_0502a_row0_col3, #T_0502a_row0_col4, #T_0502a_row0_col5, #T_0502a_row0_col6, #T_0502a_row0_col7, #T_0502a_row0_col8, #T_0502a_row1_col0, #T_0502a_row1_col1, #T_0502a_row1_col2, #T_0502a_row1_col3, #T_0502a_row1_col4, #T_0502a_row1_col5, #T_0502a_row1_col6, #T_0502a_row1_col7, #T_0502a_row1_col8, #T_0502a_row2_col0, #T_0502a_row2_col1, #T_0502a_row2_col2, #T_0502a_row2_col3, #T_0502a_row2_col4, #T_0502a_row2_col5, #T_0502a_row2_col6, #T_0502a_row2_col7, #T_0502a_row2_col8, #T_0502a_row3_col0, #T_0502a_row3_col1, #T_0502a_row3_col2, #T_0502a_row3_col3, #T_0502a_row3_col4, #T_0502a_row3_col5, #T_0502a_row3_col6, #T_0502a_row3_col7, #T_0502a_row3_col8, #T_0502a_row4_col0, #T_0502a_row4_col1, #T_0502a_row4_col2, #T_0502a_row4_col3, #T_0502a_row4_col4, #T_0502a_row4_col5, #T_0502a_row4_col6, #T_0502a_row4_col7, #T_0502a_row4_col8, #T_0502a_row5_col0, #T_0502a_row5_col1, #T_0502a_row5_col2, #T_0502a_row5_col3, #T_0502a_row5_col4, #T_0502a_row5_col5, #T_0502a_row5_col6, #T_0502a_row5_col7, #T_0502a_row5_col8, #T_0502a_row6_col0, #T_0502a_row6_col1, #T_0502a_row6_col2, #T_0502a_row6_col3, #T_0502a_row6_col4, #T_0502a_row6_col5, #T_0502a_row6_col6, #T_0502a_row6_col7, #T_0502a_row6_col8, #T_0502a_row7_col0, #T_0502a_row7_col1, #T_0502a_row7_col2, #T_0502a_row7_col3, #T_0502a_row7_col4, #T_0502a_row7_col5, #T_0502a_row7_col6, #T_0502a_row7_col7, #T_0502a_row7_col8, #T_0502a_row8_col0, #T_0502a_row8_col1, #T_0502a_row8_col2, #T_0502a_row8_col3, #T_0502a_row8_col4, #T_0502a_row8_col5, #T_0502a_row8_col6, #T_0502a_row8_col7, #T_0502a_row8_col8, #T_0502a_row9_col0, #T_0502a_row9_col1, #T_0502a_row9_col2, #T_0502a_row9_col3, #T_0502a_row9_col4, #T_0502a_row9_col5, #T_0502a_row9_col6, #T_0502a_row9_col7, #T_0502a_row9_col8, #T_0502a_row10_col0, #T_0502a_row10_col1, #T_0502a_row10_col2, #T_0502a_row10_col3, #T_0502a_row10_col4, #T_0502a_row10_col5, #T_0502a_row10_col6, #T_0502a_row10_col7, #T_0502a_row10_col8, #T_0502a_row11_col0, #T_0502a_row11_col1, #T_0502a_row11_col2, #T_0502a_row11_col3, #T_0502a_row11_col4, #T_0502a_row11_col5, #T_0502a_row11_col6, #T_0502a_row11_col7, #T_0502a_row11_col8, #T_0502a_row12_col0, #T_0502a_row12_col1, #T_0502a_row12_col2, #T_0502a_row12_col3, #T_0502a_row12_col4, #T_0502a_row12_col5, #T_0502a_row12_col6, #T_0502a_row12_col7, #T_0502a_row12_col8, #T_0502a_row13_col0, #T_0502a_row13_col1, #T_0502a_row13_col2, #T_0502a_row13_col3, #T_0502a_row13_col4, #T_0502a_row13_col5, #T_0502a_row13_col6, #T_0502a_row13_col7, #T_0502a_row13_col8, #T_0502a_row14_col0, #T_0502a_row14_col1, #T_0502a_row14_col2, #T_0502a_row14_col3, #T_0502a_row14_col4, #T_0502a_row14_col5, #T_0502a_row14_col6, #T_0502a_row14_col7, #T_0502a_row14_col8, #T_0502a_row15_col0, #T_0502a_row15_col1, #T_0502a_row15_col2, #T_0502a_row15_col3, #T_0502a_row15_col4, #T_0502a_row15_col5, #T_0502a_row15_col6, #T_0502a_row15_col7, #T_0502a_row15_col8, #T_0502a_row16_col0, #T_0502a_row16_col1, #T_0502a_row16_col2, #T_0502a_row16_col3, #T_0502a_row16_col4, #T_0502a_row16_col5, #T_0502a_row16_col6, #T_0502a_row16_col7, #T_0502a_row16_col8, #T_0502a_row17_col0, #T_0502a_row17_col1, #T_0502a_row17_col2, #T_0502a_row17_col3, #T_0502a_row17_col4, #T_0502a_row17_col5, #T_0502a_row17_col6, #T_0502a_row17_col7, #T_0502a_row17_col8, #T_0502a_row18_col0, #T_0502a_row18_col1, #T_0502a_row18_col2, #T_0502a_row18_col3, #T_0502a_row18_col4, #T_0502a_row18_col5, #T_0502a_row18_col6, #T_0502a_row18_col7, #T_0502a_row18_col8, #T_0502a_row19_col0, #T_0502a_row19_col1, #T_0502a_row19_col2, #T_0502a_row19_col3, #T_0502a_row19_col4, #T_0502a_row19_col5, #T_0502a_row19_col6, #T_0502a_row19_col7, #T_0502a_row19_col8, #T_0502a_row20_col0, #T_0502a_row20_col1, #T_0502a_row20_col2, #T_0502a_row20_col3, #T_0502a_row20_col4, #T_0502a_row20_col5, #T_0502a_row20_col6, #T_0502a_row20_col7, #T_0502a_row20_col8, #T_0502a_row21_col0, #T_0502a_row21_col1, #T_0502a_row21_col2, #T_0502a_row21_col3, #T_0502a_row21_col4, #T_0502a_row21_col5, #T_0502a_row21_col6, #T_0502a_row21_col7, #T_0502a_row21_col8, #T_0502a_row22_col0, #T_0502a_row22_col1, #T_0502a_row22_col2, #T_0502a_row22_col3, #T_0502a_row22_col4, #T_0502a_row22_col5, #T_0502a_row22_col6, #T_0502a_row22_col7, #T_0502a_row22_col8, #T_0502a_row23_col0, #T_0502a_row23_col1, #T_0502a_row23_col2, #T_0502a_row23_col3, #T_0502a_row23_col4, #T_0502a_row23_col5, #T_0502a_row23_col6, #T_0502a_row23_col7, #T_0502a_row23_col8, #T_0502a_row24_col0, #T_0502a_row24_col1, #T_0502a_row24_col2, #T_0502a_row24_col3, #T_0502a_row24_col4, #T_0502a_row24_col5, #T_0502a_row24_col6, #T_0502a_row24_col7, #T_0502a_row24_col8, #T_0502a_row25_col0, #T_0502a_row25_col1, #T_0502a_row25_col2, #T_0502a_row25_col3, #T_0502a_row25_col4, #T_0502a_row25_col5, #T_0502a_row25_col6, #T_0502a_row25_col7, #T_0502a_row25_col8, #T_0502a_row26_col0, #T_0502a_row26_col1, #T_0502a_row26_col2, #T_0502a_row26_col3, #T_0502a_row26_col4, #T_0502a_row26_col5, #T_0502a_row26_col6, #T_0502a_row26_col7, #T_0502a_row26_col8, #T_0502a_row27_col0, #T_0502a_row27_col1, #T_0502a_row27_col2, #T_0502a_row27_col3, #T_0502a_row27_col4, #T_0502a_row27_col5, #T_0502a_row27_col6, #T_0502a_row27_col7, #T_0502a_row27_col8, #T_0502a_row28_col0, #T_0502a_row28_col1, #T_0502a_row28_col2, #T_0502a_row28_col3, #T_0502a_row28_col4, #T_0502a_row28_col5, #T_0502a_row28_col6, #T_0502a_row28_col7, #T_0502a_row28_col8, #T_0502a_row29_col0, #T_0502a_row29_col1, #T_0502a_row29_col2, #T_0502a_row29_col3, #T_0502a_row29_col4, #T_0502a_row29_col5, #T_0502a_row29_col6, #T_0502a_row29_col7, #T_0502a_row29_col8, #T_0502a_row30_col0, #T_0502a_row30_col1, #T_0502a_row30_col2, #T_0502a_row30_col3, #T_0502a_row30_col4, #T_0502a_row30_col5, #T_0502a_row30_col6, #T_0502a_row30_col7, #T_0502a_row30_col8, #T_0502a_row31_col0, #T_0502a_row31_col1, #T_0502a_row31_col2, #T_0502a_row31_col3, #T_0502a_row31_col4, #T_0502a_row31_col5, #T_0502a_row31_col6, #T_0502a_row31_col7, #T_0502a_row31_col8, #T_0502a_row32_col0, #T_0502a_row32_col1, #T_0502a_row32_col2, #T_0502a_row32_col3, #T_0502a_row32_col4, #T_0502a_row32_col5, #T_0502a_row32_col6, #T_0502a_row32_col7, #T_0502a_row32_col8, #T_0502a_row33_col0, #T_0502a_row33_col1, #T_0502a_row33_col2, #T_0502a_row33_col3, #T_0502a_row33_col4, #T_0502a_row33_col5, #T_0502a_row33_col6, #T_0502a_row33_col7, #T_0502a_row33_col8, #T_0502a_row34_col0, #T_0502a_row34_col1, #T_0502a_row34_col2, #T_0502a_row34_col3, #T_0502a_row34_col4, #T_0502a_row34_col5, #T_0502a_row34_col6, #T_0502a_row34_col7, #T_0502a_row34_col8, #T_0502a_row35_col0, #T_0502a_row35_col1, #T_0502a_row35_col2, #T_0502a_row35_col3, #T_0502a_row35_col4, #T_0502a_row35_col5, #T_0502a_row35_col6, #T_0502a_row35_col7, #T_0502a_row35_col8, #T_0502a_row36_col0, #T_0502a_row36_col1, #T_0502a_row36_col2, #T_0502a_row36_col3, #T_0502a_row36_col4, #T_0502a_row36_col5, #T_0502a_row36_col6, #T_0502a_row36_col7, #T_0502a_row36_col8, #T_0502a_row37_col0, #T_0502a_row37_col1, #T_0502a_row37_col2, #T_0502a_row37_col3, #T_0502a_row37_col4, #T_0502a_row37_col5, #T_0502a_row37_col6, #T_0502a_row37_col7, #T_0502a_row37_col8, #T_0502a_row38_col0, #T_0502a_row38_col1, #T_0502a_row38_col2, #T_0502a_row38_col3, #T_0502a_row38_col4, #T_0502a_row38_col5, #T_0502a_row38_col6, #T_0502a_row38_col7, #T_0502a_row38_col8, #T_0502a_row39_col0, #T_0502a_row39_col1, #T_0502a_row39_col2, #T_0502a_row39_col3, #T_0502a_row39_col4, #T_0502a_row39_col5, #T_0502a_row39_col6, #T_0502a_row39_col7, #T_0502a_row39_col8, #T_0502a_row40_col0, #T_0502a_row40_col1, #T_0502a_row40_col2, #T_0502a_row40_col3, #T_0502a_row40_col4, #T_0502a_row40_col5, #T_0502a_row40_col6, #T_0502a_row40_col7, #T_0502a_row40_col8, #T_0502a_row41_col0, #T_0502a_row41_col1, #T_0502a_row41_col2, #T_0502a_row41_col3, #T_0502a_row41_col4, #T_0502a_row41_col5, #T_0502a_row41_col6, #T_0502a_row41_col7, #T_0502a_row41_col8, #T_0502a_row42_col0, #T_0502a_row42_col1, #T_0502a_row42_col2, #T_0502a_row42_col3, #T_0502a_row42_col4, #T_0502a_row42_col5, #T_0502a_row42_col6, #T_0502a_row42_col7, #T_0502a_row42_col8, #T_0502a_row43_col0, #T_0502a_row43_col1, #T_0502a_row43_col2, #T_0502a_row43_col3, #T_0502a_row43_col4, #T_0502a_row43_col5, #T_0502a_row43_col6, #T_0502a_row43_col7, #T_0502a_row43_col8, #T_0502a_row44_col0, #T_0502a_row44_col1, #T_0502a_row44_col2, #T_0502a_row44_col3, #T_0502a_row44_col4, #T_0502a_row44_col5, #T_0502a_row44_col6, #T_0502a_row44_col7, #T_0502a_row44_col8, #T_0502a_row45_col0, #T_0502a_row45_col1, #T_0502a_row45_col2, #T_0502a_row45_col3, #T_0502a_row45_col4, #T_0502a_row45_col5, #T_0502a_row45_col6, #T_0502a_row45_col7, #T_0502a_row45_col8, #T_0502a_row46_col0, #T_0502a_row46_col1, #T_0502a_row46_col2, #T_0502a_row46_col3, #T_0502a_row46_col4, #T_0502a_row46_col5, #T_0502a_row46_col6, #T_0502a_row46_col7, #T_0502a_row46_col8, #T_0502a_row47_col0, #T_0502a_row47_col1, #T_0502a_row47_col2, #T_0502a_row47_col3, #T_0502a_row47_col4, #T_0502a_row47_col5, #T_0502a_row47_col6, #T_0502a_row47_col7, #T_0502a_row47_col8, #T_0502a_row48_col0, #T_0502a_row48_col1, #T_0502a_row48_col2, #T_0502a_row48_col3, #T_0502a_row48_col4, #T_0502a_row48_col5, #T_0502a_row48_col6, #T_0502a_row48_col7, #T_0502a_row48_col8, #T_0502a_row49_col0, #T_0502a_row49_col1, #T_0502a_row49_col2, #T_0502a_row49_col3, #T_0502a_row49_col4, #T_0502a_row49_col5, #T_0502a_row49_col6, #T_0502a_row49_col7, #T_0502a_row49_col8, #T_0502a_row50_col0, #T_0502a_row50_col1, #T_0502a_row50_col2, #T_0502a_row50_col3, #T_0502a_row50_col4, #T_0502a_row50_col5, #T_0502a_row50_col6, #T_0502a_row50_col7, #T_0502a_row50_col8, #T_0502a_row51_col0, #T_0502a_row51_col1, #T_0502a_row51_col2, #T_0502a_row51_col3, #T_0502a_row51_col4, #T_0502a_row51_col5, #T_0502a_row51_col6, #T_0502a_row51_col7, #T_0502a_row51_col8, #T_0502a_row52_col0, #T_0502a_row52_col1, #T_0502a_row52_col2, #T_0502a_row52_col3, #T_0502a_row52_col4, #T_0502a_row52_col5, #T_0502a_row52_col6, #T_0502a_row52_col7, #T_0502a_row52_col8, #T_0502a_row53_col0, #T_0502a_row53_col1, #T_0502a_row53_col2, #T_0502a_row53_col3, #T_0502a_row53_col4, #T_0502a_row53_col5, #T_0502a_row53_col6, #T_0502a_row53_col7, #T_0502a_row53_col8, #T_0502a_row54_col0, #T_0502a_row54_col1, #T_0502a_row54_col2, #T_0502a_row54_col3, #T_0502a_row54_col4, #T_0502a_row54_col5, #T_0502a_row54_col6, #T_0502a_row54_col7, #T_0502a_row54_col8, #T_0502a_row55_col0, #T_0502a_row55_col1, #T_0502a_row55_col2, #T_0502a_row55_col3, #T_0502a_row55_col4, #T_0502a_row55_col5, #T_0502a_row55_col6, #T_0502a_row55_col7, #T_0502a_row55_col8, #T_0502a_row56_col0, #T_0502a_row56_col1, #T_0502a_row56_col2, #T_0502a_row56_col3, #T_0502a_row56_col4, #T_0502a_row56_col5, #T_0502a_row56_col6, #T_0502a_row56_col7, #T_0502a_row56_col8, #T_0502a_row57_col0, #T_0502a_row57_col1, #T_0502a_row57_col2, #T_0502a_row57_col3, #T_0502a_row57_col4, #T_0502a_row57_col5, #T_0502a_row57_col6, #T_0502a_row57_col7, #T_0502a_row57_col8, #T_0502a_row58_col0, #T_0502a_row58_col1, #T_0502a_row58_col2, #T_0502a_row58_col3, #T_0502a_row58_col4, #T_0502a_row58_col5, #T_0502a_row58_col6, #T_0502a_row58_col7, #T_0502a_row58_col8, #T_0502a_row59_col0, #T_0502a_row59_col1, #T_0502a_row59_col2, #T_0502a_row59_col3, #T_0502a_row59_col4, #T_0502a_row59_col5, #T_0502a_row59_col6, #T_0502a_row59_col7, #T_0502a_row59_col8, #T_0502a_row60_col0, #T_0502a_row60_col1, #T_0502a_row60_col2, #T_0502a_row60_col3, #T_0502a_row60_col4, #T_0502a_row60_col5, #T_0502a_row60_col6, #T_0502a_row60_col7, #T_0502a_row60_col8, #T_0502a_row61_col0, #T_0502a_row61_col1, #T_0502a_row61_col2, #T_0502a_row61_col3, #T_0502a_row61_col4, #T_0502a_row61_col5, #T_0502a_row61_col6, #T_0502a_row61_col7, #T_0502a_row61_col8, #T_0502a_row62_col0, #T_0502a_row62_col1, #T_0502a_row62_col2, #T_0502a_row62_col3, #T_0502a_row62_col4, #T_0502a_row62_col5, #T_0502a_row62_col6, #T_0502a_row62_col7, #T_0502a_row62_col8, #T_0502a_row63_col0, #T_0502a_row63_col1, #T_0502a_row63_col2, #T_0502a_row63_col3, #T_0502a_row63_col4, #T_0502a_row63_col5, #T_0502a_row63_col6, #T_0502a_row63_col7, #T_0502a_row63_col8, #T_0502a_row64_col0, #T_0502a_row64_col1, #T_0502a_row64_col2, #T_0502a_row64_col3, #T_0502a_row64_col4, #T_0502a_row64_col5, #T_0502a_row64_col6, #T_0502a_row64_col7, #T_0502a_row64_col8, #T_0502a_row65_col0, #T_0502a_row65_col1, #T_0502a_row65_col2, #T_0502a_row65_col3, #T_0502a_row65_col4, #T_0502a_row65_col5, #T_0502a_row65_col6, #T_0502a_row65_col7, #T_0502a_row65_col8, #T_0502a_row66_col0, #T_0502a_row66_col1, #T_0502a_row66_col2, #T_0502a_row66_col3, #T_0502a_row66_col4, #T_0502a_row66_col5, #T_0502a_row66_col6, #T_0502a_row66_col7, #T_0502a_row66_col8, #T_0502a_row67_col0, #T_0502a_row67_col1, #T_0502a_row67_col2, #T_0502a_row67_col3, #T_0502a_row67_col4, #T_0502a_row67_col5, #T_0502a_row67_col6, #T_0502a_row67_col7, #T_0502a_row67_col8, #T_0502a_row68_col0, #T_0502a_row68_col1, #T_0502a_row68_col2, #T_0502a_row68_col3, #T_0502a_row68_col4, #T_0502a_row68_col5, #T_0502a_row68_col6, #T_0502a_row68_col7, #T_0502a_row68_col8, #T_0502a_row69_col0, #T_0502a_row69_col1, #T_0502a_row69_col2, #T_0502a_row69_col3, #T_0502a_row69_col4, #T_0502a_row69_col5, #T_0502a_row69_col6, #T_0502a_row69_col7, #T_0502a_row69_col8, #T_0502a_row70_col0, #T_0502a_row70_col1, #T_0502a_row70_col2, #T_0502a_row70_col3, #T_0502a_row70_col4, #T_0502a_row70_col5, #T_0502a_row70_col6, #T_0502a_row70_col7, #T_0502a_row70_col8, #T_0502a_row71_col0, #T_0502a_row71_col1, #T_0502a_row71_col2, #T_0502a_row71_col3, #T_0502a_row71_col4, #T_0502a_row71_col5, #T_0502a_row71_col6, #T_0502a_row71_col7, #T_0502a_row71_col8, #T_0502a_row72_col0, #T_0502a_row72_col1, #T_0502a_row72_col2, #T_0502a_row72_col3, #T_0502a_row72_col4, #T_0502a_row72_col5, #T_0502a_row72_col6, #T_0502a_row72_col7, #T_0502a_row72_col8, #T_0502a_row73_col0, #T_0502a_row73_col1, #T_0502a_row73_col2, #T_0502a_row73_col3, #T_0502a_row73_col4, #T_0502a_row73_col5, #T_0502a_row73_col6, #T_0502a_row73_col7, #T_0502a_row73_col8, #T_0502a_row74_col0, #T_0502a_row74_col1, #T_0502a_row74_col2, #T_0502a_row74_col3, #T_0502a_row74_col4, #T_0502a_row74_col5, #T_0502a_row74_col6, #T_0502a_row74_col7, #T_0502a_row74_col8, #T_0502a_row75_col0, #T_0502a_row75_col1, #T_0502a_row75_col2, #T_0502a_row75_col3, #T_0502a_row75_col4, #T_0502a_row75_col5, #T_0502a_row75_col6, #T_0502a_row75_col7, #T_0502a_row75_col8, #T_0502a_row76_col0, #T_0502a_row76_col1, #T_0502a_row76_col2, #T_0502a_row76_col3, #T_0502a_row76_col4, #T_0502a_row76_col5, #T_0502a_row76_col6, #T_0502a_row76_col7, #T_0502a_row76_col8, #T_0502a_row77_col0, #T_0502a_row77_col1, #T_0502a_row77_col2, #T_0502a_row77_col3, #T_0502a_row77_col4, #T_0502a_row77_col5, #T_0502a_row77_col6, #T_0502a_row77_col7, #T_0502a_row77_col8, #T_0502a_row78_col0, #T_0502a_row78_col1, #T_0502a_row78_col2, #T_0502a_row78_col3, #T_0502a_row78_col4, #T_0502a_row78_col5, #T_0502a_row78_col6, #T_0502a_row78_col7, #T_0502a_row78_col8, #T_0502a_row79_col0, #T_0502a_row79_col1, #T_0502a_row79_col2, #T_0502a_row79_col3, #T_0502a_row79_col4, #T_0502a_row79_col5, #T_0502a_row79_col6, #T_0502a_row79_col7, #T_0502a_row79_col8, #T_0502a_row80_col0, #T_0502a_row80_col1, #T_0502a_row80_col2, #T_0502a_row80_col3, #T_0502a_row80_col4, #T_0502a_row80_col5, #T_0502a_row80_col6, #T_0502a_row80_col7, #T_0502a_row80_col8, #T_0502a_row81_col0, #T_0502a_row81_col1, #T_0502a_row81_col2, #T_0502a_row81_col3, #T_0502a_row81_col4, #T_0502a_row81_col5, #T_0502a_row81_col6, #T_0502a_row81_col7, #T_0502a_row81_col8, #T_0502a_row82_col0, #T_0502a_row82_col1, #T_0502a_row82_col2, #T_0502a_row82_col3, #T_0502a_row82_col4, #T_0502a_row82_col5, #T_0502a_row82_col6, #T_0502a_row82_col7, #T_0502a_row82_col8, #T_0502a_row83_col0, #T_0502a_row83_col1, #T_0502a_row83_col2, #T_0502a_row83_col3, #T_0502a_row83_col4, #T_0502a_row83_col5, #T_0502a_row83_col6, #T_0502a_row83_col7, #T_0502a_row83_col8, #T_0502a_row84_col0, #T_0502a_row84_col1, #T_0502a_row84_col2, #T_0502a_row84_col3, #T_0502a_row84_col4, #T_0502a_row84_col5, #T_0502a_row84_col6, #T_0502a_row84_col7, #T_0502a_row84_col8, #T_0502a_row85_col0, #T_0502a_row85_col1, #T_0502a_row85_col2, #T_0502a_row85_col3, #T_0502a_row85_col4, #T_0502a_row85_col5, #T_0502a_row85_col6, #T_0502a_row85_col7, #T_0502a_row85_col8, #T_0502a_row86_col0, #T_0502a_row86_col1, #T_0502a_row86_col2, #T_0502a_row86_col3, #T_0502a_row86_col4, #T_0502a_row86_col5, #T_0502a_row86_col6, #T_0502a_row86_col7, #T_0502a_row86_col8, #T_0502a_row87_col0, #T_0502a_row87_col1, #T_0502a_row87_col2, #T_0502a_row87_col3, #T_0502a_row87_col4, #T_0502a_row87_col5, #T_0502a_row87_col6, #T_0502a_row87_col7, #T_0502a_row87_col8, #T_0502a_row88_col0, #T_0502a_row88_col1, #T_0502a_row88_col2, #T_0502a_row88_col3, #T_0502a_row88_col4, #T_0502a_row88_col5, #T_0502a_row88_col6, #T_0502a_row88_col7, #T_0502a_row88_col8, #T_0502a_row89_col0, #T_0502a_row89_col1, #T_0502a_row89_col2, #T_0502a_row89_col3, #T_0502a_row89_col4, #T_0502a_row89_col5, #T_0502a_row89_col6, #T_0502a_row89_col7, #T_0502a_row89_col8, #T_0502a_row90_col0, #T_0502a_row90_col1, #T_0502a_row90_col2, #T_0502a_row90_col3, #T_0502a_row90_col4, #T_0502a_row90_col5, #T_0502a_row90_col6, #T_0502a_row90_col7, #T_0502a_row90_col8, #T_0502a_row91_col0, #T_0502a_row91_col1, #T_0502a_row91_col2, #T_0502a_row91_col3, #T_0502a_row91_col4, #T_0502a_row91_col5, #T_0502a_row91_col6, #T_0502a_row91_col7, #T_0502a_row91_col8, #T_0502a_row92_col0, #T_0502a_row92_col1, #T_0502a_row92_col2, #T_0502a_row92_col3, #T_0502a_row92_col4, #T_0502a_row92_col5, #T_0502a_row92_col6, #T_0502a_row92_col7, #T_0502a_row92_col8, #T_0502a_row93_col0, #T_0502a_row93_col1, #T_0502a_row93_col2, #T_0502a_row93_col3, #T_0502a_row93_col4, #T_0502a_row93_col5, #T_0502a_row93_col6, #T_0502a_row93_col7, #T_0502a_row93_col8, #T_0502a_row94_col0, #T_0502a_row94_col1, #T_0502a_row94_col2, #T_0502a_row94_col3, #T_0502a_row94_col4, #T_0502a_row94_col5, #T_0502a_row94_col6, #T_0502a_row94_col7, #T_0502a_row94_col8, #T_0502a_row95_col0, #T_0502a_row95_col1, #T_0502a_row95_col2, #T_0502a_row95_col3, #T_0502a_row95_col4, #T_0502a_row95_col5, #T_0502a_row95_col6, #T_0502a_row95_col7, #T_0502a_row95_col8, #T_0502a_row96_col0, #T_0502a_row96_col1, #T_0502a_row96_col2, #T_0502a_row96_col3, #T_0502a_row96_col4, #T_0502a_row96_col5, #T_0502a_row96_col6, #T_0502a_row96_col7, #T_0502a_row96_col8, #T_0502a_row97_col0, #T_0502a_row97_col1, #T_0502a_row97_col2, #T_0502a_row97_col3, #T_0502a_row97_col4, #T_0502a_row97_col5, #T_0502a_row97_col6, #T_0502a_row97_col7, #T_0502a_row97_col8, #T_0502a_row98_col0, #T_0502a_row98_col1, #T_0502a_row98_col2, #T_0502a_row98_col3, #T_0502a_row98_col4, #T_0502a_row98_col5, #T_0502a_row98_col6, #T_0502a_row98_col7, #T_0502a_row98_col8, #T_0502a_row99_col0, #T_0502a_row99_col1, #T_0502a_row99_col2, #T_0502a_row99_col3, #T_0502a_row99_col4, #T_0502a_row99_col5, #T_0502a_row99_col6, #T_0502a_row99_col7, #T_0502a_row99_col8, #T_0502a_row100_col0, #T_0502a_row100_col1, #T_0502a_row100_col2, #T_0502a_row100_col3, #T_0502a_row100_col4, #T_0502a_row100_col5, #T_0502a_row100_col6, #T_0502a_row100_col7, #T_0502a_row100_col8, #T_0502a_row101_col0, #T_0502a_row101_col1, #T_0502a_row101_col2, #T_0502a_row101_col3, #T_0502a_row101_col4, #T_0502a_row101_col5, #T_0502a_row101_col6, #T_0502a_row101_col7, #T_0502a_row101_col8, #T_0502a_row102_col0, #T_0502a_row102_col1, #T_0502a_row102_col2, #T_0502a_row102_col3, #T_0502a_row102_col4, #T_0502a_row102_col5, #T_0502a_row102_col6, #T_0502a_row102_col7, #T_0502a_row102_col8, #T_0502a_row103_col0, #T_0502a_row103_col1, #T_0502a_row103_col2, #T_0502a_row103_col3, #T_0502a_row103_col4, #T_0502a_row103_col5, #T_0502a_row103_col6, #T_0502a_row103_col7, #T_0502a_row103_col8, #T_0502a_row104_col0, #T_0502a_row104_col1, #T_0502a_row104_col2, #T_0502a_row104_col3, #T_0502a_row104_col4, #T_0502a_row104_col5, #T_0502a_row104_col6, #T_0502a_row104_col7, #T_0502a_row104_col8, #T_0502a_row105_col0, #T_0502a_row105_col1, #T_0502a_row105_col2, #T_0502a_row105_col3, #T_0502a_row105_col4, #T_0502a_row105_col5, #T_0502a_row105_col6, #T_0502a_row105_col7, #T_0502a_row105_col8, #T_0502a_row106_col0, #T_0502a_row106_col1, #T_0502a_row106_col2, #T_0502a_row106_col3, #T_0502a_row106_col4, #T_0502a_row106_col5, #T_0502a_row106_col6, #T_0502a_row106_col7, #T_0502a_row106_col8, #T_0502a_row107_col0, #T_0502a_row107_col1, #T_0502a_row107_col2, #T_0502a_row107_col3, #T_0502a_row107_col4, #T_0502a_row107_col5, #T_0502a_row107_col6, #T_0502a_row107_col7, #T_0502a_row107_col8, #T_0502a_row108_col0, #T_0502a_row108_col1, #T_0502a_row108_col2, #T_0502a_row108_col3, #T_0502a_row108_col4, #T_0502a_row108_col5, #T_0502a_row108_col6, #T_0502a_row108_col7, #T_0502a_row108_col8, #T_0502a_row109_col0, #T_0502a_row109_col1, #T_0502a_row109_col2, #T_0502a_row109_col3, #T_0502a_row109_col4, #T_0502a_row109_col5, #T_0502a_row109_col6, #T_0502a_row109_col7, #T_0502a_row109_col8, #T_0502a_row110_col0, #T_0502a_row110_col1, #T_0502a_row110_col2, #T_0502a_row110_col3, #T_0502a_row110_col4, #T_0502a_row110_col5, #T_0502a_row110_col6, #T_0502a_row110_col7, #T_0502a_row110_col8, #T_0502a_row111_col0, #T_0502a_row111_col1, #T_0502a_row111_col2, #T_0502a_row111_col3, #T_0502a_row111_col4, #T_0502a_row111_col5, #T_0502a_row111_col6, #T_0502a_row111_col7, #T_0502a_row111_col8, #T_0502a_row112_col0, #T_0502a_row112_col1, #T_0502a_row112_col2, #T_0502a_row112_col3, #T_0502a_row112_col4, #T_0502a_row112_col5, #T_0502a_row112_col6, #T_0502a_row112_col7, #T_0502a_row112_col8, #T_0502a_row113_col0, #T_0502a_row113_col1, #T_0502a_row113_col2, #T_0502a_row113_col3, #T_0502a_row113_col4, #T_0502a_row113_col5, #T_0502a_row113_col6, #T_0502a_row113_col7, #T_0502a_row113_col8, #T_0502a_row114_col0, #T_0502a_row114_col1, #T_0502a_row114_col2, #T_0502a_row114_col3, #T_0502a_row114_col4, #T_0502a_row114_col5, #T_0502a_row114_col6, #T_0502a_row114_col7, #T_0502a_row114_col8, #T_0502a_row115_col0, #T_0502a_row115_col1, #T_0502a_row115_col2, #T_0502a_row115_col3, #T_0502a_row115_col4, #T_0502a_row115_col5, #T_0502a_row115_col6, #T_0502a_row115_col7, #T_0502a_row115_col8, #T_0502a_row116_col0, #T_0502a_row116_col1, #T_0502a_row116_col2, #T_0502a_row116_col3, #T_0502a_row116_col4, #T_0502a_row116_col5, #T_0502a_row116_col6, #T_0502a_row116_col7, #T_0502a_row116_col8, #T_0502a_row117_col0, #T_0502a_row117_col1, #T_0502a_row117_col2, #T_0502a_row117_col3, #T_0502a_row117_col4, #T_0502a_row117_col5, #T_0502a_row117_col6, #T_0502a_row117_col7, #T_0502a_row117_col8, #T_0502a_row118_col0, #T_0502a_row118_col1, #T_0502a_row118_col2, #T_0502a_row118_col3, #T_0502a_row118_col4, #T_0502a_row118_col5, #T_0502a_row118_col6, #T_0502a_row118_col7, #T_0502a_row118_col8, #T_0502a_row119_col0, #T_0502a_row119_col1, #T_0502a_row119_col2, #T_0502a_row119_col3, #T_0502a_row119_col4, #T_0502a_row119_col5, #T_0502a_row119_col6, #T_0502a_row119_col7, #T_0502a_row119_col8, #T_0502a_row120_col0, #T_0502a_row120_col1, #T_0502a_row120_col2, #T_0502a_row120_col3, #T_0502a_row120_col4, #T_0502a_row120_col5, #T_0502a_row120_col6, #T_0502a_row120_col7, #T_0502a_row120_col8, #T_0502a_row121_col0, #T_0502a_row121_col1, #T_0502a_row121_col2, #T_0502a_row121_col3, #T_0502a_row121_col4, #T_0502a_row121_col5, #T_0502a_row121_col6, #T_0502a_row121_col7, #T_0502a_row121_col8, #T_0502a_row122_col0, #T_0502a_row122_col1, #T_0502a_row122_col2, #T_0502a_row122_col3, #T_0502a_row122_col4, #T_0502a_row122_col5, #T_0502a_row122_col6, #T_0502a_row122_col7, #T_0502a_row122_col8, #T_0502a_row123_col0, #T_0502a_row123_col1, #T_0502a_row123_col2, #T_0502a_row123_col3, #T_0502a_row123_col4, #T_0502a_row123_col5, #T_0502a_row123_col6, #T_0502a_row123_col7, #T_0502a_row123_col8, #T_0502a_row124_col0, #T_0502a_row124_col1, #T_0502a_row124_col2, #T_0502a_row124_col3, #T_0502a_row124_col4, #T_0502a_row124_col5, #T_0502a_row124_col6, #T_0502a_row124_col7, #T_0502a_row124_col8, #T_0502a_row125_col0, #T_0502a_row125_col1, #T_0502a_row125_col2, #T_0502a_row125_col3, #T_0502a_row125_col4, #T_0502a_row125_col5, #T_0502a_row125_col6, #T_0502a_row125_col7, #T_0502a_row125_col8, #T_0502a_row126_col0, #T_0502a_row126_col1, #T_0502a_row126_col2, #T_0502a_row126_col3, #T_0502a_row126_col4, #T_0502a_row126_col5, #T_0502a_row126_col6, #T_0502a_row126_col7, #T_0502a_row126_col8, #T_0502a_row127_col0, #T_0502a_row127_col1, #T_0502a_row127_col2, #T_0502a_row127_col3, #T_0502a_row127_col4, #T_0502a_row127_col5, #T_0502a_row127_col6, #T_0502a_row127_col7, #T_0502a_row127_col8, #T_0502a_row128_col0, #T_0502a_row128_col1, #T_0502a_row128_col2, #T_0502a_row128_col3, #T_0502a_row128_col4, #T_0502a_row128_col5, #T_0502a_row128_col6, #T_0502a_row128_col7, #T_0502a_row128_col8, #T_0502a_row129_col0, #T_0502a_row129_col1, #T_0502a_row129_col2, #T_0502a_row129_col3, #T_0502a_row129_col4, #T_0502a_row129_col5, #T_0502a_row129_col6, #T_0502a_row129_col7, #T_0502a_row129_col8, #T_0502a_row130_col0, #T_0502a_row130_col1, #T_0502a_row130_col2, #T_0502a_row130_col3, #T_0502a_row130_col4, #T_0502a_row130_col5, #T_0502a_row130_col6, #T_0502a_row130_col7, #T_0502a_row130_col8, #T_0502a_row131_col0, #T_0502a_row131_col1, #T_0502a_row131_col2, #T_0502a_row131_col3, #T_0502a_row131_col4, #T_0502a_row131_col5, #T_0502a_row131_col6, #T_0502a_row131_col7, #T_0502a_row131_col8, #T_0502a_row132_col0, #T_0502a_row132_col1, #T_0502a_row132_col2, #T_0502a_row132_col3, #T_0502a_row132_col4, #T_0502a_row132_col5, #T_0502a_row132_col6, #T_0502a_row132_col7, #T_0502a_row132_col8, #T_0502a_row133_col0, #T_0502a_row133_col1, #T_0502a_row133_col2, #T_0502a_row133_col3, #T_0502a_row133_col4, #T_0502a_row133_col5, #T_0502a_row133_col6, #T_0502a_row133_col7, #T_0502a_row133_col8, #T_0502a_row134_col0, #T_0502a_row134_col1, #T_0502a_row134_col2, #T_0502a_row134_col3, #T_0502a_row134_col4, #T_0502a_row134_col5, #T_0502a_row134_col6, #T_0502a_row134_col7, #T_0502a_row134_col8, #T_0502a_row135_col0, #T_0502a_row135_col1, #T_0502a_row135_col2, #T_0502a_row135_col3, #T_0502a_row135_col4, #T_0502a_row135_col5, #T_0502a_row135_col6, #T_0502a_row135_col7, #T_0502a_row135_col8, #T_0502a_row136_col0, #T_0502a_row136_col1, #T_0502a_row136_col2, #T_0502a_row136_col3, #T_0502a_row136_col4, #T_0502a_row136_col5, #T_0502a_row136_col6, #T_0502a_row136_col7, #T_0502a_row136_col8, #T_0502a_row137_col0, #T_0502a_row137_col1, #T_0502a_row137_col2, #T_0502a_row137_col3, #T_0502a_row137_col4, #T_0502a_row137_col5, #T_0502a_row137_col6, #T_0502a_row137_col7, #T_0502a_row137_col8, #T_0502a_row138_col0, #T_0502a_row138_col1, #T_0502a_row138_col2, #T_0502a_row138_col3, #T_0502a_row138_col4, #T_0502a_row138_col5, #T_0502a_row138_col6, #T_0502a_row138_col7, #T_0502a_row138_col8, #T_0502a_row139_col0, #T_0502a_row139_col1, #T_0502a_row139_col2, #T_0502a_row139_col3, #T_0502a_row139_col4, #T_0502a_row139_col5, #T_0502a_row139_col6, #T_0502a_row139_col7, #T_0502a_row139_col8, #T_0502a_row140_col0, #T_0502a_row140_col1, #T_0502a_row140_col2, #T_0502a_row140_col3, #T_0502a_row140_col4, #T_0502a_row140_col5, #T_0502a_row140_col6, #T_0502a_row140_col7, #T_0502a_row140_col8, #T_0502a_row141_col0, #T_0502a_row141_col1, #T_0502a_row141_col2, #T_0502a_row141_col3, #T_0502a_row141_col4, #T_0502a_row141_col5, #T_0502a_row141_col6, #T_0502a_row141_col7, #T_0502a_row141_col8, #T_0502a_row142_col0, #T_0502a_row142_col1, #T_0502a_row142_col2, #T_0502a_row142_col3, #T_0502a_row142_col4, #T_0502a_row142_col5, #T_0502a_row142_col6, #T_0502a_row142_col7, #T_0502a_row142_col8, #T_0502a_row143_col0, #T_0502a_row143_col1, #T_0502a_row143_col2, #T_0502a_row143_col3, #T_0502a_row143_col4, #T_0502a_row143_col5, #T_0502a_row143_col6, #T_0502a_row143_col7, #T_0502a_row143_col8, #T_0502a_row144_col0, #T_0502a_row144_col1, #T_0502a_row144_col2, #T_0502a_row144_col3, #T_0502a_row144_col4, #T_0502a_row144_col5, #T_0502a_row144_col6, #T_0502a_row144_col7, #T_0502a_row144_col8, #T_0502a_row145_col0, #T_0502a_row145_col1, #T_0502a_row145_col2, #T_0502a_row145_col3, #T_0502a_row145_col4, #T_0502a_row145_col5, #T_0502a_row145_col6, #T_0502a_row145_col7, #T_0502a_row145_col8, #T_0502a_row146_col0, #T_0502a_row146_col1, #T_0502a_row146_col2, #T_0502a_row146_col3, #T_0502a_row146_col4, #T_0502a_row146_col5, #T_0502a_row146_col6, #T_0502a_row146_col7, #T_0502a_row146_col8, #T_0502a_row147_col0, #T_0502a_row147_col1, #T_0502a_row147_col2, #T_0502a_row147_col3, #T_0502a_row147_col4, #T_0502a_row147_col5, #T_0502a_row147_col6, #T_0502a_row147_col7, #T_0502a_row147_col8, #T_0502a_row148_col0, #T_0502a_row148_col1, #T_0502a_row148_col2, #T_0502a_row148_col3, #T_0502a_row148_col4, #T_0502a_row148_col5, #T_0502a_row148_col6, #T_0502a_row148_col7, #T_0502a_row148_col8, #T_0502a_row149_col0, #T_0502a_row149_col1, #T_0502a_row149_col2, #T_0502a_row149_col3, #T_0502a_row149_col4, #T_0502a_row149_col5, #T_0502a_row149_col6, #T_0502a_row149_col7, #T_0502a_row149_col8, #T_0502a_row150_col0, #T_0502a_row150_col1, #T_0502a_row150_col2, #T_0502a_row150_col3, #T_0502a_row150_col4, #T_0502a_row150_col5, #T_0502a_row150_col6, #T_0502a_row150_col7, #T_0502a_row150_col8, #T_0502a_row151_col0, #T_0502a_row151_col1, #T_0502a_row151_col2, #T_0502a_row151_col3, #T_0502a_row151_col4, #T_0502a_row151_col5, #T_0502a_row151_col6, #T_0502a_row151_col7, #T_0502a_row151_col8, #T_0502a_row152_col0, #T_0502a_row152_col1, #T_0502a_row152_col2, #T_0502a_row152_col3, #T_0502a_row152_col4, #T_0502a_row152_col5, #T_0502a_row152_col6, #T_0502a_row152_col7, #T_0502a_row152_col8, #T_0502a_row153_col0, #T_0502a_row153_col1, #T_0502a_row153_col2, #T_0502a_row153_col3, #T_0502a_row153_col4, #T_0502a_row153_col5, #T_0502a_row153_col6, #T_0502a_row153_col7, #T_0502a_row153_col8, #T_0502a_row154_col0, #T_0502a_row154_col1, #T_0502a_row154_col2, #T_0502a_row154_col3, #T_0502a_row154_col4, #T_0502a_row154_col5, #T_0502a_row154_col6, #T_0502a_row154_col7, #T_0502a_row154_col8, #T_0502a_row155_col0, #T_0502a_row155_col1, #T_0502a_row155_col2, #T_0502a_row155_col3, #T_0502a_row155_col4, #T_0502a_row155_col5, #T_0502a_row155_col6, #T_0502a_row155_col7, #T_0502a_row155_col8, #T_0502a_row156_col0, #T_0502a_row156_col1, #T_0502a_row156_col2, #T_0502a_row156_col3, #T_0502a_row156_col4, #T_0502a_row156_col5, #T_0502a_row156_col6, #T_0502a_row156_col7, #T_0502a_row156_col8, #T_0502a_row157_col0, #T_0502a_row157_col1, #T_0502a_row157_col2, #T_0502a_row157_col3, #T_0502a_row157_col4, #T_0502a_row157_col5, #T_0502a_row157_col6, #T_0502a_row157_col7, #T_0502a_row157_col8, #T_0502a_row158_col0, #T_0502a_row158_col1, #T_0502a_row158_col2, #T_0502a_row158_col3, #T_0502a_row158_col4, #T_0502a_row158_col5, #T_0502a_row158_col6, #T_0502a_row158_col7, #T_0502a_row158_col8, #T_0502a_row159_col0, #T_0502a_row159_col1, #T_0502a_row159_col2, #T_0502a_row159_col3, #T_0502a_row159_col4, #T_0502a_row159_col5, #T_0502a_row159_col6, #T_0502a_row159_col7, #T_0502a_row159_col8, #T_0502a_row160_col0, #T_0502a_row160_col1, #T_0502a_row160_col2, #T_0502a_row160_col3, #T_0502a_row160_col4, #T_0502a_row160_col5, #T_0502a_row160_col6, #T_0502a_row160_col7, #T_0502a_row160_col8, #T_0502a_row161_col0, #T_0502a_row161_col1, #T_0502a_row161_col2, #T_0502a_row161_col3, #T_0502a_row161_col4, #T_0502a_row161_col5, #T_0502a_row161_col6, #T_0502a_row161_col7, #T_0502a_row161_col8, #T_0502a_row162_col0, #T_0502a_row162_col1, #T_0502a_row162_col2, #T_0502a_row162_col3, #T_0502a_row162_col4, #T_0502a_row162_col5, #T_0502a_row162_col6, #T_0502a_row162_col7, #T_0502a_row162_col8, #T_0502a_row163_col0, #T_0502a_row163_col1, #T_0502a_row163_col2, #T_0502a_row163_col3, #T_0502a_row163_col4, #T_0502a_row163_col5, #T_0502a_row163_col6, #T_0502a_row163_col7, #T_0502a_row163_col8, #T_0502a_row164_col0, #T_0502a_row164_col1, #T_0502a_row164_col2, #T_0502a_row164_col3, #T_0502a_row164_col4, #T_0502a_row164_col5, #T_0502a_row164_col6, #T_0502a_row164_col7, #T_0502a_row164_col8, #T_0502a_row165_col0, #T_0502a_row165_col1, #T_0502a_row165_col2, #T_0502a_row165_col3, #T_0502a_row165_col4, #T_0502a_row165_col5, #T_0502a_row165_col6, #T_0502a_row165_col7, #T_0502a_row165_col8, #T_0502a_row166_col0, #T_0502a_row166_col1, #T_0502a_row166_col2, #T_0502a_row166_col3, #T_0502a_row166_col4, #T_0502a_row166_col5, #T_0502a_row166_col6, #T_0502a_row166_col7, #T_0502a_row166_col8, #T_0502a_row167_col0, #T_0502a_row167_col1, #T_0502a_row167_col2, #T_0502a_row167_col3, #T_0502a_row167_col4, #T_0502a_row167_col5, #T_0502a_row167_col6, #T_0502a_row167_col7, #T_0502a_row167_col8, #T_0502a_row168_col0, #T_0502a_row168_col1, #T_0502a_row168_col2, #T_0502a_row168_col3, #T_0502a_row168_col4, #T_0502a_row168_col5, #T_0502a_row168_col6, #T_0502a_row168_col7, #T_0502a_row168_col8, #T_0502a_row169_col0, #T_0502a_row169_col1, #T_0502a_row169_col2, #T_0502a_row169_col3, #T_0502a_row169_col4, #T_0502a_row169_col5, #T_0502a_row169_col6, #T_0502a_row169_col7, #T_0502a_row169_col8, #T_0502a_row170_col0, #T_0502a_row170_col1, #T_0502a_row170_col2, #T_0502a_row170_col3, #T_0502a_row170_col4, #T_0502a_row170_col5, #T_0502a_row170_col6, #T_0502a_row170_col7, #T_0502a_row170_col8, #T_0502a_row171_col0, #T_0502a_row171_col1, #T_0502a_row171_col2, #T_0502a_row171_col3, #T_0502a_row171_col4, #T_0502a_row171_col5, #T_0502a_row171_col6, #T_0502a_row171_col7, #T_0502a_row171_col8, #T_0502a_row172_col0, #T_0502a_row172_col1, #T_0502a_row172_col2, #T_0502a_row172_col3, #T_0502a_row172_col4, #T_0502a_row172_col5, #T_0502a_row172_col6, #T_0502a_row172_col7, #T_0502a_row172_col8, #T_0502a_row173_col0, #T_0502a_row173_col1, #T_0502a_row173_col2, #T_0502a_row173_col3, #T_0502a_row173_col4, #T_0502a_row173_col5, #T_0502a_row173_col6, #T_0502a_row173_col7, #T_0502a_row173_col8, #T_0502a_row174_col0, #T_0502a_row174_col1, #T_0502a_row174_col2, #T_0502a_row174_col3, #T_0502a_row174_col4, #T_0502a_row174_col5, #T_0502a_row174_col6, #T_0502a_row174_col7, #T_0502a_row174_col8, #T_0502a_row175_col0, #T_0502a_row175_col1, #T_0502a_row175_col2, #T_0502a_row175_col3, #T_0502a_row175_col4, #T_0502a_row175_col5, #T_0502a_row175_col6, #T_0502a_row175_col7, #T_0502a_row175_col8, #T_0502a_row176_col0, #T_0502a_row176_col1, #T_0502a_row176_col2, #T_0502a_row176_col3, #T_0502a_row176_col4, #T_0502a_row176_col5, #T_0502a_row176_col6, #T_0502a_row176_col7, #T_0502a_row176_col8, #T_0502a_row177_col0, #T_0502a_row177_col1, #T_0502a_row177_col2, #T_0502a_row177_col3, #T_0502a_row177_col4, #T_0502a_row177_col5, #T_0502a_row177_col6, #T_0502a_row177_col7, #T_0502a_row177_col8, #T_0502a_row178_col0, #T_0502a_row178_col1, #T_0502a_row178_col2, #T_0502a_row178_col3, #T_0502a_row178_col4, #T_0502a_row178_col5, #T_0502a_row178_col6, #T_0502a_row178_col7, #T_0502a_row178_col8, #T_0502a_row179_col0, #T_0502a_row179_col1, #T_0502a_row179_col2, #T_0502a_row179_col3, #T_0502a_row179_col4, #T_0502a_row179_col5, #T_0502a_row179_col6, #T_0502a_row179_col7, #T_0502a_row179_col8, #T_0502a_row180_col0, #T_0502a_row180_col1, #T_0502a_row180_col2, #T_0502a_row180_col3, #T_0502a_row180_col4, #T_0502a_row180_col5, #T_0502a_row180_col6, #T_0502a_row180_col7, #T_0502a_row180_col8, #T_0502a_row181_col0, #T_0502a_row181_col1, #T_0502a_row181_col2, #T_0502a_row181_col3, #T_0502a_row181_col4, #T_0502a_row181_col5, #T_0502a_row181_col6, #T_0502a_row181_col7, #T_0502a_row181_col8, #T_0502a_row182_col0, #T_0502a_row182_col1, #T_0502a_row182_col2, #T_0502a_row182_col3, #T_0502a_row182_col4, #T_0502a_row182_col5, #T_0502a_row182_col6, #T_0502a_row182_col7, #T_0502a_row182_col8, #T_0502a_row183_col0, #T_0502a_row183_col1, #T_0502a_row183_col2, #T_0502a_row183_col3, #T_0502a_row183_col4, #T_0502a_row183_col5, #T_0502a_row183_col6, #T_0502a_row183_col7, #T_0502a_row183_col8, #T_0502a_row184_col0, #T_0502a_row184_col1, #T_0502a_row184_col2, #T_0502a_row184_col3, #T_0502a_row184_col4, #T_0502a_row184_col5, #T_0502a_row184_col6, #T_0502a_row184_col7, #T_0502a_row184_col8, #T_0502a_row185_col0, #T_0502a_row185_col1, #T_0502a_row185_col2, #T_0502a_row185_col3, #T_0502a_row185_col4, #T_0502a_row185_col5, #T_0502a_row185_col6, #T_0502a_row185_col7, #T_0502a_row185_col8, #T_0502a_row186_col0, #T_0502a_row186_col1, #T_0502a_row186_col2, #T_0502a_row186_col3, #T_0502a_row186_col4, #T_0502a_row186_col5, #T_0502a_row186_col6, #T_0502a_row186_col7, #T_0502a_row186_col8, #T_0502a_row187_col0, #T_0502a_row187_col1, #T_0502a_row187_col2, #T_0502a_row187_col3, #T_0502a_row187_col4, #T_0502a_row187_col5, #T_0502a_row187_col6, #T_0502a_row187_col7, #T_0502a_row187_col8, #T_0502a_row188_col0, #T_0502a_row188_col1, #T_0502a_row188_col2, #T_0502a_row188_col3, #T_0502a_row188_col4, #T_0502a_row188_col5, #T_0502a_row188_col6, #T_0502a_row188_col7, #T_0502a_row188_col8, #T_0502a_row189_col0, #T_0502a_row189_col1, #T_0502a_row189_col2, #T_0502a_row189_col3, #T_0502a_row189_col4, #T_0502a_row189_col5, #T_0502a_row189_col6, #T_0502a_row189_col7, #T_0502a_row189_col8, #T_0502a_row190_col0, #T_0502a_row190_col1, #T_0502a_row190_col2, #T_0502a_row190_col3, #T_0502a_row190_col4, #T_0502a_row190_col5, #T_0502a_row190_col6, #T_0502a_row190_col7, #T_0502a_row190_col8, #T_0502a_row191_col0, #T_0502a_row191_col1, #T_0502a_row191_col2, #T_0502a_row191_col3, #T_0502a_row191_col4, #T_0502a_row191_col5, #T_0502a_row191_col6, #T_0502a_row191_col7, #T_0502a_row191_col8, #T_0502a_row192_col0, #T_0502a_row192_col1, #T_0502a_row192_col2, #T_0502a_row192_col3, #T_0502a_row192_col4, #T_0502a_row192_col5, #T_0502a_row192_col6, #T_0502a_row192_col7, #T_0502a_row192_col8, #T_0502a_row193_col0, #T_0502a_row193_col1, #T_0502a_row193_col2, #T_0502a_row193_col3, #T_0502a_row193_col4, #T_0502a_row193_col5, #T_0502a_row193_col6, #T_0502a_row193_col7, #T_0502a_row193_col8, #T_0502a_row194_col0, #T_0502a_row194_col1, #T_0502a_row194_col2, #T_0502a_row194_col3, #T_0502a_row194_col4, #T_0502a_row194_col5, #T_0502a_row194_col6, #T_0502a_row194_col7, #T_0502a_row194_col8 {\n", + " text-align: left;\n", + "}\n", + "</style>\n", + "<table id=\"T_0502a\">\n", + " <thead>\n", + " <tr>\n", + " <th id=\"T_0502a_level0_col0\" class=\"col_heading level0 col0\" >ID</th>\n", + " <th id=\"T_0502a_level0_col1\" class=\"col_heading level0 col1\" >Name</th>\n", + " <th id=\"T_0502a_level0_col2\" class=\"col_heading level0 col2\" >Description</th>\n", + " <th id=\"T_0502a_level0_col3\" class=\"col_heading level0 col3\" >Has Figure</th>\n", + " <th id=\"T_0502a_level0_col4\" class=\"col_heading level0 col4\" >Has Table</th>\n", + " <th id=\"T_0502a_level0_col5\" class=\"col_heading level0 col5\" >Required Inputs</th>\n", + " <th id=\"T_0502a_level0_col6\" class=\"col_heading level0 col6\" >Params</th>\n", + " <th id=\"T_0502a_level0_col7\" class=\"col_heading level0 col7\" >Tags</th>\n", + " <th id=\"T_0502a_level0_col8\" class=\"col_heading level0 col8\" >Tasks</th>\n", + " </tr>\n", + " </thead>\n", + " <tbody>\n", + " <tr>\n", + " <td id=\"T_0502a_row0_col0\" class=\"data row0 col0\" >validmind.data_validation.ACFandPACFPlot</td>\n", + " <td id=\"T_0502a_row0_col1\" class=\"data row0 col1\" >AC Fand PACF Plot</td>\n", + " <td id=\"T_0502a_row0_col2\" class=\"data row0 col2\" >Analyzes time series data using Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) plots to...</td>\n", + " <td id=\"T_0502a_row0_col3\" class=\"data row0 col3\" >True</td>\n", + " <td id=\"T_0502a_row0_col4\" class=\"data row0 col4\" >False</td>\n", + " <td id=\"T_0502a_row0_col5\" class=\"data row0 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row0_col6\" class=\"data row0 col6\" >{}</td>\n", + " <td id=\"T_0502a_row0_col7\" class=\"data row0 col7\" >['time_series_data', 'forecasting', 'statistical_test', 'visualization']</td>\n", + " <td id=\"T_0502a_row0_col8\" class=\"data row0 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row1_col0\" class=\"data row1 col0\" >validmind.data_validation.ADF</td>\n", + " <td id=\"T_0502a_row1_col1\" class=\"data row1 col1\" >ADF</td>\n", + " <td id=\"T_0502a_row1_col2\" class=\"data row1 col2\" >Assesses the stationarity of a time series dataset using the Augmented Dickey-Fuller (ADF) test....</td>\n", + " <td id=\"T_0502a_row1_col3\" class=\"data row1 col3\" >False</td>\n", + " <td id=\"T_0502a_row1_col4\" class=\"data row1 col4\" >True</td>\n", + " <td id=\"T_0502a_row1_col5\" class=\"data row1 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row1_col6\" class=\"data row1 col6\" >{}</td>\n", + " <td id=\"T_0502a_row1_col7\" class=\"data row1 col7\" >['time_series_data', 'statsmodels', 'forecasting', 'statistical_test', 'stationarity']</td>\n", + " <td id=\"T_0502a_row1_col8\" class=\"data row1 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row2_col0\" class=\"data row2 col0\" >validmind.data_validation.AutoAR</td>\n", + " <td id=\"T_0502a_row2_col1\" class=\"data row2 col1\" >Auto AR</td>\n", + " <td id=\"T_0502a_row2_col2\" class=\"data row2 col2\" >Automatically identifies the optimal Autoregressive (AR) order for a time series using BIC and AIC criteria....</td>\n", + " <td id=\"T_0502a_row2_col3\" class=\"data row2 col3\" >False</td>\n", + " <td id=\"T_0502a_row2_col4\" class=\"data row2 col4\" >True</td>\n", + " <td id=\"T_0502a_row2_col5\" class=\"data row2 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row2_col6\" class=\"data row2 col6\" >{'max_ar_order': {'type': 'int', 'default': 3}}</td>\n", + " <td id=\"T_0502a_row2_col7\" class=\"data row2 col7\" >['time_series_data', 'statsmodels', 'forecasting', 'statistical_test']</td>\n", + " <td id=\"T_0502a_row2_col8\" class=\"data row2 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row3_col0\" class=\"data row3 col0\" >validmind.data_validation.AutoMA</td>\n", + " <td id=\"T_0502a_row3_col1\" class=\"data row3 col1\" >Auto MA</td>\n", + " <td id=\"T_0502a_row3_col2\" class=\"data row3 col2\" >Automatically selects the optimal Moving Average (MA) order for each variable in a time series dataset based on...</td>\n", + " <td id=\"T_0502a_row3_col3\" class=\"data row3 col3\" >False</td>\n", + " <td id=\"T_0502a_row3_col4\" class=\"data row3 col4\" >True</td>\n", + " <td id=\"T_0502a_row3_col5\" class=\"data row3 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row3_col6\" class=\"data row3 col6\" >{'max_ma_order': {'type': 'int', 'default': 3}}</td>\n", + " <td id=\"T_0502a_row3_col7\" class=\"data row3 col7\" >['time_series_data', 'statsmodels', 'forecasting', 'statistical_test']</td>\n", + " <td id=\"T_0502a_row3_col8\" class=\"data row3 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row4_col0\" class=\"data row4 col0\" >validmind.data_validation.AutoStationarity</td>\n", + " <td id=\"T_0502a_row4_col1\" class=\"data row4 col1\" >Auto Stationarity</td>\n", + " <td id=\"T_0502a_row4_col2\" class=\"data row4 col2\" >Automates Augmented Dickey-Fuller test to assess stationarity across multiple time series in a DataFrame....</td>\n", + " <td id=\"T_0502a_row4_col3\" class=\"data row4 col3\" >False</td>\n", + " <td id=\"T_0502a_row4_col4\" class=\"data row4 col4\" >True</td>\n", + " <td id=\"T_0502a_row4_col5\" class=\"data row4 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row4_col6\" class=\"data row4 col6\" >{'max_order': {'type': 'int', 'default': 5}, 'threshold': {'type': 'float', 'default': 0.05}}</td>\n", + " <td id=\"T_0502a_row4_col7\" class=\"data row4 col7\" >['time_series_data', 'statsmodels', 'forecasting', 'statistical_test']</td>\n", + " <td id=\"T_0502a_row4_col8\" class=\"data row4 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row5_col0\" class=\"data row5 col0\" >validmind.data_validation.BivariateScatterPlots</td>\n", + " <td id=\"T_0502a_row5_col1\" class=\"data row5 col1\" >Bivariate Scatter Plots</td>\n", + " <td id=\"T_0502a_row5_col2\" class=\"data row5 col2\" >Generates bivariate scatterplots to visually inspect relationships between pairs of numerical predictor variables...</td>\n", + " <td id=\"T_0502a_row5_col3\" class=\"data row5 col3\" >True</td>\n", + " <td id=\"T_0502a_row5_col4\" class=\"data row5 col4\" >False</td>\n", + " <td id=\"T_0502a_row5_col5\" class=\"data row5 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row5_col6\" class=\"data row5 col6\" >{}</td>\n", + " <td id=\"T_0502a_row5_col7\" class=\"data row5 col7\" >['tabular_data', 'numerical_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row5_col8\" class=\"data row5 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row6_col0\" class=\"data row6 col0\" >validmind.data_validation.BoxPierce</td>\n", + " <td id=\"T_0502a_row6_col1\" class=\"data row6 col1\" >Box Pierce</td>\n", + " <td id=\"T_0502a_row6_col2\" class=\"data row6 col2\" >Detects autocorrelation in time-series data through the Box-Pierce test to validate model performance....</td>\n", + " <td id=\"T_0502a_row6_col3\" class=\"data row6 col3\" >False</td>\n", + " <td id=\"T_0502a_row6_col4\" class=\"data row6 col4\" >True</td>\n", + " <td id=\"T_0502a_row6_col5\" class=\"data row6 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row6_col6\" class=\"data row6 col6\" >{}</td>\n", + " <td id=\"T_0502a_row6_col7\" class=\"data row6 col7\" >['time_series_data', 'forecasting', 'statistical_test', 'statsmodels']</td>\n", + " <td id=\"T_0502a_row6_col8\" class=\"data row6 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row7_col0\" class=\"data row7 col0\" >validmind.data_validation.ChiSquaredFeaturesTable</td>\n", + " <td id=\"T_0502a_row7_col1\" class=\"data row7 col1\" >Chi Squared Features Table</td>\n", + " <td id=\"T_0502a_row7_col2\" class=\"data row7 col2\" >Assesses the statistical association between categorical features and a target variable using the Chi-Squared test....</td>\n", + " <td id=\"T_0502a_row7_col3\" class=\"data row7 col3\" >False</td>\n", + " <td id=\"T_0502a_row7_col4\" class=\"data row7 col4\" >True</td>\n", + " <td id=\"T_0502a_row7_col5\" class=\"data row7 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row7_col6\" class=\"data row7 col6\" >{'p_threshold': {'type': '_empty', 'default': 0.05}}</td>\n", + " <td id=\"T_0502a_row7_col7\" class=\"data row7 col7\" >['tabular_data', 'categorical_data', 'statistical_test']</td>\n", + " <td id=\"T_0502a_row7_col8\" class=\"data row7 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row8_col0\" class=\"data row8 col0\" >validmind.data_validation.ClassImbalance</td>\n", + " <td id=\"T_0502a_row8_col1\" class=\"data row8 col1\" >Class Imbalance</td>\n", + " <td id=\"T_0502a_row8_col2\" class=\"data row8 col2\" >Evaluates and quantifies class distribution imbalance in a dataset used by a machine learning model....</td>\n", + " <td id=\"T_0502a_row8_col3\" class=\"data row8 col3\" >True</td>\n", + " <td id=\"T_0502a_row8_col4\" class=\"data row8 col4\" >True</td>\n", + " <td id=\"T_0502a_row8_col5\" class=\"data row8 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row8_col6\" class=\"data row8 col6\" >{'min_percent_threshold': {'type': 'int', 'default': 10}}</td>\n", + " <td id=\"T_0502a_row8_col7\" class=\"data row8 col7\" >['tabular_data', 'binary_classification', 'multiclass_classification', 'data_quality']</td>\n", + " <td id=\"T_0502a_row8_col8\" class=\"data row8 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row9_col0\" class=\"data row9 col0\" >validmind.data_validation.DatasetDescription</td>\n", + " <td id=\"T_0502a_row9_col1\" class=\"data row9 col1\" >Dataset Description</td>\n", + " <td id=\"T_0502a_row9_col2\" class=\"data row9 col2\" >Provides comprehensive analysis and statistical summaries of each column in a machine learning model's dataset....</td>\n", + " <td id=\"T_0502a_row9_col3\" class=\"data row9 col3\" >False</td>\n", + " <td id=\"T_0502a_row9_col4\" class=\"data row9 col4\" >True</td>\n", + " <td id=\"T_0502a_row9_col5\" class=\"data row9 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row9_col6\" class=\"data row9 col6\" >{}</td>\n", + " <td id=\"T_0502a_row9_col7\" class=\"data row9 col7\" >['tabular_data', 'time_series_data', 'text_data']</td>\n", + " <td id=\"T_0502a_row9_col8\" class=\"data row9 col8\" >['classification', 'regression', 'text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row10_col0\" class=\"data row10 col0\" >validmind.data_validation.DatasetSplit</td>\n", + " <td id=\"T_0502a_row10_col1\" class=\"data row10 col1\" >Dataset Split</td>\n", + " <td id=\"T_0502a_row10_col2\" class=\"data row10 col2\" >Evaluates and visualizes the distribution proportions among training, testing, and validation datasets of an ML...</td>\n", + " <td id=\"T_0502a_row10_col3\" class=\"data row10 col3\" >False</td>\n", + " <td id=\"T_0502a_row10_col4\" class=\"data row10 col4\" >True</td>\n", + " <td id=\"T_0502a_row10_col5\" class=\"data row10 col5\" >['datasets']</td>\n", + " <td id=\"T_0502a_row10_col6\" class=\"data row10 col6\" >{}</td>\n", + " <td id=\"T_0502a_row10_col7\" class=\"data row10 col7\" >['tabular_data', 'time_series_data', 'text_data']</td>\n", + " <td id=\"T_0502a_row10_col8\" class=\"data row10 col8\" >['classification', 'regression', 'text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row11_col0\" class=\"data row11 col0\" >validmind.data_validation.DescriptiveStatistics</td>\n", + " <td id=\"T_0502a_row11_col1\" class=\"data row11 col1\" >Descriptive Statistics</td>\n", + " <td id=\"T_0502a_row11_col2\" class=\"data row11 col2\" >Performs a detailed descriptive statistical analysis of both numerical and categorical data within a model's...</td>\n", + " <td id=\"T_0502a_row11_col3\" class=\"data row11 col3\" >False</td>\n", + " <td id=\"T_0502a_row11_col4\" class=\"data row11 col4\" >True</td>\n", + " <td id=\"T_0502a_row11_col5\" class=\"data row11 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row11_col6\" class=\"data row11 col6\" >{}</td>\n", + " <td id=\"T_0502a_row11_col7\" class=\"data row11 col7\" >['tabular_data', 'time_series_data', 'data_quality']</td>\n", + " <td id=\"T_0502a_row11_col8\" class=\"data row11 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row12_col0\" class=\"data row12 col0\" >validmind.data_validation.DickeyFullerGLS</td>\n", + " <td id=\"T_0502a_row12_col1\" class=\"data row12 col1\" >Dickey Fuller GLS</td>\n", + " <td id=\"T_0502a_row12_col2\" class=\"data row12 col2\" >Assesses stationarity in time series data using the Dickey-Fuller GLS test to determine the order of integration....</td>\n", + " <td id=\"T_0502a_row12_col3\" class=\"data row12 col3\" >False</td>\n", + " <td id=\"T_0502a_row12_col4\" class=\"data row12 col4\" >True</td>\n", + " <td id=\"T_0502a_row12_col5\" class=\"data row12 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row12_col6\" class=\"data row12 col6\" >{}</td>\n", + " <td id=\"T_0502a_row12_col7\" class=\"data row12 col7\" >['time_series_data', 'forecasting', 'unit_root_test']</td>\n", + " <td id=\"T_0502a_row12_col8\" class=\"data row12 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row13_col0\" class=\"data row13 col0\" >validmind.data_validation.Duplicates</td>\n", + " <td id=\"T_0502a_row13_col1\" class=\"data row13 col1\" >Duplicates</td>\n", + " <td id=\"T_0502a_row13_col2\" class=\"data row13 col2\" >Tests dataset for duplicate entries, ensuring model reliability via data quality verification....</td>\n", + " <td id=\"T_0502a_row13_col3\" class=\"data row13 col3\" >False</td>\n", + " <td id=\"T_0502a_row13_col4\" class=\"data row13 col4\" >True</td>\n", + " <td id=\"T_0502a_row13_col5\" class=\"data row13 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row13_col6\" class=\"data row13 col6\" >{'min_threshold': {'type': '_empty', 'default': 1}}</td>\n", + " <td id=\"T_0502a_row13_col7\" class=\"data row13 col7\" >['tabular_data', 'data_quality', 'text_data']</td>\n", + " <td id=\"T_0502a_row13_col8\" class=\"data row13 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row14_col0\" class=\"data row14 col0\" >validmind.data_validation.EngleGrangerCoint</td>\n", + " <td id=\"T_0502a_row14_col1\" class=\"data row14 col1\" >Engle Granger Coint</td>\n", + " <td id=\"T_0502a_row14_col2\" class=\"data row14 col2\" >Assesses the degree of co-movement between pairs of time series data using the Engle-Granger cointegration test....</td>\n", + " <td id=\"T_0502a_row14_col3\" class=\"data row14 col3\" >False</td>\n", + " <td id=\"T_0502a_row14_col4\" class=\"data row14 col4\" >True</td>\n", + " <td id=\"T_0502a_row14_col5\" class=\"data row14 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row14_col6\" class=\"data row14 col6\" >{'threshold': {'type': 'float', 'default': 0.05}}</td>\n", + " <td id=\"T_0502a_row14_col7\" class=\"data row14 col7\" >['time_series_data', 'statistical_test', 'forecasting']</td>\n", + " <td id=\"T_0502a_row14_col8\" class=\"data row14 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row15_col0\" class=\"data row15 col0\" >validmind.data_validation.FeatureTargetCorrelationPlot</td>\n", + " <td id=\"T_0502a_row15_col1\" class=\"data row15 col1\" >Feature Target Correlation Plot</td>\n", + " <td id=\"T_0502a_row15_col2\" class=\"data row15 col2\" >Visualizes the correlation between input features and the model's target output in a color-coded horizontal bar...</td>\n", + " <td id=\"T_0502a_row15_col3\" class=\"data row15 col3\" >True</td>\n", + " <td id=\"T_0502a_row15_col4\" class=\"data row15 col4\" >False</td>\n", + " <td id=\"T_0502a_row15_col5\" class=\"data row15 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row15_col6\" class=\"data row15 col6\" >{'fig_height': {'type': '_empty', 'default': 600}}</td>\n", + " <td id=\"T_0502a_row15_col7\" class=\"data row15 col7\" >['tabular_data', 'visualization', 'correlation']</td>\n", + " <td id=\"T_0502a_row15_col8\" class=\"data row15 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row16_col0\" class=\"data row16 col0\" >validmind.data_validation.HighCardinality</td>\n", + " <td id=\"T_0502a_row16_col1\" class=\"data row16 col1\" >High Cardinality</td>\n", + " <td id=\"T_0502a_row16_col2\" class=\"data row16 col2\" >Assesses the number of unique values in categorical columns to detect high cardinality and potential overfitting....</td>\n", + " <td id=\"T_0502a_row16_col3\" class=\"data row16 col3\" >False</td>\n", + " <td id=\"T_0502a_row16_col4\" class=\"data row16 col4\" >True</td>\n", + " <td id=\"T_0502a_row16_col5\" class=\"data row16 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row16_col6\" class=\"data row16 col6\" >{'num_threshold': {'type': 'int', 'default': 100}, 'percent_threshold': {'type': 'float', 'default': 0.1}, 'threshold_type': {'type': 'str', 'default': 'percent'}}</td>\n", + " <td id=\"T_0502a_row16_col7\" class=\"data row16 col7\" >['tabular_data', 'data_quality', 'categorical_data']</td>\n", + " <td id=\"T_0502a_row16_col8\" class=\"data row16 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row17_col0\" class=\"data row17 col0\" >validmind.data_validation.HighPearsonCorrelation</td>\n", + " <td id=\"T_0502a_row17_col1\" class=\"data row17 col1\" >High Pearson Correlation</td>\n", + " <td id=\"T_0502a_row17_col2\" class=\"data row17 col2\" >Identifies highly correlated feature pairs in a dataset suggesting feature redundancy or multicollinearity....</td>\n", + " <td id=\"T_0502a_row17_col3\" class=\"data row17 col3\" >False</td>\n", + " <td id=\"T_0502a_row17_col4\" class=\"data row17 col4\" >True</td>\n", + " <td id=\"T_0502a_row17_col5\" class=\"data row17 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row17_col6\" class=\"data row17 col6\" >{'max_threshold': {'type': 'float', 'default': 0.3}, 'top_n_correlations': {'type': 'int', 'default': 10}, 'feature_columns': {'type': 'list', 'default': None}}</td>\n", + " <td id=\"T_0502a_row17_col7\" class=\"data row17 col7\" >['tabular_data', 'data_quality', 'correlation']</td>\n", + " <td id=\"T_0502a_row17_col8\" class=\"data row17 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row18_col0\" class=\"data row18 col0\" >validmind.data_validation.IQROutliersBarPlot</td>\n", + " <td id=\"T_0502a_row18_col1\" class=\"data row18 col1\" >IQR Outliers Bar Plot</td>\n", + " <td id=\"T_0502a_row18_col2\" class=\"data row18 col2\" >Visualizes outlier distribution across percentiles in numerical data using the Interquartile Range (IQR) method....</td>\n", + " <td id=\"T_0502a_row18_col3\" class=\"data row18 col3\" >True</td>\n", + " <td id=\"T_0502a_row18_col4\" class=\"data row18 col4\" >False</td>\n", + " <td id=\"T_0502a_row18_col5\" class=\"data row18 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row18_col6\" class=\"data row18 col6\" >{'threshold': {'type': 'float', 'default': 1.5}, 'fig_width': {'type': 'int', 'default': 800}}</td>\n", + " <td id=\"T_0502a_row18_col7\" class=\"data row18 col7\" >['tabular_data', 'visualization', 'numerical_data']</td>\n", + " <td id=\"T_0502a_row18_col8\" class=\"data row18 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row19_col0\" class=\"data row19 col0\" >validmind.data_validation.IQROutliersTable</td>\n", + " <td id=\"T_0502a_row19_col1\" class=\"data row19 col1\" >IQR Outliers Table</td>\n", + " <td id=\"T_0502a_row19_col2\" class=\"data row19 col2\" >Determines and summarizes outliers in numerical features using the Interquartile Range method....</td>\n", + " <td id=\"T_0502a_row19_col3\" class=\"data row19 col3\" >False</td>\n", + " <td id=\"T_0502a_row19_col4\" class=\"data row19 col4\" >True</td>\n", + " <td id=\"T_0502a_row19_col5\" class=\"data row19 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row19_col6\" class=\"data row19 col6\" >{'threshold': {'type': 'float', 'default': 1.5}}</td>\n", + " <td id=\"T_0502a_row19_col7\" class=\"data row19 col7\" >['tabular_data', 'numerical_data']</td>\n", + " <td id=\"T_0502a_row19_col8\" class=\"data row19 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row20_col0\" class=\"data row20 col0\" >validmind.data_validation.IsolationForestOutliers</td>\n", + " <td id=\"T_0502a_row20_col1\" class=\"data row20 col1\" >Isolation Forest Outliers</td>\n", + " <td id=\"T_0502a_row20_col2\" class=\"data row20 col2\" >Detects outliers in a dataset using the Isolation Forest algorithm and visualizes results through scatter plots....</td>\n", + " <td id=\"T_0502a_row20_col3\" class=\"data row20 col3\" >True</td>\n", + " <td id=\"T_0502a_row20_col4\" class=\"data row20 col4\" >False</td>\n", + " <td id=\"T_0502a_row20_col5\" class=\"data row20 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row20_col6\" class=\"data row20 col6\" >{'random_state': {'type': 'int', 'default': 0}, 'contamination': {'type': 'float', 'default': 0.1}, 'feature_columns': {'type': 'list', 'default': None}}</td>\n", + " <td id=\"T_0502a_row20_col7\" class=\"data row20 col7\" >['tabular_data', 'anomaly_detection']</td>\n", + " <td id=\"T_0502a_row20_col8\" class=\"data row20 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row21_col0\" class=\"data row21 col0\" >validmind.data_validation.JarqueBera</td>\n", + " <td id=\"T_0502a_row21_col1\" class=\"data row21 col1\" >Jarque Bera</td>\n", + " <td id=\"T_0502a_row21_col2\" class=\"data row21 col2\" >Assesses normality of dataset features in an ML model using the Jarque-Bera test....</td>\n", + " <td id=\"T_0502a_row21_col3\" class=\"data row21 col3\" >False</td>\n", + " <td id=\"T_0502a_row21_col4\" class=\"data row21 col4\" >True</td>\n", + " <td id=\"T_0502a_row21_col5\" class=\"data row21 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row21_col6\" class=\"data row21 col6\" >{}</td>\n", + " <td id=\"T_0502a_row21_col7\" class=\"data row21 col7\" >['tabular_data', 'data_distribution', 'statistical_test', 'statsmodels']</td>\n", + " <td id=\"T_0502a_row21_col8\" class=\"data row21 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row22_col0\" class=\"data row22 col0\" >validmind.data_validation.KPSS</td>\n", + " <td id=\"T_0502a_row22_col1\" class=\"data row22 col1\" >KPSS</td>\n", + " <td id=\"T_0502a_row22_col2\" class=\"data row22 col2\" >Assesses the stationarity of time-series data in a machine learning model using the KPSS unit root test....</td>\n", + " <td id=\"T_0502a_row22_col3\" class=\"data row22 col3\" >False</td>\n", + " <td id=\"T_0502a_row22_col4\" class=\"data row22 col4\" >True</td>\n", + " <td id=\"T_0502a_row22_col5\" class=\"data row22 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row22_col6\" class=\"data row22 col6\" >{}</td>\n", + " <td id=\"T_0502a_row22_col7\" class=\"data row22 col7\" >['time_series_data', 'stationarity', 'unit_root_test', 'statsmodels']</td>\n", + " <td id=\"T_0502a_row22_col8\" class=\"data row22 col8\" >['data_validation']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row23_col0\" class=\"data row23 col0\" >validmind.data_validation.LJungBox</td>\n", + " <td id=\"T_0502a_row23_col1\" class=\"data row23 col1\" >L Jung Box</td>\n", + " <td id=\"T_0502a_row23_col2\" class=\"data row23 col2\" >Assesses autocorrelations in dataset features by performing a Ljung-Box test on each feature....</td>\n", + " <td id=\"T_0502a_row23_col3\" class=\"data row23 col3\" >False</td>\n", + " <td id=\"T_0502a_row23_col4\" class=\"data row23 col4\" >True</td>\n", + " <td id=\"T_0502a_row23_col5\" class=\"data row23 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row23_col6\" class=\"data row23 col6\" >{}</td>\n", + " <td id=\"T_0502a_row23_col7\" class=\"data row23 col7\" >['time_series_data', 'forecasting', 'statistical_test', 'statsmodels']</td>\n", + " <td id=\"T_0502a_row23_col8\" class=\"data row23 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row24_col0\" class=\"data row24 col0\" >validmind.data_validation.LaggedCorrelationHeatmap</td>\n", + " <td id=\"T_0502a_row24_col1\" class=\"data row24 col1\" >Lagged Correlation Heatmap</td>\n", + " <td id=\"T_0502a_row24_col2\" class=\"data row24 col2\" >Assesses and visualizes correlation between target variable and lagged independent variables in a time-series...</td>\n", + " <td id=\"T_0502a_row24_col3\" class=\"data row24 col3\" >True</td>\n", + " <td id=\"T_0502a_row24_col4\" class=\"data row24 col4\" >False</td>\n", + " <td id=\"T_0502a_row24_col5\" class=\"data row24 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row24_col6\" class=\"data row24 col6\" >{'num_lags': {'type': 'int', 'default': 10}}</td>\n", + " <td id=\"T_0502a_row24_col7\" class=\"data row24 col7\" >['time_series_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row24_col8\" class=\"data row24 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row25_col0\" class=\"data row25 col0\" >validmind.data_validation.MissingValues</td>\n", + " <td id=\"T_0502a_row25_col1\" class=\"data row25 col1\" >Missing Values</td>\n", + " <td id=\"T_0502a_row25_col2\" class=\"data row25 col2\" >Evaluates dataset quality by ensuring missing value ratio across all features does not exceed a set threshold....</td>\n", + " <td id=\"T_0502a_row25_col3\" class=\"data row25 col3\" >False</td>\n", + " <td id=\"T_0502a_row25_col4\" class=\"data row25 col4\" >True</td>\n", + " <td id=\"T_0502a_row25_col5\" class=\"data row25 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row25_col6\" class=\"data row25 col6\" >{'min_threshold': {'type': 'int', 'default': 1}}</td>\n", + " <td id=\"T_0502a_row25_col7\" class=\"data row25 col7\" >['tabular_data', 'data_quality']</td>\n", + " <td id=\"T_0502a_row25_col8\" class=\"data row25 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row26_col0\" class=\"data row26 col0\" >validmind.data_validation.MissingValuesBarPlot</td>\n", + " <td id=\"T_0502a_row26_col1\" class=\"data row26 col1\" >Missing Values Bar Plot</td>\n", + " <td id=\"T_0502a_row26_col2\" class=\"data row26 col2\" >Assesses the percentage and distribution of missing values in the dataset via a bar plot, with emphasis on...</td>\n", + " <td id=\"T_0502a_row26_col3\" class=\"data row26 col3\" >True</td>\n", + " <td id=\"T_0502a_row26_col4\" class=\"data row26 col4\" >False</td>\n", + " <td id=\"T_0502a_row26_col5\" class=\"data row26 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row26_col6\" class=\"data row26 col6\" >{'threshold': {'type': 'int', 'default': 80}, 'fig_height': {'type': 'int', 'default': 600}}</td>\n", + " <td id=\"T_0502a_row26_col7\" class=\"data row26 col7\" >['tabular_data', 'data_quality', 'visualization']</td>\n", + " <td id=\"T_0502a_row26_col8\" class=\"data row26 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row27_col0\" class=\"data row27 col0\" >validmind.data_validation.MutualInformation</td>\n", + " <td id=\"T_0502a_row27_col1\" class=\"data row27 col1\" >Mutual Information</td>\n", + " <td id=\"T_0502a_row27_col2\" class=\"data row27 col2\" >Calculates mutual information scores between features and target variable to evaluate feature relevance....</td>\n", + " <td id=\"T_0502a_row27_col3\" class=\"data row27 col3\" >True</td>\n", + " <td id=\"T_0502a_row27_col4\" class=\"data row27 col4\" >False</td>\n", + " <td id=\"T_0502a_row27_col5\" class=\"data row27 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row27_col6\" class=\"data row27 col6\" >{'min_threshold': {'type': 'float', 'default': 0.01}, 'task': {'type': 'str', 'default': 'classification'}}</td>\n", + " <td id=\"T_0502a_row27_col7\" class=\"data row27 col7\" >['feature_selection', 'data_analysis']</td>\n", + " <td id=\"T_0502a_row27_col8\" class=\"data row27 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row28_col0\" class=\"data row28 col0\" >validmind.data_validation.PearsonCorrelationMatrix</td>\n", + " <td id=\"T_0502a_row28_col1\" class=\"data row28 col1\" >Pearson Correlation Matrix</td>\n", + " <td id=\"T_0502a_row28_col2\" class=\"data row28 col2\" >Evaluates linear dependency between numerical variables in a dataset via a Pearson Correlation coefficient heat map....</td>\n", + " <td id=\"T_0502a_row28_col3\" class=\"data row28 col3\" >True</td>\n", + " <td id=\"T_0502a_row28_col4\" class=\"data row28 col4\" >False</td>\n", + " <td id=\"T_0502a_row28_col5\" class=\"data row28 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row28_col6\" class=\"data row28 col6\" >{}</td>\n", + " <td id=\"T_0502a_row28_col7\" class=\"data row28 col7\" >['tabular_data', 'numerical_data', 'correlation']</td>\n", + " <td id=\"T_0502a_row28_col8\" class=\"data row28 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row29_col0\" class=\"data row29 col0\" >validmind.data_validation.PhillipsPerronArch</td>\n", + " <td id=\"T_0502a_row29_col1\" class=\"data row29 col1\" >Phillips Perron Arch</td>\n", + " <td id=\"T_0502a_row29_col2\" class=\"data row29 col2\" >Assesses the stationarity of time series data in each feature of the ML model using the Phillips-Perron test....</td>\n", + " <td id=\"T_0502a_row29_col3\" class=\"data row29 col3\" >False</td>\n", + " <td id=\"T_0502a_row29_col4\" class=\"data row29 col4\" >True</td>\n", + " <td id=\"T_0502a_row29_col5\" class=\"data row29 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row29_col6\" class=\"data row29 col6\" >{}</td>\n", + " <td id=\"T_0502a_row29_col7\" class=\"data row29 col7\" >['time_series_data', 'forecasting', 'statistical_test', 'unit_root_test']</td>\n", + " <td id=\"T_0502a_row29_col8\" class=\"data row29 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row30_col0\" class=\"data row30 col0\" >validmind.data_validation.ProtectedClassesDescription</td>\n", + " <td id=\"T_0502a_row30_col1\" class=\"data row30 col1\" >Protected Classes Description</td>\n", + " <td id=\"T_0502a_row30_col2\" class=\"data row30 col2\" >Visualizes the distribution of protected classes in the dataset relative to the target variable...</td>\n", + " <td id=\"T_0502a_row30_col3\" class=\"data row30 col3\" >True</td>\n", + " <td id=\"T_0502a_row30_col4\" class=\"data row30 col4\" >True</td>\n", + " <td id=\"T_0502a_row30_col5\" class=\"data row30 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row30_col6\" class=\"data row30 col6\" >{'protected_classes': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row30_col7\" class=\"data row30 col7\" >['bias_and_fairness', 'descriptive_statistics']</td>\n", + " <td id=\"T_0502a_row30_col8\" class=\"data row30 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row31_col0\" class=\"data row31 col0\" >validmind.data_validation.RollingStatsPlot</td>\n", + " <td id=\"T_0502a_row31_col1\" class=\"data row31 col1\" >Rolling Stats Plot</td>\n", + " <td id=\"T_0502a_row31_col2\" class=\"data row31 col2\" >Evaluates the stationarity of time series data by plotting its rolling mean and standard deviation over a specified...</td>\n", + " <td id=\"T_0502a_row31_col3\" class=\"data row31 col3\" >True</td>\n", + " <td id=\"T_0502a_row31_col4\" class=\"data row31 col4\" >False</td>\n", + " <td id=\"T_0502a_row31_col5\" class=\"data row31 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row31_col6\" class=\"data row31 col6\" >{'window_size': {'type': 'int', 'default': 12}}</td>\n", + " <td id=\"T_0502a_row31_col7\" class=\"data row31 col7\" >['time_series_data', 'visualization', 'stationarity']</td>\n", + " <td id=\"T_0502a_row31_col8\" class=\"data row31 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row32_col0\" class=\"data row32 col0\" >validmind.data_validation.RunsTest</td>\n", + " <td id=\"T_0502a_row32_col1\" class=\"data row32 col1\" >Runs Test</td>\n", + " <td id=\"T_0502a_row32_col2\" class=\"data row32 col2\" >Executes Runs Test on ML model to detect non-random patterns in output data sequence....</td>\n", + " <td id=\"T_0502a_row32_col3\" class=\"data row32 col3\" >False</td>\n", + " <td id=\"T_0502a_row32_col4\" class=\"data row32 col4\" >True</td>\n", + " <td id=\"T_0502a_row32_col5\" class=\"data row32 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row32_col6\" class=\"data row32 col6\" >{}</td>\n", + " <td id=\"T_0502a_row32_col7\" class=\"data row32 col7\" >['tabular_data', 'statistical_test', 'statsmodels']</td>\n", + " <td id=\"T_0502a_row32_col8\" class=\"data row32 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row33_col0\" class=\"data row33 col0\" >validmind.data_validation.ScatterPlot</td>\n", + " <td id=\"T_0502a_row33_col1\" class=\"data row33 col1\" >Scatter Plot</td>\n", + " <td id=\"T_0502a_row33_col2\" class=\"data row33 col2\" >Assesses visual relationships, patterns, and outliers among features in a dataset through scatter plot matrices....</td>\n", + " <td id=\"T_0502a_row33_col3\" class=\"data row33 col3\" >True</td>\n", + " <td id=\"T_0502a_row33_col4\" class=\"data row33 col4\" >False</td>\n", + " <td id=\"T_0502a_row33_col5\" class=\"data row33 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row33_col6\" class=\"data row33 col6\" >{}</td>\n", + " <td id=\"T_0502a_row33_col7\" class=\"data row33 col7\" >['tabular_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row33_col8\" class=\"data row33 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row34_col0\" class=\"data row34 col0\" >validmind.data_validation.ScoreBandDefaultRates</td>\n", + " <td id=\"T_0502a_row34_col1\" class=\"data row34 col1\" >Score Band Default Rates</td>\n", + " <td id=\"T_0502a_row34_col2\" class=\"data row34 col2\" >Analyzes default rates and population distribution across credit score bands....</td>\n", + " <td id=\"T_0502a_row34_col3\" class=\"data row34 col3\" >False</td>\n", + " <td id=\"T_0502a_row34_col4\" class=\"data row34 col4\" >True</td>\n", + " <td id=\"T_0502a_row34_col5\" class=\"data row34 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row34_col6\" class=\"data row34 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'score_bands': {'type': 'list', 'default': None}}</td>\n", + " <td id=\"T_0502a_row34_col7\" class=\"data row34 col7\" >['visualization', 'credit_risk', 'scorecard']</td>\n", + " <td id=\"T_0502a_row34_col8\" class=\"data row34 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row35_col0\" class=\"data row35 col0\" >validmind.data_validation.SeasonalDecompose</td>\n", + " <td id=\"T_0502a_row35_col1\" class=\"data row35 col1\" >Seasonal Decompose</td>\n", + " <td id=\"T_0502a_row35_col2\" class=\"data row35 col2\" >Assesses patterns and seasonality in a time series dataset by decomposing its features into foundational components....</td>\n", + " <td id=\"T_0502a_row35_col3\" class=\"data row35 col3\" >True</td>\n", + " <td id=\"T_0502a_row35_col4\" class=\"data row35 col4\" >False</td>\n", + " <td id=\"T_0502a_row35_col5\" class=\"data row35 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row35_col6\" class=\"data row35 col6\" >{'seasonal_model': {'type': 'str', 'default': 'additive'}}</td>\n", + " <td id=\"T_0502a_row35_col7\" class=\"data row35 col7\" >['time_series_data', 'seasonality', 'statsmodels']</td>\n", + " <td id=\"T_0502a_row35_col8\" class=\"data row35 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row36_col0\" class=\"data row36 col0\" >validmind.data_validation.ShapiroWilk</td>\n", + " <td id=\"T_0502a_row36_col1\" class=\"data row36 col1\" >Shapiro Wilk</td>\n", + " <td id=\"T_0502a_row36_col2\" class=\"data row36 col2\" >Evaluates feature-wise normality of training data using the Shapiro-Wilk test....</td>\n", + " <td id=\"T_0502a_row36_col3\" class=\"data row36 col3\" >False</td>\n", + " <td id=\"T_0502a_row36_col4\" class=\"data row36 col4\" >True</td>\n", + " <td id=\"T_0502a_row36_col5\" class=\"data row36 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row36_col6\" class=\"data row36 col6\" >{}</td>\n", + " <td id=\"T_0502a_row36_col7\" class=\"data row36 col7\" >['tabular_data', 'data_distribution', 'statistical_test']</td>\n", + " <td id=\"T_0502a_row36_col8\" class=\"data row36 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row37_col0\" class=\"data row37 col0\" >validmind.data_validation.Skewness</td>\n", + " <td id=\"T_0502a_row37_col1\" class=\"data row37 col1\" >Skewness</td>\n", + " <td id=\"T_0502a_row37_col2\" class=\"data row37 col2\" >Evaluates the skewness of numerical data in a dataset to check against a defined threshold, aiming to ensure data...</td>\n", + " <td id=\"T_0502a_row37_col3\" class=\"data row37 col3\" >False</td>\n", + " <td id=\"T_0502a_row37_col4\" class=\"data row37 col4\" >True</td>\n", + " <td id=\"T_0502a_row37_col5\" class=\"data row37 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row37_col6\" class=\"data row37 col6\" >{'max_threshold': {'type': '_empty', 'default': 1}}</td>\n", + " <td id=\"T_0502a_row37_col7\" class=\"data row37 col7\" >['data_quality', 'tabular_data']</td>\n", + " <td id=\"T_0502a_row37_col8\" class=\"data row37 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row38_col0\" class=\"data row38 col0\" >validmind.data_validation.SpreadPlot</td>\n", + " <td id=\"T_0502a_row38_col1\" class=\"data row38 col1\" >Spread Plot</td>\n", + " <td id=\"T_0502a_row38_col2\" class=\"data row38 col2\" >Assesses potential correlations between pairs of time series variables through visualization to enhance...</td>\n", + " <td id=\"T_0502a_row38_col3\" class=\"data row38 col3\" >True</td>\n", + " <td id=\"T_0502a_row38_col4\" class=\"data row38 col4\" >False</td>\n", + " <td id=\"T_0502a_row38_col5\" class=\"data row38 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row38_col6\" class=\"data row38 col6\" >{}</td>\n", + " <td id=\"T_0502a_row38_col7\" class=\"data row38 col7\" >['time_series_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row38_col8\" class=\"data row38 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row39_col0\" class=\"data row39 col0\" >validmind.data_validation.TabularCategoricalBarPlots</td>\n", + " <td id=\"T_0502a_row39_col1\" class=\"data row39 col1\" >Tabular Categorical Bar Plots</td>\n", + " <td id=\"T_0502a_row39_col2\" class=\"data row39 col2\" >Generates and visualizes bar plots for each category in categorical features to evaluate the dataset's composition....</td>\n", + " <td id=\"T_0502a_row39_col3\" class=\"data row39 col3\" >True</td>\n", + " <td id=\"T_0502a_row39_col4\" class=\"data row39 col4\" >False</td>\n", + " <td id=\"T_0502a_row39_col5\" class=\"data row39 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row39_col6\" class=\"data row39 col6\" >{}</td>\n", + " <td id=\"T_0502a_row39_col7\" class=\"data row39 col7\" >['tabular_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row39_col8\" class=\"data row39 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row40_col0\" class=\"data row40 col0\" >validmind.data_validation.TabularDateTimeHistograms</td>\n", + " <td id=\"T_0502a_row40_col1\" class=\"data row40 col1\" >Tabular Date Time Histograms</td>\n", + " <td id=\"T_0502a_row40_col2\" class=\"data row40 col2\" >Generates histograms to provide graphical insight into the distribution of time intervals in a model's datetime...</td>\n", + " <td id=\"T_0502a_row40_col3\" class=\"data row40 col3\" >True</td>\n", + " <td id=\"T_0502a_row40_col4\" class=\"data row40 col4\" >False</td>\n", + " <td id=\"T_0502a_row40_col5\" class=\"data row40 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row40_col6\" class=\"data row40 col6\" >{}</td>\n", + " <td id=\"T_0502a_row40_col7\" class=\"data row40 col7\" >['time_series_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row40_col8\" class=\"data row40 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row41_col0\" class=\"data row41 col0\" >validmind.data_validation.TabularDescriptionTables</td>\n", + " <td id=\"T_0502a_row41_col1\" class=\"data row41 col1\" >Tabular Description Tables</td>\n", + " <td id=\"T_0502a_row41_col2\" class=\"data row41 col2\" >Summarizes key descriptive statistics for numerical, categorical, and datetime variables in a dataset....</td>\n", + " <td id=\"T_0502a_row41_col3\" class=\"data row41 col3\" >False</td>\n", + " <td id=\"T_0502a_row41_col4\" class=\"data row41 col4\" >True</td>\n", + " <td id=\"T_0502a_row41_col5\" class=\"data row41 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row41_col6\" class=\"data row41 col6\" >{}</td>\n", + " <td id=\"T_0502a_row41_col7\" class=\"data row41 col7\" >['tabular_data']</td>\n", + " <td id=\"T_0502a_row41_col8\" class=\"data row41 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row42_col0\" class=\"data row42 col0\" >validmind.data_validation.TabularNumericalHistograms</td>\n", + " <td id=\"T_0502a_row42_col1\" class=\"data row42 col1\" >Tabular Numerical Histograms</td>\n", + " <td id=\"T_0502a_row42_col2\" class=\"data row42 col2\" >Generates histograms for each numerical feature in a dataset to provide visual insights into data distribution and...</td>\n", + " <td id=\"T_0502a_row42_col3\" class=\"data row42 col3\" >True</td>\n", + " <td id=\"T_0502a_row42_col4\" class=\"data row42 col4\" >False</td>\n", + " <td id=\"T_0502a_row42_col5\" class=\"data row42 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row42_col6\" class=\"data row42 col6\" >{}</td>\n", + " <td id=\"T_0502a_row42_col7\" class=\"data row42 col7\" >['tabular_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row42_col8\" class=\"data row42 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row43_col0\" class=\"data row43 col0\" >validmind.data_validation.TargetRateBarPlots</td>\n", + " <td id=\"T_0502a_row43_col1\" class=\"data row43 col1\" >Target Rate Bar Plots</td>\n", + " <td id=\"T_0502a_row43_col2\" class=\"data row43 col2\" >Generates bar plots visualizing the default rates of categorical features for a classification machine learning...</td>\n", + " <td id=\"T_0502a_row43_col3\" class=\"data row43 col3\" >True</td>\n", + " <td id=\"T_0502a_row43_col4\" class=\"data row43 col4\" >False</td>\n", + " <td id=\"T_0502a_row43_col5\" class=\"data row43 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row43_col6\" class=\"data row43 col6\" >{}</td>\n", + " <td id=\"T_0502a_row43_col7\" class=\"data row43 col7\" >['tabular_data', 'visualization', 'categorical_data']</td>\n", + " <td id=\"T_0502a_row43_col8\" class=\"data row43 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row44_col0\" class=\"data row44 col0\" >validmind.data_validation.TimeSeriesDescription</td>\n", + " <td id=\"T_0502a_row44_col1\" class=\"data row44 col1\" >Time Series Description</td>\n", + " <td id=\"T_0502a_row44_col2\" class=\"data row44 col2\" >Generates a detailed analysis for the provided time series dataset, summarizing key statistics to identify trends,...</td>\n", + " <td id=\"T_0502a_row44_col3\" class=\"data row44 col3\" >False</td>\n", + " <td id=\"T_0502a_row44_col4\" class=\"data row44 col4\" >True</td>\n", + " <td id=\"T_0502a_row44_col5\" class=\"data row44 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row44_col6\" class=\"data row44 col6\" >{}</td>\n", + " <td id=\"T_0502a_row44_col7\" class=\"data row44 col7\" >['time_series_data', 'analysis']</td>\n", + " <td id=\"T_0502a_row44_col8\" class=\"data row44 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row45_col0\" class=\"data row45 col0\" >validmind.data_validation.TimeSeriesDescriptiveStatistics</td>\n", + " <td id=\"T_0502a_row45_col1\" class=\"data row45 col1\" >Time Series Descriptive Statistics</td>\n", + " <td id=\"T_0502a_row45_col2\" class=\"data row45 col2\" >Evaluates the descriptive statistics of a time series dataset to identify trends, patterns, and data quality issues....</td>\n", + " <td id=\"T_0502a_row45_col3\" class=\"data row45 col3\" >False</td>\n", + " <td id=\"T_0502a_row45_col4\" class=\"data row45 col4\" >True</td>\n", + " <td id=\"T_0502a_row45_col5\" class=\"data row45 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row45_col6\" class=\"data row45 col6\" >{}</td>\n", + " <td id=\"T_0502a_row45_col7\" class=\"data row45 col7\" >['time_series_data', 'analysis']</td>\n", + " <td id=\"T_0502a_row45_col8\" class=\"data row45 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row46_col0\" class=\"data row46 col0\" >validmind.data_validation.TimeSeriesFrequency</td>\n", + " <td id=\"T_0502a_row46_col1\" class=\"data row46 col1\" >Time Series Frequency</td>\n", + " <td id=\"T_0502a_row46_col2\" class=\"data row46 col2\" >Evaluates consistency of time series data frequency and generates a frequency plot....</td>\n", + " <td id=\"T_0502a_row46_col3\" class=\"data row46 col3\" >True</td>\n", + " <td id=\"T_0502a_row46_col4\" class=\"data row46 col4\" >True</td>\n", + " <td id=\"T_0502a_row46_col5\" class=\"data row46 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row46_col6\" class=\"data row46 col6\" >{}</td>\n", + " <td id=\"T_0502a_row46_col7\" class=\"data row46 col7\" >['time_series_data']</td>\n", + " <td id=\"T_0502a_row46_col8\" class=\"data row46 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row47_col0\" class=\"data row47 col0\" >validmind.data_validation.TimeSeriesHistogram</td>\n", + " <td id=\"T_0502a_row47_col1\" class=\"data row47 col1\" >Time Series Histogram</td>\n", + " <td id=\"T_0502a_row47_col2\" class=\"data row47 col2\" >Visualizes distribution of time-series data using histograms and Kernel Density Estimation (KDE) lines....</td>\n", + " <td id=\"T_0502a_row47_col3\" class=\"data row47 col3\" >True</td>\n", + " <td id=\"T_0502a_row47_col4\" class=\"data row47 col4\" >False</td>\n", + " <td id=\"T_0502a_row47_col5\" class=\"data row47 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row47_col6\" class=\"data row47 col6\" >{'nbins': {'type': '_empty', 'default': 30}}</td>\n", + " <td id=\"T_0502a_row47_col7\" class=\"data row47 col7\" >['data_validation', 'visualization', 'time_series_data']</td>\n", + " <td id=\"T_0502a_row47_col8\" class=\"data row47 col8\" >['regression', 'time_series_forecasting']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row48_col0\" class=\"data row48 col0\" >validmind.data_validation.TimeSeriesLinePlot</td>\n", + " <td id=\"T_0502a_row48_col1\" class=\"data row48 col1\" >Time Series Line Plot</td>\n", + " <td id=\"T_0502a_row48_col2\" class=\"data row48 col2\" >Generates and analyses time-series data through line plots revealing trends, patterns, anomalies over time....</td>\n", + " <td id=\"T_0502a_row48_col3\" class=\"data row48 col3\" >True</td>\n", + " <td id=\"T_0502a_row48_col4\" class=\"data row48 col4\" >False</td>\n", + " <td id=\"T_0502a_row48_col5\" class=\"data row48 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row48_col6\" class=\"data row48 col6\" >{}</td>\n", + " <td id=\"T_0502a_row48_col7\" class=\"data row48 col7\" >['time_series_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row48_col8\" class=\"data row48 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row49_col0\" class=\"data row49 col0\" >validmind.data_validation.TimeSeriesMissingValues</td>\n", + " <td id=\"T_0502a_row49_col1\" class=\"data row49 col1\" >Time Series Missing Values</td>\n", + " <td id=\"T_0502a_row49_col2\" class=\"data row49 col2\" >Validates time-series data quality by confirming the count of missing values is below a certain threshold....</td>\n", + " <td id=\"T_0502a_row49_col3\" class=\"data row49 col3\" >True</td>\n", + " <td id=\"T_0502a_row49_col4\" class=\"data row49 col4\" >True</td>\n", + " <td id=\"T_0502a_row49_col5\" class=\"data row49 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row49_col6\" class=\"data row49 col6\" >{'min_threshold': {'type': 'int', 'default': 1}}</td>\n", + " <td id=\"T_0502a_row49_col7\" class=\"data row49 col7\" >['time_series_data']</td>\n", + " <td id=\"T_0502a_row49_col8\" class=\"data row49 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row50_col0\" class=\"data row50 col0\" >validmind.data_validation.TimeSeriesOutliers</td>\n", + " <td id=\"T_0502a_row50_col1\" class=\"data row50 col1\" >Time Series Outliers</td>\n", + " <td id=\"T_0502a_row50_col2\" class=\"data row50 col2\" >Identifies and visualizes outliers in time-series data using the z-score method....</td>\n", + " <td id=\"T_0502a_row50_col3\" class=\"data row50 col3\" >False</td>\n", + " <td id=\"T_0502a_row50_col4\" class=\"data row50 col4\" >True</td>\n", + " <td id=\"T_0502a_row50_col5\" class=\"data row50 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row50_col6\" class=\"data row50 col6\" >{'zscore_threshold': {'type': 'int', 'default': 3}}</td>\n", + " <td id=\"T_0502a_row50_col7\" class=\"data row50 col7\" >['time_series_data']</td>\n", + " <td id=\"T_0502a_row50_col8\" class=\"data row50 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row51_col0\" class=\"data row51 col0\" >validmind.data_validation.TooManyZeroValues</td>\n", + " <td id=\"T_0502a_row51_col1\" class=\"data row51 col1\" >Too Many Zero Values</td>\n", + " <td id=\"T_0502a_row51_col2\" class=\"data row51 col2\" >Identifies numerical columns in a dataset that contain an excessive number of zero values, defined by a threshold...</td>\n", + " <td id=\"T_0502a_row51_col3\" class=\"data row51 col3\" >False</td>\n", + " <td id=\"T_0502a_row51_col4\" class=\"data row51 col4\" >True</td>\n", + " <td id=\"T_0502a_row51_col5\" class=\"data row51 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row51_col6\" class=\"data row51 col6\" >{'max_percent_threshold': {'type': 'float', 'default': 0.03}}</td>\n", + " <td id=\"T_0502a_row51_col7\" class=\"data row51 col7\" >['tabular_data']</td>\n", + " <td id=\"T_0502a_row51_col8\" class=\"data row51 col8\" >['regression', 'classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row52_col0\" class=\"data row52 col0\" >validmind.data_validation.UniqueRows</td>\n", + " <td id=\"T_0502a_row52_col1\" class=\"data row52 col1\" >Unique Rows</td>\n", + " <td id=\"T_0502a_row52_col2\" class=\"data row52 col2\" >Verifies the diversity of the dataset by ensuring that the count of unique rows exceeds a prescribed threshold....</td>\n", + " <td id=\"T_0502a_row52_col3\" class=\"data row52 col3\" >False</td>\n", + " <td id=\"T_0502a_row52_col4\" class=\"data row52 col4\" >True</td>\n", + " <td id=\"T_0502a_row52_col5\" class=\"data row52 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row52_col6\" class=\"data row52 col6\" >{'min_percent_threshold': {'type': 'float', 'default': 1}}</td>\n", + " <td id=\"T_0502a_row52_col7\" class=\"data row52 col7\" >['tabular_data']</td>\n", + " <td id=\"T_0502a_row52_col8\" class=\"data row52 col8\" >['regression', 'classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row53_col0\" class=\"data row53 col0\" >validmind.data_validation.WOEBinPlots</td>\n", + " <td id=\"T_0502a_row53_col1\" class=\"data row53 col1\" >WOE Bin Plots</td>\n", + " <td id=\"T_0502a_row53_col2\" class=\"data row53 col2\" >Generates visualizations of Weight of Evidence (WoE) and Information Value (IV) for understanding predictive power...</td>\n", + " <td id=\"T_0502a_row53_col3\" class=\"data row53 col3\" >True</td>\n", + " <td id=\"T_0502a_row53_col4\" class=\"data row53 col4\" >False</td>\n", + " <td id=\"T_0502a_row53_col5\" class=\"data row53 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row53_col6\" class=\"data row53 col6\" >{'breaks_adj': {'type': 'list', 'default': None}, 'fig_height': {'type': 'int', 'default': 600}, 'fig_width': {'type': 'int', 'default': 500}}</td>\n", + " <td id=\"T_0502a_row53_col7\" class=\"data row53 col7\" >['tabular_data', 'visualization', 'categorical_data']</td>\n", + " <td id=\"T_0502a_row53_col8\" class=\"data row53 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row54_col0\" class=\"data row54 col0\" >validmind.data_validation.WOEBinTable</td>\n", + " <td id=\"T_0502a_row54_col1\" class=\"data row54 col1\" >WOE Bin Table</td>\n", + " <td id=\"T_0502a_row54_col2\" class=\"data row54 col2\" >Assesses the Weight of Evidence (WoE) and Information Value (IV) of each feature to evaluate its predictive power...</td>\n", + " <td id=\"T_0502a_row54_col3\" class=\"data row54 col3\" >False</td>\n", + " <td id=\"T_0502a_row54_col4\" class=\"data row54 col4\" >True</td>\n", + " <td id=\"T_0502a_row54_col5\" class=\"data row54 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row54_col6\" class=\"data row54 col6\" >{'breaks_adj': {'type': 'list', 'default': None}}</td>\n", + " <td id=\"T_0502a_row54_col7\" class=\"data row54 col7\" >['tabular_data', 'categorical_data']</td>\n", + " <td id=\"T_0502a_row54_col8\" class=\"data row54 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row55_col0\" class=\"data row55 col0\" >validmind.data_validation.ZivotAndrewsArch</td>\n", + " <td id=\"T_0502a_row55_col1\" class=\"data row55 col1\" >Zivot Andrews Arch</td>\n", + " <td id=\"T_0502a_row55_col2\" class=\"data row55 col2\" >Evaluates the order of integration and stationarity of time series data using the Zivot-Andrews unit root test....</td>\n", + " <td id=\"T_0502a_row55_col3\" class=\"data row55 col3\" >False</td>\n", + " <td id=\"T_0502a_row55_col4\" class=\"data row55 col4\" >True</td>\n", + " <td id=\"T_0502a_row55_col5\" class=\"data row55 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row55_col6\" class=\"data row55 col6\" >{}</td>\n", + " <td id=\"T_0502a_row55_col7\" class=\"data row55 col7\" >['time_series_data', 'stationarity', 'unit_root_test']</td>\n", + " <td id=\"T_0502a_row55_col8\" class=\"data row55 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row56_col0\" class=\"data row56 col0\" >validmind.data_validation.nlp.CommonWords</td>\n", + " <td id=\"T_0502a_row56_col1\" class=\"data row56 col1\" >Common Words</td>\n", + " <td id=\"T_0502a_row56_col2\" class=\"data row56 col2\" >Assesses the most frequent non-stopwords in a text column for identifying prevalent language patterns....</td>\n", + " <td id=\"T_0502a_row56_col3\" class=\"data row56 col3\" >True</td>\n", + " <td id=\"T_0502a_row56_col4\" class=\"data row56 col4\" >False</td>\n", + " <td id=\"T_0502a_row56_col5\" class=\"data row56 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row56_col6\" class=\"data row56 col6\" >{}</td>\n", + " <td id=\"T_0502a_row56_col7\" class=\"data row56 col7\" >['nlp', 'text_data', 'visualization', 'frequency_analysis']</td>\n", + " <td id=\"T_0502a_row56_col8\" class=\"data row56 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row57_col0\" class=\"data row57 col0\" >validmind.data_validation.nlp.Hashtags</td>\n", + " <td id=\"T_0502a_row57_col1\" class=\"data row57 col1\" >Hashtags</td>\n", + " <td id=\"T_0502a_row57_col2\" class=\"data row57 col2\" >Assesses hashtag frequency in a text column, highlighting usage trends and potential dataset bias or spam....</td>\n", + " <td id=\"T_0502a_row57_col3\" class=\"data row57 col3\" >True</td>\n", + " <td id=\"T_0502a_row57_col4\" class=\"data row57 col4\" >False</td>\n", + " <td id=\"T_0502a_row57_col5\" class=\"data row57 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row57_col6\" class=\"data row57 col6\" >{'top_hashtags': {'type': 'int', 'default': 25}}</td>\n", + " <td id=\"T_0502a_row57_col7\" class=\"data row57 col7\" >['nlp', 'text_data', 'visualization', 'frequency_analysis']</td>\n", + " <td id=\"T_0502a_row57_col8\" class=\"data row57 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row58_col0\" class=\"data row58 col0\" >validmind.data_validation.nlp.LanguageDetection</td>\n", + " <td id=\"T_0502a_row58_col1\" class=\"data row58 col1\" >Language Detection</td>\n", + " <td id=\"T_0502a_row58_col2\" class=\"data row58 col2\" >Assesses the diversity of languages in a textual dataset by detecting and visualizing the distribution of languages....</td>\n", + " <td id=\"T_0502a_row58_col3\" class=\"data row58 col3\" >True</td>\n", + " <td id=\"T_0502a_row58_col4\" class=\"data row58 col4\" >False</td>\n", + " <td id=\"T_0502a_row58_col5\" class=\"data row58 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row58_col6\" class=\"data row58 col6\" >{}</td>\n", + " <td id=\"T_0502a_row58_col7\" class=\"data row58 col7\" >['nlp', 'text_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row58_col8\" class=\"data row58 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row59_col0\" class=\"data row59 col0\" >validmind.data_validation.nlp.Mentions</td>\n", + " <td id=\"T_0502a_row59_col1\" class=\"data row59 col1\" >Mentions</td>\n", + " <td id=\"T_0502a_row59_col2\" class=\"data row59 col2\" >Calculates and visualizes frequencies of '@' prefixed mentions in a text-based dataset for NLP model analysis....</td>\n", + " <td id=\"T_0502a_row59_col3\" class=\"data row59 col3\" >True</td>\n", + " <td id=\"T_0502a_row59_col4\" class=\"data row59 col4\" >False</td>\n", + " <td id=\"T_0502a_row59_col5\" class=\"data row59 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row59_col6\" class=\"data row59 col6\" >{'top_mentions': {'type': 'int', 'default': 25}}</td>\n", + " <td id=\"T_0502a_row59_col7\" class=\"data row59 col7\" >['nlp', 'text_data', 'visualization', 'frequency_analysis']</td>\n", + " <td id=\"T_0502a_row59_col8\" class=\"data row59 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row60_col0\" class=\"data row60 col0\" >validmind.data_validation.nlp.PolarityAndSubjectivity</td>\n", + " <td id=\"T_0502a_row60_col1\" class=\"data row60 col1\" >Polarity And Subjectivity</td>\n", + " <td id=\"T_0502a_row60_col2\" class=\"data row60 col2\" >Analyzes the polarity and subjectivity of text data within a given dataset to visualize the sentiment distribution....</td>\n", + " <td id=\"T_0502a_row60_col3\" class=\"data row60 col3\" >True</td>\n", + " <td id=\"T_0502a_row60_col4\" class=\"data row60 col4\" >True</td>\n", + " <td id=\"T_0502a_row60_col5\" class=\"data row60 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row60_col6\" class=\"data row60 col6\" >{'threshold_subjectivity': {'type': '_empty', 'default': 0.5}, 'threshold_polarity': {'type': '_empty', 'default': 0}}</td>\n", + " <td id=\"T_0502a_row60_col7\" class=\"data row60 col7\" >['nlp', 'text_data', 'data_validation']</td>\n", + " <td id=\"T_0502a_row60_col8\" class=\"data row60 col8\" >['nlp']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row61_col0\" class=\"data row61 col0\" >validmind.data_validation.nlp.Punctuations</td>\n", + " <td id=\"T_0502a_row61_col1\" class=\"data row61 col1\" >Punctuations</td>\n", + " <td id=\"T_0502a_row61_col2\" class=\"data row61 col2\" >Analyzes and visualizes the frequency distribution of punctuation usage in a given text dataset....</td>\n", + " <td id=\"T_0502a_row61_col3\" class=\"data row61 col3\" >True</td>\n", + " <td id=\"T_0502a_row61_col4\" class=\"data row61 col4\" >False</td>\n", + " <td id=\"T_0502a_row61_col5\" class=\"data row61 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row61_col6\" class=\"data row61 col6\" >{'count_mode': {'type': '_empty', 'default': 'token'}}</td>\n", + " <td id=\"T_0502a_row61_col7\" class=\"data row61 col7\" >['nlp', 'text_data', 'visualization', 'frequency_analysis']</td>\n", + " <td id=\"T_0502a_row61_col8\" class=\"data row61 col8\" >['text_classification', 'text_summarization', 'nlp']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row62_col0\" class=\"data row62 col0\" >validmind.data_validation.nlp.Sentiment</td>\n", + " <td id=\"T_0502a_row62_col1\" class=\"data row62 col1\" >Sentiment</td>\n", + " <td id=\"T_0502a_row62_col2\" class=\"data row62 col2\" >Analyzes the sentiment of text data within a dataset using the VADER sentiment analysis tool....</td>\n", + " <td id=\"T_0502a_row62_col3\" class=\"data row62 col3\" >True</td>\n", + " <td id=\"T_0502a_row62_col4\" class=\"data row62 col4\" >False</td>\n", + " <td id=\"T_0502a_row62_col5\" class=\"data row62 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row62_col6\" class=\"data row62 col6\" >{}</td>\n", + " <td id=\"T_0502a_row62_col7\" class=\"data row62 col7\" >['nlp', 'text_data', 'data_validation']</td>\n", + " <td id=\"T_0502a_row62_col8\" class=\"data row62 col8\" >['nlp']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row63_col0\" class=\"data row63 col0\" >validmind.data_validation.nlp.StopWords</td>\n", + " <td id=\"T_0502a_row63_col1\" class=\"data row63 col1\" >Stop Words</td>\n", + " <td id=\"T_0502a_row63_col2\" class=\"data row63 col2\" >Evaluates and visualizes the frequency of English stop words in a text dataset against a defined threshold....</td>\n", + " <td id=\"T_0502a_row63_col3\" class=\"data row63 col3\" >True</td>\n", + " <td id=\"T_0502a_row63_col4\" class=\"data row63 col4\" >True</td>\n", + " <td id=\"T_0502a_row63_col5\" class=\"data row63 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row63_col6\" class=\"data row63 col6\" >{'min_percent_threshold': {'type': 'float', 'default': 0.5}, 'num_words': {'type': 'int', 'default': 25}}</td>\n", + " <td id=\"T_0502a_row63_col7\" class=\"data row63 col7\" >['nlp', 'text_data', 'frequency_analysis', 'visualization']</td>\n", + " <td id=\"T_0502a_row63_col8\" class=\"data row63 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row64_col0\" class=\"data row64 col0\" >validmind.data_validation.nlp.TextDescription</td>\n", + " <td id=\"T_0502a_row64_col1\" class=\"data row64 col1\" >Text Description</td>\n", + " <td id=\"T_0502a_row64_col2\" class=\"data row64 col2\" >Conducts comprehensive textual analysis on a dataset using NLTK to evaluate various parameters and generate...</td>\n", + " <td id=\"T_0502a_row64_col3\" class=\"data row64 col3\" >True</td>\n", + " <td id=\"T_0502a_row64_col4\" class=\"data row64 col4\" >False</td>\n", + " <td id=\"T_0502a_row64_col5\" class=\"data row64 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row64_col6\" class=\"data row64 col6\" >{'unwanted_tokens': {'type': 'set', 'default': {'s', 'mrs', 'us', \"''\", ' ', 'ms', 'dr', 'dollar', '``', 'mr', \"'s\", \"s'\"}}, 'lang': {'type': 'str', 'default': 'english'}}</td>\n", + " <td id=\"T_0502a_row64_col7\" class=\"data row64 col7\" >['nlp', 'text_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row64_col8\" class=\"data row64 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row65_col0\" class=\"data row65 col0\" >validmind.data_validation.nlp.Toxicity</td>\n", + " <td id=\"T_0502a_row65_col1\" class=\"data row65 col1\" >Toxicity</td>\n", + " <td id=\"T_0502a_row65_col2\" class=\"data row65 col2\" >Assesses the toxicity of text data within a dataset to visualize the distribution of toxicity scores....</td>\n", + " <td id=\"T_0502a_row65_col3\" class=\"data row65 col3\" >True</td>\n", + " <td id=\"T_0502a_row65_col4\" class=\"data row65 col4\" >False</td>\n", + " <td id=\"T_0502a_row65_col5\" class=\"data row65 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row65_col6\" class=\"data row65 col6\" >{}</td>\n", + " <td id=\"T_0502a_row65_col7\" class=\"data row65 col7\" >['nlp', 'text_data', 'data_validation']</td>\n", + " <td id=\"T_0502a_row65_col8\" class=\"data row65 col8\" >['nlp']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row66_col0\" class=\"data row66 col0\" >validmind.model_validation.BertScore</td>\n", + " <td id=\"T_0502a_row66_col1\" class=\"data row66 col1\" >Bert Score</td>\n", + " <td id=\"T_0502a_row66_col2\" class=\"data row66 col2\" >Assesses the quality of machine-generated text using BERTScore metrics and visualizes results through histograms...</td>\n", + " <td id=\"T_0502a_row66_col3\" class=\"data row66 col3\" >True</td>\n", + " <td id=\"T_0502a_row66_col4\" class=\"data row66 col4\" >True</td>\n", + " <td id=\"T_0502a_row66_col5\" class=\"data row66 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row66_col6\" class=\"data row66 col6\" >{'evaluation_model': {'type': '_empty', 'default': 'distilbert-base-uncased'}}</td>\n", + " <td id=\"T_0502a_row66_col7\" class=\"data row66 col7\" >['nlp', 'text_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row66_col8\" class=\"data row66 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row67_col0\" class=\"data row67 col0\" >validmind.model_validation.BleuScore</td>\n", + " <td id=\"T_0502a_row67_col1\" class=\"data row67 col1\" >Bleu Score</td>\n", + " <td id=\"T_0502a_row67_col2\" class=\"data row67 col2\" >Evaluates the quality of machine-generated text using BLEU metrics and visualizes the results through histograms...</td>\n", + " <td id=\"T_0502a_row67_col3\" class=\"data row67 col3\" >True</td>\n", + " <td id=\"T_0502a_row67_col4\" class=\"data row67 col4\" >True</td>\n", + " <td id=\"T_0502a_row67_col5\" class=\"data row67 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row67_col6\" class=\"data row67 col6\" >{}</td>\n", + " <td id=\"T_0502a_row67_col7\" class=\"data row67 col7\" >['nlp', 'text_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row67_col8\" class=\"data row67 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row68_col0\" class=\"data row68 col0\" >validmind.model_validation.ClusterSizeDistribution</td>\n", + " <td id=\"T_0502a_row68_col1\" class=\"data row68 col1\" >Cluster Size Distribution</td>\n", + " <td id=\"T_0502a_row68_col2\" class=\"data row68 col2\" >Assesses the performance of clustering models by comparing the distribution of cluster sizes in model predictions...</td>\n", + " <td id=\"T_0502a_row68_col3\" class=\"data row68 col3\" >True</td>\n", + " <td id=\"T_0502a_row68_col4\" class=\"data row68 col4\" >False</td>\n", + " <td id=\"T_0502a_row68_col5\" class=\"data row68 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row68_col6\" class=\"data row68 col6\" >{}</td>\n", + " <td id=\"T_0502a_row68_col7\" class=\"data row68 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_0502a_row68_col8\" class=\"data row68 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row69_col0\" class=\"data row69 col0\" >validmind.model_validation.ContextualRecall</td>\n", + " <td id=\"T_0502a_row69_col1\" class=\"data row69 col1\" >Contextual Recall</td>\n", + " <td id=\"T_0502a_row69_col2\" class=\"data row69 col2\" >Evaluates a Natural Language Generation model's ability to generate contextually relevant and factually correct...</td>\n", + " <td id=\"T_0502a_row69_col3\" class=\"data row69 col3\" >True</td>\n", + " <td id=\"T_0502a_row69_col4\" class=\"data row69 col4\" >True</td>\n", + " <td id=\"T_0502a_row69_col5\" class=\"data row69 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row69_col6\" class=\"data row69 col6\" >{}</td>\n", + " <td id=\"T_0502a_row69_col7\" class=\"data row69 col7\" >['nlp', 'text_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row69_col8\" class=\"data row69 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row70_col0\" class=\"data row70 col0\" >validmind.model_validation.FeaturesAUC</td>\n", + " <td id=\"T_0502a_row70_col1\" class=\"data row70 col1\" >Features AUC</td>\n", + " <td id=\"T_0502a_row70_col2\" class=\"data row70 col2\" >Evaluates the discriminatory power of each individual feature within a binary classification model by calculating...</td>\n", + " <td id=\"T_0502a_row70_col3\" class=\"data row70 col3\" >True</td>\n", + " <td id=\"T_0502a_row70_col4\" class=\"data row70 col4\" >False</td>\n", + " <td id=\"T_0502a_row70_col5\" class=\"data row70 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row70_col6\" class=\"data row70 col6\" >{'fontsize': {'type': 'int', 'default': 12}, 'figure_height': {'type': 'int', 'default': 500}}</td>\n", + " <td id=\"T_0502a_row70_col7\" class=\"data row70 col7\" >['feature_importance', 'AUC', 'visualization']</td>\n", + " <td id=\"T_0502a_row70_col8\" class=\"data row70 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row71_col0\" class=\"data row71 col0\" >validmind.model_validation.MeteorScore</td>\n", + " <td id=\"T_0502a_row71_col1\" class=\"data row71 col1\" >Meteor Score</td>\n", + " <td id=\"T_0502a_row71_col2\" class=\"data row71 col2\" >Assesses the quality of machine-generated translations by comparing them to human-produced references using the...</td>\n", + " <td id=\"T_0502a_row71_col3\" class=\"data row71 col3\" >True</td>\n", + " <td id=\"T_0502a_row71_col4\" class=\"data row71 col4\" >True</td>\n", + " <td id=\"T_0502a_row71_col5\" class=\"data row71 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row71_col6\" class=\"data row71 col6\" >{}</td>\n", + " <td id=\"T_0502a_row71_col7\" class=\"data row71 col7\" >['nlp', 'text_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row71_col8\" class=\"data row71 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row72_col0\" class=\"data row72 col0\" >validmind.model_validation.ModelMetadata</td>\n", + " <td id=\"T_0502a_row72_col1\" class=\"data row72 col1\" >Model Metadata</td>\n", + " <td id=\"T_0502a_row72_col2\" class=\"data row72 col2\" >Compare metadata of different models and generate a summary table with the results....</td>\n", + " <td id=\"T_0502a_row72_col3\" class=\"data row72 col3\" >False</td>\n", + " <td id=\"T_0502a_row72_col4\" class=\"data row72 col4\" >True</td>\n", + " <td id=\"T_0502a_row72_col5\" class=\"data row72 col5\" >['model']</td>\n", + " <td id=\"T_0502a_row72_col6\" class=\"data row72 col6\" >{}</td>\n", + " <td id=\"T_0502a_row72_col7\" class=\"data row72 col7\" >['model_training', 'metadata']</td>\n", + " <td id=\"T_0502a_row72_col8\" class=\"data row72 col8\" >['regression', 'time_series_forecasting']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row73_col0\" class=\"data row73 col0\" >validmind.model_validation.ModelPredictionResiduals</td>\n", + " <td id=\"T_0502a_row73_col1\" class=\"data row73 col1\" >Model Prediction Residuals</td>\n", + " <td id=\"T_0502a_row73_col2\" class=\"data row73 col2\" >Assesses normality and behavior of residuals in regression models through visualization and statistical tests....</td>\n", + " <td id=\"T_0502a_row73_col3\" class=\"data row73 col3\" >True</td>\n", + " <td id=\"T_0502a_row73_col4\" class=\"data row73 col4\" >True</td>\n", + " <td id=\"T_0502a_row73_col5\" class=\"data row73 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row73_col6\" class=\"data row73 col6\" >{'nbins': {'type': 'int', 'default': 100}, 'p_value_threshold': {'type': 'float', 'default': 0.05}, 'start_date': {'type': None, 'default': None}, 'end_date': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_0502a_row73_col7\" class=\"data row73 col7\" >['regression']</td>\n", + " <td id=\"T_0502a_row73_col8\" class=\"data row73 col8\" >['residual_analysis', 'visualization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row74_col0\" class=\"data row74 col0\" >validmind.model_validation.RegardScore</td>\n", + " <td id=\"T_0502a_row74_col1\" class=\"data row74 col1\" >Regard Score</td>\n", + " <td id=\"T_0502a_row74_col2\" class=\"data row74 col2\" >Assesses the sentiment and potential biases in text generated by NLP models by computing and visualizing regard...</td>\n", + " <td id=\"T_0502a_row74_col3\" class=\"data row74 col3\" >True</td>\n", + " <td id=\"T_0502a_row74_col4\" class=\"data row74 col4\" >True</td>\n", + " <td id=\"T_0502a_row74_col5\" class=\"data row74 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row74_col6\" class=\"data row74 col6\" >{}</td>\n", + " <td id=\"T_0502a_row74_col7\" class=\"data row74 col7\" >['nlp', 'text_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row74_col8\" class=\"data row74 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row75_col0\" class=\"data row75 col0\" >validmind.model_validation.RegressionResidualsPlot</td>\n", + " <td id=\"T_0502a_row75_col1\" class=\"data row75 col1\" >Regression Residuals Plot</td>\n", + " <td id=\"T_0502a_row75_col2\" class=\"data row75 col2\" >Evaluates regression model performance using residual distribution and actual vs. predicted plots....</td>\n", + " <td id=\"T_0502a_row75_col3\" class=\"data row75 col3\" >True</td>\n", + " <td id=\"T_0502a_row75_col4\" class=\"data row75 col4\" >False</td>\n", + " <td id=\"T_0502a_row75_col5\" class=\"data row75 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row75_col6\" class=\"data row75 col6\" >{'bin_size': {'type': 'float', 'default': 0.1}}</td>\n", + " <td id=\"T_0502a_row75_col7\" class=\"data row75 col7\" >['model_performance', 'visualization']</td>\n", + " <td id=\"T_0502a_row75_col8\" class=\"data row75 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row76_col0\" class=\"data row76 col0\" >validmind.model_validation.RougeScore</td>\n", + " <td id=\"T_0502a_row76_col1\" class=\"data row76 col1\" >Rouge Score</td>\n", + " <td id=\"T_0502a_row76_col2\" class=\"data row76 col2\" >Assesses the quality of machine-generated text using ROUGE metrics and visualizes the results to provide...</td>\n", + " <td id=\"T_0502a_row76_col3\" class=\"data row76 col3\" >True</td>\n", + " <td id=\"T_0502a_row76_col4\" class=\"data row76 col4\" >True</td>\n", + " <td id=\"T_0502a_row76_col5\" class=\"data row76 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row76_col6\" class=\"data row76 col6\" >{'metric': {'type': 'str', 'default': 'rouge-1'}}</td>\n", + " <td id=\"T_0502a_row76_col7\" class=\"data row76 col7\" >['nlp', 'text_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row76_col8\" class=\"data row76 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row77_col0\" class=\"data row77 col0\" >validmind.model_validation.TimeSeriesPredictionWithCI</td>\n", + " <td id=\"T_0502a_row77_col1\" class=\"data row77 col1\" >Time Series Prediction With CI</td>\n", + " <td id=\"T_0502a_row77_col2\" class=\"data row77 col2\" >Assesses predictive accuracy and uncertainty in time series models, highlighting breaches beyond confidence...</td>\n", + " <td id=\"T_0502a_row77_col3\" class=\"data row77 col3\" >True</td>\n", + " <td id=\"T_0502a_row77_col4\" class=\"data row77 col4\" >True</td>\n", + " <td id=\"T_0502a_row77_col5\" class=\"data row77 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row77_col6\" class=\"data row77 col6\" >{'confidence': {'type': 'float', 'default': 0.95}}</td>\n", + " <td id=\"T_0502a_row77_col7\" class=\"data row77 col7\" >['model_predictions', 'visualization']</td>\n", + " <td id=\"T_0502a_row77_col8\" class=\"data row77 col8\" >['regression', 'time_series_forecasting']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row78_col0\" class=\"data row78 col0\" >validmind.model_validation.TimeSeriesPredictionsPlot</td>\n", + " <td id=\"T_0502a_row78_col1\" class=\"data row78 col1\" >Time Series Predictions Plot</td>\n", + " <td id=\"T_0502a_row78_col2\" class=\"data row78 col2\" >Plot actual vs predicted values for time series data and generate a visual comparison for the model....</td>\n", + " <td id=\"T_0502a_row78_col3\" class=\"data row78 col3\" >True</td>\n", + " <td id=\"T_0502a_row78_col4\" class=\"data row78 col4\" >False</td>\n", + " <td id=\"T_0502a_row78_col5\" class=\"data row78 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row78_col6\" class=\"data row78 col6\" >{}</td>\n", + " <td id=\"T_0502a_row78_col7\" class=\"data row78 col7\" >['model_predictions', 'visualization']</td>\n", + " <td id=\"T_0502a_row78_col8\" class=\"data row78 col8\" >['regression', 'time_series_forecasting']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row79_col0\" class=\"data row79 col0\" >validmind.model_validation.TimeSeriesR2SquareBySegments</td>\n", + " <td id=\"T_0502a_row79_col1\" class=\"data row79 col1\" >Time Series R2 Square By Segments</td>\n", + " <td id=\"T_0502a_row79_col2\" class=\"data row79 col2\" >Evaluates the R-Squared values of regression models over specified time segments in time series data to assess...</td>\n", + " <td id=\"T_0502a_row79_col3\" class=\"data row79 col3\" >True</td>\n", + " <td id=\"T_0502a_row79_col4\" class=\"data row79 col4\" >True</td>\n", + " <td id=\"T_0502a_row79_col5\" class=\"data row79 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row79_col6\" class=\"data row79 col6\" >{'segments': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_0502a_row79_col7\" class=\"data row79 col7\" >['model_performance', 'sklearn']</td>\n", + " <td id=\"T_0502a_row79_col8\" class=\"data row79 col8\" >['regression', 'time_series_forecasting']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row80_col0\" class=\"data row80 col0\" >validmind.model_validation.TokenDisparity</td>\n", + " <td id=\"T_0502a_row80_col1\" class=\"data row80 col1\" >Token Disparity</td>\n", + " <td id=\"T_0502a_row80_col2\" class=\"data row80 col2\" >Evaluates the token disparity between reference and generated texts, visualizing the results through histograms and...</td>\n", + " <td id=\"T_0502a_row80_col3\" class=\"data row80 col3\" >True</td>\n", + " <td id=\"T_0502a_row80_col4\" class=\"data row80 col4\" >True</td>\n", + " <td id=\"T_0502a_row80_col5\" class=\"data row80 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row80_col6\" class=\"data row80 col6\" >{}</td>\n", + " <td id=\"T_0502a_row80_col7\" class=\"data row80 col7\" >['nlp', 'text_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row80_col8\" class=\"data row80 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row81_col0\" class=\"data row81 col0\" >validmind.model_validation.ToxicityScore</td>\n", + " <td id=\"T_0502a_row81_col1\" class=\"data row81 col1\" >Toxicity Score</td>\n", + " <td id=\"T_0502a_row81_col2\" class=\"data row81 col2\" >Assesses the toxicity levels of texts generated by NLP models to identify and mitigate harmful or offensive content....</td>\n", + " <td id=\"T_0502a_row81_col3\" class=\"data row81 col3\" >True</td>\n", + " <td id=\"T_0502a_row81_col4\" class=\"data row81 col4\" >True</td>\n", + " <td id=\"T_0502a_row81_col5\" class=\"data row81 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row81_col6\" class=\"data row81 col6\" >{}</td>\n", + " <td id=\"T_0502a_row81_col7\" class=\"data row81 col7\" >['nlp', 'text_data', 'visualization']</td>\n", + " <td id=\"T_0502a_row81_col8\" class=\"data row81 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row82_col0\" class=\"data row82 col0\" >validmind.model_validation.embeddings.ClusterDistribution</td>\n", + " <td id=\"T_0502a_row82_col1\" class=\"data row82 col1\" >Cluster Distribution</td>\n", + " <td id=\"T_0502a_row82_col2\" class=\"data row82 col2\" >Assesses the distribution of text embeddings across clusters produced by a model using KMeans clustering....</td>\n", + " <td id=\"T_0502a_row82_col3\" class=\"data row82 col3\" >True</td>\n", + " <td id=\"T_0502a_row82_col4\" class=\"data row82 col4\" >False</td>\n", + " <td id=\"T_0502a_row82_col5\" class=\"data row82 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row82_col6\" class=\"data row82 col6\" >{'num_clusters': {'type': 'int', 'default': 5}}</td>\n", + " <td id=\"T_0502a_row82_col7\" class=\"data row82 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", + " <td id=\"T_0502a_row82_col8\" class=\"data row82 col8\" >['feature_extraction']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row83_col0\" class=\"data row83 col0\" >validmind.model_validation.embeddings.CosineSimilarityComparison</td>\n", + " <td id=\"T_0502a_row83_col1\" class=\"data row83 col1\" >Cosine Similarity Comparison</td>\n", + " <td id=\"T_0502a_row83_col2\" class=\"data row83 col2\" >Assesses the similarity between embeddings generated by different models using Cosine Similarity, providing both...</td>\n", + " <td id=\"T_0502a_row83_col3\" class=\"data row83 col3\" >True</td>\n", + " <td id=\"T_0502a_row83_col4\" class=\"data row83 col4\" >True</td>\n", + " <td id=\"T_0502a_row83_col5\" class=\"data row83 col5\" >['dataset', 'models']</td>\n", + " <td id=\"T_0502a_row83_col6\" class=\"data row83 col6\" >{}</td>\n", + " <td id=\"T_0502a_row83_col7\" class=\"data row83 col7\" >['visualization', 'dimensionality_reduction', 'embeddings']</td>\n", + " <td id=\"T_0502a_row83_col8\" class=\"data row83 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row84_col0\" class=\"data row84 col0\" >validmind.model_validation.embeddings.CosineSimilarityDistribution</td>\n", + " <td id=\"T_0502a_row84_col1\" class=\"data row84 col1\" >Cosine Similarity Distribution</td>\n", + " <td id=\"T_0502a_row84_col2\" class=\"data row84 col2\" >Assesses the similarity between predicted text embeddings from a model using a Cosine Similarity distribution...</td>\n", + " <td id=\"T_0502a_row84_col3\" class=\"data row84 col3\" >True</td>\n", + " <td id=\"T_0502a_row84_col4\" class=\"data row84 col4\" >False</td>\n", + " <td id=\"T_0502a_row84_col5\" class=\"data row84 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row84_col6\" class=\"data row84 col6\" >{}</td>\n", + " <td id=\"T_0502a_row84_col7\" class=\"data row84 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", + " <td id=\"T_0502a_row84_col8\" class=\"data row84 col8\" >['feature_extraction']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row85_col0\" class=\"data row85 col0\" >validmind.model_validation.embeddings.CosineSimilarityHeatmap</td>\n", + " <td id=\"T_0502a_row85_col1\" class=\"data row85 col1\" >Cosine Similarity Heatmap</td>\n", + " <td id=\"T_0502a_row85_col2\" class=\"data row85 col2\" >Generates an interactive heatmap to visualize the cosine similarities among embeddings derived from a given model....</td>\n", + " <td id=\"T_0502a_row85_col3\" class=\"data row85 col3\" >True</td>\n", + " <td id=\"T_0502a_row85_col4\" class=\"data row85 col4\" >False</td>\n", + " <td id=\"T_0502a_row85_col5\" class=\"data row85 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row85_col6\" class=\"data row85 col6\" >{'title': {'type': '_empty', 'default': 'Cosine Similarity Matrix'}, 'color': {'type': '_empty', 'default': 'Cosine Similarity'}, 'xaxis_title': {'type': '_empty', 'default': 'Index'}, 'yaxis_title': {'type': '_empty', 'default': 'Index'}, 'color_scale': {'type': '_empty', 'default': 'Blues'}}</td>\n", + " <td id=\"T_0502a_row85_col7\" class=\"data row85 col7\" >['visualization', 'dimensionality_reduction', 'embeddings']</td>\n", + " <td id=\"T_0502a_row85_col8\" class=\"data row85 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row86_col0\" class=\"data row86 col0\" >validmind.model_validation.embeddings.DescriptiveAnalytics</td>\n", + " <td id=\"T_0502a_row86_col1\" class=\"data row86 col1\" >Descriptive Analytics</td>\n", + " <td id=\"T_0502a_row86_col2\" class=\"data row86 col2\" >Evaluates statistical properties of text embeddings in an ML model via mean, median, and standard deviation...</td>\n", + " <td id=\"T_0502a_row86_col3\" class=\"data row86 col3\" >True</td>\n", + " <td id=\"T_0502a_row86_col4\" class=\"data row86 col4\" >False</td>\n", + " <td id=\"T_0502a_row86_col5\" class=\"data row86 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row86_col6\" class=\"data row86 col6\" >{}</td>\n", + " <td id=\"T_0502a_row86_col7\" class=\"data row86 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", + " <td id=\"T_0502a_row86_col8\" class=\"data row86 col8\" >['feature_extraction']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row87_col0\" class=\"data row87 col0\" >validmind.model_validation.embeddings.EmbeddingsVisualization2D</td>\n", + " <td id=\"T_0502a_row87_col1\" class=\"data row87 col1\" >Embeddings Visualization2 D</td>\n", + " <td id=\"T_0502a_row87_col2\" class=\"data row87 col2\" >Visualizes 2D representation of text embeddings generated by a model using t-SNE technique....</td>\n", + " <td id=\"T_0502a_row87_col3\" class=\"data row87 col3\" >True</td>\n", + " <td id=\"T_0502a_row87_col4\" class=\"data row87 col4\" >False</td>\n", + " <td id=\"T_0502a_row87_col5\" class=\"data row87 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row87_col6\" class=\"data row87 col6\" >{'cluster_column': {'type': None, 'default': None}, 'perplexity': {'type': 'int', 'default': 30}}</td>\n", + " <td id=\"T_0502a_row87_col7\" class=\"data row87 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", + " <td id=\"T_0502a_row87_col8\" class=\"data row87 col8\" >['feature_extraction']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row88_col0\" class=\"data row88 col0\" >validmind.model_validation.embeddings.EuclideanDistanceComparison</td>\n", + " <td id=\"T_0502a_row88_col1\" class=\"data row88 col1\" >Euclidean Distance Comparison</td>\n", + " <td id=\"T_0502a_row88_col2\" class=\"data row88 col2\" >Assesses and visualizes the dissimilarity between model embeddings using Euclidean distance, providing insights...</td>\n", + " <td id=\"T_0502a_row88_col3\" class=\"data row88 col3\" >True</td>\n", + " <td id=\"T_0502a_row88_col4\" class=\"data row88 col4\" >True</td>\n", + " <td id=\"T_0502a_row88_col5\" class=\"data row88 col5\" >['dataset', 'models']</td>\n", + " <td id=\"T_0502a_row88_col6\" class=\"data row88 col6\" >{}</td>\n", + " <td id=\"T_0502a_row88_col7\" class=\"data row88 col7\" >['visualization', 'dimensionality_reduction', 'embeddings']</td>\n", + " <td id=\"T_0502a_row88_col8\" class=\"data row88 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row89_col0\" class=\"data row89 col0\" >validmind.model_validation.embeddings.EuclideanDistanceHeatmap</td>\n", + " <td id=\"T_0502a_row89_col1\" class=\"data row89 col1\" >Euclidean Distance Heatmap</td>\n", + " <td id=\"T_0502a_row89_col2\" class=\"data row89 col2\" >Generates an interactive heatmap to visualize the Euclidean distances among embeddings derived from a given model....</td>\n", + " <td id=\"T_0502a_row89_col3\" class=\"data row89 col3\" >True</td>\n", + " <td id=\"T_0502a_row89_col4\" class=\"data row89 col4\" >False</td>\n", + " <td id=\"T_0502a_row89_col5\" class=\"data row89 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row89_col6\" class=\"data row89 col6\" >{'title': {'type': '_empty', 'default': 'Euclidean Distance Matrix'}, 'color': {'type': '_empty', 'default': 'Euclidean Distance'}, 'xaxis_title': {'type': '_empty', 'default': 'Index'}, 'yaxis_title': {'type': '_empty', 'default': 'Index'}, 'color_scale': {'type': '_empty', 'default': 'Blues'}}</td>\n", + " <td id=\"T_0502a_row89_col7\" class=\"data row89 col7\" >['visualization', 'dimensionality_reduction', 'embeddings']</td>\n", + " <td id=\"T_0502a_row89_col8\" class=\"data row89 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row90_col0\" class=\"data row90 col0\" >validmind.model_validation.embeddings.PCAComponentsPairwisePlots</td>\n", + " <td id=\"T_0502a_row90_col1\" class=\"data row90 col1\" >PCA Components Pairwise Plots</td>\n", + " <td id=\"T_0502a_row90_col2\" class=\"data row90 col2\" >Generates scatter plots for pairwise combinations of principal component analysis (PCA) components of model...</td>\n", + " <td id=\"T_0502a_row90_col3\" class=\"data row90 col3\" >True</td>\n", + " <td id=\"T_0502a_row90_col4\" class=\"data row90 col4\" >False</td>\n", + " <td id=\"T_0502a_row90_col5\" class=\"data row90 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row90_col6\" class=\"data row90 col6\" >{'n_components': {'type': 'int', 'default': 3}}</td>\n", + " <td id=\"T_0502a_row90_col7\" class=\"data row90 col7\" >['visualization', 'dimensionality_reduction', 'embeddings']</td>\n", + " <td id=\"T_0502a_row90_col8\" class=\"data row90 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row91_col0\" class=\"data row91 col0\" >validmind.model_validation.embeddings.StabilityAnalysisKeyword</td>\n", + " <td id=\"T_0502a_row91_col1\" class=\"data row91 col1\" >Stability Analysis Keyword</td>\n", + " <td id=\"T_0502a_row91_col2\" class=\"data row91 col2\" >Evaluates robustness of embedding models to keyword swaps in the test dataset....</td>\n", + " <td id=\"T_0502a_row91_col3\" class=\"data row91 col3\" >True</td>\n", + " <td id=\"T_0502a_row91_col4\" class=\"data row91 col4\" >True</td>\n", + " <td id=\"T_0502a_row91_col5\" class=\"data row91 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row91_col6\" class=\"data row91 col6\" >{'keyword_dict': {'type': None, 'default': None}, 'mean_similarity_threshold': {'type': 'float', 'default': 0.7}}</td>\n", + " <td id=\"T_0502a_row91_col7\" class=\"data row91 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", + " <td id=\"T_0502a_row91_col8\" class=\"data row91 col8\" >['feature_extraction']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row92_col0\" class=\"data row92 col0\" >validmind.model_validation.embeddings.StabilityAnalysisRandomNoise</td>\n", + " <td id=\"T_0502a_row92_col1\" class=\"data row92 col1\" >Stability Analysis Random Noise</td>\n", + " <td id=\"T_0502a_row92_col2\" class=\"data row92 col2\" >Assesses the robustness of text embeddings models to random noise introduced via text perturbations....</td>\n", + " <td id=\"T_0502a_row92_col3\" class=\"data row92 col3\" >True</td>\n", + " <td id=\"T_0502a_row92_col4\" class=\"data row92 col4\" >True</td>\n", + " <td id=\"T_0502a_row92_col5\" class=\"data row92 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row92_col6\" class=\"data row92 col6\" >{'probability': {'type': 'float', 'default': 0.02}, 'mean_similarity_threshold': {'type': 'float', 'default': 0.7}}</td>\n", + " <td id=\"T_0502a_row92_col7\" class=\"data row92 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", + " <td id=\"T_0502a_row92_col8\" class=\"data row92 col8\" >['feature_extraction']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row93_col0\" class=\"data row93 col0\" >validmind.model_validation.embeddings.StabilityAnalysisSynonyms</td>\n", + " <td id=\"T_0502a_row93_col1\" class=\"data row93 col1\" >Stability Analysis Synonyms</td>\n", + " <td id=\"T_0502a_row93_col2\" class=\"data row93 col2\" >Evaluates the stability of text embeddings models when words in test data are replaced by their synonyms randomly....</td>\n", + " <td id=\"T_0502a_row93_col3\" class=\"data row93 col3\" >True</td>\n", + " <td id=\"T_0502a_row93_col4\" class=\"data row93 col4\" >True</td>\n", + " <td id=\"T_0502a_row93_col5\" class=\"data row93 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row93_col6\" class=\"data row93 col6\" >{'probability': {'type': 'float', 'default': 0.02}, 'mean_similarity_threshold': {'type': 'float', 'default': 0.7}}</td>\n", + " <td id=\"T_0502a_row93_col7\" class=\"data row93 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", + " <td id=\"T_0502a_row93_col8\" class=\"data row93 col8\" >['feature_extraction']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row94_col0\" class=\"data row94 col0\" >validmind.model_validation.embeddings.StabilityAnalysisTranslation</td>\n", + " <td id=\"T_0502a_row94_col1\" class=\"data row94 col1\" >Stability Analysis Translation</td>\n", + " <td id=\"T_0502a_row94_col2\" class=\"data row94 col2\" >Evaluates robustness of text embeddings models to noise introduced by translating the original text to another...</td>\n", + " <td id=\"T_0502a_row94_col3\" class=\"data row94 col3\" >True</td>\n", + " <td id=\"T_0502a_row94_col4\" class=\"data row94 col4\" >True</td>\n", + " <td id=\"T_0502a_row94_col5\" class=\"data row94 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row94_col6\" class=\"data row94 col6\" >{'source_lang': {'type': 'str', 'default': 'en'}, 'target_lang': {'type': 'str', 'default': 'fr'}, 'mean_similarity_threshold': {'type': 'float', 'default': 0.7}}</td>\n", + " <td id=\"T_0502a_row94_col7\" class=\"data row94 col7\" >['llm', 'text_data', 'embeddings', 'visualization']</td>\n", + " <td id=\"T_0502a_row94_col8\" class=\"data row94 col8\" >['feature_extraction']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row95_col0\" class=\"data row95 col0\" >validmind.model_validation.embeddings.TSNEComponentsPairwisePlots</td>\n", + " <td id=\"T_0502a_row95_col1\" class=\"data row95 col1\" >TSNE Components Pairwise Plots</td>\n", + " <td id=\"T_0502a_row95_col2\" class=\"data row95 col2\" >Creates scatter plots for pairwise combinations of t-SNE components to visualize embeddings and highlight potential...</td>\n", + " <td id=\"T_0502a_row95_col3\" class=\"data row95 col3\" >True</td>\n", + " <td id=\"T_0502a_row95_col4\" class=\"data row95 col4\" >False</td>\n", + " <td id=\"T_0502a_row95_col5\" class=\"data row95 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row95_col6\" class=\"data row95 col6\" >{'n_components': {'type': 'int', 'default': 2}, 'perplexity': {'type': 'int', 'default': 30}, 'title': {'type': 'str', 'default': 't-SNE'}}</td>\n", + " <td id=\"T_0502a_row95_col7\" class=\"data row95 col7\" >['visualization', 'dimensionality_reduction', 'embeddings']</td>\n", + " <td id=\"T_0502a_row95_col8\" class=\"data row95 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row96_col0\" class=\"data row96 col0\" >validmind.model_validation.ragas.AnswerCorrectness</td>\n", + " <td id=\"T_0502a_row96_col1\" class=\"data row96 col1\" >Answer Correctness</td>\n", + " <td id=\"T_0502a_row96_col2\" class=\"data row96 col2\" >Evaluates the correctness of answers in a dataset with respect to the provided ground...</td>\n", + " <td id=\"T_0502a_row96_col3\" class=\"data row96 col3\" >True</td>\n", + " <td id=\"T_0502a_row96_col4\" class=\"data row96 col4\" >True</td>\n", + " <td id=\"T_0502a_row96_col5\" class=\"data row96 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row96_col6\" class=\"data row96 col6\" >{'user_input_column': {'type': 'str', 'default': 'user_input'}, 'response_column': {'type': 'str', 'default': 'response'}, 'reference_column': {'type': 'str', 'default': 'reference'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row96_col7\" class=\"data row96 col7\" >['ragas', 'llm']</td>\n", + " <td id=\"T_0502a_row96_col8\" class=\"data row96 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row97_col0\" class=\"data row97 col0\" >validmind.model_validation.ragas.AspectCritic</td>\n", + " <td id=\"T_0502a_row97_col1\" class=\"data row97 col1\" >Aspect Critic</td>\n", + " <td id=\"T_0502a_row97_col2\" class=\"data row97 col2\" >Evaluates generations against the following aspects: harmfulness, maliciousness,...</td>\n", + " <td id=\"T_0502a_row97_col3\" class=\"data row97 col3\" >True</td>\n", + " <td id=\"T_0502a_row97_col4\" class=\"data row97 col4\" >True</td>\n", + " <td id=\"T_0502a_row97_col5\" class=\"data row97 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row97_col6\" class=\"data row97 col6\" >{'user_input_column': {'type': 'str', 'default': 'user_input'}, 'response_column': {'type': 'str', 'default': 'response'}, 'retrieved_contexts_column': {'type': None, 'default': None}, 'aspects': {'type': None, 'default': ['coherence', 'conciseness', 'correctness', 'harmfulness', 'maliciousness']}, 'additional_aspects': {'type': None, 'default': None}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row97_col7\" class=\"data row97 col7\" >['ragas', 'llm', 'qualitative']</td>\n", + " <td id=\"T_0502a_row97_col8\" class=\"data row97 col8\" >['text_summarization', 'text_generation', 'text_qa']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row98_col0\" class=\"data row98 col0\" >validmind.model_validation.ragas.ContextEntityRecall</td>\n", + " <td id=\"T_0502a_row98_col1\" class=\"data row98 col1\" >Context Entity Recall</td>\n", + " <td id=\"T_0502a_row98_col2\" class=\"data row98 col2\" >Evaluates the context entity recall for dataset entries and visualizes the results....</td>\n", + " <td id=\"T_0502a_row98_col3\" class=\"data row98 col3\" >True</td>\n", + " <td id=\"T_0502a_row98_col4\" class=\"data row98 col4\" >True</td>\n", + " <td id=\"T_0502a_row98_col5\" class=\"data row98 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row98_col6\" class=\"data row98 col6\" >{'retrieved_contexts_column': {'type': 'str', 'default': 'retrieved_contexts'}, 'reference_column': {'type': 'str', 'default': 'reference'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row98_col7\" class=\"data row98 col7\" >['ragas', 'llm', 'retrieval_performance']</td>\n", + " <td id=\"T_0502a_row98_col8\" class=\"data row98 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row99_col0\" class=\"data row99 col0\" >validmind.model_validation.ragas.ContextPrecision</td>\n", + " <td id=\"T_0502a_row99_col1\" class=\"data row99 col1\" >Context Precision</td>\n", + " <td id=\"T_0502a_row99_col2\" class=\"data row99 col2\" >Context Precision is a metric that evaluates whether all of the ground-truth...</td>\n", + " <td id=\"T_0502a_row99_col3\" class=\"data row99 col3\" >True</td>\n", + " <td id=\"T_0502a_row99_col4\" class=\"data row99 col4\" >True</td>\n", + " <td id=\"T_0502a_row99_col5\" class=\"data row99 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row99_col6\" class=\"data row99 col6\" >{'user_input_column': {'type': 'str', 'default': 'user_input'}, 'retrieved_contexts_column': {'type': 'str', 'default': 'retrieved_contexts'}, 'reference_column': {'type': 'str', 'default': 'reference'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row99_col7\" class=\"data row99 col7\" >['ragas', 'llm', 'retrieval_performance']</td>\n", + " <td id=\"T_0502a_row99_col8\" class=\"data row99 col8\" >['text_qa', 'text_generation', 'text_summarization', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row100_col0\" class=\"data row100 col0\" >validmind.model_validation.ragas.ContextPrecisionWithoutReference</td>\n", + " <td id=\"T_0502a_row100_col1\" class=\"data row100 col1\" >Context Precision Without Reference</td>\n", + " <td id=\"T_0502a_row100_col2\" class=\"data row100 col2\" >Context Precision Without Reference is a metric used to evaluate the relevance of...</td>\n", + " <td id=\"T_0502a_row100_col3\" class=\"data row100 col3\" >True</td>\n", + " <td id=\"T_0502a_row100_col4\" class=\"data row100 col4\" >True</td>\n", + " <td id=\"T_0502a_row100_col5\" class=\"data row100 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row100_col6\" class=\"data row100 col6\" >{'user_input_column': {'type': 'str', 'default': 'user_input'}, 'retrieved_contexts_column': {'type': 'str', 'default': 'retrieved_contexts'}, 'response_column': {'type': 'str', 'default': 'response'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row100_col7\" class=\"data row100 col7\" >['ragas', 'llm', 'retrieval_performance']</td>\n", + " <td id=\"T_0502a_row100_col8\" class=\"data row100 col8\" >['text_qa', 'text_generation', 'text_summarization', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row101_col0\" class=\"data row101 col0\" >validmind.model_validation.ragas.ContextRecall</td>\n", + " <td id=\"T_0502a_row101_col1\" class=\"data row101 col1\" >Context Recall</td>\n", + " <td id=\"T_0502a_row101_col2\" class=\"data row101 col2\" >Context recall measures the extent to which the retrieved context aligns with the...</td>\n", + " <td id=\"T_0502a_row101_col3\" class=\"data row101 col3\" >True</td>\n", + " <td id=\"T_0502a_row101_col4\" class=\"data row101 col4\" >True</td>\n", + " <td id=\"T_0502a_row101_col5\" class=\"data row101 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row101_col6\" class=\"data row101 col6\" >{'user_input_column': {'type': 'str', 'default': 'user_input'}, 'retrieved_contexts_column': {'type': 'str', 'default': 'retrieved_contexts'}, 'reference_column': {'type': 'str', 'default': 'reference'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row101_col7\" class=\"data row101 col7\" >['ragas', 'llm', 'retrieval_performance']</td>\n", + " <td id=\"T_0502a_row101_col8\" class=\"data row101 col8\" >['text_qa', 'text_generation', 'text_summarization', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row102_col0\" class=\"data row102 col0\" >validmind.model_validation.ragas.Faithfulness</td>\n", + " <td id=\"T_0502a_row102_col1\" class=\"data row102 col1\" >Faithfulness</td>\n", + " <td id=\"T_0502a_row102_col2\" class=\"data row102 col2\" >Evaluates the faithfulness of the generated answers with respect to retrieved contexts....</td>\n", + " <td id=\"T_0502a_row102_col3\" class=\"data row102 col3\" >True</td>\n", + " <td id=\"T_0502a_row102_col4\" class=\"data row102 col4\" >True</td>\n", + " <td id=\"T_0502a_row102_col5\" class=\"data row102 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row102_col6\" class=\"data row102 col6\" >{'user_input_column': {'type': 'str', 'default': 'user_input'}, 'response_column': {'type': 'str', 'default': 'response'}, 'retrieved_contexts_column': {'type': 'str', 'default': 'retrieved_contexts'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row102_col7\" class=\"data row102 col7\" >['ragas', 'llm', 'rag_performance']</td>\n", + " <td id=\"T_0502a_row102_col8\" class=\"data row102 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row103_col0\" class=\"data row103 col0\" >validmind.model_validation.ragas.NoiseSensitivity</td>\n", + " <td id=\"T_0502a_row103_col1\" class=\"data row103 col1\" >Noise Sensitivity</td>\n", + " <td id=\"T_0502a_row103_col2\" class=\"data row103 col2\" >Assesses the sensitivity of a Large Language Model (LLM) to noise in retrieved context by measuring how often it...</td>\n", + " <td id=\"T_0502a_row103_col3\" class=\"data row103 col3\" >True</td>\n", + " <td id=\"T_0502a_row103_col4\" class=\"data row103 col4\" >True</td>\n", + " <td id=\"T_0502a_row103_col5\" class=\"data row103 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row103_col6\" class=\"data row103 col6\" >{'response_column': {'type': 'str', 'default': 'response'}, 'retrieved_contexts_column': {'type': 'str', 'default': 'retrieved_contexts'}, 'reference_column': {'type': 'str', 'default': 'reference'}, 'focus': {'type': 'str', 'default': 'relevant'}, 'user_input_column': {'type': 'str', 'default': 'user_input'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row103_col7\" class=\"data row103 col7\" >['ragas', 'llm', 'rag_performance']</td>\n", + " <td id=\"T_0502a_row103_col8\" class=\"data row103 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row104_col0\" class=\"data row104 col0\" >validmind.model_validation.ragas.ResponseRelevancy</td>\n", + " <td id=\"T_0502a_row104_col1\" class=\"data row104 col1\" >Response Relevancy</td>\n", + " <td id=\"T_0502a_row104_col2\" class=\"data row104 col2\" >Assesses how pertinent the generated answer is to the given prompt....</td>\n", + " <td id=\"T_0502a_row104_col3\" class=\"data row104 col3\" >True</td>\n", + " <td id=\"T_0502a_row104_col4\" class=\"data row104 col4\" >True</td>\n", + " <td id=\"T_0502a_row104_col5\" class=\"data row104 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row104_col6\" class=\"data row104 col6\" >{'user_input_column': {'type': 'str', 'default': 'user_input'}, 'retrieved_contexts_column': {'type': 'str', 'default': None}, 'response_column': {'type': 'str', 'default': 'response'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row104_col7\" class=\"data row104 col7\" >['ragas', 'llm', 'rag_performance']</td>\n", + " <td id=\"T_0502a_row104_col8\" class=\"data row104 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row105_col0\" class=\"data row105 col0\" >validmind.model_validation.ragas.SemanticSimilarity</td>\n", + " <td id=\"T_0502a_row105_col1\" class=\"data row105 col1\" >Semantic Similarity</td>\n", + " <td id=\"T_0502a_row105_col2\" class=\"data row105 col2\" >Calculates the semantic similarity between generated responses and ground truths...</td>\n", + " <td id=\"T_0502a_row105_col3\" class=\"data row105 col3\" >True</td>\n", + " <td id=\"T_0502a_row105_col4\" class=\"data row105 col4\" >True</td>\n", + " <td id=\"T_0502a_row105_col5\" class=\"data row105 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row105_col6\" class=\"data row105 col6\" >{'response_column': {'type': 'str', 'default': 'response'}, 'reference_column': {'type': 'str', 'default': 'reference'}, 'judge_llm': {'type': '_empty', 'default': None}, 'judge_embeddings': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row105_col7\" class=\"data row105 col7\" >['ragas', 'llm']</td>\n", + " <td id=\"T_0502a_row105_col8\" class=\"data row105 col8\" >['text_qa', 'text_generation', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row106_col0\" class=\"data row106 col0\" >validmind.model_validation.sklearn.AdjustedMutualInformation</td>\n", + " <td id=\"T_0502a_row106_col1\" class=\"data row106 col1\" >Adjusted Mutual Information</td>\n", + " <td id=\"T_0502a_row106_col2\" class=\"data row106 col2\" >Evaluates clustering model performance by measuring mutual information between true and predicted labels, adjusting...</td>\n", + " <td id=\"T_0502a_row106_col3\" class=\"data row106 col3\" >False</td>\n", + " <td id=\"T_0502a_row106_col4\" class=\"data row106 col4\" >True</td>\n", + " <td id=\"T_0502a_row106_col5\" class=\"data row106 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row106_col6\" class=\"data row106 col6\" >{}</td>\n", + " <td id=\"T_0502a_row106_col7\" class=\"data row106 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", + " <td id=\"T_0502a_row106_col8\" class=\"data row106 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row107_col0\" class=\"data row107 col0\" >validmind.model_validation.sklearn.AdjustedRandIndex</td>\n", + " <td id=\"T_0502a_row107_col1\" class=\"data row107 col1\" >Adjusted Rand Index</td>\n", + " <td id=\"T_0502a_row107_col2\" class=\"data row107 col2\" >Measures the similarity between two data clusters using the Adjusted Rand Index (ARI) metric in clustering machine...</td>\n", + " <td id=\"T_0502a_row107_col3\" class=\"data row107 col3\" >False</td>\n", + " <td id=\"T_0502a_row107_col4\" class=\"data row107 col4\" >True</td>\n", + " <td id=\"T_0502a_row107_col5\" class=\"data row107 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row107_col6\" class=\"data row107 col6\" >{}</td>\n", + " <td id=\"T_0502a_row107_col7\" class=\"data row107 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", + " <td id=\"T_0502a_row107_col8\" class=\"data row107 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row108_col0\" class=\"data row108 col0\" >validmind.model_validation.sklearn.CalibrationCurve</td>\n", + " <td id=\"T_0502a_row108_col1\" class=\"data row108 col1\" >Calibration Curve</td>\n", + " <td id=\"T_0502a_row108_col2\" class=\"data row108 col2\" >Evaluates the calibration of probability estimates by comparing predicted probabilities against observed...</td>\n", + " <td id=\"T_0502a_row108_col3\" class=\"data row108 col3\" >True</td>\n", + " <td id=\"T_0502a_row108_col4\" class=\"data row108 col4\" >False</td>\n", + " <td id=\"T_0502a_row108_col5\" class=\"data row108 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row108_col6\" class=\"data row108 col6\" >{'n_bins': {'type': 'int', 'default': 10}}</td>\n", + " <td id=\"T_0502a_row108_col7\" class=\"data row108 col7\" >['sklearn', 'model_performance', 'classification']</td>\n", + " <td id=\"T_0502a_row108_col8\" class=\"data row108 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row109_col0\" class=\"data row109 col0\" >validmind.model_validation.sklearn.ClassifierPerformance</td>\n", + " <td id=\"T_0502a_row109_col1\" class=\"data row109 col1\" >Classifier Performance</td>\n", + " <td id=\"T_0502a_row109_col2\" class=\"data row109 col2\" >Evaluates performance of binary or multiclass classification models using precision, recall, F1-Score, accuracy,...</td>\n", + " <td id=\"T_0502a_row109_col3\" class=\"data row109 col3\" >False</td>\n", + " <td id=\"T_0502a_row109_col4\" class=\"data row109 col4\" >True</td>\n", + " <td id=\"T_0502a_row109_col5\" class=\"data row109 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row109_col6\" class=\"data row109 col6\" >{'average': {'type': 'str', 'default': 'macro'}}</td>\n", + " <td id=\"T_0502a_row109_col7\" class=\"data row109 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_0502a_row109_col8\" class=\"data row109 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row110_col0\" class=\"data row110 col0\" >validmind.model_validation.sklearn.ClassifierThresholdOptimization</td>\n", + " <td id=\"T_0502a_row110_col1\" class=\"data row110 col1\" >Classifier Threshold Optimization</td>\n", + " <td id=\"T_0502a_row110_col2\" class=\"data row110 col2\" >Analyzes and visualizes different threshold optimization methods for binary classification models....</td>\n", + " <td id=\"T_0502a_row110_col3\" class=\"data row110 col3\" >False</td>\n", + " <td id=\"T_0502a_row110_col4\" class=\"data row110 col4\" >True</td>\n", + " <td id=\"T_0502a_row110_col5\" class=\"data row110 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row110_col6\" class=\"data row110 col6\" >{'methods': {'type': None, 'default': None}, 'target_recall': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_0502a_row110_col7\" class=\"data row110 col7\" >['model_validation', 'threshold_optimization', 'classification_metrics']</td>\n", + " <td id=\"T_0502a_row110_col8\" class=\"data row110 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row111_col0\" class=\"data row111 col0\" >validmind.model_validation.sklearn.ClusterCosineSimilarity</td>\n", + " <td id=\"T_0502a_row111_col1\" class=\"data row111 col1\" >Cluster Cosine Similarity</td>\n", + " <td id=\"T_0502a_row111_col2\" class=\"data row111 col2\" >Measures the intra-cluster similarity of a clustering model using cosine similarity....</td>\n", + " <td id=\"T_0502a_row111_col3\" class=\"data row111 col3\" >False</td>\n", + " <td id=\"T_0502a_row111_col4\" class=\"data row111 col4\" >True</td>\n", + " <td id=\"T_0502a_row111_col5\" class=\"data row111 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row111_col6\" class=\"data row111 col6\" >{}</td>\n", + " <td id=\"T_0502a_row111_col7\" class=\"data row111 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", + " <td id=\"T_0502a_row111_col8\" class=\"data row111 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row112_col0\" class=\"data row112 col0\" >validmind.model_validation.sklearn.ClusterPerformanceMetrics</td>\n", + " <td id=\"T_0502a_row112_col1\" class=\"data row112 col1\" >Cluster Performance Metrics</td>\n", + " <td id=\"T_0502a_row112_col2\" class=\"data row112 col2\" >Evaluates the performance of clustering machine learning models using multiple established metrics....</td>\n", + " <td id=\"T_0502a_row112_col3\" class=\"data row112 col3\" >False</td>\n", + " <td id=\"T_0502a_row112_col4\" class=\"data row112 col4\" >True</td>\n", + " <td id=\"T_0502a_row112_col5\" class=\"data row112 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row112_col6\" class=\"data row112 col6\" >{}</td>\n", + " <td id=\"T_0502a_row112_col7\" class=\"data row112 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", + " <td id=\"T_0502a_row112_col8\" class=\"data row112 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row113_col0\" class=\"data row113 col0\" >validmind.model_validation.sklearn.CompletenessScore</td>\n", + " <td id=\"T_0502a_row113_col1\" class=\"data row113 col1\" >Completeness Score</td>\n", + " <td id=\"T_0502a_row113_col2\" class=\"data row113 col2\" >Evaluates a clustering model's capacity to categorize instances from a single class into the same cluster....</td>\n", + " <td id=\"T_0502a_row113_col3\" class=\"data row113 col3\" >False</td>\n", + " <td id=\"T_0502a_row113_col4\" class=\"data row113 col4\" >True</td>\n", + " <td id=\"T_0502a_row113_col5\" class=\"data row113 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row113_col6\" class=\"data row113 col6\" >{}</td>\n", + " <td id=\"T_0502a_row113_col7\" class=\"data row113 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", + " <td id=\"T_0502a_row113_col8\" class=\"data row113 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row114_col0\" class=\"data row114 col0\" >validmind.model_validation.sklearn.ConfusionMatrix</td>\n", + " <td id=\"T_0502a_row114_col1\" class=\"data row114 col1\" >Confusion Matrix</td>\n", + " <td id=\"T_0502a_row114_col2\" class=\"data row114 col2\" >Evaluates and visually represents the classification ML model's predictive performance using a Confusion Matrix...</td>\n", + " <td id=\"T_0502a_row114_col3\" class=\"data row114 col3\" >True</td>\n", + " <td id=\"T_0502a_row114_col4\" class=\"data row114 col4\" >False</td>\n", + " <td id=\"T_0502a_row114_col5\" class=\"data row114 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row114_col6\" class=\"data row114 col6\" >{'threshold': {'type': 'float', 'default': 0.5}}</td>\n", + " <td id=\"T_0502a_row114_col7\" class=\"data row114 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_0502a_row114_col8\" class=\"data row114 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row115_col0\" class=\"data row115 col0\" >validmind.model_validation.sklearn.FeatureImportance</td>\n", + " <td id=\"T_0502a_row115_col1\" class=\"data row115 col1\" >Feature Importance</td>\n", + " <td id=\"T_0502a_row115_col2\" class=\"data row115 col2\" >Compute feature importance scores for a given model and generate a summary table...</td>\n", + " <td id=\"T_0502a_row115_col3\" class=\"data row115 col3\" >False</td>\n", + " <td id=\"T_0502a_row115_col4\" class=\"data row115 col4\" >True</td>\n", + " <td id=\"T_0502a_row115_col5\" class=\"data row115 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row115_col6\" class=\"data row115 col6\" >{'num_features': {'type': 'int', 'default': 3}}</td>\n", + " <td id=\"T_0502a_row115_col7\" class=\"data row115 col7\" >['model_explainability', 'sklearn']</td>\n", + " <td id=\"T_0502a_row115_col8\" class=\"data row115 col8\" >['regression', 'time_series_forecasting']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row116_col0\" class=\"data row116 col0\" >validmind.model_validation.sklearn.FowlkesMallowsScore</td>\n", + " <td id=\"T_0502a_row116_col1\" class=\"data row116 col1\" >Fowlkes Mallows Score</td>\n", + " <td id=\"T_0502a_row116_col2\" class=\"data row116 col2\" >Evaluates the similarity between predicted and actual cluster assignments in a model using the Fowlkes-Mallows...</td>\n", + " <td id=\"T_0502a_row116_col3\" class=\"data row116 col3\" >False</td>\n", + " <td id=\"T_0502a_row116_col4\" class=\"data row116 col4\" >True</td>\n", + " <td id=\"T_0502a_row116_col5\" class=\"data row116 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row116_col6\" class=\"data row116 col6\" >{}</td>\n", + " <td id=\"T_0502a_row116_col7\" class=\"data row116 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_0502a_row116_col8\" class=\"data row116 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row117_col0\" class=\"data row117 col0\" >validmind.model_validation.sklearn.HomogeneityScore</td>\n", + " <td id=\"T_0502a_row117_col1\" class=\"data row117 col1\" >Homogeneity Score</td>\n", + " <td id=\"T_0502a_row117_col2\" class=\"data row117 col2\" >Assesses clustering homogeneity by comparing true and predicted labels, scoring from 0 (heterogeneous) to 1...</td>\n", + " <td id=\"T_0502a_row117_col3\" class=\"data row117 col3\" >False</td>\n", + " <td id=\"T_0502a_row117_col4\" class=\"data row117 col4\" >True</td>\n", + " <td id=\"T_0502a_row117_col5\" class=\"data row117 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row117_col6\" class=\"data row117 col6\" >{}</td>\n", + " <td id=\"T_0502a_row117_col7\" class=\"data row117 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_0502a_row117_col8\" class=\"data row117 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row118_col0\" class=\"data row118 col0\" >validmind.model_validation.sklearn.HyperParametersTuning</td>\n", + " <td id=\"T_0502a_row118_col1\" class=\"data row118 col1\" >Hyper Parameters Tuning</td>\n", + " <td id=\"T_0502a_row118_col2\" class=\"data row118 col2\" >Performs exhaustive grid search over specified parameter ranges to find optimal model configurations...</td>\n", + " <td id=\"T_0502a_row118_col3\" class=\"data row118 col3\" >False</td>\n", + " <td id=\"T_0502a_row118_col4\" class=\"data row118 col4\" >True</td>\n", + " <td id=\"T_0502a_row118_col5\" class=\"data row118 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row118_col6\" class=\"data row118 col6\" >{'param_grid': {'type': 'dict', 'default': None}, 'scoring': {'type': None, 'default': None}, 'thresholds': {'type': None, 'default': None}, 'fit_params': {'type': 'dict', 'default': None}}</td>\n", + " <td id=\"T_0502a_row118_col7\" class=\"data row118 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_0502a_row118_col8\" class=\"data row118 col8\" >['clustering', 'classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row119_col0\" class=\"data row119 col0\" >validmind.model_validation.sklearn.KMeansClustersOptimization</td>\n", + " <td id=\"T_0502a_row119_col1\" class=\"data row119 col1\" >K Means Clusters Optimization</td>\n", + " <td id=\"T_0502a_row119_col2\" class=\"data row119 col2\" >Optimizes the number of clusters in K-means models using Elbow and Silhouette methods....</td>\n", + " <td id=\"T_0502a_row119_col3\" class=\"data row119 col3\" >True</td>\n", + " <td id=\"T_0502a_row119_col4\" class=\"data row119 col4\" >False</td>\n", + " <td id=\"T_0502a_row119_col5\" class=\"data row119 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row119_col6\" class=\"data row119 col6\" >{'n_clusters': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_0502a_row119_col7\" class=\"data row119 col7\" >['sklearn', 'model_performance', 'kmeans']</td>\n", + " <td id=\"T_0502a_row119_col8\" class=\"data row119 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row120_col0\" class=\"data row120 col0\" >validmind.model_validation.sklearn.MinimumAccuracy</td>\n", + " <td id=\"T_0502a_row120_col1\" class=\"data row120 col1\" >Minimum Accuracy</td>\n", + " <td id=\"T_0502a_row120_col2\" class=\"data row120 col2\" >Checks if the model's prediction accuracy meets or surpasses a specified threshold....</td>\n", + " <td id=\"T_0502a_row120_col3\" class=\"data row120 col3\" >False</td>\n", + " <td id=\"T_0502a_row120_col4\" class=\"data row120 col4\" >True</td>\n", + " <td id=\"T_0502a_row120_col5\" class=\"data row120 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row120_col6\" class=\"data row120 col6\" >{'min_threshold': {'type': 'float', 'default': 0.7}}</td>\n", + " <td id=\"T_0502a_row120_col7\" class=\"data row120 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_0502a_row120_col8\" class=\"data row120 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row121_col0\" class=\"data row121 col0\" >validmind.model_validation.sklearn.MinimumF1Score</td>\n", + " <td id=\"T_0502a_row121_col1\" class=\"data row121 col1\" >Minimum F1 Score</td>\n", + " <td id=\"T_0502a_row121_col2\" class=\"data row121 col2\" >Assesses if the model's F1 score on the validation set meets a predefined minimum threshold, ensuring balanced...</td>\n", + " <td id=\"T_0502a_row121_col3\" class=\"data row121 col3\" >False</td>\n", + " <td id=\"T_0502a_row121_col4\" class=\"data row121 col4\" >True</td>\n", + " <td id=\"T_0502a_row121_col5\" class=\"data row121 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row121_col6\" class=\"data row121 col6\" >{'min_threshold': {'type': 'float', 'default': 0.5}}</td>\n", + " <td id=\"T_0502a_row121_col7\" class=\"data row121 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_0502a_row121_col8\" class=\"data row121 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row122_col0\" class=\"data row122 col0\" >validmind.model_validation.sklearn.MinimumROCAUCScore</td>\n", + " <td id=\"T_0502a_row122_col1\" class=\"data row122 col1\" >Minimum ROCAUC Score</td>\n", + " <td id=\"T_0502a_row122_col2\" class=\"data row122 col2\" >Validates model by checking if the ROC AUC score meets or surpasses a specified threshold....</td>\n", + " <td id=\"T_0502a_row122_col3\" class=\"data row122 col3\" >False</td>\n", + " <td id=\"T_0502a_row122_col4\" class=\"data row122 col4\" >True</td>\n", + " <td id=\"T_0502a_row122_col5\" class=\"data row122 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row122_col6\" class=\"data row122 col6\" >{'min_threshold': {'type': 'float', 'default': 0.5}}</td>\n", + " <td id=\"T_0502a_row122_col7\" class=\"data row122 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_0502a_row122_col8\" class=\"data row122 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row123_col0\" class=\"data row123 col0\" >validmind.model_validation.sklearn.ModelParameters</td>\n", + " <td id=\"T_0502a_row123_col1\" class=\"data row123 col1\" >Model Parameters</td>\n", + " <td id=\"T_0502a_row123_col2\" class=\"data row123 col2\" >Extracts and displays model parameters in a structured format for transparency and reproducibility....</td>\n", + " <td id=\"T_0502a_row123_col3\" class=\"data row123 col3\" >False</td>\n", + " <td id=\"T_0502a_row123_col4\" class=\"data row123 col4\" >True</td>\n", + " <td id=\"T_0502a_row123_col5\" class=\"data row123 col5\" >['model']</td>\n", + " <td id=\"T_0502a_row123_col6\" class=\"data row123 col6\" >{'model_params': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_0502a_row123_col7\" class=\"data row123 col7\" >['model_training', 'metadata']</td>\n", + " <td id=\"T_0502a_row123_col8\" class=\"data row123 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row124_col0\" class=\"data row124 col0\" >validmind.model_validation.sklearn.ModelsPerformanceComparison</td>\n", + " <td id=\"T_0502a_row124_col1\" class=\"data row124 col1\" >Models Performance Comparison</td>\n", + " <td id=\"T_0502a_row124_col2\" class=\"data row124 col2\" >Evaluates and compares the performance of multiple Machine Learning models using various metrics like accuracy,...</td>\n", + " <td id=\"T_0502a_row124_col3\" class=\"data row124 col3\" >False</td>\n", + " <td id=\"T_0502a_row124_col4\" class=\"data row124 col4\" >True</td>\n", + " <td id=\"T_0502a_row124_col5\" class=\"data row124 col5\" >['dataset', 'models']</td>\n", + " <td id=\"T_0502a_row124_col6\" class=\"data row124 col6\" >{}</td>\n", + " <td id=\"T_0502a_row124_col7\" class=\"data row124 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'model_comparison']</td>\n", + " <td id=\"T_0502a_row124_col8\" class=\"data row124 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row125_col0\" class=\"data row125 col0\" >validmind.model_validation.sklearn.OverfitDiagnosis</td>\n", + " <td id=\"T_0502a_row125_col1\" class=\"data row125 col1\" >Overfit Diagnosis</td>\n", + " <td id=\"T_0502a_row125_col2\" class=\"data row125 col2\" >Assesses potential overfitting in a model's predictions, identifying regions where performance between training and...</td>\n", + " <td id=\"T_0502a_row125_col3\" class=\"data row125 col3\" >True</td>\n", + " <td id=\"T_0502a_row125_col4\" class=\"data row125 col4\" >True</td>\n", + " <td id=\"T_0502a_row125_col5\" class=\"data row125 col5\" >['model', 'datasets']</td>\n", + " <td id=\"T_0502a_row125_col6\" class=\"data row125 col6\" >{'metric': {'type': 'str', 'default': None}, 'cut_off_threshold': {'type': 'float', 'default': 0.04}}</td>\n", + " <td id=\"T_0502a_row125_col7\" class=\"data row125 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'linear_regression', 'model_diagnosis']</td>\n", + " <td id=\"T_0502a_row125_col8\" class=\"data row125 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row126_col0\" class=\"data row126 col0\" >validmind.model_validation.sklearn.PermutationFeatureImportance</td>\n", + " <td id=\"T_0502a_row126_col1\" class=\"data row126 col1\" >Permutation Feature Importance</td>\n", + " <td id=\"T_0502a_row126_col2\" class=\"data row126 col2\" >Assesses the significance of each feature in a model by evaluating the impact on model performance when feature...</td>\n", + " <td id=\"T_0502a_row126_col3\" class=\"data row126 col3\" >True</td>\n", + " <td id=\"T_0502a_row126_col4\" class=\"data row126 col4\" >False</td>\n", + " <td id=\"T_0502a_row126_col5\" class=\"data row126 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row126_col6\" class=\"data row126 col6\" >{'fontsize': {'type': None, 'default': None}, 'figure_height': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_0502a_row126_col7\" class=\"data row126 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'feature_importance', 'visualization']</td>\n", + " <td id=\"T_0502a_row126_col8\" class=\"data row126 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row127_col0\" class=\"data row127 col0\" >validmind.model_validation.sklearn.PopulationStabilityIndex</td>\n", + " <td id=\"T_0502a_row127_col1\" class=\"data row127 col1\" >Population Stability Index</td>\n", + " <td id=\"T_0502a_row127_col2\" class=\"data row127 col2\" >Assesses the Population Stability Index (PSI) to quantify the stability of an ML model's predictions across...</td>\n", + " <td id=\"T_0502a_row127_col3\" class=\"data row127 col3\" >True</td>\n", + " <td id=\"T_0502a_row127_col4\" class=\"data row127 col4\" >True</td>\n", + " <td id=\"T_0502a_row127_col5\" class=\"data row127 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row127_col6\" class=\"data row127 col6\" >{'num_bins': {'type': 'int', 'default': 10}, 'mode': {'type': 'str', 'default': 'fixed'}}</td>\n", + " <td id=\"T_0502a_row127_col7\" class=\"data row127 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_0502a_row127_col8\" class=\"data row127 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row128_col0\" class=\"data row128 col0\" >validmind.model_validation.sklearn.PrecisionRecallCurve</td>\n", + " <td id=\"T_0502a_row128_col1\" class=\"data row128 col1\" >Precision Recall Curve</td>\n", + " <td id=\"T_0502a_row128_col2\" class=\"data row128 col2\" >Evaluates the precision-recall trade-off for binary classification models and visualizes the Precision-Recall curve....</td>\n", + " <td id=\"T_0502a_row128_col3\" class=\"data row128 col3\" >True</td>\n", + " <td id=\"T_0502a_row128_col4\" class=\"data row128 col4\" >False</td>\n", + " <td id=\"T_0502a_row128_col5\" class=\"data row128 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row128_col6\" class=\"data row128 col6\" >{}</td>\n", + " <td id=\"T_0502a_row128_col7\" class=\"data row128 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_0502a_row128_col8\" class=\"data row128 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row129_col0\" class=\"data row129 col0\" >validmind.model_validation.sklearn.ROCCurve</td>\n", + " <td id=\"T_0502a_row129_col1\" class=\"data row129 col1\" >ROC Curve</td>\n", + " <td id=\"T_0502a_row129_col2\" class=\"data row129 col2\" >Evaluates binary classification model performance by generating and plotting the Receiver Operating Characteristic...</td>\n", + " <td id=\"T_0502a_row129_col3\" class=\"data row129 col3\" >True</td>\n", + " <td id=\"T_0502a_row129_col4\" class=\"data row129 col4\" >False</td>\n", + " <td id=\"T_0502a_row129_col5\" class=\"data row129 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row129_col6\" class=\"data row129 col6\" >{}</td>\n", + " <td id=\"T_0502a_row129_col7\" class=\"data row129 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_0502a_row129_col8\" class=\"data row129 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row130_col0\" class=\"data row130 col0\" >validmind.model_validation.sklearn.RegressionErrors</td>\n", + " <td id=\"T_0502a_row130_col1\" class=\"data row130 col1\" >Regression Errors</td>\n", + " <td id=\"T_0502a_row130_col2\" class=\"data row130 col2\" >Assesses the performance and error distribution of a regression model using various error metrics....</td>\n", + " <td id=\"T_0502a_row130_col3\" class=\"data row130 col3\" >False</td>\n", + " <td id=\"T_0502a_row130_col4\" class=\"data row130 col4\" >True</td>\n", + " <td id=\"T_0502a_row130_col5\" class=\"data row130 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row130_col6\" class=\"data row130 col6\" >{}</td>\n", + " <td id=\"T_0502a_row130_col7\" class=\"data row130 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_0502a_row130_col8\" class=\"data row130 col8\" >['regression', 'classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row131_col0\" class=\"data row131 col0\" >validmind.model_validation.sklearn.RegressionErrorsComparison</td>\n", + " <td id=\"T_0502a_row131_col1\" class=\"data row131 col1\" >Regression Errors Comparison</td>\n", + " <td id=\"T_0502a_row131_col2\" class=\"data row131 col2\" >Assesses multiple regression error metrics to compare model performance across different datasets, emphasizing...</td>\n", + " <td id=\"T_0502a_row131_col3\" class=\"data row131 col3\" >False</td>\n", + " <td id=\"T_0502a_row131_col4\" class=\"data row131 col4\" >True</td>\n", + " <td id=\"T_0502a_row131_col5\" class=\"data row131 col5\" >['datasets', 'models']</td>\n", + " <td id=\"T_0502a_row131_col6\" class=\"data row131 col6\" >{}</td>\n", + " <td id=\"T_0502a_row131_col7\" class=\"data row131 col7\" >['model_performance', 'sklearn']</td>\n", + " <td id=\"T_0502a_row131_col8\" class=\"data row131 col8\" >['regression', 'time_series_forecasting']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row132_col0\" class=\"data row132 col0\" >validmind.model_validation.sklearn.RegressionPerformance</td>\n", + " <td id=\"T_0502a_row132_col1\" class=\"data row132 col1\" >Regression Performance</td>\n", + " <td id=\"T_0502a_row132_col2\" class=\"data row132 col2\" >Evaluates the performance of a regression model using five different metrics: MAE, MSE, RMSE, MAPE, and MBD....</td>\n", + " <td id=\"T_0502a_row132_col3\" class=\"data row132 col3\" >False</td>\n", + " <td id=\"T_0502a_row132_col4\" class=\"data row132 col4\" >True</td>\n", + " <td id=\"T_0502a_row132_col5\" class=\"data row132 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row132_col6\" class=\"data row132 col6\" >{}</td>\n", + " <td id=\"T_0502a_row132_col7\" class=\"data row132 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_0502a_row132_col8\" class=\"data row132 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row133_col0\" class=\"data row133 col0\" >validmind.model_validation.sklearn.RegressionR2Square</td>\n", + " <td id=\"T_0502a_row133_col1\" class=\"data row133 col1\" >Regression R2 Square</td>\n", + " <td id=\"T_0502a_row133_col2\" class=\"data row133 col2\" >Assesses the overall goodness-of-fit of a regression model by evaluating R-squared (R2) and Adjusted R-squared (Adj...</td>\n", + " <td id=\"T_0502a_row133_col3\" class=\"data row133 col3\" >False</td>\n", + " <td id=\"T_0502a_row133_col4\" class=\"data row133 col4\" >True</td>\n", + " <td id=\"T_0502a_row133_col5\" class=\"data row133 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row133_col6\" class=\"data row133 col6\" >{}</td>\n", + " <td id=\"T_0502a_row133_col7\" class=\"data row133 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_0502a_row133_col8\" class=\"data row133 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row134_col0\" class=\"data row134 col0\" >validmind.model_validation.sklearn.RegressionR2SquareComparison</td>\n", + " <td id=\"T_0502a_row134_col1\" class=\"data row134 col1\" >Regression R2 Square Comparison</td>\n", + " <td id=\"T_0502a_row134_col2\" class=\"data row134 col2\" >Compares R-Squared and Adjusted R-Squared values for different regression models across multiple datasets to assess...</td>\n", + " <td id=\"T_0502a_row134_col3\" class=\"data row134 col3\" >False</td>\n", + " <td id=\"T_0502a_row134_col4\" class=\"data row134 col4\" >True</td>\n", + " <td id=\"T_0502a_row134_col5\" class=\"data row134 col5\" >['datasets', 'models']</td>\n", + " <td id=\"T_0502a_row134_col6\" class=\"data row134 col6\" >{}</td>\n", + " <td id=\"T_0502a_row134_col7\" class=\"data row134 col7\" >['model_performance', 'sklearn']</td>\n", + " <td id=\"T_0502a_row134_col8\" class=\"data row134 col8\" >['regression', 'time_series_forecasting']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row135_col0\" class=\"data row135 col0\" >validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", + " <td id=\"T_0502a_row135_col1\" class=\"data row135 col1\" >Robustness Diagnosis</td>\n", + " <td id=\"T_0502a_row135_col2\" class=\"data row135 col2\" >Assesses the robustness of a machine learning model by evaluating performance decay under noisy conditions....</td>\n", + " <td id=\"T_0502a_row135_col3\" class=\"data row135 col3\" >True</td>\n", + " <td id=\"T_0502a_row135_col4\" class=\"data row135 col4\" >True</td>\n", + " <td id=\"T_0502a_row135_col5\" class=\"data row135 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row135_col6\" class=\"data row135 col6\" >{'metric': {'type': 'str', 'default': None}, 'scaling_factor_std_dev_list': {'type': None, 'default': [0.1, 0.2, 0.3, 0.4, 0.5]}, 'performance_decay_threshold': {'type': 'float', 'default': 0.05}}</td>\n", + " <td id=\"T_0502a_row135_col7\" class=\"data row135 col7\" >['sklearn', 'model_diagnosis', 'visualization']</td>\n", + " <td id=\"T_0502a_row135_col8\" class=\"data row135 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row136_col0\" class=\"data row136 col0\" >validmind.model_validation.sklearn.SHAPGlobalImportance</td>\n", + " <td id=\"T_0502a_row136_col1\" class=\"data row136 col1\" >SHAP Global Importance</td>\n", + " <td id=\"T_0502a_row136_col2\" class=\"data row136 col2\" >Evaluates and visualizes global feature importance using SHAP values for model explanation and risk identification....</td>\n", + " <td id=\"T_0502a_row136_col3\" class=\"data row136 col3\" >False</td>\n", + " <td id=\"T_0502a_row136_col4\" class=\"data row136 col4\" >True</td>\n", + " <td id=\"T_0502a_row136_col5\" class=\"data row136 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row136_col6\" class=\"data row136 col6\" >{'kernel_explainer_samples': {'type': 'int', 'default': 10}, 'tree_or_linear_explainer_samples': {'type': 'int', 'default': 200}, 'class_of_interest': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_0502a_row136_col7\" class=\"data row136 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'feature_importance', 'visualization']</td>\n", + " <td id=\"T_0502a_row136_col8\" class=\"data row136 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row137_col0\" class=\"data row137 col0\" >validmind.model_validation.sklearn.ScoreProbabilityAlignment</td>\n", + " <td id=\"T_0502a_row137_col1\" class=\"data row137 col1\" >Score Probability Alignment</td>\n", + " <td id=\"T_0502a_row137_col2\" class=\"data row137 col2\" >Analyzes the alignment between credit scores and predicted probabilities....</td>\n", + " <td id=\"T_0502a_row137_col3\" class=\"data row137 col3\" >True</td>\n", + " <td id=\"T_0502a_row137_col4\" class=\"data row137 col4\" >True</td>\n", + " <td id=\"T_0502a_row137_col5\" class=\"data row137 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row137_col6\" class=\"data row137 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'n_bins': {'type': 'int', 'default': 10}}</td>\n", + " <td id=\"T_0502a_row137_col7\" class=\"data row137 col7\" >['visualization', 'credit_risk', 'calibration']</td>\n", + " <td id=\"T_0502a_row137_col8\" class=\"data row137 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row138_col0\" class=\"data row138 col0\" >validmind.model_validation.sklearn.SilhouettePlot</td>\n", + " <td id=\"T_0502a_row138_col1\" class=\"data row138 col1\" >Silhouette Plot</td>\n", + " <td id=\"T_0502a_row138_col2\" class=\"data row138 col2\" >Calculates and visualizes Silhouette Score, assessing the degree of data point suitability to its cluster in ML...</td>\n", + " <td id=\"T_0502a_row138_col3\" class=\"data row138 col3\" >True</td>\n", + " <td id=\"T_0502a_row138_col4\" class=\"data row138 col4\" >True</td>\n", + " <td id=\"T_0502a_row138_col5\" class=\"data row138 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row138_col6\" class=\"data row138 col6\" >{}</td>\n", + " <td id=\"T_0502a_row138_col7\" class=\"data row138 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_0502a_row138_col8\" class=\"data row138 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row139_col0\" class=\"data row139 col0\" >validmind.model_validation.sklearn.TrainingTestDegradation</td>\n", + " <td id=\"T_0502a_row139_col1\" class=\"data row139 col1\" >Training Test Degradation</td>\n", + " <td id=\"T_0502a_row139_col2\" class=\"data row139 col2\" >Tests if model performance degradation between training and test datasets exceeds a predefined threshold....</td>\n", + " <td id=\"T_0502a_row139_col3\" class=\"data row139 col3\" >False</td>\n", + " <td id=\"T_0502a_row139_col4\" class=\"data row139 col4\" >True</td>\n", + " <td id=\"T_0502a_row139_col5\" class=\"data row139 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row139_col6\" class=\"data row139 col6\" >{'max_threshold': {'type': 'float', 'default': 0.1}}</td>\n", + " <td id=\"T_0502a_row139_col7\" class=\"data row139 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_0502a_row139_col8\" class=\"data row139 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row140_col0\" class=\"data row140 col0\" >validmind.model_validation.sklearn.VMeasure</td>\n", + " <td id=\"T_0502a_row140_col1\" class=\"data row140 col1\" >V Measure</td>\n", + " <td id=\"T_0502a_row140_col2\" class=\"data row140 col2\" >Evaluates homogeneity and completeness of a clustering model using the V Measure Score....</td>\n", + " <td id=\"T_0502a_row140_col3\" class=\"data row140 col3\" >False</td>\n", + " <td id=\"T_0502a_row140_col4\" class=\"data row140 col4\" >True</td>\n", + " <td id=\"T_0502a_row140_col5\" class=\"data row140 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row140_col6\" class=\"data row140 col6\" >{}</td>\n", + " <td id=\"T_0502a_row140_col7\" class=\"data row140 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_0502a_row140_col8\" class=\"data row140 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row141_col0\" class=\"data row141 col0\" >validmind.model_validation.sklearn.WeakspotsDiagnosis</td>\n", + " <td id=\"T_0502a_row141_col1\" class=\"data row141 col1\" >Weakspots Diagnosis</td>\n", + " <td id=\"T_0502a_row141_col2\" class=\"data row141 col2\" >Identifies and visualizes weak spots in a machine learning model's performance across various sections of the...</td>\n", + " <td id=\"T_0502a_row141_col3\" class=\"data row141 col3\" >True</td>\n", + " <td id=\"T_0502a_row141_col4\" class=\"data row141 col4\" >True</td>\n", + " <td id=\"T_0502a_row141_col5\" class=\"data row141 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row141_col6\" class=\"data row141 col6\" >{'features_columns': {'type': None, 'default': None}, 'metrics': {'type': None, 'default': None}, 'thresholds': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_0502a_row141_col7\" class=\"data row141 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_diagnosis', 'visualization']</td>\n", + " <td id=\"T_0502a_row141_col8\" class=\"data row141 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row142_col0\" class=\"data row142 col0\" >validmind.model_validation.statsmodels.AutoARIMA</td>\n", + " <td id=\"T_0502a_row142_col1\" class=\"data row142 col1\" >Auto ARIMA</td>\n", + " <td id=\"T_0502a_row142_col2\" class=\"data row142 col2\" >Evaluates ARIMA models for time-series forecasting, ranking them using Bayesian and Akaike Information Criteria....</td>\n", + " <td id=\"T_0502a_row142_col3\" class=\"data row142 col3\" >False</td>\n", + " <td id=\"T_0502a_row142_col4\" class=\"data row142 col4\" >True</td>\n", + " <td id=\"T_0502a_row142_col5\" class=\"data row142 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row142_col6\" class=\"data row142 col6\" >{}</td>\n", + " <td id=\"T_0502a_row142_col7\" class=\"data row142 col7\" >['time_series_data', 'forecasting', 'model_selection', 'statsmodels']</td>\n", + " <td id=\"T_0502a_row142_col8\" class=\"data row142 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row143_col0\" class=\"data row143 col0\" >validmind.model_validation.statsmodels.CumulativePredictionProbabilities</td>\n", + " <td id=\"T_0502a_row143_col1\" class=\"data row143 col1\" >Cumulative Prediction Probabilities</td>\n", + " <td id=\"T_0502a_row143_col2\" class=\"data row143 col2\" >Visualizes cumulative probabilities of positive and negative classes for both training and testing in classification models....</td>\n", + " <td id=\"T_0502a_row143_col3\" class=\"data row143 col3\" >True</td>\n", + " <td id=\"T_0502a_row143_col4\" class=\"data row143 col4\" >False</td>\n", + " <td id=\"T_0502a_row143_col5\" class=\"data row143 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row143_col6\" class=\"data row143 col6\" >{'title': {'type': 'str', 'default': 'Cumulative Probabilities'}}</td>\n", + " <td id=\"T_0502a_row143_col7\" class=\"data row143 col7\" >['visualization', 'credit_risk']</td>\n", + " <td id=\"T_0502a_row143_col8\" class=\"data row143 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row144_col0\" class=\"data row144 col0\" >validmind.model_validation.statsmodels.DurbinWatsonTest</td>\n", + " <td id=\"T_0502a_row144_col1\" class=\"data row144 col1\" >Durbin Watson Test</td>\n", + " <td id=\"T_0502a_row144_col2\" class=\"data row144 col2\" >Assesses autocorrelation in time series data features using the Durbin-Watson statistic....</td>\n", + " <td id=\"T_0502a_row144_col3\" class=\"data row144 col3\" >False</td>\n", + " <td id=\"T_0502a_row144_col4\" class=\"data row144 col4\" >True</td>\n", + " <td id=\"T_0502a_row144_col5\" class=\"data row144 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row144_col6\" class=\"data row144 col6\" >{'threshold': {'type': None, 'default': [1.5, 2.5]}}</td>\n", + " <td id=\"T_0502a_row144_col7\" class=\"data row144 col7\" >['time_series_data', 'forecasting', 'statistical_test', 'statsmodels']</td>\n", + " <td id=\"T_0502a_row144_col8\" class=\"data row144 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row145_col0\" class=\"data row145 col0\" >validmind.model_validation.statsmodels.GINITable</td>\n", + " <td id=\"T_0502a_row145_col1\" class=\"data row145 col1\" >GINI Table</td>\n", + " <td id=\"T_0502a_row145_col2\" class=\"data row145 col2\" >Evaluates classification model performance using AUC, GINI, and KS metrics for training and test datasets....</td>\n", + " <td id=\"T_0502a_row145_col3\" class=\"data row145 col3\" >False</td>\n", + " <td id=\"T_0502a_row145_col4\" class=\"data row145 col4\" >True</td>\n", + " <td id=\"T_0502a_row145_col5\" class=\"data row145 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row145_col6\" class=\"data row145 col6\" >{}</td>\n", + " <td id=\"T_0502a_row145_col7\" class=\"data row145 col7\" >['model_performance']</td>\n", + " <td id=\"T_0502a_row145_col8\" class=\"data row145 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row146_col0\" class=\"data row146 col0\" >validmind.model_validation.statsmodels.KolmogorovSmirnov</td>\n", + " <td id=\"T_0502a_row146_col1\" class=\"data row146 col1\" >Kolmogorov Smirnov</td>\n", + " <td id=\"T_0502a_row146_col2\" class=\"data row146 col2\" >Assesses whether each feature in the dataset aligns with a normal distribution using the Kolmogorov-Smirnov test....</td>\n", + " <td id=\"T_0502a_row146_col3\" class=\"data row146 col3\" >False</td>\n", + " <td id=\"T_0502a_row146_col4\" class=\"data row146 col4\" >True</td>\n", + " <td id=\"T_0502a_row146_col5\" class=\"data row146 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row146_col6\" class=\"data row146 col6\" >{'dist': {'type': 'str', 'default': 'norm'}}</td>\n", + " <td id=\"T_0502a_row146_col7\" class=\"data row146 col7\" >['tabular_data', 'data_distribution', 'statistical_test', 'statsmodels']</td>\n", + " <td id=\"T_0502a_row146_col8\" class=\"data row146 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row147_col0\" class=\"data row147 col0\" >validmind.model_validation.statsmodels.Lilliefors</td>\n", + " <td id=\"T_0502a_row147_col1\" class=\"data row147 col1\" >Lilliefors</td>\n", + " <td id=\"T_0502a_row147_col2\" class=\"data row147 col2\" >Assesses the normality of feature distributions in an ML model's training dataset using the Lilliefors test....</td>\n", + " <td id=\"T_0502a_row147_col3\" class=\"data row147 col3\" >False</td>\n", + " <td id=\"T_0502a_row147_col4\" class=\"data row147 col4\" >True</td>\n", + " <td id=\"T_0502a_row147_col5\" class=\"data row147 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row147_col6\" class=\"data row147 col6\" >{}</td>\n", + " <td id=\"T_0502a_row147_col7\" class=\"data row147 col7\" >['tabular_data', 'data_distribution', 'statistical_test', 'statsmodels']</td>\n", + " <td id=\"T_0502a_row147_col8\" class=\"data row147 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row148_col0\" class=\"data row148 col0\" >validmind.model_validation.statsmodels.PredictionProbabilitiesHistogram</td>\n", + " <td id=\"T_0502a_row148_col1\" class=\"data row148 col1\" >Prediction Probabilities Histogram</td>\n", + " <td id=\"T_0502a_row148_col2\" class=\"data row148 col2\" >Assesses the predictive probability distribution for binary classification to evaluate model performance and...</td>\n", + " <td id=\"T_0502a_row148_col3\" class=\"data row148 col3\" >True</td>\n", + " <td id=\"T_0502a_row148_col4\" class=\"data row148 col4\" >False</td>\n", + " <td id=\"T_0502a_row148_col5\" class=\"data row148 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row148_col6\" class=\"data row148 col6\" >{'title': {'type': 'str', 'default': 'Histogram of Predictive Probabilities'}}</td>\n", + " <td id=\"T_0502a_row148_col7\" class=\"data row148 col7\" >['visualization', 'credit_risk']</td>\n", + " <td id=\"T_0502a_row148_col8\" class=\"data row148 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row149_col0\" class=\"data row149 col0\" >validmind.model_validation.statsmodels.RegressionCoeffs</td>\n", + " <td id=\"T_0502a_row149_col1\" class=\"data row149 col1\" >Regression Coeffs</td>\n", + " <td id=\"T_0502a_row149_col2\" class=\"data row149 col2\" >Assesses the significance and uncertainty of predictor variables in a regression model through visualization of...</td>\n", + " <td id=\"T_0502a_row149_col3\" class=\"data row149 col3\" >True</td>\n", + " <td id=\"T_0502a_row149_col4\" class=\"data row149 col4\" >True</td>\n", + " <td id=\"T_0502a_row149_col5\" class=\"data row149 col5\" >['model']</td>\n", + " <td id=\"T_0502a_row149_col6\" class=\"data row149 col6\" >{}</td>\n", + " <td id=\"T_0502a_row149_col7\" class=\"data row149 col7\" >['tabular_data', 'visualization', 'model_training']</td>\n", + " <td id=\"T_0502a_row149_col8\" class=\"data row149 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row150_col0\" class=\"data row150 col0\" >validmind.model_validation.statsmodels.RegressionFeatureSignificance</td>\n", + " <td id=\"T_0502a_row150_col1\" class=\"data row150 col1\" >Regression Feature Significance</td>\n", + " <td id=\"T_0502a_row150_col2\" class=\"data row150 col2\" >Assesses and visualizes the statistical significance of features in a regression model....</td>\n", + " <td id=\"T_0502a_row150_col3\" class=\"data row150 col3\" >True</td>\n", + " <td id=\"T_0502a_row150_col4\" class=\"data row150 col4\" >False</td>\n", + " <td id=\"T_0502a_row150_col5\" class=\"data row150 col5\" >['model']</td>\n", + " <td id=\"T_0502a_row150_col6\" class=\"data row150 col6\" >{'fontsize': {'type': 'int', 'default': 10}, 'p_threshold': {'type': 'float', 'default': 0.05}}</td>\n", + " <td id=\"T_0502a_row150_col7\" class=\"data row150 col7\" >['statistical_test', 'model_interpretation', 'visualization', 'feature_importance']</td>\n", + " <td id=\"T_0502a_row150_col8\" class=\"data row150 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row151_col0\" class=\"data row151 col0\" >validmind.model_validation.statsmodels.RegressionModelForecastPlot</td>\n", + " <td id=\"T_0502a_row151_col1\" class=\"data row151 col1\" >Regression Model Forecast Plot</td>\n", + " <td id=\"T_0502a_row151_col2\" class=\"data row151 col2\" >Generates plots to visually compare the forecasted outcomes of a regression model against actual observed values over...</td>\n", + " <td id=\"T_0502a_row151_col3\" class=\"data row151 col3\" >True</td>\n", + " <td id=\"T_0502a_row151_col4\" class=\"data row151 col4\" >False</td>\n", + " <td id=\"T_0502a_row151_col5\" class=\"data row151 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row151_col6\" class=\"data row151 col6\" >{'start_date': {'type': None, 'default': None}, 'end_date': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_0502a_row151_col7\" class=\"data row151 col7\" >['time_series_data', 'forecasting', 'visualization']</td>\n", + " <td id=\"T_0502a_row151_col8\" class=\"data row151 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row152_col0\" class=\"data row152 col0\" >validmind.model_validation.statsmodels.RegressionModelForecastPlotLevels</td>\n", + " <td id=\"T_0502a_row152_col1\" class=\"data row152 col1\" >Regression Model Forecast Plot Levels</td>\n", + " <td id=\"T_0502a_row152_col2\" class=\"data row152 col2\" >Assesses the alignment between forecasted and observed values in regression models through visual plots...</td>\n", + " <td id=\"T_0502a_row152_col3\" class=\"data row152 col3\" >True</td>\n", + " <td id=\"T_0502a_row152_col4\" class=\"data row152 col4\" >False</td>\n", + " <td id=\"T_0502a_row152_col5\" class=\"data row152 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row152_col6\" class=\"data row152 col6\" >{}</td>\n", + " <td id=\"T_0502a_row152_col7\" class=\"data row152 col7\" >['time_series_data', 'forecasting', 'visualization']</td>\n", + " <td id=\"T_0502a_row152_col8\" class=\"data row152 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row153_col0\" class=\"data row153 col0\" >validmind.model_validation.statsmodels.RegressionModelSensitivityPlot</td>\n", + " <td id=\"T_0502a_row153_col1\" class=\"data row153 col1\" >Regression Model Sensitivity Plot</td>\n", + " <td id=\"T_0502a_row153_col2\" class=\"data row153 col2\" >Assesses the sensitivity of a regression model to changes in independent variables by applying shocks and...</td>\n", + " <td id=\"T_0502a_row153_col3\" class=\"data row153 col3\" >True</td>\n", + " <td id=\"T_0502a_row153_col4\" class=\"data row153 col4\" >False</td>\n", + " <td id=\"T_0502a_row153_col5\" class=\"data row153 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row153_col6\" class=\"data row153 col6\" >{'shocks': {'type': None, 'default': [0.1]}, 'transformation': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_0502a_row153_col7\" class=\"data row153 col7\" >['senstivity_analysis', 'visualization']</td>\n", + " <td id=\"T_0502a_row153_col8\" class=\"data row153 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row154_col0\" class=\"data row154 col0\" >validmind.model_validation.statsmodels.RegressionModelSummary</td>\n", + " <td id=\"T_0502a_row154_col1\" class=\"data row154 col1\" >Regression Model Summary</td>\n", + " <td id=\"T_0502a_row154_col2\" class=\"data row154 col2\" >Evaluates regression model performance using metrics including R-Squared, Adjusted R-Squared, MSE, and RMSE....</td>\n", + " <td id=\"T_0502a_row154_col3\" class=\"data row154 col3\" >False</td>\n", + " <td id=\"T_0502a_row154_col4\" class=\"data row154 col4\" >True</td>\n", + " <td id=\"T_0502a_row154_col5\" class=\"data row154 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row154_col6\" class=\"data row154 col6\" >{}</td>\n", + " <td id=\"T_0502a_row154_col7\" class=\"data row154 col7\" >['model_performance', 'regression']</td>\n", + " <td id=\"T_0502a_row154_col8\" class=\"data row154 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row155_col0\" class=\"data row155 col0\" >validmind.model_validation.statsmodels.RegressionPermutationFeatureImportance</td>\n", + " <td id=\"T_0502a_row155_col1\" class=\"data row155 col1\" >Regression Permutation Feature Importance</td>\n", + " <td id=\"T_0502a_row155_col2\" class=\"data row155 col2\" >Assesses the significance of each feature in a model by evaluating the impact on model performance when feature...</td>\n", + " <td id=\"T_0502a_row155_col3\" class=\"data row155 col3\" >True</td>\n", + " <td id=\"T_0502a_row155_col4\" class=\"data row155 col4\" >False</td>\n", + " <td id=\"T_0502a_row155_col5\" class=\"data row155 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row155_col6\" class=\"data row155 col6\" >{'fontsize': {'type': 'int', 'default': 12}, 'figure_height': {'type': 'int', 'default': 500}}</td>\n", + " <td id=\"T_0502a_row155_col7\" class=\"data row155 col7\" >['statsmodels', 'feature_importance', 'visualization']</td>\n", + " <td id=\"T_0502a_row155_col8\" class=\"data row155 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row156_col0\" class=\"data row156 col0\" >validmind.model_validation.statsmodels.ScorecardHistogram</td>\n", + " <td id=\"T_0502a_row156_col1\" class=\"data row156 col1\" >Scorecard Histogram</td>\n", + " <td id=\"T_0502a_row156_col2\" class=\"data row156 col2\" >The Scorecard Histogram test evaluates the distribution of credit scores between default and non-default instances,...</td>\n", + " <td id=\"T_0502a_row156_col3\" class=\"data row156 col3\" >True</td>\n", + " <td id=\"T_0502a_row156_col4\" class=\"data row156 col4\" >False</td>\n", + " <td id=\"T_0502a_row156_col5\" class=\"data row156 col5\" >['dataset']</td>\n", + " <td id=\"T_0502a_row156_col6\" class=\"data row156 col6\" >{'title': {'type': 'str', 'default': 'Histogram of Scores'}, 'score_column': {'type': 'str', 'default': 'score'}}</td>\n", + " <td id=\"T_0502a_row156_col7\" class=\"data row156 col7\" >['visualization', 'credit_risk', 'logistic_regression']</td>\n", + " <td id=\"T_0502a_row156_col8\" class=\"data row156 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row157_col0\" class=\"data row157 col0\" >validmind.ongoing_monitoring.CalibrationCurveDrift</td>\n", + " <td id=\"T_0502a_row157_col1\" class=\"data row157 col1\" >Calibration Curve Drift</td>\n", + " <td id=\"T_0502a_row157_col2\" class=\"data row157 col2\" >Evaluates changes in probability calibration between reference and monitoring datasets....</td>\n", + " <td id=\"T_0502a_row157_col3\" class=\"data row157 col3\" >True</td>\n", + " <td id=\"T_0502a_row157_col4\" class=\"data row157 col4\" >True</td>\n", + " <td id=\"T_0502a_row157_col5\" class=\"data row157 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row157_col6\" class=\"data row157 col6\" >{'n_bins': {'type': 'int', 'default': 10}, 'drift_pct_threshold': {'type': 'float', 'default': 20}}</td>\n", + " <td id=\"T_0502a_row157_col7\" class=\"data row157 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_0502a_row157_col8\" class=\"data row157 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row158_col0\" class=\"data row158 col0\" >validmind.ongoing_monitoring.ClassDiscriminationDrift</td>\n", + " <td id=\"T_0502a_row158_col1\" class=\"data row158 col1\" >Class Discrimination Drift</td>\n", + " <td id=\"T_0502a_row158_col2\" class=\"data row158 col2\" >Compares classification discrimination metrics between reference and monitoring datasets....</td>\n", + " <td id=\"T_0502a_row158_col3\" class=\"data row158 col3\" >False</td>\n", + " <td id=\"T_0502a_row158_col4\" class=\"data row158 col4\" >True</td>\n", + " <td id=\"T_0502a_row158_col5\" class=\"data row158 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row158_col6\" class=\"data row158 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", + " <td id=\"T_0502a_row158_col7\" class=\"data row158 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_0502a_row158_col8\" class=\"data row158 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row159_col0\" class=\"data row159 col0\" >validmind.ongoing_monitoring.ClassImbalanceDrift</td>\n", + " <td id=\"T_0502a_row159_col1\" class=\"data row159 col1\" >Class Imbalance Drift</td>\n", + " <td id=\"T_0502a_row159_col2\" class=\"data row159 col2\" >Evaluates drift in class distribution between reference and monitoring datasets....</td>\n", + " <td id=\"T_0502a_row159_col3\" class=\"data row159 col3\" >True</td>\n", + " <td id=\"T_0502a_row159_col4\" class=\"data row159 col4\" >True</td>\n", + " <td id=\"T_0502a_row159_col5\" class=\"data row159 col5\" >['datasets']</td>\n", + " <td id=\"T_0502a_row159_col6\" class=\"data row159 col6\" >{'drift_pct_threshold': {'type': 'float', 'default': 5.0}, 'title': {'type': 'str', 'default': 'Class Distribution Drift'}}</td>\n", + " <td id=\"T_0502a_row159_col7\" class=\"data row159 col7\" >['tabular_data', 'binary_classification', 'multiclass_classification']</td>\n", + " <td id=\"T_0502a_row159_col8\" class=\"data row159 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row160_col0\" class=\"data row160 col0\" >validmind.ongoing_monitoring.ClassificationAccuracyDrift</td>\n", + " <td id=\"T_0502a_row160_col1\" class=\"data row160 col1\" >Classification Accuracy Drift</td>\n", + " <td id=\"T_0502a_row160_col2\" class=\"data row160 col2\" >Compares classification accuracy metrics between reference and monitoring datasets....</td>\n", + " <td id=\"T_0502a_row160_col3\" class=\"data row160 col3\" >False</td>\n", + " <td id=\"T_0502a_row160_col4\" class=\"data row160 col4\" >True</td>\n", + " <td id=\"T_0502a_row160_col5\" class=\"data row160 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row160_col6\" class=\"data row160 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", + " <td id=\"T_0502a_row160_col7\" class=\"data row160 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_0502a_row160_col8\" class=\"data row160 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row161_col0\" class=\"data row161 col0\" >validmind.ongoing_monitoring.ConfusionMatrixDrift</td>\n", + " <td id=\"T_0502a_row161_col1\" class=\"data row161 col1\" >Confusion Matrix Drift</td>\n", + " <td id=\"T_0502a_row161_col2\" class=\"data row161 col2\" >Compares confusion matrix metrics between reference and monitoring datasets....</td>\n", + " <td id=\"T_0502a_row161_col3\" class=\"data row161 col3\" >False</td>\n", + " <td id=\"T_0502a_row161_col4\" class=\"data row161 col4\" >True</td>\n", + " <td id=\"T_0502a_row161_col5\" class=\"data row161 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row161_col6\" class=\"data row161 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", + " <td id=\"T_0502a_row161_col7\" class=\"data row161 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_0502a_row161_col8\" class=\"data row161 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row162_col0\" class=\"data row162 col0\" >validmind.ongoing_monitoring.CumulativePredictionProbabilitiesDrift</td>\n", + " <td id=\"T_0502a_row162_col1\" class=\"data row162 col1\" >Cumulative Prediction Probabilities Drift</td>\n", + " <td id=\"T_0502a_row162_col2\" class=\"data row162 col2\" >Compares cumulative prediction probability distributions between reference and monitoring datasets....</td>\n", + " <td id=\"T_0502a_row162_col3\" class=\"data row162 col3\" >True</td>\n", + " <td id=\"T_0502a_row162_col4\" class=\"data row162 col4\" >False</td>\n", + " <td id=\"T_0502a_row162_col5\" class=\"data row162 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row162_col6\" class=\"data row162 col6\" >{}</td>\n", + " <td id=\"T_0502a_row162_col7\" class=\"data row162 col7\" >['visualization', 'credit_risk']</td>\n", + " <td id=\"T_0502a_row162_col8\" class=\"data row162 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row163_col0\" class=\"data row163 col0\" >validmind.ongoing_monitoring.FeatureDrift</td>\n", + " <td id=\"T_0502a_row163_col1\" class=\"data row163 col1\" >Feature Drift</td>\n", + " <td id=\"T_0502a_row163_col2\" class=\"data row163 col2\" >Evaluates changes in feature distribution over time to identify potential model drift....</td>\n", + " <td id=\"T_0502a_row163_col3\" class=\"data row163 col3\" >True</td>\n", + " <td id=\"T_0502a_row163_col4\" class=\"data row163 col4\" >True</td>\n", + " <td id=\"T_0502a_row163_col5\" class=\"data row163 col5\" >['datasets']</td>\n", + " <td id=\"T_0502a_row163_col6\" class=\"data row163 col6\" >{'bins': {'type': '_empty', 'default': [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]}, 'feature_columns': {'type': '_empty', 'default': None}, 'psi_threshold': {'type': '_empty', 'default': 0.2}}</td>\n", + " <td id=\"T_0502a_row163_col7\" class=\"data row163 col7\" >['visualization']</td>\n", + " <td id=\"T_0502a_row163_col8\" class=\"data row163 col8\" >['monitoring']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row164_col0\" class=\"data row164 col0\" >validmind.ongoing_monitoring.PredictionAcrossEachFeature</td>\n", + " <td id=\"T_0502a_row164_col1\" class=\"data row164 col1\" >Prediction Across Each Feature</td>\n", + " <td id=\"T_0502a_row164_col2\" class=\"data row164 col2\" >Assesses differences in model predictions across individual features between reference and monitoring datasets...</td>\n", + " <td id=\"T_0502a_row164_col3\" class=\"data row164 col3\" >True</td>\n", + " <td id=\"T_0502a_row164_col4\" class=\"data row164 col4\" >False</td>\n", + " <td id=\"T_0502a_row164_col5\" class=\"data row164 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row164_col6\" class=\"data row164 col6\" >{}</td>\n", + " <td id=\"T_0502a_row164_col7\" class=\"data row164 col7\" >['visualization']</td>\n", + " <td id=\"T_0502a_row164_col8\" class=\"data row164 col8\" >['monitoring']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row165_col0\" class=\"data row165 col0\" >validmind.ongoing_monitoring.PredictionCorrelation</td>\n", + " <td id=\"T_0502a_row165_col1\" class=\"data row165 col1\" >Prediction Correlation</td>\n", + " <td id=\"T_0502a_row165_col2\" class=\"data row165 col2\" >Assesses correlation changes between model predictions from reference and monitoring datasets to detect potential...</td>\n", + " <td id=\"T_0502a_row165_col3\" class=\"data row165 col3\" >True</td>\n", + " <td id=\"T_0502a_row165_col4\" class=\"data row165 col4\" >True</td>\n", + " <td id=\"T_0502a_row165_col5\" class=\"data row165 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row165_col6\" class=\"data row165 col6\" >{'drift_pct_threshold': {'type': 'float', 'default': 20}}</td>\n", + " <td id=\"T_0502a_row165_col7\" class=\"data row165 col7\" >['visualization']</td>\n", + " <td id=\"T_0502a_row165_col8\" class=\"data row165 col8\" >['monitoring']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row166_col0\" class=\"data row166 col0\" >validmind.ongoing_monitoring.PredictionProbabilitiesHistogramDrift</td>\n", + " <td id=\"T_0502a_row166_col1\" class=\"data row166 col1\" >Prediction Probabilities Histogram Drift</td>\n", + " <td id=\"T_0502a_row166_col2\" class=\"data row166 col2\" >Compares prediction probability distributions between reference and monitoring datasets....</td>\n", + " <td id=\"T_0502a_row166_col3\" class=\"data row166 col3\" >True</td>\n", + " <td id=\"T_0502a_row166_col4\" class=\"data row166 col4\" >True</td>\n", + " <td id=\"T_0502a_row166_col5\" class=\"data row166 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row166_col6\" class=\"data row166 col6\" >{'title': {'type': '_empty', 'default': 'Prediction Probabilities Histogram Drift'}, 'drift_pct_threshold': {'type': 'float', 'default': 20.0}}</td>\n", + " <td id=\"T_0502a_row166_col7\" class=\"data row166 col7\" >['visualization', 'credit_risk']</td>\n", + " <td id=\"T_0502a_row166_col8\" class=\"data row166 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row167_col0\" class=\"data row167 col0\" >validmind.ongoing_monitoring.PredictionQuantilesAcrossFeatures</td>\n", + " <td id=\"T_0502a_row167_col1\" class=\"data row167 col1\" >Prediction Quantiles Across Features</td>\n", + " <td id=\"T_0502a_row167_col2\" class=\"data row167 col2\" >Assesses differences in model prediction distributions across individual features between reference...</td>\n", + " <td id=\"T_0502a_row167_col3\" class=\"data row167 col3\" >True</td>\n", + " <td id=\"T_0502a_row167_col4\" class=\"data row167 col4\" >False</td>\n", + " <td id=\"T_0502a_row167_col5\" class=\"data row167 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row167_col6\" class=\"data row167 col6\" >{}</td>\n", + " <td id=\"T_0502a_row167_col7\" class=\"data row167 col7\" >['visualization']</td>\n", + " <td id=\"T_0502a_row167_col8\" class=\"data row167 col8\" >['monitoring']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row168_col0\" class=\"data row168 col0\" >validmind.ongoing_monitoring.ROCCurveDrift</td>\n", + " <td id=\"T_0502a_row168_col1\" class=\"data row168 col1\" >ROC Curve Drift</td>\n", + " <td id=\"T_0502a_row168_col2\" class=\"data row168 col2\" >Compares ROC curves between reference and monitoring datasets....</td>\n", + " <td id=\"T_0502a_row168_col3\" class=\"data row168 col3\" >True</td>\n", + " <td id=\"T_0502a_row168_col4\" class=\"data row168 col4\" >False</td>\n", + " <td id=\"T_0502a_row168_col5\" class=\"data row168 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row168_col6\" class=\"data row168 col6\" >{}</td>\n", + " <td id=\"T_0502a_row168_col7\" class=\"data row168 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_0502a_row168_col8\" class=\"data row168 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row169_col0\" class=\"data row169 col0\" >validmind.ongoing_monitoring.ScoreBandsDrift</td>\n", + " <td id=\"T_0502a_row169_col1\" class=\"data row169 col1\" >Score Bands Drift</td>\n", + " <td id=\"T_0502a_row169_col2\" class=\"data row169 col2\" >Analyzes drift in population distribution and default rates across score bands....</td>\n", + " <td id=\"T_0502a_row169_col3\" class=\"data row169 col3\" >False</td>\n", + " <td id=\"T_0502a_row169_col4\" class=\"data row169 col4\" >True</td>\n", + " <td id=\"T_0502a_row169_col5\" class=\"data row169 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row169_col6\" class=\"data row169 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'score_bands': {'type': 'list', 'default': None}, 'drift_threshold': {'type': 'float', 'default': 20.0}}</td>\n", + " <td id=\"T_0502a_row169_col7\" class=\"data row169 col7\" >['visualization', 'credit_risk', 'scorecard']</td>\n", + " <td id=\"T_0502a_row169_col8\" class=\"data row169 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row170_col0\" class=\"data row170 col0\" >validmind.ongoing_monitoring.ScorecardHistogramDrift</td>\n", + " <td id=\"T_0502a_row170_col1\" class=\"data row170 col1\" >Scorecard Histogram Drift</td>\n", + " <td id=\"T_0502a_row170_col2\" class=\"data row170 col2\" >Compares score distributions between reference and monitoring datasets for each class....</td>\n", + " <td id=\"T_0502a_row170_col3\" class=\"data row170 col3\" >True</td>\n", + " <td id=\"T_0502a_row170_col4\" class=\"data row170 col4\" >True</td>\n", + " <td id=\"T_0502a_row170_col5\" class=\"data row170 col5\" >['datasets']</td>\n", + " <td id=\"T_0502a_row170_col6\" class=\"data row170 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'title': {'type': 'str', 'default': 'Scorecard Histogram Drift'}, 'drift_pct_threshold': {'type': 'float', 'default': 20.0}}</td>\n", + " <td id=\"T_0502a_row170_col7\" class=\"data row170 col7\" >['visualization', 'credit_risk', 'logistic_regression']</td>\n", + " <td id=\"T_0502a_row170_col8\" class=\"data row170 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row171_col0\" class=\"data row171 col0\" >validmind.ongoing_monitoring.TargetPredictionDistributionPlot</td>\n", + " <td id=\"T_0502a_row171_col1\" class=\"data row171 col1\" >Target Prediction Distribution Plot</td>\n", + " <td id=\"T_0502a_row171_col2\" class=\"data row171 col2\" >Assesses differences in prediction distributions between a reference dataset and a monitoring dataset to identify...</td>\n", + " <td id=\"T_0502a_row171_col3\" class=\"data row171 col3\" >True</td>\n", + " <td id=\"T_0502a_row171_col4\" class=\"data row171 col4\" >True</td>\n", + " <td id=\"T_0502a_row171_col5\" class=\"data row171 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_0502a_row171_col6\" class=\"data row171 col6\" >{'drift_pct_threshold': {'type': 'float', 'default': 20}}</td>\n", + " <td id=\"T_0502a_row171_col7\" class=\"data row171 col7\" >['visualization']</td>\n", + " <td id=\"T_0502a_row171_col8\" class=\"data row171 col8\" >['monitoring']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row172_col0\" class=\"data row172 col0\" >validmind.prompt_validation.Bias</td>\n", + " <td id=\"T_0502a_row172_col1\" class=\"data row172 col1\" >Bias</td>\n", + " <td id=\"T_0502a_row172_col2\" class=\"data row172 col2\" >Assesses potential bias in a Large Language Model by analyzing the distribution and order of exemplars in the...</td>\n", + " <td id=\"T_0502a_row172_col3\" class=\"data row172 col3\" >False</td>\n", + " <td id=\"T_0502a_row172_col4\" class=\"data row172 col4\" >True</td>\n", + " <td id=\"T_0502a_row172_col5\" class=\"data row172 col5\" >['model']</td>\n", + " <td id=\"T_0502a_row172_col6\" class=\"data row172 col6\" >{'min_threshold': {'type': '_empty', 'default': 7}, 'judge_llm': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row172_col7\" class=\"data row172 col7\" >['llm', 'few_shot']</td>\n", + " <td id=\"T_0502a_row172_col8\" class=\"data row172 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row173_col0\" class=\"data row173 col0\" >validmind.prompt_validation.Clarity</td>\n", + " <td id=\"T_0502a_row173_col1\" class=\"data row173 col1\" >Clarity</td>\n", + " <td id=\"T_0502a_row173_col2\" class=\"data row173 col2\" >Evaluates and scores the clarity of prompts in a Large Language Model based on specified guidelines....</td>\n", + " <td id=\"T_0502a_row173_col3\" class=\"data row173 col3\" >False</td>\n", + " <td id=\"T_0502a_row173_col4\" class=\"data row173 col4\" >True</td>\n", + " <td id=\"T_0502a_row173_col5\" class=\"data row173 col5\" >['model']</td>\n", + " <td id=\"T_0502a_row173_col6\" class=\"data row173 col6\" >{'min_threshold': {'type': '_empty', 'default': 7}, 'judge_llm': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row173_col7\" class=\"data row173 col7\" >['llm', 'zero_shot', 'few_shot']</td>\n", + " <td id=\"T_0502a_row173_col8\" class=\"data row173 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row174_col0\" class=\"data row174 col0\" >validmind.prompt_validation.Conciseness</td>\n", + " <td id=\"T_0502a_row174_col1\" class=\"data row174 col1\" >Conciseness</td>\n", + " <td id=\"T_0502a_row174_col2\" class=\"data row174 col2\" >Analyzes and grades the conciseness of prompts provided to a Large Language Model....</td>\n", + " <td id=\"T_0502a_row174_col3\" class=\"data row174 col3\" >False</td>\n", + " <td id=\"T_0502a_row174_col4\" class=\"data row174 col4\" >True</td>\n", + " <td id=\"T_0502a_row174_col5\" class=\"data row174 col5\" >['model']</td>\n", + " <td id=\"T_0502a_row174_col6\" class=\"data row174 col6\" >{'min_threshold': {'type': '_empty', 'default': 7}, 'judge_llm': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row174_col7\" class=\"data row174 col7\" >['llm', 'zero_shot', 'few_shot']</td>\n", + " <td id=\"T_0502a_row174_col8\" class=\"data row174 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row175_col0\" class=\"data row175 col0\" >validmind.prompt_validation.Delimitation</td>\n", + " <td id=\"T_0502a_row175_col1\" class=\"data row175 col1\" >Delimitation</td>\n", + " <td id=\"T_0502a_row175_col2\" class=\"data row175 col2\" >Evaluates the proper use of delimiters in prompts provided to Large Language Models....</td>\n", + " <td id=\"T_0502a_row175_col3\" class=\"data row175 col3\" >False</td>\n", + " <td id=\"T_0502a_row175_col4\" class=\"data row175 col4\" >True</td>\n", + " <td id=\"T_0502a_row175_col5\" class=\"data row175 col5\" >['model']</td>\n", + " <td id=\"T_0502a_row175_col6\" class=\"data row175 col6\" >{'min_threshold': {'type': '_empty', 'default': 7}, 'judge_llm': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row175_col7\" class=\"data row175 col7\" >['llm', 'zero_shot', 'few_shot']</td>\n", + " <td id=\"T_0502a_row175_col8\" class=\"data row175 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row176_col0\" class=\"data row176 col0\" >validmind.prompt_validation.NegativeInstruction</td>\n", + " <td id=\"T_0502a_row176_col1\" class=\"data row176 col1\" >Negative Instruction</td>\n", + " <td id=\"T_0502a_row176_col2\" class=\"data row176 col2\" >Evaluates and grades the use of affirmative, proactive language over negative instructions in LLM prompts....</td>\n", + " <td id=\"T_0502a_row176_col3\" class=\"data row176 col3\" >False</td>\n", + " <td id=\"T_0502a_row176_col4\" class=\"data row176 col4\" >True</td>\n", + " <td id=\"T_0502a_row176_col5\" class=\"data row176 col5\" >['model']</td>\n", + " <td id=\"T_0502a_row176_col6\" class=\"data row176 col6\" >{'min_threshold': {'type': '_empty', 'default': 7}, 'judge_llm': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row176_col7\" class=\"data row176 col7\" >['llm', 'zero_shot', 'few_shot']</td>\n", + " <td id=\"T_0502a_row176_col8\" class=\"data row176 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row177_col0\" class=\"data row177 col0\" >validmind.prompt_validation.Robustness</td>\n", + " <td id=\"T_0502a_row177_col1\" class=\"data row177 col1\" >Robustness</td>\n", + " <td id=\"T_0502a_row177_col2\" class=\"data row177 col2\" >Assesses the robustness of prompts provided to a Large Language Model under varying conditions and contexts. This test...</td>\n", + " <td id=\"T_0502a_row177_col3\" class=\"data row177 col3\" >False</td>\n", + " <td id=\"T_0502a_row177_col4\" class=\"data row177 col4\" >True</td>\n", + " <td id=\"T_0502a_row177_col5\" class=\"data row177 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row177_col6\" class=\"data row177 col6\" >{'num_tests': {'type': '_empty', 'default': 10}, 'judge_llm': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row177_col7\" class=\"data row177 col7\" >['llm', 'zero_shot', 'few_shot']</td>\n", + " <td id=\"T_0502a_row177_col8\" class=\"data row177 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row178_col0\" class=\"data row178 col0\" >validmind.prompt_validation.Specificity</td>\n", + " <td id=\"T_0502a_row178_col1\" class=\"data row178 col1\" >Specificity</td>\n", + " <td id=\"T_0502a_row178_col2\" class=\"data row178 col2\" >Evaluates and scores the specificity of prompts provided to a Large Language Model (LLM), based on clarity, detail,...</td>\n", + " <td id=\"T_0502a_row178_col3\" class=\"data row178 col3\" >False</td>\n", + " <td id=\"T_0502a_row178_col4\" class=\"data row178 col4\" >True</td>\n", + " <td id=\"T_0502a_row178_col5\" class=\"data row178 col5\" >['model']</td>\n", + " <td id=\"T_0502a_row178_col6\" class=\"data row178 col6\" >{'min_threshold': {'type': '_empty', 'default': 7}, 'judge_llm': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_0502a_row178_col7\" class=\"data row178 col7\" >['llm', 'zero_shot', 'few_shot']</td>\n", + " <td id=\"T_0502a_row178_col8\" class=\"data row178 col8\" >['text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row179_col0\" class=\"data row179 col0\" >validmind.unit_metrics.classification.Accuracy</td>\n", + " <td id=\"T_0502a_row179_col1\" class=\"data row179 col1\" >Accuracy</td>\n", + " <td id=\"T_0502a_row179_col2\" class=\"data row179 col2\" >Calculates the accuracy of a model</td>\n", + " <td id=\"T_0502a_row179_col3\" class=\"data row179 col3\" >False</td>\n", + " <td id=\"T_0502a_row179_col4\" class=\"data row179 col4\" >False</td>\n", + " <td id=\"T_0502a_row179_col5\" class=\"data row179 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row179_col6\" class=\"data row179 col6\" >{}</td>\n", + " <td id=\"T_0502a_row179_col7\" class=\"data row179 col7\" >['classification']</td>\n", + " <td id=\"T_0502a_row179_col8\" class=\"data row179 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row180_col0\" class=\"data row180 col0\" >validmind.unit_metrics.classification.F1</td>\n", + " <td id=\"T_0502a_row180_col1\" class=\"data row180 col1\" >F1</td>\n", + " <td id=\"T_0502a_row180_col2\" class=\"data row180 col2\" >Calculates the F1 score for a classification model.</td>\n", + " <td id=\"T_0502a_row180_col3\" class=\"data row180 col3\" >False</td>\n", + " <td id=\"T_0502a_row180_col4\" class=\"data row180 col4\" >False</td>\n", + " <td id=\"T_0502a_row180_col5\" class=\"data row180 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row180_col6\" class=\"data row180 col6\" >{}</td>\n", + " <td id=\"T_0502a_row180_col7\" class=\"data row180 col7\" >['classification']</td>\n", + " <td id=\"T_0502a_row180_col8\" class=\"data row180 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row181_col0\" class=\"data row181 col0\" >validmind.unit_metrics.classification.Precision</td>\n", + " <td id=\"T_0502a_row181_col1\" class=\"data row181 col1\" >Precision</td>\n", + " <td id=\"T_0502a_row181_col2\" class=\"data row181 col2\" >Calculates the precision for a classification model.</td>\n", + " <td id=\"T_0502a_row181_col3\" class=\"data row181 col3\" >False</td>\n", + " <td id=\"T_0502a_row181_col4\" class=\"data row181 col4\" >False</td>\n", + " <td id=\"T_0502a_row181_col5\" class=\"data row181 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row181_col6\" class=\"data row181 col6\" >{}</td>\n", + " <td id=\"T_0502a_row181_col7\" class=\"data row181 col7\" >['classification']</td>\n", + " <td id=\"T_0502a_row181_col8\" class=\"data row181 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row182_col0\" class=\"data row182 col0\" >validmind.unit_metrics.classification.ROC_AUC</td>\n", + " <td id=\"T_0502a_row182_col1\" class=\"data row182 col1\" >ROC AUC</td>\n", + " <td id=\"T_0502a_row182_col2\" class=\"data row182 col2\" >Calculates the ROC AUC for a classification model.</td>\n", + " <td id=\"T_0502a_row182_col3\" class=\"data row182 col3\" >False</td>\n", + " <td id=\"T_0502a_row182_col4\" class=\"data row182 col4\" >False</td>\n", + " <td id=\"T_0502a_row182_col5\" class=\"data row182 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row182_col6\" class=\"data row182 col6\" >{}</td>\n", + " <td id=\"T_0502a_row182_col7\" class=\"data row182 col7\" >['classification']</td>\n", + " <td id=\"T_0502a_row182_col8\" class=\"data row182 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row183_col0\" class=\"data row183 col0\" >validmind.unit_metrics.classification.Recall</td>\n", + " <td id=\"T_0502a_row183_col1\" class=\"data row183 col1\" >Recall</td>\n", + " <td id=\"T_0502a_row183_col2\" class=\"data row183 col2\" >Calculates the recall for a classification model.</td>\n", + " <td id=\"T_0502a_row183_col3\" class=\"data row183 col3\" >False</td>\n", + " <td id=\"T_0502a_row183_col4\" class=\"data row183 col4\" >False</td>\n", + " <td id=\"T_0502a_row183_col5\" class=\"data row183 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row183_col6\" class=\"data row183 col6\" >{}</td>\n", + " <td id=\"T_0502a_row183_col7\" class=\"data row183 col7\" >['classification']</td>\n", + " <td id=\"T_0502a_row183_col8\" class=\"data row183 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row184_col0\" class=\"data row184 col0\" >validmind.unit_metrics.regression.AdjustedRSquaredScore</td>\n", + " <td id=\"T_0502a_row184_col1\" class=\"data row184 col1\" >Adjusted R Squared Score</td>\n", + " <td id=\"T_0502a_row184_col2\" class=\"data row184 col2\" >Calculates the adjusted R-squared score for a regression model.</td>\n", + " <td id=\"T_0502a_row184_col3\" class=\"data row184 col3\" >False</td>\n", + " <td id=\"T_0502a_row184_col4\" class=\"data row184 col4\" >False</td>\n", + " <td id=\"T_0502a_row184_col5\" class=\"data row184 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row184_col6\" class=\"data row184 col6\" >{}</td>\n", + " <td id=\"T_0502a_row184_col7\" class=\"data row184 col7\" >['regression']</td>\n", + " <td id=\"T_0502a_row184_col8\" class=\"data row184 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row185_col0\" class=\"data row185 col0\" >validmind.unit_metrics.regression.GiniCoefficient</td>\n", + " <td id=\"T_0502a_row185_col1\" class=\"data row185 col1\" >Gini Coefficient</td>\n", + " <td id=\"T_0502a_row185_col2\" class=\"data row185 col2\" >Calculates the Gini coefficient for a regression model.</td>\n", + " <td id=\"T_0502a_row185_col3\" class=\"data row185 col3\" >False</td>\n", + " <td id=\"T_0502a_row185_col4\" class=\"data row185 col4\" >False</td>\n", + " <td id=\"T_0502a_row185_col5\" class=\"data row185 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row185_col6\" class=\"data row185 col6\" >{}</td>\n", + " <td id=\"T_0502a_row185_col7\" class=\"data row185 col7\" >['regression']</td>\n", + " <td id=\"T_0502a_row185_col8\" class=\"data row185 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row186_col0\" class=\"data row186 col0\" >validmind.unit_metrics.regression.HuberLoss</td>\n", + " <td id=\"T_0502a_row186_col1\" class=\"data row186 col1\" >Huber Loss</td>\n", + " <td id=\"T_0502a_row186_col2\" class=\"data row186 col2\" >Calculates the Huber loss for a regression model.</td>\n", + " <td id=\"T_0502a_row186_col3\" class=\"data row186 col3\" >False</td>\n", + " <td id=\"T_0502a_row186_col4\" class=\"data row186 col4\" >False</td>\n", + " <td id=\"T_0502a_row186_col5\" class=\"data row186 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row186_col6\" class=\"data row186 col6\" >{}</td>\n", + " <td id=\"T_0502a_row186_col7\" class=\"data row186 col7\" >['regression']</td>\n", + " <td id=\"T_0502a_row186_col8\" class=\"data row186 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row187_col0\" class=\"data row187 col0\" >validmind.unit_metrics.regression.KolmogorovSmirnovStatistic</td>\n", + " <td id=\"T_0502a_row187_col1\" class=\"data row187 col1\" >Kolmogorov Smirnov Statistic</td>\n", + " <td id=\"T_0502a_row187_col2\" class=\"data row187 col2\" >Calculates the Kolmogorov-Smirnov statistic for a regression model.</td>\n", + " <td id=\"T_0502a_row187_col3\" class=\"data row187 col3\" >False</td>\n", + " <td id=\"T_0502a_row187_col4\" class=\"data row187 col4\" >False</td>\n", + " <td id=\"T_0502a_row187_col5\" class=\"data row187 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_0502a_row187_col6\" class=\"data row187 col6\" >{}</td>\n", + " <td id=\"T_0502a_row187_col7\" class=\"data row187 col7\" >['regression']</td>\n", + " <td id=\"T_0502a_row187_col8\" class=\"data row187 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row188_col0\" class=\"data row188 col0\" >validmind.unit_metrics.regression.MeanAbsoluteError</td>\n", + " <td id=\"T_0502a_row188_col1\" class=\"data row188 col1\" >Mean Absolute Error</td>\n", + " <td id=\"T_0502a_row188_col2\" class=\"data row188 col2\" >Calculates the mean absolute error for a regression model.</td>\n", + " <td id=\"T_0502a_row188_col3\" class=\"data row188 col3\" >False</td>\n", + " <td id=\"T_0502a_row188_col4\" class=\"data row188 col4\" >False</td>\n", + " <td id=\"T_0502a_row188_col5\" class=\"data row188 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row188_col6\" class=\"data row188 col6\" >{}</td>\n", + " <td id=\"T_0502a_row188_col7\" class=\"data row188 col7\" >['regression']</td>\n", + " <td id=\"T_0502a_row188_col8\" class=\"data row188 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row189_col0\" class=\"data row189 col0\" >validmind.unit_metrics.regression.MeanAbsolutePercentageError</td>\n", + " <td id=\"T_0502a_row189_col1\" class=\"data row189 col1\" >Mean Absolute Percentage Error</td>\n", + " <td id=\"T_0502a_row189_col2\" class=\"data row189 col2\" >Calculates the mean absolute percentage error for a regression model.</td>\n", + " <td id=\"T_0502a_row189_col3\" class=\"data row189 col3\" >False</td>\n", + " <td id=\"T_0502a_row189_col4\" class=\"data row189 col4\" >False</td>\n", + " <td id=\"T_0502a_row189_col5\" class=\"data row189 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row189_col6\" class=\"data row189 col6\" >{}</td>\n", + " <td id=\"T_0502a_row189_col7\" class=\"data row189 col7\" >['regression']</td>\n", + " <td id=\"T_0502a_row189_col8\" class=\"data row189 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row190_col0\" class=\"data row190 col0\" >validmind.unit_metrics.regression.MeanBiasDeviation</td>\n", + " <td id=\"T_0502a_row190_col1\" class=\"data row190 col1\" >Mean Bias Deviation</td>\n", + " <td id=\"T_0502a_row190_col2\" class=\"data row190 col2\" >Calculates the mean bias deviation for a regression model.</td>\n", + " <td id=\"T_0502a_row190_col3\" class=\"data row190 col3\" >False</td>\n", + " <td id=\"T_0502a_row190_col4\" class=\"data row190 col4\" >False</td>\n", + " <td id=\"T_0502a_row190_col5\" class=\"data row190 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row190_col6\" class=\"data row190 col6\" >{}</td>\n", + " <td id=\"T_0502a_row190_col7\" class=\"data row190 col7\" >['regression']</td>\n", + " <td id=\"T_0502a_row190_col8\" class=\"data row190 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row191_col0\" class=\"data row191 col0\" >validmind.unit_metrics.regression.MeanSquaredError</td>\n", + " <td id=\"T_0502a_row191_col1\" class=\"data row191 col1\" >Mean Squared Error</td>\n", + " <td id=\"T_0502a_row191_col2\" class=\"data row191 col2\" >Calculates the mean squared error for a regression model.</td>\n", + " <td id=\"T_0502a_row191_col3\" class=\"data row191 col3\" >False</td>\n", + " <td id=\"T_0502a_row191_col4\" class=\"data row191 col4\" >False</td>\n", + " <td id=\"T_0502a_row191_col5\" class=\"data row191 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row191_col6\" class=\"data row191 col6\" >{}</td>\n", + " <td id=\"T_0502a_row191_col7\" class=\"data row191 col7\" >['regression']</td>\n", + " <td id=\"T_0502a_row191_col8\" class=\"data row191 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row192_col0\" class=\"data row192 col0\" >validmind.unit_metrics.regression.QuantileLoss</td>\n", + " <td id=\"T_0502a_row192_col1\" class=\"data row192 col1\" >Quantile Loss</td>\n", + " <td id=\"T_0502a_row192_col2\" class=\"data row192 col2\" >Calculates the quantile loss for a regression model.</td>\n", + " <td id=\"T_0502a_row192_col3\" class=\"data row192 col3\" >False</td>\n", + " <td id=\"T_0502a_row192_col4\" class=\"data row192 col4\" >False</td>\n", + " <td id=\"T_0502a_row192_col5\" class=\"data row192 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row192_col6\" class=\"data row192 col6\" >{'quantile': {'type': '_empty', 'default': 0.5}}</td>\n", + " <td id=\"T_0502a_row192_col7\" class=\"data row192 col7\" >['regression']</td>\n", + " <td id=\"T_0502a_row192_col8\" class=\"data row192 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row193_col0\" class=\"data row193 col0\" >validmind.unit_metrics.regression.RSquaredScore</td>\n", + " <td id=\"T_0502a_row193_col1\" class=\"data row193 col1\" >R Squared Score</td>\n", + " <td id=\"T_0502a_row193_col2\" class=\"data row193 col2\" >Calculates the R-squared score for a regression model.</td>\n", + " <td id=\"T_0502a_row193_col3\" class=\"data row193 col3\" >False</td>\n", + " <td id=\"T_0502a_row193_col4\" class=\"data row193 col4\" >False</td>\n", + " <td id=\"T_0502a_row193_col5\" class=\"data row193 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row193_col6\" class=\"data row193 col6\" >{}</td>\n", + " <td id=\"T_0502a_row193_col7\" class=\"data row193 col7\" >['regression']</td>\n", + " <td id=\"T_0502a_row193_col8\" class=\"data row193 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_0502a_row194_col0\" class=\"data row194 col0\" >validmind.unit_metrics.regression.RootMeanSquaredError</td>\n", + " <td id=\"T_0502a_row194_col1\" class=\"data row194 col1\" >Root Mean Squared Error</td>\n", + " <td id=\"T_0502a_row194_col2\" class=\"data row194 col2\" >Calculates the root mean squared error for a regression model.</td>\n", + " <td id=\"T_0502a_row194_col3\" class=\"data row194 col3\" >False</td>\n", + " <td id=\"T_0502a_row194_col4\" class=\"data row194 col4\" >False</td>\n", + " <td id=\"T_0502a_row194_col5\" class=\"data row194 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_0502a_row194_col6\" class=\"data row194 col6\" >{}</td>\n", + " <td id=\"T_0502a_row194_col7\" class=\"data row194 col7\" >['regression']</td>\n", + " <td id=\"T_0502a_row194_col8\" class=\"data row194 col8\" >['regression']</td>\n", + " </tr>\n", + " </tbody>\n", + "</table>\n" + ], + "text/plain": [ + "<pandas.io.formats.style.Styler at 0x38000a670>" + ] + }, + "execution_count": null, + "metadata": {}, + "output_type": "execute_result" + } ], - "text/plain": [ - "<pandas.io.formats.style.Styler at 0x38000a670>" + "source": [ + "list_tests()" ] - }, - "execution_count": 2, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "list_tests()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Understand tags and task types\n", - "\n", - "Use [list_tasks()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tasks) to view all unique task types used to classify tests in the ValidMind Library.\n", - "\n", - "Understanding `task` types helps you filter tests that match your record's (such as a model) objective. For example:\n", - "\n", - "- **classification:** Works with Classification Models and Datasets.\n", - "- **regression:** Works with Regression Models and Datasets.\n", - "- **text classification:** Works with Text Classification Models and Datasets.\n", - "- **text summarization:** Works with Text Summarization Models and Datasets." - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [ + }, { - "data": { - "text/plain": [ - "['text_qa',\n", - " 'classification',\n", - " 'data_validation',\n", - " 'text_classification',\n", - " 'feature_extraction',\n", - " 'regression',\n", - " 'visualization',\n", - " 'clustering',\n", - " 'time_series_forecasting',\n", - " 'text_summarization',\n", - " 'nlp',\n", - " 'residual_analysis',\n", - " 'monitoring',\n", - " 'text_generation']" + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Understand tags and task types\n", + "\n", + "Use [list_tasks()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tasks) to view all unique task types used to classify tests in the ValidMind Library.\n", + "\n", + "Understanding `task` types helps you filter tests that match your record's objective. For example:\n", + "\n", + "- **classification:** Works with Classification Models and Datasets.\n", + "- **regression:** Works with Regression Models and Datasets.\n", + "- **text classification:** Works with Text Classification Models and Datasets.\n", + "- **text summarization:** Works with Text Summarization Models and Datasets." ] - }, - "execution_count": 3, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "list_tasks()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Use [list_tags()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tags) to view all unique tags used to describe tests in the ValidMind Library.\n", - "\n", - "`Tags` describe what a test applies to and help you filter tests for your use case. Examples include:\n", - "\n", - "- **llm:** Tests that work with Large Language Models.\n", - "- **nlp:** Tests relevant for natural language processing.\n", - "- **binary_classification:** Tests for binary classification tasks.\n", - "- **forecasting:** Tests for forecasting and time-series analysis.\n", - "- **tabular_data:** Tests for tabular data like CSVs and Excel spreadsheets." - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [ + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "['text_qa',\n", + " 'classification',\n", + " 'data_validation',\n", + " 'text_classification',\n", + " 'feature_extraction',\n", + " 'regression',\n", + " 'visualization',\n", + " 'clustering',\n", + " 'time_series_forecasting',\n", + " 'text_summarization',\n", + " 'nlp',\n", + " 'residual_analysis',\n", + " 'monitoring',\n", + " 'text_generation']" + ] + }, + "execution_count": null, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "list_tasks()" + ] + }, { - "data": { - "text/plain": [ - "['senstivity_analysis',\n", - " 'calibration',\n", - " 'clustering',\n", - " 'anomaly_detection',\n", - " 'nlp',\n", - " 'classification_metrics',\n", - " 'dimensionality_reduction',\n", - " 'tabular_data',\n", - " 'time_series_data',\n", - " 'model_predictions',\n", - " 'feature_selection',\n", - " 'correlation',\n", - " 'frequency_analysis',\n", - " 'embeddings',\n", - " 'regression',\n", - " 'llm',\n", - " 'statsmodels',\n", - " 'ragas',\n", - " 'model_performance',\n", - " 'model_validation',\n", - " 'rag_performance',\n", - " 'model_training',\n", - " 'qualitative',\n", - " 'classification',\n", - " 'kmeans',\n", - " 'multiclass_classification',\n", - " 'linear_regression',\n", - " 'data_quality',\n", - " 'text_data',\n", - " 'binary_classification',\n", - " 'threshold_optimization',\n", - " 'stationarity',\n", - " 'bias_and_fairness',\n", - " 'scorecard',\n", - " 'model_explainability',\n", - " 'model_comparison',\n", - " 'numerical_data',\n", - " 'sklearn',\n", - " 'model_selection',\n", - " 'retrieval_performance',\n", - " 'zero_shot',\n", - " 'statistical_test',\n", - " 'descriptive_statistics',\n", - " 'seasonality',\n", - " 'analysis',\n", - " 'data_validation',\n", - " 'data_distribution',\n", - " 'feature_importance',\n", - " 'metadata',\n", - " 'few_shot',\n", - " 'visualization',\n", - " 'credit_risk',\n", - " 'forecasting',\n", - " 'AUC',\n", - " 'logistic_regression',\n", - " 'model_diagnosis',\n", - " 'model_interpretation',\n", - " 'unit_root_test',\n", - " 'categorical_data',\n", - " 'data_analysis']" + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Use [list_tags()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tags) to view all unique tags used to describe tests in the ValidMind Library.\n", + "\n", + "`Tags` describe what a test applies to and help you filter tests for your use case. Examples include:\n", + "\n", + "- **llm:** Tests that work with Large Language Models.\n", + "- **nlp:** Tests relevant for natural language processing.\n", + "- **binary_classification:** Tests for binary classification tasks.\n", + "- **forecasting:** Tests for forecasting and time-series analysis.\n", + "- **tabular_data:** Tests for tabular data like CSVs and Excel spreadsheets." ] - }, - "execution_count": 4, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "list_tags()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Finally, to match each task type with its related tags, use the [list_tasks_and_tags()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tasks_and_tags) function:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [ + }, { - "data": { - "text/html": [ - "<style type=\"text/css\">\n", - "#T_ac294 th {\n", - " text-align: left;\n", - "}\n", - "#T_ac294_row0_col0, #T_ac294_row0_col1, #T_ac294_row1_col0, #T_ac294_row1_col1, #T_ac294_row2_col0, #T_ac294_row2_col1, #T_ac294_row3_col0, #T_ac294_row3_col1, #T_ac294_row4_col0, #T_ac294_row4_col1, #T_ac294_row5_col0, #T_ac294_row5_col1, #T_ac294_row6_col0, #T_ac294_row6_col1, #T_ac294_row7_col0, #T_ac294_row7_col1, #T_ac294_row8_col0, #T_ac294_row8_col1, #T_ac294_row9_col0, #T_ac294_row9_col1, #T_ac294_row10_col0, #T_ac294_row10_col1, #T_ac294_row11_col0, #T_ac294_row11_col1, #T_ac294_row12_col0, #T_ac294_row12_col1, #T_ac294_row13_col0, #T_ac294_row13_col1 {\n", - " text-align: left;\n", - "}\n", - "</style>\n", - "<table id=\"T_ac294\">\n", - " <thead>\n", - " <tr>\n", - " <th id=\"T_ac294_level0_col0\" class=\"col_heading level0 col0\" >Task</th>\n", - " <th id=\"T_ac294_level0_col1\" class=\"col_heading level0 col1\" >Tags</th>\n", - " </tr>\n", - " </thead>\n", - " <tbody>\n", - " <tr>\n", - " <td id=\"T_ac294_row0_col0\" class=\"data row0 col0\" >regression</td>\n", - " <td id=\"T_ac294_row0_col1\" class=\"data row0 col1\" >senstivity_analysis, tabular_data, time_series_data, model_predictions, feature_selection, correlation, regression, statsmodels, model_performance, model_training, multiclass_classification, linear_regression, data_quality, text_data, model_explainability, binary_classification, stationarity, bias_and_fairness, numerical_data, sklearn, model_selection, statistical_test, descriptive_statistics, seasonality, analysis, data_validation, data_distribution, metadata, feature_importance, visualization, forecasting, model_diagnosis, model_interpretation, unit_root_test, categorical_data, data_analysis</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_ac294_row1_col0\" class=\"data row1 col0\" >classification</td>\n", - " <td id=\"T_ac294_row1_col1\" class=\"data row1 col1\" >calibration, anomaly_detection, classification_metrics, tabular_data, time_series_data, feature_selection, correlation, statsmodels, model_performance, model_validation, model_training, classification, multiclass_classification, linear_regression, data_quality, text_data, binary_classification, threshold_optimization, bias_and_fairness, scorecard, model_comparison, numerical_data, sklearn, statistical_test, descriptive_statistics, feature_importance, data_distribution, metadata, visualization, credit_risk, AUC, logistic_regression, model_diagnosis, categorical_data, data_analysis</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_ac294_row2_col0\" class=\"data row2 col0\" >text_classification</td>\n", - " <td id=\"T_ac294_row2_col1\" class=\"data row2 col1\" >model_performance, feature_importance, multiclass_classification, few_shot, frequency_analysis, zero_shot, text_data, visualization, llm, binary_classification, ragas, model_diagnosis, model_comparison, sklearn, nlp, retrieval_performance, tabular_data, time_series_data</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_ac294_row3_col0\" class=\"data row3 col0\" >text_summarization</td>\n", - " <td id=\"T_ac294_row3_col1\" class=\"data row3 col1\" >qualitative, few_shot, frequency_analysis, embeddings, zero_shot, text_data, visualization, llm, rag_performance, ragas, retrieval_performance, nlp, dimensionality_reduction, tabular_data, time_series_data</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_ac294_row4_col0\" class=\"data row4 col0\" >data_validation</td>\n", - " <td id=\"T_ac294_row4_col1\" class=\"data row4 col1\" >stationarity, statsmodels, unit_root_test, time_series_data</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_ac294_row5_col0\" class=\"data row5 col0\" >time_series_forecasting</td>\n", - " <td id=\"T_ac294_row5_col1\" class=\"data row5 col1\" >model_training, data_validation, metadata, visualization, model_explainability, sklearn, model_performance, model_predictions, time_series_data</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_ac294_row6_col0\" class=\"data row6 col0\" >nlp</td>\n", - " <td id=\"T_ac294_row6_col1\" class=\"data row6 col1\" >data_validation, frequency_analysis, text_data, visualization, nlp</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_ac294_row7_col0\" class=\"data row7 col0\" >clustering</td>\n", - " <td id=\"T_ac294_row7_col1\" class=\"data row7 col1\" >clustering, model_performance, kmeans, sklearn</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_ac294_row8_col0\" class=\"data row8 col0\" >residual_analysis</td>\n", - " <td id=\"T_ac294_row8_col1\" class=\"data row8 col1\" >regression</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_ac294_row9_col0\" class=\"data row9 col0\" >visualization</td>\n", - " <td id=\"T_ac294_row9_col1\" class=\"data row9 col1\" >regression</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_ac294_row10_col0\" class=\"data row10 col0\" >feature_extraction</td>\n", - " <td id=\"T_ac294_row10_col1\" class=\"data row10 col1\" >embeddings, text_data, visualization, llm</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_ac294_row11_col0\" class=\"data row11 col0\" >text_qa</td>\n", - " <td id=\"T_ac294_row11_col1\" class=\"data row11 col1\" >qualitative, embeddings, visualization, llm, rag_performance, ragas, dimensionality_reduction, retrieval_performance</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_ac294_row12_col0\" class=\"data row12 col0\" >text_generation</td>\n", - " <td id=\"T_ac294_row12_col1\" class=\"data row12 col1\" >qualitative, embeddings, visualization, llm, rag_performance, ragas, dimensionality_reduction, retrieval_performance</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_ac294_row13_col0\" class=\"data row13 col0\" >monitoring</td>\n", - " <td id=\"T_ac294_row13_col1\" class=\"data row13 col1\" >visualization</td>\n", - " </tr>\n", - " </tbody>\n", - "</table>\n" + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "['senstivity_analysis',\n", + " 'calibration',\n", + " 'clustering',\n", + " 'anomaly_detection',\n", + " 'nlp',\n", + " 'classification_metrics',\n", + " 'dimensionality_reduction',\n", + " 'tabular_data',\n", + " 'time_series_data',\n", + " 'model_predictions',\n", + " 'feature_selection',\n", + " 'correlation',\n", + " 'frequency_analysis',\n", + " 'embeddings',\n", + " 'regression',\n", + " 'llm',\n", + " 'statsmodels',\n", + " 'ragas',\n", + " 'model_performance',\n", + " 'model_validation',\n", + " 'rag_performance',\n", + " 'model_training',\n", + " 'qualitative',\n", + " 'classification',\n", + " 'kmeans',\n", + " 'multiclass_classification',\n", + " 'linear_regression',\n", + " 'data_quality',\n", + " 'text_data',\n", + " 'binary_classification',\n", + " 'threshold_optimization',\n", + " 'stationarity',\n", + " 'bias_and_fairness',\n", + " 'scorecard',\n", + " 'model_explainability',\n", + " 'model_comparison',\n", + " 'numerical_data',\n", + " 'sklearn',\n", + " 'model_selection',\n", + " 'retrieval_performance',\n", + " 'zero_shot',\n", + " 'statistical_test',\n", + " 'descriptive_statistics',\n", + " 'seasonality',\n", + " 'analysis',\n", + " 'data_validation',\n", + " 'data_distribution',\n", + " 'feature_importance',\n", + " 'metadata',\n", + " 'few_shot',\n", + " 'visualization',\n", + " 'credit_risk',\n", + " 'forecasting',\n", + " 'AUC',\n", + " 'logistic_regression',\n", + " 'model_diagnosis',\n", + " 'model_interpretation',\n", + " 'unit_root_test',\n", + " 'categorical_data',\n", + " 'data_analysis']" + ] + }, + "execution_count": null, + "metadata": {}, + "output_type": "execute_result" + } ], - "text/plain": [ - "<pandas.io.formats.style.Styler at 0x38000adc0>" + "source": [ + "list_tags()" ] - }, - "execution_count": 5, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "list_tasks_and_tags()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Filter tests by tags and task types\n", - "\n", - "While listing all tests is useful, you’ll often want to narrow your search. The [list_tests()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) function supports `filter`, `task`, and `tags` parameters to assist in refining your results.\n", - "\n", - "Use the `filter` parameter to find tests that match a specific keyword, such as `sklearn`:" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [ + }, { - "data": { - "text/html": [ - "<style type=\"text/css\">\n", - "#T_326c3 th {\n", - " text-align: left;\n", - "}\n", - "#T_326c3_row0_col0, #T_326c3_row0_col1, #T_326c3_row0_col2, #T_326c3_row0_col3, #T_326c3_row0_col4, #T_326c3_row0_col5, #T_326c3_row0_col6, #T_326c3_row0_col7, #T_326c3_row0_col8, #T_326c3_row1_col0, #T_326c3_row1_col1, #T_326c3_row1_col2, #T_326c3_row1_col3, #T_326c3_row1_col4, #T_326c3_row1_col5, #T_326c3_row1_col6, #T_326c3_row1_col7, #T_326c3_row1_col8, #T_326c3_row2_col0, #T_326c3_row2_col1, #T_326c3_row2_col2, #T_326c3_row2_col3, #T_326c3_row2_col4, #T_326c3_row2_col5, #T_326c3_row2_col6, #T_326c3_row2_col7, #T_326c3_row2_col8, #T_326c3_row3_col0, #T_326c3_row3_col1, #T_326c3_row3_col2, #T_326c3_row3_col3, #T_326c3_row3_col4, #T_326c3_row3_col5, #T_326c3_row3_col6, #T_326c3_row3_col7, #T_326c3_row3_col8, #T_326c3_row4_col0, #T_326c3_row4_col1, #T_326c3_row4_col2, #T_326c3_row4_col3, #T_326c3_row4_col4, #T_326c3_row4_col5, #T_326c3_row4_col6, #T_326c3_row4_col7, #T_326c3_row4_col8, #T_326c3_row5_col0, #T_326c3_row5_col1, #T_326c3_row5_col2, #T_326c3_row5_col3, #T_326c3_row5_col4, #T_326c3_row5_col5, #T_326c3_row5_col6, #T_326c3_row5_col7, #T_326c3_row5_col8, #T_326c3_row6_col0, #T_326c3_row6_col1, #T_326c3_row6_col2, #T_326c3_row6_col3, #T_326c3_row6_col4, #T_326c3_row6_col5, #T_326c3_row6_col6, #T_326c3_row6_col7, #T_326c3_row6_col8, #T_326c3_row7_col0, #T_326c3_row7_col1, #T_326c3_row7_col2, #T_326c3_row7_col3, #T_326c3_row7_col4, #T_326c3_row7_col5, #T_326c3_row7_col6, #T_326c3_row7_col7, #T_326c3_row7_col8, #T_326c3_row8_col0, #T_326c3_row8_col1, #T_326c3_row8_col2, #T_326c3_row8_col3, #T_326c3_row8_col4, #T_326c3_row8_col5, #T_326c3_row8_col6, #T_326c3_row8_col7, #T_326c3_row8_col8, #T_326c3_row9_col0, #T_326c3_row9_col1, #T_326c3_row9_col2, #T_326c3_row9_col3, #T_326c3_row9_col4, #T_326c3_row9_col5, #T_326c3_row9_col6, #T_326c3_row9_col7, #T_326c3_row9_col8, #T_326c3_row10_col0, #T_326c3_row10_col1, #T_326c3_row10_col2, #T_326c3_row10_col3, #T_326c3_row10_col4, #T_326c3_row10_col5, #T_326c3_row10_col6, #T_326c3_row10_col7, #T_326c3_row10_col8, #T_326c3_row11_col0, #T_326c3_row11_col1, #T_326c3_row11_col2, #T_326c3_row11_col3, #T_326c3_row11_col4, #T_326c3_row11_col5, #T_326c3_row11_col6, #T_326c3_row11_col7, #T_326c3_row11_col8, #T_326c3_row12_col0, #T_326c3_row12_col1, #T_326c3_row12_col2, #T_326c3_row12_col3, #T_326c3_row12_col4, #T_326c3_row12_col5, #T_326c3_row12_col6, #T_326c3_row12_col7, #T_326c3_row12_col8, #T_326c3_row13_col0, #T_326c3_row13_col1, #T_326c3_row13_col2, #T_326c3_row13_col3, #T_326c3_row13_col4, #T_326c3_row13_col5, #T_326c3_row13_col6, #T_326c3_row13_col7, #T_326c3_row13_col8, #T_326c3_row14_col0, #T_326c3_row14_col1, #T_326c3_row14_col2, #T_326c3_row14_col3, #T_326c3_row14_col4, #T_326c3_row14_col5, #T_326c3_row14_col6, #T_326c3_row14_col7, #T_326c3_row14_col8, #T_326c3_row15_col0, #T_326c3_row15_col1, #T_326c3_row15_col2, #T_326c3_row15_col3, #T_326c3_row15_col4, #T_326c3_row15_col5, #T_326c3_row15_col6, #T_326c3_row15_col7, #T_326c3_row15_col8, #T_326c3_row16_col0, #T_326c3_row16_col1, #T_326c3_row16_col2, #T_326c3_row16_col3, #T_326c3_row16_col4, #T_326c3_row16_col5, #T_326c3_row16_col6, #T_326c3_row16_col7, #T_326c3_row16_col8, #T_326c3_row17_col0, #T_326c3_row17_col1, #T_326c3_row17_col2, #T_326c3_row17_col3, #T_326c3_row17_col4, #T_326c3_row17_col5, #T_326c3_row17_col6, #T_326c3_row17_col7, #T_326c3_row17_col8, #T_326c3_row18_col0, #T_326c3_row18_col1, #T_326c3_row18_col2, #T_326c3_row18_col3, #T_326c3_row18_col4, #T_326c3_row18_col5, #T_326c3_row18_col6, #T_326c3_row18_col7, #T_326c3_row18_col8, #T_326c3_row19_col0, #T_326c3_row19_col1, #T_326c3_row19_col2, #T_326c3_row19_col3, #T_326c3_row19_col4, #T_326c3_row19_col5, #T_326c3_row19_col6, #T_326c3_row19_col7, #T_326c3_row19_col8, #T_326c3_row20_col0, #T_326c3_row20_col1, #T_326c3_row20_col2, #T_326c3_row20_col3, #T_326c3_row20_col4, #T_326c3_row20_col5, #T_326c3_row20_col6, #T_326c3_row20_col7, #T_326c3_row20_col8, #T_326c3_row21_col0, #T_326c3_row21_col1, #T_326c3_row21_col2, #T_326c3_row21_col3, #T_326c3_row21_col4, #T_326c3_row21_col5, #T_326c3_row21_col6, #T_326c3_row21_col7, #T_326c3_row21_col8, #T_326c3_row22_col0, #T_326c3_row22_col1, #T_326c3_row22_col2, #T_326c3_row22_col3, #T_326c3_row22_col4, #T_326c3_row22_col5, #T_326c3_row22_col6, #T_326c3_row22_col7, #T_326c3_row22_col8, #T_326c3_row23_col0, #T_326c3_row23_col1, #T_326c3_row23_col2, #T_326c3_row23_col3, #T_326c3_row23_col4, #T_326c3_row23_col5, #T_326c3_row23_col6, #T_326c3_row23_col7, #T_326c3_row23_col8, #T_326c3_row24_col0, #T_326c3_row24_col1, #T_326c3_row24_col2, #T_326c3_row24_col3, #T_326c3_row24_col4, #T_326c3_row24_col5, #T_326c3_row24_col6, #T_326c3_row24_col7, #T_326c3_row24_col8, #T_326c3_row25_col0, #T_326c3_row25_col1, #T_326c3_row25_col2, #T_326c3_row25_col3, #T_326c3_row25_col4, #T_326c3_row25_col5, #T_326c3_row25_col6, #T_326c3_row25_col7, #T_326c3_row25_col8, #T_326c3_row26_col0, #T_326c3_row26_col1, #T_326c3_row26_col2, #T_326c3_row26_col3, #T_326c3_row26_col4, #T_326c3_row26_col5, #T_326c3_row26_col6, #T_326c3_row26_col7, #T_326c3_row26_col8, #T_326c3_row27_col0, #T_326c3_row27_col1, #T_326c3_row27_col2, #T_326c3_row27_col3, #T_326c3_row27_col4, #T_326c3_row27_col5, #T_326c3_row27_col6, #T_326c3_row27_col7, #T_326c3_row27_col8, #T_326c3_row28_col0, #T_326c3_row28_col1, #T_326c3_row28_col2, #T_326c3_row28_col3, #T_326c3_row28_col4, #T_326c3_row28_col5, #T_326c3_row28_col6, #T_326c3_row28_col7, #T_326c3_row28_col8, #T_326c3_row29_col0, #T_326c3_row29_col1, #T_326c3_row29_col2, #T_326c3_row29_col3, #T_326c3_row29_col4, #T_326c3_row29_col5, #T_326c3_row29_col6, #T_326c3_row29_col7, #T_326c3_row29_col8, #T_326c3_row30_col0, #T_326c3_row30_col1, #T_326c3_row30_col2, #T_326c3_row30_col3, #T_326c3_row30_col4, #T_326c3_row30_col5, #T_326c3_row30_col6, #T_326c3_row30_col7, #T_326c3_row30_col8, #T_326c3_row31_col0, #T_326c3_row31_col1, #T_326c3_row31_col2, #T_326c3_row31_col3, #T_326c3_row31_col4, #T_326c3_row31_col5, #T_326c3_row31_col6, #T_326c3_row31_col7, #T_326c3_row31_col8, #T_326c3_row32_col0, #T_326c3_row32_col1, #T_326c3_row32_col2, #T_326c3_row32_col3, #T_326c3_row32_col4, #T_326c3_row32_col5, #T_326c3_row32_col6, #T_326c3_row32_col7, #T_326c3_row32_col8, #T_326c3_row33_col0, #T_326c3_row33_col1, #T_326c3_row33_col2, #T_326c3_row33_col3, #T_326c3_row33_col4, #T_326c3_row33_col5, #T_326c3_row33_col6, #T_326c3_row33_col7, #T_326c3_row33_col8, #T_326c3_row34_col0, #T_326c3_row34_col1, #T_326c3_row34_col2, #T_326c3_row34_col3, #T_326c3_row34_col4, #T_326c3_row34_col5, #T_326c3_row34_col6, #T_326c3_row34_col7, #T_326c3_row34_col8, #T_326c3_row35_col0, #T_326c3_row35_col1, #T_326c3_row35_col2, #T_326c3_row35_col3, #T_326c3_row35_col4, #T_326c3_row35_col5, #T_326c3_row35_col6, #T_326c3_row35_col7, #T_326c3_row35_col8, #T_326c3_row36_col0, #T_326c3_row36_col1, #T_326c3_row36_col2, #T_326c3_row36_col3, #T_326c3_row36_col4, #T_326c3_row36_col5, #T_326c3_row36_col6, #T_326c3_row36_col7, #T_326c3_row36_col8, #T_326c3_row37_col0, #T_326c3_row37_col1, #T_326c3_row37_col2, #T_326c3_row37_col3, #T_326c3_row37_col4, #T_326c3_row37_col5, #T_326c3_row37_col6, #T_326c3_row37_col7, #T_326c3_row37_col8, #T_326c3_row38_col0, #T_326c3_row38_col1, #T_326c3_row38_col2, #T_326c3_row38_col3, #T_326c3_row38_col4, #T_326c3_row38_col5, #T_326c3_row38_col6, #T_326c3_row38_col7, #T_326c3_row38_col8, #T_326c3_row39_col0, #T_326c3_row39_col1, #T_326c3_row39_col2, #T_326c3_row39_col3, #T_326c3_row39_col4, #T_326c3_row39_col5, #T_326c3_row39_col6, #T_326c3_row39_col7, #T_326c3_row39_col8, #T_326c3_row40_col0, #T_326c3_row40_col1, #T_326c3_row40_col2, #T_326c3_row40_col3, #T_326c3_row40_col4, #T_326c3_row40_col5, #T_326c3_row40_col6, #T_326c3_row40_col7, #T_326c3_row40_col8, #T_326c3_row41_col0, #T_326c3_row41_col1, #T_326c3_row41_col2, #T_326c3_row41_col3, #T_326c3_row41_col4, #T_326c3_row41_col5, #T_326c3_row41_col6, #T_326c3_row41_col7, #T_326c3_row41_col8, #T_326c3_row42_col0, #T_326c3_row42_col1, #T_326c3_row42_col2, #T_326c3_row42_col3, #T_326c3_row42_col4, #T_326c3_row42_col5, #T_326c3_row42_col6, #T_326c3_row42_col7, #T_326c3_row42_col8 {\n", - " text-align: left;\n", - "}\n", - "</style>\n", - "<table id=\"T_326c3\">\n", - " <thead>\n", - " <tr>\n", - " <th id=\"T_326c3_level0_col0\" class=\"col_heading level0 col0\" >ID</th>\n", - " <th id=\"T_326c3_level0_col1\" class=\"col_heading level0 col1\" >Name</th>\n", - " <th id=\"T_326c3_level0_col2\" class=\"col_heading level0 col2\" >Description</th>\n", - " <th id=\"T_326c3_level0_col3\" class=\"col_heading level0 col3\" >Has Figure</th>\n", - " <th id=\"T_326c3_level0_col4\" class=\"col_heading level0 col4\" >Has Table</th>\n", - " <th id=\"T_326c3_level0_col5\" class=\"col_heading level0 col5\" >Required Inputs</th>\n", - " <th id=\"T_326c3_level0_col6\" class=\"col_heading level0 col6\" >Params</th>\n", - " <th id=\"T_326c3_level0_col7\" class=\"col_heading level0 col7\" >Tags</th>\n", - " <th id=\"T_326c3_level0_col8\" class=\"col_heading level0 col8\" >Tasks</th>\n", - " </tr>\n", - " </thead>\n", - " <tbody>\n", - " <tr>\n", - " <td id=\"T_326c3_row0_col0\" class=\"data row0 col0\" >validmind.model_validation.ClusterSizeDistribution</td>\n", - " <td id=\"T_326c3_row0_col1\" class=\"data row0 col1\" >Cluster Size Distribution</td>\n", - " <td id=\"T_326c3_row0_col2\" class=\"data row0 col2\" >Assesses the performance of clustering models by comparing the distribution of cluster sizes in model predictions...</td>\n", - " <td id=\"T_326c3_row0_col3\" class=\"data row0 col3\" >True</td>\n", - " <td id=\"T_326c3_row0_col4\" class=\"data row0 col4\" >False</td>\n", - " <td id=\"T_326c3_row0_col5\" class=\"data row0 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_326c3_row0_col6\" class=\"data row0 col6\" >{}</td>\n", - " <td id=\"T_326c3_row0_col7\" class=\"data row0 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_326c3_row0_col8\" class=\"data row0 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row1_col0\" class=\"data row1 col0\" >validmind.model_validation.TimeSeriesR2SquareBySegments</td>\n", - " <td id=\"T_326c3_row1_col1\" class=\"data row1 col1\" >Time Series R2 Square By Segments</td>\n", - " <td id=\"T_326c3_row1_col2\" class=\"data row1 col2\" >Evaluates the R-Squared values of regression models over specified time segments in time series data to assess...</td>\n", - " <td id=\"T_326c3_row1_col3\" class=\"data row1 col3\" >True</td>\n", - " <td id=\"T_326c3_row1_col4\" class=\"data row1 col4\" >True</td>\n", - " <td id=\"T_326c3_row1_col5\" class=\"data row1 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_326c3_row1_col6\" class=\"data row1 col6\" >{'segments': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_326c3_row1_col7\" class=\"data row1 col7\" >['model_performance', 'sklearn']</td>\n", - " <td id=\"T_326c3_row1_col8\" class=\"data row1 col8\" >['regression', 'time_series_forecasting']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row2_col0\" class=\"data row2 col0\" >validmind.model_validation.sklearn.AdjustedMutualInformation</td>\n", - " <td id=\"T_326c3_row2_col1\" class=\"data row2 col1\" >Adjusted Mutual Information</td>\n", - " <td id=\"T_326c3_row2_col2\" class=\"data row2 col2\" >Evaluates clustering model performance by measuring mutual information between true and predicted labels, adjusting...</td>\n", - " <td id=\"T_326c3_row2_col3\" class=\"data row2 col3\" >False</td>\n", - " <td id=\"T_326c3_row2_col4\" class=\"data row2 col4\" >True</td>\n", - " <td id=\"T_326c3_row2_col5\" class=\"data row2 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row2_col6\" class=\"data row2 col6\" >{}</td>\n", - " <td id=\"T_326c3_row2_col7\" class=\"data row2 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", - " <td id=\"T_326c3_row2_col8\" class=\"data row2 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row3_col0\" class=\"data row3 col0\" >validmind.model_validation.sklearn.AdjustedRandIndex</td>\n", - " <td id=\"T_326c3_row3_col1\" class=\"data row3 col1\" >Adjusted Rand Index</td>\n", - " <td id=\"T_326c3_row3_col2\" class=\"data row3 col2\" >Measures the similarity between two data clusters using the Adjusted Rand Index (ARI) metric in clustering machine...</td>\n", - " <td id=\"T_326c3_row3_col3\" class=\"data row3 col3\" >False</td>\n", - " <td id=\"T_326c3_row3_col4\" class=\"data row3 col4\" >True</td>\n", - " <td id=\"T_326c3_row3_col5\" class=\"data row3 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row3_col6\" class=\"data row3 col6\" >{}</td>\n", - " <td id=\"T_326c3_row3_col7\" class=\"data row3 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", - " <td id=\"T_326c3_row3_col8\" class=\"data row3 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row4_col0\" class=\"data row4 col0\" >validmind.model_validation.sklearn.CalibrationCurve</td>\n", - " <td id=\"T_326c3_row4_col1\" class=\"data row4 col1\" >Calibration Curve</td>\n", - " <td id=\"T_326c3_row4_col2\" class=\"data row4 col2\" >Evaluates the calibration of probability estimates by comparing predicted probabilities against observed...</td>\n", - " <td id=\"T_326c3_row4_col3\" class=\"data row4 col3\" >True</td>\n", - " <td id=\"T_326c3_row4_col4\" class=\"data row4 col4\" >False</td>\n", - " <td id=\"T_326c3_row4_col5\" class=\"data row4 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row4_col6\" class=\"data row4 col6\" >{'n_bins': {'type': 'int', 'default': 10}}</td>\n", - " <td id=\"T_326c3_row4_col7\" class=\"data row4 col7\" >['sklearn', 'model_performance', 'classification']</td>\n", - " <td id=\"T_326c3_row4_col8\" class=\"data row4 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row5_col0\" class=\"data row5 col0\" >validmind.model_validation.sklearn.ClassifierPerformance</td>\n", - " <td id=\"T_326c3_row5_col1\" class=\"data row5 col1\" >Classifier Performance</td>\n", - " <td id=\"T_326c3_row5_col2\" class=\"data row5 col2\" >Evaluates performance of binary or multiclass classification models using precision, recall, F1-Score, accuracy,...</td>\n", - " <td id=\"T_326c3_row5_col3\" class=\"data row5 col3\" >False</td>\n", - " <td id=\"T_326c3_row5_col4\" class=\"data row5 col4\" >True</td>\n", - " <td id=\"T_326c3_row5_col5\" class=\"data row5 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_326c3_row5_col6\" class=\"data row5 col6\" >{'average': {'type': 'str', 'default': 'macro'}}</td>\n", - " <td id=\"T_326c3_row5_col7\" class=\"data row5 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_326c3_row5_col8\" class=\"data row5 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row6_col0\" class=\"data row6 col0\" >validmind.model_validation.sklearn.ClassifierThresholdOptimization</td>\n", - " <td id=\"T_326c3_row6_col1\" class=\"data row6 col1\" >Classifier Threshold Optimization</td>\n", - " <td id=\"T_326c3_row6_col2\" class=\"data row6 col2\" >Analyzes and visualizes different threshold optimization methods for binary classification models....</td>\n", - " <td id=\"T_326c3_row6_col3\" class=\"data row6 col3\" >False</td>\n", - " <td id=\"T_326c3_row6_col4\" class=\"data row6 col4\" >True</td>\n", - " <td id=\"T_326c3_row6_col5\" class=\"data row6 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_326c3_row6_col6\" class=\"data row6 col6\" >{'methods': {'type': None, 'default': None}, 'target_recall': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_326c3_row6_col7\" class=\"data row6 col7\" >['model_validation', 'threshold_optimization', 'classification_metrics']</td>\n", - " <td id=\"T_326c3_row6_col8\" class=\"data row6 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row7_col0\" class=\"data row7 col0\" >validmind.model_validation.sklearn.ClusterCosineSimilarity</td>\n", - " <td id=\"T_326c3_row7_col1\" class=\"data row7 col1\" >Cluster Cosine Similarity</td>\n", - " <td id=\"T_326c3_row7_col2\" class=\"data row7 col2\" >Measures the intra-cluster similarity of a clustering model using cosine similarity....</td>\n", - " <td id=\"T_326c3_row7_col3\" class=\"data row7 col3\" >False</td>\n", - " <td id=\"T_326c3_row7_col4\" class=\"data row7 col4\" >True</td>\n", - " <td id=\"T_326c3_row7_col5\" class=\"data row7 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row7_col6\" class=\"data row7 col6\" >{}</td>\n", - " <td id=\"T_326c3_row7_col7\" class=\"data row7 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", - " <td id=\"T_326c3_row7_col8\" class=\"data row7 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row8_col0\" class=\"data row8 col0\" >validmind.model_validation.sklearn.ClusterPerformanceMetrics</td>\n", - " <td id=\"T_326c3_row8_col1\" class=\"data row8 col1\" >Cluster Performance Metrics</td>\n", - " <td id=\"T_326c3_row8_col2\" class=\"data row8 col2\" >Evaluates the performance of clustering machine learning models using multiple established metrics....</td>\n", - " <td id=\"T_326c3_row8_col3\" class=\"data row8 col3\" >False</td>\n", - " <td id=\"T_326c3_row8_col4\" class=\"data row8 col4\" >True</td>\n", - " <td id=\"T_326c3_row8_col5\" class=\"data row8 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row8_col6\" class=\"data row8 col6\" >{}</td>\n", - " <td id=\"T_326c3_row8_col7\" class=\"data row8 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", - " <td id=\"T_326c3_row8_col8\" class=\"data row8 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row9_col0\" class=\"data row9 col0\" >validmind.model_validation.sklearn.CompletenessScore</td>\n", - " <td id=\"T_326c3_row9_col1\" class=\"data row9 col1\" >Completeness Score</td>\n", - " <td id=\"T_326c3_row9_col2\" class=\"data row9 col2\" >Evaluates a clustering model's capacity to categorize instances from a single class into the same cluster....</td>\n", - " <td id=\"T_326c3_row9_col3\" class=\"data row9 col3\" >False</td>\n", - " <td id=\"T_326c3_row9_col4\" class=\"data row9 col4\" >True</td>\n", - " <td id=\"T_326c3_row9_col5\" class=\"data row9 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row9_col6\" class=\"data row9 col6\" >{}</td>\n", - " <td id=\"T_326c3_row9_col7\" class=\"data row9 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", - " <td id=\"T_326c3_row9_col8\" class=\"data row9 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row10_col0\" class=\"data row10 col0\" >validmind.model_validation.sklearn.ConfusionMatrix</td>\n", - " <td id=\"T_326c3_row10_col1\" class=\"data row10 col1\" >Confusion Matrix</td>\n", - " <td id=\"T_326c3_row10_col2\" class=\"data row10 col2\" >Evaluates and visually represents the classification ML model's predictive performance using a Confusion Matrix...</td>\n", - " <td id=\"T_326c3_row10_col3\" class=\"data row10 col3\" >True</td>\n", - " <td id=\"T_326c3_row10_col4\" class=\"data row10 col4\" >False</td>\n", - " <td id=\"T_326c3_row10_col5\" class=\"data row10 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_326c3_row10_col6\" class=\"data row10 col6\" >{'threshold': {'type': 'float', 'default': 0.5}}</td>\n", - " <td id=\"T_326c3_row10_col7\" class=\"data row10 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_326c3_row10_col8\" class=\"data row10 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row11_col0\" class=\"data row11 col0\" >validmind.model_validation.sklearn.FeatureImportance</td>\n", - " <td id=\"T_326c3_row11_col1\" class=\"data row11 col1\" >Feature Importance</td>\n", - " <td id=\"T_326c3_row11_col2\" class=\"data row11 col2\" >Compute feature importance scores for a given model and generate a summary table...</td>\n", - " <td id=\"T_326c3_row11_col3\" class=\"data row11 col3\" >False</td>\n", - " <td id=\"T_326c3_row11_col4\" class=\"data row11 col4\" >True</td>\n", - " <td id=\"T_326c3_row11_col5\" class=\"data row11 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_326c3_row11_col6\" class=\"data row11 col6\" >{'num_features': {'type': 'int', 'default': 3}}</td>\n", - " <td id=\"T_326c3_row11_col7\" class=\"data row11 col7\" >['model_explainability', 'sklearn']</td>\n", - " <td id=\"T_326c3_row11_col8\" class=\"data row11 col8\" >['regression', 'time_series_forecasting']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row12_col0\" class=\"data row12 col0\" >validmind.model_validation.sklearn.FowlkesMallowsScore</td>\n", - " <td id=\"T_326c3_row12_col1\" class=\"data row12 col1\" >Fowlkes Mallows Score</td>\n", - " <td id=\"T_326c3_row12_col2\" class=\"data row12 col2\" >Evaluates the similarity between predicted and actual cluster assignments in a model using the Fowlkes-Mallows...</td>\n", - " <td id=\"T_326c3_row12_col3\" class=\"data row12 col3\" >False</td>\n", - " <td id=\"T_326c3_row12_col4\" class=\"data row12 col4\" >True</td>\n", - " <td id=\"T_326c3_row12_col5\" class=\"data row12 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_326c3_row12_col6\" class=\"data row12 col6\" >{}</td>\n", - " <td id=\"T_326c3_row12_col7\" class=\"data row12 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_326c3_row12_col8\" class=\"data row12 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row13_col0\" class=\"data row13 col0\" >validmind.model_validation.sklearn.HomogeneityScore</td>\n", - " <td id=\"T_326c3_row13_col1\" class=\"data row13 col1\" >Homogeneity Score</td>\n", - " <td id=\"T_326c3_row13_col2\" class=\"data row13 col2\" >Assesses clustering homogeneity by comparing true and predicted labels, scoring from 0 (heterogeneous) to 1...</td>\n", - " <td id=\"T_326c3_row13_col3\" class=\"data row13 col3\" >False</td>\n", - " <td id=\"T_326c3_row13_col4\" class=\"data row13 col4\" >True</td>\n", - " <td id=\"T_326c3_row13_col5\" class=\"data row13 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_326c3_row13_col6\" class=\"data row13 col6\" >{}</td>\n", - " <td id=\"T_326c3_row13_col7\" class=\"data row13 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_326c3_row13_col8\" class=\"data row13 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row14_col0\" class=\"data row14 col0\" >validmind.model_validation.sklearn.HyperParametersTuning</td>\n", - " <td id=\"T_326c3_row14_col1\" class=\"data row14 col1\" >Hyper Parameters Tuning</td>\n", - " <td id=\"T_326c3_row14_col2\" class=\"data row14 col2\" >Performs exhaustive grid search over specified parameter ranges to find optimal model configurations...</td>\n", - " <td id=\"T_326c3_row14_col3\" class=\"data row14 col3\" >False</td>\n", - " <td id=\"T_326c3_row14_col4\" class=\"data row14 col4\" >True</td>\n", - " <td id=\"T_326c3_row14_col5\" class=\"data row14 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row14_col6\" class=\"data row14 col6\" >{'param_grid': {'type': 'dict', 'default': None}, 'scoring': {'type': None, 'default': None}, 'thresholds': {'type': None, 'default': None}, 'fit_params': {'type': 'dict', 'default': None}}</td>\n", - " <td id=\"T_326c3_row14_col7\" class=\"data row14 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_326c3_row14_col8\" class=\"data row14 col8\" >['clustering', 'classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row15_col0\" class=\"data row15 col0\" >validmind.model_validation.sklearn.KMeansClustersOptimization</td>\n", - " <td id=\"T_326c3_row15_col1\" class=\"data row15 col1\" >K Means Clusters Optimization</td>\n", - " <td id=\"T_326c3_row15_col2\" class=\"data row15 col2\" >Optimizes the number of clusters in K-means models using Elbow and Silhouette methods....</td>\n", - " <td id=\"T_326c3_row15_col3\" class=\"data row15 col3\" >True</td>\n", - " <td id=\"T_326c3_row15_col4\" class=\"data row15 col4\" >False</td>\n", - " <td id=\"T_326c3_row15_col5\" class=\"data row15 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row15_col6\" class=\"data row15 col6\" >{'n_clusters': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_326c3_row15_col7\" class=\"data row15 col7\" >['sklearn', 'model_performance', 'kmeans']</td>\n", - " <td id=\"T_326c3_row15_col8\" class=\"data row15 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row16_col0\" class=\"data row16 col0\" >validmind.model_validation.sklearn.MinimumAccuracy</td>\n", - " <td id=\"T_326c3_row16_col1\" class=\"data row16 col1\" >Minimum Accuracy</td>\n", - " <td id=\"T_326c3_row16_col2\" class=\"data row16 col2\" >Checks if the model's prediction accuracy meets or surpasses a specified threshold....</td>\n", - " <td id=\"T_326c3_row16_col3\" class=\"data row16 col3\" >False</td>\n", - " <td id=\"T_326c3_row16_col4\" class=\"data row16 col4\" >True</td>\n", - " <td id=\"T_326c3_row16_col5\" class=\"data row16 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_326c3_row16_col6\" class=\"data row16 col6\" >{'min_threshold': {'type': 'float', 'default': 0.7}}</td>\n", - " <td id=\"T_326c3_row16_col7\" class=\"data row16 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_326c3_row16_col8\" class=\"data row16 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row17_col0\" class=\"data row17 col0\" >validmind.model_validation.sklearn.MinimumF1Score</td>\n", - " <td id=\"T_326c3_row17_col1\" class=\"data row17 col1\" >Minimum F1 Score</td>\n", - " <td id=\"T_326c3_row17_col2\" class=\"data row17 col2\" >Assesses if the model's F1 score on the validation set meets a predefined minimum threshold, ensuring balanced...</td>\n", - " <td id=\"T_326c3_row17_col3\" class=\"data row17 col3\" >False</td>\n", - " <td id=\"T_326c3_row17_col4\" class=\"data row17 col4\" >True</td>\n", - " <td id=\"T_326c3_row17_col5\" class=\"data row17 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_326c3_row17_col6\" class=\"data row17 col6\" >{'min_threshold': {'type': 'float', 'default': 0.5}}</td>\n", - " <td id=\"T_326c3_row17_col7\" class=\"data row17 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_326c3_row17_col8\" class=\"data row17 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row18_col0\" class=\"data row18 col0\" >validmind.model_validation.sklearn.MinimumROCAUCScore</td>\n", - " <td id=\"T_326c3_row18_col1\" class=\"data row18 col1\" >Minimum ROCAUC Score</td>\n", - " <td id=\"T_326c3_row18_col2\" class=\"data row18 col2\" >Validates model by checking if the ROC AUC score meets or surpasses a specified threshold....</td>\n", - " <td id=\"T_326c3_row18_col3\" class=\"data row18 col3\" >False</td>\n", - " <td id=\"T_326c3_row18_col4\" class=\"data row18 col4\" >True</td>\n", - " <td id=\"T_326c3_row18_col5\" class=\"data row18 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_326c3_row18_col6\" class=\"data row18 col6\" >{'min_threshold': {'type': 'float', 'default': 0.5}}</td>\n", - " <td id=\"T_326c3_row18_col7\" class=\"data row18 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_326c3_row18_col8\" class=\"data row18 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row19_col0\" class=\"data row19 col0\" >validmind.model_validation.sklearn.ModelParameters</td>\n", - " <td id=\"T_326c3_row19_col1\" class=\"data row19 col1\" >Model Parameters</td>\n", - " <td id=\"T_326c3_row19_col2\" class=\"data row19 col2\" >Extracts and displays model parameters in a structured format for transparency and reproducibility....</td>\n", - " <td id=\"T_326c3_row19_col3\" class=\"data row19 col3\" >False</td>\n", - " <td id=\"T_326c3_row19_col4\" class=\"data row19 col4\" >True</td>\n", - " <td id=\"T_326c3_row19_col5\" class=\"data row19 col5\" >['model']</td>\n", - " <td id=\"T_326c3_row19_col6\" class=\"data row19 col6\" >{'model_params': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_326c3_row19_col7\" class=\"data row19 col7\" >['model_training', 'metadata']</td>\n", - " <td id=\"T_326c3_row19_col8\" class=\"data row19 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row20_col0\" class=\"data row20 col0\" >validmind.model_validation.sklearn.ModelsPerformanceComparison</td>\n", - " <td id=\"T_326c3_row20_col1\" class=\"data row20 col1\" >Models Performance Comparison</td>\n", - " <td id=\"T_326c3_row20_col2\" class=\"data row20 col2\" >Evaluates and compares the performance of multiple Machine Learning models using various metrics like accuracy,...</td>\n", - " <td id=\"T_326c3_row20_col3\" class=\"data row20 col3\" >False</td>\n", - " <td id=\"T_326c3_row20_col4\" class=\"data row20 col4\" >True</td>\n", - " <td id=\"T_326c3_row20_col5\" class=\"data row20 col5\" >['dataset', 'models']</td>\n", - " <td id=\"T_326c3_row20_col6\" class=\"data row20 col6\" >{}</td>\n", - " <td id=\"T_326c3_row20_col7\" class=\"data row20 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'model_comparison']</td>\n", - " <td id=\"T_326c3_row20_col8\" class=\"data row20 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row21_col0\" class=\"data row21 col0\" >validmind.model_validation.sklearn.OverfitDiagnosis</td>\n", - " <td id=\"T_326c3_row21_col1\" class=\"data row21 col1\" >Overfit Diagnosis</td>\n", - " <td id=\"T_326c3_row21_col2\" class=\"data row21 col2\" >Assesses potential overfitting in a model's predictions, identifying regions where performance between training and...</td>\n", - " <td id=\"T_326c3_row21_col3\" class=\"data row21 col3\" >True</td>\n", - " <td id=\"T_326c3_row21_col4\" class=\"data row21 col4\" >True</td>\n", - " <td id=\"T_326c3_row21_col5\" class=\"data row21 col5\" >['model', 'datasets']</td>\n", - " <td id=\"T_326c3_row21_col6\" class=\"data row21 col6\" >{'metric': {'type': 'str', 'default': None}, 'cut_off_threshold': {'type': 'float', 'default': 0.04}}</td>\n", - " <td id=\"T_326c3_row21_col7\" class=\"data row21 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'linear_regression', 'model_diagnosis']</td>\n", - " <td id=\"T_326c3_row21_col8\" class=\"data row21 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row22_col0\" class=\"data row22 col0\" >validmind.model_validation.sklearn.PermutationFeatureImportance</td>\n", - " <td id=\"T_326c3_row22_col1\" class=\"data row22 col1\" >Permutation Feature Importance</td>\n", - " <td id=\"T_326c3_row22_col2\" class=\"data row22 col2\" >Assesses the significance of each feature in a model by evaluating the impact on model performance when feature...</td>\n", - " <td id=\"T_326c3_row22_col3\" class=\"data row22 col3\" >True</td>\n", - " <td id=\"T_326c3_row22_col4\" class=\"data row22 col4\" >False</td>\n", - " <td id=\"T_326c3_row22_col5\" class=\"data row22 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row22_col6\" class=\"data row22 col6\" >{'fontsize': {'type': None, 'default': None}, 'figure_height': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_326c3_row22_col7\" class=\"data row22 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'feature_importance', 'visualization']</td>\n", - " <td id=\"T_326c3_row22_col8\" class=\"data row22 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row23_col0\" class=\"data row23 col0\" >validmind.model_validation.sklearn.PopulationStabilityIndex</td>\n", - " <td id=\"T_326c3_row23_col1\" class=\"data row23 col1\" >Population Stability Index</td>\n", - " <td id=\"T_326c3_row23_col2\" class=\"data row23 col2\" >Assesses the Population Stability Index (PSI) to quantify the stability of an ML model's predictions across...</td>\n", - " <td id=\"T_326c3_row23_col3\" class=\"data row23 col3\" >True</td>\n", - " <td id=\"T_326c3_row23_col4\" class=\"data row23 col4\" >True</td>\n", - " <td id=\"T_326c3_row23_col5\" class=\"data row23 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_326c3_row23_col6\" class=\"data row23 col6\" >{'num_bins': {'type': 'int', 'default': 10}, 'mode': {'type': 'str', 'default': 'fixed'}}</td>\n", - " <td id=\"T_326c3_row23_col7\" class=\"data row23 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_326c3_row23_col8\" class=\"data row23 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row24_col0\" class=\"data row24 col0\" >validmind.model_validation.sklearn.PrecisionRecallCurve</td>\n", - " <td id=\"T_326c3_row24_col1\" class=\"data row24 col1\" >Precision Recall Curve</td>\n", - " <td id=\"T_326c3_row24_col2\" class=\"data row24 col2\" >Evaluates the precision-recall trade-off for binary classification models and visualizes the Precision-Recall curve....</td>\n", - " <td id=\"T_326c3_row24_col3\" class=\"data row24 col3\" >True</td>\n", - " <td id=\"T_326c3_row24_col4\" class=\"data row24 col4\" >False</td>\n", - " <td id=\"T_326c3_row24_col5\" class=\"data row24 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row24_col6\" class=\"data row24 col6\" >{}</td>\n", - " <td id=\"T_326c3_row24_col7\" class=\"data row24 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_326c3_row24_col8\" class=\"data row24 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row25_col0\" class=\"data row25 col0\" >validmind.model_validation.sklearn.ROCCurve</td>\n", - " <td id=\"T_326c3_row25_col1\" class=\"data row25 col1\" >ROC Curve</td>\n", - " <td id=\"T_326c3_row25_col2\" class=\"data row25 col2\" >Evaluates binary classification model performance by generating and plotting the Receiver Operating Characteristic...</td>\n", - " <td id=\"T_326c3_row25_col3\" class=\"data row25 col3\" >True</td>\n", - " <td id=\"T_326c3_row25_col4\" class=\"data row25 col4\" >False</td>\n", - " <td id=\"T_326c3_row25_col5\" class=\"data row25 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row25_col6\" class=\"data row25 col6\" >{}</td>\n", - " <td id=\"T_326c3_row25_col7\" class=\"data row25 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_326c3_row25_col8\" class=\"data row25 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row26_col0\" class=\"data row26 col0\" >validmind.model_validation.sklearn.RegressionErrors</td>\n", - " <td id=\"T_326c3_row26_col1\" class=\"data row26 col1\" >Regression Errors</td>\n", - " <td id=\"T_326c3_row26_col2\" class=\"data row26 col2\" >Assesses the performance and error distribution of a regression model using various error metrics....</td>\n", - " <td id=\"T_326c3_row26_col3\" class=\"data row26 col3\" >False</td>\n", - " <td id=\"T_326c3_row26_col4\" class=\"data row26 col4\" >True</td>\n", - " <td id=\"T_326c3_row26_col5\" class=\"data row26 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row26_col6\" class=\"data row26 col6\" >{}</td>\n", - " <td id=\"T_326c3_row26_col7\" class=\"data row26 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_326c3_row26_col8\" class=\"data row26 col8\" >['regression', 'classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row27_col0\" class=\"data row27 col0\" >validmind.model_validation.sklearn.RegressionErrorsComparison</td>\n", - " <td id=\"T_326c3_row27_col1\" class=\"data row27 col1\" >Regression Errors Comparison</td>\n", - " <td id=\"T_326c3_row27_col2\" class=\"data row27 col2\" >Assesses multiple regression error metrics to compare model performance across different datasets, emphasizing...</td>\n", - " <td id=\"T_326c3_row27_col3\" class=\"data row27 col3\" >False</td>\n", - " <td id=\"T_326c3_row27_col4\" class=\"data row27 col4\" >True</td>\n", - " <td id=\"T_326c3_row27_col5\" class=\"data row27 col5\" >['datasets', 'models']</td>\n", - " <td id=\"T_326c3_row27_col6\" class=\"data row27 col6\" >{}</td>\n", - " <td id=\"T_326c3_row27_col7\" class=\"data row27 col7\" >['model_performance', 'sklearn']</td>\n", - " <td id=\"T_326c3_row27_col8\" class=\"data row27 col8\" >['regression', 'time_series_forecasting']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row28_col0\" class=\"data row28 col0\" >validmind.model_validation.sklearn.RegressionPerformance</td>\n", - " <td id=\"T_326c3_row28_col1\" class=\"data row28 col1\" >Regression Performance</td>\n", - " <td id=\"T_326c3_row28_col2\" class=\"data row28 col2\" >Evaluates the performance of a regression model using five different metrics: MAE, MSE, RMSE, MAPE, and MBD....</td>\n", - " <td id=\"T_326c3_row28_col3\" class=\"data row28 col3\" >False</td>\n", - " <td id=\"T_326c3_row28_col4\" class=\"data row28 col4\" >True</td>\n", - " <td id=\"T_326c3_row28_col5\" class=\"data row28 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row28_col6\" class=\"data row28 col6\" >{}</td>\n", - " <td id=\"T_326c3_row28_col7\" class=\"data row28 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_326c3_row28_col8\" class=\"data row28 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row29_col0\" class=\"data row29 col0\" >validmind.model_validation.sklearn.RegressionR2Square</td>\n", - " <td id=\"T_326c3_row29_col1\" class=\"data row29 col1\" >Regression R2 Square</td>\n", - " <td id=\"T_326c3_row29_col2\" class=\"data row29 col2\" >Assesses the overall goodness-of-fit of a regression model by evaluating R-squared (R2) and Adjusted R-squared (Adj...</td>\n", - " <td id=\"T_326c3_row29_col3\" class=\"data row29 col3\" >False</td>\n", - " <td id=\"T_326c3_row29_col4\" class=\"data row29 col4\" >True</td>\n", - " <td id=\"T_326c3_row29_col5\" class=\"data row29 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_326c3_row29_col6\" class=\"data row29 col6\" >{}</td>\n", - " <td id=\"T_326c3_row29_col7\" class=\"data row29 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_326c3_row29_col8\" class=\"data row29 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row30_col0\" class=\"data row30 col0\" >validmind.model_validation.sklearn.RegressionR2SquareComparison</td>\n", - " <td id=\"T_326c3_row30_col1\" class=\"data row30 col1\" >Regression R2 Square Comparison</td>\n", - " <td id=\"T_326c3_row30_col2\" class=\"data row30 col2\" >Compares R-Squared and Adjusted R-Squared values for different regression models across multiple datasets to assess...</td>\n", - " <td id=\"T_326c3_row30_col3\" class=\"data row30 col3\" >False</td>\n", - " <td id=\"T_326c3_row30_col4\" class=\"data row30 col4\" >True</td>\n", - " <td id=\"T_326c3_row30_col5\" class=\"data row30 col5\" >['datasets', 'models']</td>\n", - " <td id=\"T_326c3_row30_col6\" class=\"data row30 col6\" >{}</td>\n", - " <td id=\"T_326c3_row30_col7\" class=\"data row30 col7\" >['model_performance', 'sklearn']</td>\n", - " <td id=\"T_326c3_row30_col8\" class=\"data row30 col8\" >['regression', 'time_series_forecasting']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row31_col0\" class=\"data row31 col0\" >validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", - " <td id=\"T_326c3_row31_col1\" class=\"data row31 col1\" >Robustness Diagnosis</td>\n", - " <td id=\"T_326c3_row31_col2\" class=\"data row31 col2\" >Assesses the robustness of a machine learning model by evaluating performance decay under noisy conditions....</td>\n", - " <td id=\"T_326c3_row31_col3\" class=\"data row31 col3\" >True</td>\n", - " <td id=\"T_326c3_row31_col4\" class=\"data row31 col4\" >True</td>\n", - " <td id=\"T_326c3_row31_col5\" class=\"data row31 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_326c3_row31_col6\" class=\"data row31 col6\" >{'metric': {'type': 'str', 'default': None}, 'scaling_factor_std_dev_list': {'type': None, 'default': [0.1, 0.2, 0.3, 0.4, 0.5]}, 'performance_decay_threshold': {'type': 'float', 'default': 0.05}}</td>\n", - " <td id=\"T_326c3_row31_col7\" class=\"data row31 col7\" >['sklearn', 'model_diagnosis', 'visualization']</td>\n", - " <td id=\"T_326c3_row31_col8\" class=\"data row31 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row32_col0\" class=\"data row32 col0\" >validmind.model_validation.sklearn.SHAPGlobalImportance</td>\n", - " <td id=\"T_326c3_row32_col1\" class=\"data row32 col1\" >SHAP Global Importance</td>\n", - " <td id=\"T_326c3_row32_col2\" class=\"data row32 col2\" >Evaluates and visualizes global feature importance using SHAP values for model explanation and risk identification....</td>\n", - " <td id=\"T_326c3_row32_col3\" class=\"data row32 col3\" >False</td>\n", - " <td id=\"T_326c3_row32_col4\" class=\"data row32 col4\" >True</td>\n", - " <td id=\"T_326c3_row32_col5\" class=\"data row32 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row32_col6\" class=\"data row32 col6\" >{'kernel_explainer_samples': {'type': 'int', 'default': 10}, 'tree_or_linear_explainer_samples': {'type': 'int', 'default': 200}, 'class_of_interest': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_326c3_row32_col7\" class=\"data row32 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'feature_importance', 'visualization']</td>\n", - " <td id=\"T_326c3_row32_col8\" class=\"data row32 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row33_col0\" class=\"data row33 col0\" >validmind.model_validation.sklearn.ScoreProbabilityAlignment</td>\n", - " <td id=\"T_326c3_row33_col1\" class=\"data row33 col1\" >Score Probability Alignment</td>\n", - " <td id=\"T_326c3_row33_col2\" class=\"data row33 col2\" >Analyzes the alignment between credit scores and predicted probabilities....</td>\n", - " <td id=\"T_326c3_row33_col3\" class=\"data row33 col3\" >True</td>\n", - " <td id=\"T_326c3_row33_col4\" class=\"data row33 col4\" >True</td>\n", - " <td id=\"T_326c3_row33_col5\" class=\"data row33 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row33_col6\" class=\"data row33 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'n_bins': {'type': 'int', 'default': 10}}</td>\n", - " <td id=\"T_326c3_row33_col7\" class=\"data row33 col7\" >['visualization', 'credit_risk', 'calibration']</td>\n", - " <td id=\"T_326c3_row33_col8\" class=\"data row33 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row34_col0\" class=\"data row34 col0\" >validmind.model_validation.sklearn.SilhouettePlot</td>\n", - " <td id=\"T_326c3_row34_col1\" class=\"data row34 col1\" >Silhouette Plot</td>\n", - " <td id=\"T_326c3_row34_col2\" class=\"data row34 col2\" >Calculates and visualizes Silhouette Score, assessing the degree of data point suitability to its cluster in ML...</td>\n", - " <td id=\"T_326c3_row34_col3\" class=\"data row34 col3\" >True</td>\n", - " <td id=\"T_326c3_row34_col4\" class=\"data row34 col4\" >True</td>\n", - " <td id=\"T_326c3_row34_col5\" class=\"data row34 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_326c3_row34_col6\" class=\"data row34 col6\" >{}</td>\n", - " <td id=\"T_326c3_row34_col7\" class=\"data row34 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_326c3_row34_col8\" class=\"data row34 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row35_col0\" class=\"data row35 col0\" >validmind.model_validation.sklearn.TrainingTestDegradation</td>\n", - " <td id=\"T_326c3_row35_col1\" class=\"data row35 col1\" >Training Test Degradation</td>\n", - " <td id=\"T_326c3_row35_col2\" class=\"data row35 col2\" >Tests if model performance degradation between training and test datasets exceeds a predefined threshold....</td>\n", - " <td id=\"T_326c3_row35_col3\" class=\"data row35 col3\" >False</td>\n", - " <td id=\"T_326c3_row35_col4\" class=\"data row35 col4\" >True</td>\n", - " <td id=\"T_326c3_row35_col5\" class=\"data row35 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_326c3_row35_col6\" class=\"data row35 col6\" >{'max_threshold': {'type': 'float', 'default': 0.1}}</td>\n", - " <td id=\"T_326c3_row35_col7\" class=\"data row35 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_326c3_row35_col8\" class=\"data row35 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row36_col0\" class=\"data row36 col0\" >validmind.model_validation.sklearn.VMeasure</td>\n", - " <td id=\"T_326c3_row36_col1\" class=\"data row36 col1\" >V Measure</td>\n", - " <td id=\"T_326c3_row36_col2\" class=\"data row36 col2\" >Evaluates homogeneity and completeness of a clustering model using the V Measure Score....</td>\n", - " <td id=\"T_326c3_row36_col3\" class=\"data row36 col3\" >False</td>\n", - " <td id=\"T_326c3_row36_col4\" class=\"data row36 col4\" >True</td>\n", - " <td id=\"T_326c3_row36_col5\" class=\"data row36 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_326c3_row36_col6\" class=\"data row36 col6\" >{}</td>\n", - " <td id=\"T_326c3_row36_col7\" class=\"data row36 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_326c3_row36_col8\" class=\"data row36 col8\" >['clustering']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row37_col0\" class=\"data row37 col0\" >validmind.model_validation.sklearn.WeakspotsDiagnosis</td>\n", - " <td id=\"T_326c3_row37_col1\" class=\"data row37 col1\" >Weakspots Diagnosis</td>\n", - " <td id=\"T_326c3_row37_col2\" class=\"data row37 col2\" >Identifies and visualizes weak spots in a machine learning model's performance across various sections of the...</td>\n", - " <td id=\"T_326c3_row37_col3\" class=\"data row37 col3\" >True</td>\n", - " <td id=\"T_326c3_row37_col4\" class=\"data row37 col4\" >True</td>\n", - " <td id=\"T_326c3_row37_col5\" class=\"data row37 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_326c3_row37_col6\" class=\"data row37 col6\" >{'features_columns': {'type': None, 'default': None}, 'metrics': {'type': None, 'default': None}, 'thresholds': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_326c3_row37_col7\" class=\"data row37 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_diagnosis', 'visualization']</td>\n", - " <td id=\"T_326c3_row37_col8\" class=\"data row37 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row38_col0\" class=\"data row38 col0\" >validmind.ongoing_monitoring.CalibrationCurveDrift</td>\n", - " <td id=\"T_326c3_row38_col1\" class=\"data row38 col1\" >Calibration Curve Drift</td>\n", - " <td id=\"T_326c3_row38_col2\" class=\"data row38 col2\" >Evaluates changes in probability calibration between reference and monitoring datasets....</td>\n", - " <td id=\"T_326c3_row38_col3\" class=\"data row38 col3\" >True</td>\n", - " <td id=\"T_326c3_row38_col4\" class=\"data row38 col4\" >True</td>\n", - " <td id=\"T_326c3_row38_col5\" class=\"data row38 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_326c3_row38_col6\" class=\"data row38 col6\" >{'n_bins': {'type': 'int', 'default': 10}, 'drift_pct_threshold': {'type': 'float', 'default': 20}}</td>\n", - " <td id=\"T_326c3_row38_col7\" class=\"data row38 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_326c3_row38_col8\" class=\"data row38 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row39_col0\" class=\"data row39 col0\" >validmind.ongoing_monitoring.ClassDiscriminationDrift</td>\n", - " <td id=\"T_326c3_row39_col1\" class=\"data row39 col1\" >Class Discrimination Drift</td>\n", - " <td id=\"T_326c3_row39_col2\" class=\"data row39 col2\" >Compares classification discrimination metrics between reference and monitoring datasets....</td>\n", - " <td id=\"T_326c3_row39_col3\" class=\"data row39 col3\" >False</td>\n", - " <td id=\"T_326c3_row39_col4\" class=\"data row39 col4\" >True</td>\n", - " <td id=\"T_326c3_row39_col5\" class=\"data row39 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_326c3_row39_col6\" class=\"data row39 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", - " <td id=\"T_326c3_row39_col7\" class=\"data row39 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_326c3_row39_col8\" class=\"data row39 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row40_col0\" class=\"data row40 col0\" >validmind.ongoing_monitoring.ClassificationAccuracyDrift</td>\n", - " <td id=\"T_326c3_row40_col1\" class=\"data row40 col1\" >Classification Accuracy Drift</td>\n", - " <td id=\"T_326c3_row40_col2\" class=\"data row40 col2\" >Compares classification accuracy metrics between reference and monitoring datasets....</td>\n", - " <td id=\"T_326c3_row40_col3\" class=\"data row40 col3\" >False</td>\n", - " <td id=\"T_326c3_row40_col4\" class=\"data row40 col4\" >True</td>\n", - " <td id=\"T_326c3_row40_col5\" class=\"data row40 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_326c3_row40_col6\" class=\"data row40 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", - " <td id=\"T_326c3_row40_col7\" class=\"data row40 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_326c3_row40_col8\" class=\"data row40 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row41_col0\" class=\"data row41 col0\" >validmind.ongoing_monitoring.ConfusionMatrixDrift</td>\n", - " <td id=\"T_326c3_row41_col1\" class=\"data row41 col1\" >Confusion Matrix Drift</td>\n", - " <td id=\"T_326c3_row41_col2\" class=\"data row41 col2\" >Compares confusion matrix metrics between reference and monitoring datasets....</td>\n", - " <td id=\"T_326c3_row41_col3\" class=\"data row41 col3\" >False</td>\n", - " <td id=\"T_326c3_row41_col4\" class=\"data row41 col4\" >True</td>\n", - " <td id=\"T_326c3_row41_col5\" class=\"data row41 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_326c3_row41_col6\" class=\"data row41 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", - " <td id=\"T_326c3_row41_col7\" class=\"data row41 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_326c3_row41_col8\" class=\"data row41 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_326c3_row42_col0\" class=\"data row42 col0\" >validmind.ongoing_monitoring.ROCCurveDrift</td>\n", - " <td id=\"T_326c3_row42_col1\" class=\"data row42 col1\" >ROC Curve Drift</td>\n", - " <td id=\"T_326c3_row42_col2\" class=\"data row42 col2\" >Compares ROC curves between reference and monitoring datasets....</td>\n", - " <td id=\"T_326c3_row42_col3\" class=\"data row42 col3\" >True</td>\n", - " <td id=\"T_326c3_row42_col4\" class=\"data row42 col4\" >False</td>\n", - " <td id=\"T_326c3_row42_col5\" class=\"data row42 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_326c3_row42_col6\" class=\"data row42 col6\" >{}</td>\n", - " <td id=\"T_326c3_row42_col7\" class=\"data row42 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_326c3_row42_col8\" class=\"data row42 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " </tbody>\n", - "</table>\n" + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Finally, to match each task type with its related tags, use the [list_tasks_and_tags()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tasks_and_tags) function:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "<style type=\"text/css\">\n", + "#T_ac294 th {\n", + " text-align: left;\n", + "}\n", + "#T_ac294_row0_col0, #T_ac294_row0_col1, #T_ac294_row1_col0, #T_ac294_row1_col1, #T_ac294_row2_col0, #T_ac294_row2_col1, #T_ac294_row3_col0, #T_ac294_row3_col1, #T_ac294_row4_col0, #T_ac294_row4_col1, #T_ac294_row5_col0, #T_ac294_row5_col1, #T_ac294_row6_col0, #T_ac294_row6_col1, #T_ac294_row7_col0, #T_ac294_row7_col1, #T_ac294_row8_col0, #T_ac294_row8_col1, #T_ac294_row9_col0, #T_ac294_row9_col1, #T_ac294_row10_col0, #T_ac294_row10_col1, #T_ac294_row11_col0, #T_ac294_row11_col1, #T_ac294_row12_col0, #T_ac294_row12_col1, #T_ac294_row13_col0, #T_ac294_row13_col1 {\n", + " text-align: left;\n", + "}\n", + "</style>\n", + "<table id=\"T_ac294\">\n", + " <thead>\n", + " <tr>\n", + " <th id=\"T_ac294_level0_col0\" class=\"col_heading level0 col0\" >Task</th>\n", + " <th id=\"T_ac294_level0_col1\" class=\"col_heading level0 col1\" >Tags</th>\n", + " </tr>\n", + " </thead>\n", + " <tbody>\n", + " <tr>\n", + " <td id=\"T_ac294_row0_col0\" class=\"data row0 col0\" >regression</td>\n", + " <td id=\"T_ac294_row0_col1\" class=\"data row0 col1\" >senstivity_analysis, tabular_data, time_series_data, model_predictions, feature_selection, correlation, regression, statsmodels, model_performance, model_training, multiclass_classification, linear_regression, data_quality, text_data, model_explainability, binary_classification, stationarity, bias_and_fairness, numerical_data, sklearn, model_selection, statistical_test, descriptive_statistics, seasonality, analysis, data_validation, data_distribution, metadata, feature_importance, visualization, forecasting, model_diagnosis, model_interpretation, unit_root_test, categorical_data, data_analysis</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_ac294_row1_col0\" class=\"data row1 col0\" >classification</td>\n", + " <td id=\"T_ac294_row1_col1\" class=\"data row1 col1\" >calibration, anomaly_detection, classification_metrics, tabular_data, time_series_data, feature_selection, correlation, statsmodels, model_performance, model_validation, model_training, classification, multiclass_classification, linear_regression, data_quality, text_data, binary_classification, threshold_optimization, bias_and_fairness, scorecard, model_comparison, numerical_data, sklearn, statistical_test, descriptive_statistics, feature_importance, data_distribution, metadata, visualization, credit_risk, AUC, logistic_regression, model_diagnosis, categorical_data, data_analysis</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_ac294_row2_col0\" class=\"data row2 col0\" >text_classification</td>\n", + " <td id=\"T_ac294_row2_col1\" class=\"data row2 col1\" >model_performance, feature_importance, multiclass_classification, few_shot, frequency_analysis, zero_shot, text_data, visualization, llm, binary_classification, ragas, model_diagnosis, model_comparison, sklearn, nlp, retrieval_performance, tabular_data, time_series_data</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_ac294_row3_col0\" class=\"data row3 col0\" >text_summarization</td>\n", + " <td id=\"T_ac294_row3_col1\" class=\"data row3 col1\" >qualitative, few_shot, frequency_analysis, embeddings, zero_shot, text_data, visualization, llm, rag_performance, ragas, retrieval_performance, nlp, dimensionality_reduction, tabular_data, time_series_data</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_ac294_row4_col0\" class=\"data row4 col0\" >data_validation</td>\n", + " <td id=\"T_ac294_row4_col1\" class=\"data row4 col1\" >stationarity, statsmodels, unit_root_test, time_series_data</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_ac294_row5_col0\" class=\"data row5 col0\" >time_series_forecasting</td>\n", + " <td id=\"T_ac294_row5_col1\" class=\"data row5 col1\" >model_training, data_validation, metadata, visualization, model_explainability, sklearn, model_performance, model_predictions, time_series_data</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_ac294_row6_col0\" class=\"data row6 col0\" >nlp</td>\n", + " <td id=\"T_ac294_row6_col1\" class=\"data row6 col1\" >data_validation, frequency_analysis, text_data, visualization, nlp</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_ac294_row7_col0\" class=\"data row7 col0\" >clustering</td>\n", + " <td id=\"T_ac294_row7_col1\" class=\"data row7 col1\" >clustering, model_performance, kmeans, sklearn</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_ac294_row8_col0\" class=\"data row8 col0\" >residual_analysis</td>\n", + " <td id=\"T_ac294_row8_col1\" class=\"data row8 col1\" >regression</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_ac294_row9_col0\" class=\"data row9 col0\" >visualization</td>\n", + " <td id=\"T_ac294_row9_col1\" class=\"data row9 col1\" >regression</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_ac294_row10_col0\" class=\"data row10 col0\" >feature_extraction</td>\n", + " <td id=\"T_ac294_row10_col1\" class=\"data row10 col1\" >embeddings, text_data, visualization, llm</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_ac294_row11_col0\" class=\"data row11 col0\" >text_qa</td>\n", + " <td id=\"T_ac294_row11_col1\" class=\"data row11 col1\" >qualitative, embeddings, visualization, llm, rag_performance, ragas, dimensionality_reduction, retrieval_performance</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_ac294_row12_col0\" class=\"data row12 col0\" >text_generation</td>\n", + " <td id=\"T_ac294_row12_col1\" class=\"data row12 col1\" >qualitative, embeddings, visualization, llm, rag_performance, ragas, dimensionality_reduction, retrieval_performance</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_ac294_row13_col0\" class=\"data row13 col0\" >monitoring</td>\n", + " <td id=\"T_ac294_row13_col1\" class=\"data row13 col1\" >visualization</td>\n", + " </tr>\n", + " </tbody>\n", + "</table>\n" + ], + "text/plain": [ + "<pandas.io.formats.style.Styler at 0x38000adc0>" + ] + }, + "execution_count": null, + "metadata": {}, + "output_type": "execute_result" + } ], - "text/plain": [ - "<pandas.io.formats.style.Styler at 0x1052e6790>" + "source": [ + "list_tasks_and_tags()" ] - }, - "execution_count": 6, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "list_tests(filter=\"sklearn\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Use the `task` parameter to find tests that match a specific task type, such as `classification`:" - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": {}, - "outputs": [ + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Filter tests by tags and task types\n", + "\n", + "While listing all tests is useful, you’ll often want to narrow your search. The [list_tests()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) function supports `filter`, `task`, and `tags` parameters to assist in refining your results.\n", + "\n", + "Use the `filter` parameter to find tests that match a specific keyword, such as `sklearn`:" + ] + }, { - "data": { - "text/html": [ - "<style type=\"text/css\">\n", - "#T_56dd5 th {\n", - " text-align: left;\n", - "}\n", - "#T_56dd5_row0_col0, #T_56dd5_row0_col1, #T_56dd5_row0_col2, #T_56dd5_row0_col3, #T_56dd5_row0_col4, #T_56dd5_row0_col5, #T_56dd5_row0_col6, #T_56dd5_row0_col7, #T_56dd5_row0_col8, #T_56dd5_row1_col0, #T_56dd5_row1_col1, #T_56dd5_row1_col2, #T_56dd5_row1_col3, #T_56dd5_row1_col4, #T_56dd5_row1_col5, #T_56dd5_row1_col6, #T_56dd5_row1_col7, #T_56dd5_row1_col8, #T_56dd5_row2_col0, #T_56dd5_row2_col1, #T_56dd5_row2_col2, #T_56dd5_row2_col3, #T_56dd5_row2_col4, #T_56dd5_row2_col5, #T_56dd5_row2_col6, #T_56dd5_row2_col7, #T_56dd5_row2_col8, #T_56dd5_row3_col0, #T_56dd5_row3_col1, #T_56dd5_row3_col2, #T_56dd5_row3_col3, #T_56dd5_row3_col4, #T_56dd5_row3_col5, #T_56dd5_row3_col6, #T_56dd5_row3_col7, #T_56dd5_row3_col8, #T_56dd5_row4_col0, #T_56dd5_row4_col1, #T_56dd5_row4_col2, #T_56dd5_row4_col3, #T_56dd5_row4_col4, #T_56dd5_row4_col5, #T_56dd5_row4_col6, #T_56dd5_row4_col7, #T_56dd5_row4_col8, #T_56dd5_row5_col0, #T_56dd5_row5_col1, #T_56dd5_row5_col2, #T_56dd5_row5_col3, #T_56dd5_row5_col4, #T_56dd5_row5_col5, #T_56dd5_row5_col6, #T_56dd5_row5_col7, #T_56dd5_row5_col8, #T_56dd5_row6_col0, #T_56dd5_row6_col1, #T_56dd5_row6_col2, #T_56dd5_row6_col3, #T_56dd5_row6_col4, #T_56dd5_row6_col5, #T_56dd5_row6_col6, #T_56dd5_row6_col7, #T_56dd5_row6_col8, #T_56dd5_row7_col0, #T_56dd5_row7_col1, #T_56dd5_row7_col2, #T_56dd5_row7_col3, #T_56dd5_row7_col4, #T_56dd5_row7_col5, #T_56dd5_row7_col6, #T_56dd5_row7_col7, #T_56dd5_row7_col8, #T_56dd5_row8_col0, #T_56dd5_row8_col1, #T_56dd5_row8_col2, #T_56dd5_row8_col3, #T_56dd5_row8_col4, #T_56dd5_row8_col5, #T_56dd5_row8_col6, #T_56dd5_row8_col7, #T_56dd5_row8_col8, #T_56dd5_row9_col0, #T_56dd5_row9_col1, #T_56dd5_row9_col2, #T_56dd5_row9_col3, #T_56dd5_row9_col4, #T_56dd5_row9_col5, #T_56dd5_row9_col6, #T_56dd5_row9_col7, #T_56dd5_row9_col8, #T_56dd5_row10_col0, #T_56dd5_row10_col1, #T_56dd5_row10_col2, #T_56dd5_row10_col3, #T_56dd5_row10_col4, #T_56dd5_row10_col5, #T_56dd5_row10_col6, #T_56dd5_row10_col7, #T_56dd5_row10_col8, #T_56dd5_row11_col0, #T_56dd5_row11_col1, #T_56dd5_row11_col2, #T_56dd5_row11_col3, #T_56dd5_row11_col4, #T_56dd5_row11_col5, #T_56dd5_row11_col6, #T_56dd5_row11_col7, #T_56dd5_row11_col8, #T_56dd5_row12_col0, #T_56dd5_row12_col1, #T_56dd5_row12_col2, #T_56dd5_row12_col3, #T_56dd5_row12_col4, #T_56dd5_row12_col5, #T_56dd5_row12_col6, #T_56dd5_row12_col7, #T_56dd5_row12_col8, #T_56dd5_row13_col0, #T_56dd5_row13_col1, #T_56dd5_row13_col2, #T_56dd5_row13_col3, #T_56dd5_row13_col4, #T_56dd5_row13_col5, #T_56dd5_row13_col6, #T_56dd5_row13_col7, #T_56dd5_row13_col8, #T_56dd5_row14_col0, #T_56dd5_row14_col1, #T_56dd5_row14_col2, #T_56dd5_row14_col3, #T_56dd5_row14_col4, #T_56dd5_row14_col5, #T_56dd5_row14_col6, #T_56dd5_row14_col7, #T_56dd5_row14_col8, #T_56dd5_row15_col0, #T_56dd5_row15_col1, #T_56dd5_row15_col2, #T_56dd5_row15_col3, #T_56dd5_row15_col4, #T_56dd5_row15_col5, #T_56dd5_row15_col6, #T_56dd5_row15_col7, #T_56dd5_row15_col8, #T_56dd5_row16_col0, #T_56dd5_row16_col1, #T_56dd5_row16_col2, #T_56dd5_row16_col3, #T_56dd5_row16_col4, #T_56dd5_row16_col5, #T_56dd5_row16_col6, #T_56dd5_row16_col7, #T_56dd5_row16_col8, #T_56dd5_row17_col0, #T_56dd5_row17_col1, #T_56dd5_row17_col2, #T_56dd5_row17_col3, #T_56dd5_row17_col4, #T_56dd5_row17_col5, #T_56dd5_row17_col6, #T_56dd5_row17_col7, #T_56dd5_row17_col8, #T_56dd5_row18_col0, #T_56dd5_row18_col1, #T_56dd5_row18_col2, #T_56dd5_row18_col3, #T_56dd5_row18_col4, #T_56dd5_row18_col5, #T_56dd5_row18_col6, #T_56dd5_row18_col7, #T_56dd5_row18_col8, #T_56dd5_row19_col0, #T_56dd5_row19_col1, #T_56dd5_row19_col2, #T_56dd5_row19_col3, #T_56dd5_row19_col4, #T_56dd5_row19_col5, #T_56dd5_row19_col6, #T_56dd5_row19_col7, #T_56dd5_row19_col8, #T_56dd5_row20_col0, #T_56dd5_row20_col1, #T_56dd5_row20_col2, #T_56dd5_row20_col3, #T_56dd5_row20_col4, #T_56dd5_row20_col5, #T_56dd5_row20_col6, #T_56dd5_row20_col7, #T_56dd5_row20_col8, #T_56dd5_row21_col0, #T_56dd5_row21_col1, #T_56dd5_row21_col2, #T_56dd5_row21_col3, #T_56dd5_row21_col4, #T_56dd5_row21_col5, #T_56dd5_row21_col6, #T_56dd5_row21_col7, #T_56dd5_row21_col8, #T_56dd5_row22_col0, #T_56dd5_row22_col1, #T_56dd5_row22_col2, #T_56dd5_row22_col3, #T_56dd5_row22_col4, #T_56dd5_row22_col5, #T_56dd5_row22_col6, #T_56dd5_row22_col7, #T_56dd5_row22_col8, #T_56dd5_row23_col0, #T_56dd5_row23_col1, #T_56dd5_row23_col2, #T_56dd5_row23_col3, #T_56dd5_row23_col4, #T_56dd5_row23_col5, #T_56dd5_row23_col6, #T_56dd5_row23_col7, #T_56dd5_row23_col8, #T_56dd5_row24_col0, #T_56dd5_row24_col1, #T_56dd5_row24_col2, #T_56dd5_row24_col3, #T_56dd5_row24_col4, #T_56dd5_row24_col5, #T_56dd5_row24_col6, #T_56dd5_row24_col7, #T_56dd5_row24_col8, #T_56dd5_row25_col0, #T_56dd5_row25_col1, #T_56dd5_row25_col2, #T_56dd5_row25_col3, #T_56dd5_row25_col4, #T_56dd5_row25_col5, #T_56dd5_row25_col6, #T_56dd5_row25_col7, #T_56dd5_row25_col8, #T_56dd5_row26_col0, #T_56dd5_row26_col1, #T_56dd5_row26_col2, #T_56dd5_row26_col3, #T_56dd5_row26_col4, #T_56dd5_row26_col5, #T_56dd5_row26_col6, #T_56dd5_row26_col7, #T_56dd5_row26_col8, #T_56dd5_row27_col0, #T_56dd5_row27_col1, #T_56dd5_row27_col2, #T_56dd5_row27_col3, #T_56dd5_row27_col4, #T_56dd5_row27_col5, #T_56dd5_row27_col6, #T_56dd5_row27_col7, #T_56dd5_row27_col8, #T_56dd5_row28_col0, #T_56dd5_row28_col1, #T_56dd5_row28_col2, #T_56dd5_row28_col3, #T_56dd5_row28_col4, #T_56dd5_row28_col5, #T_56dd5_row28_col6, #T_56dd5_row28_col7, #T_56dd5_row28_col8, #T_56dd5_row29_col0, #T_56dd5_row29_col1, #T_56dd5_row29_col2, #T_56dd5_row29_col3, #T_56dd5_row29_col4, #T_56dd5_row29_col5, #T_56dd5_row29_col6, #T_56dd5_row29_col7, #T_56dd5_row29_col8, #T_56dd5_row30_col0, #T_56dd5_row30_col1, #T_56dd5_row30_col2, #T_56dd5_row30_col3, #T_56dd5_row30_col4, #T_56dd5_row30_col5, #T_56dd5_row30_col6, #T_56dd5_row30_col7, #T_56dd5_row30_col8, #T_56dd5_row31_col0, #T_56dd5_row31_col1, #T_56dd5_row31_col2, #T_56dd5_row31_col3, #T_56dd5_row31_col4, #T_56dd5_row31_col5, #T_56dd5_row31_col6, #T_56dd5_row31_col7, #T_56dd5_row31_col8, #T_56dd5_row32_col0, #T_56dd5_row32_col1, #T_56dd5_row32_col2, #T_56dd5_row32_col3, #T_56dd5_row32_col4, #T_56dd5_row32_col5, #T_56dd5_row32_col6, #T_56dd5_row32_col7, #T_56dd5_row32_col8, #T_56dd5_row33_col0, #T_56dd5_row33_col1, #T_56dd5_row33_col2, #T_56dd5_row33_col3, #T_56dd5_row33_col4, #T_56dd5_row33_col5, #T_56dd5_row33_col6, #T_56dd5_row33_col7, #T_56dd5_row33_col8, #T_56dd5_row34_col0, #T_56dd5_row34_col1, #T_56dd5_row34_col2, #T_56dd5_row34_col3, #T_56dd5_row34_col4, #T_56dd5_row34_col5, #T_56dd5_row34_col6, #T_56dd5_row34_col7, #T_56dd5_row34_col8, #T_56dd5_row35_col0, #T_56dd5_row35_col1, #T_56dd5_row35_col2, #T_56dd5_row35_col3, #T_56dd5_row35_col4, #T_56dd5_row35_col5, #T_56dd5_row35_col6, #T_56dd5_row35_col7, #T_56dd5_row35_col8, #T_56dd5_row36_col0, #T_56dd5_row36_col1, #T_56dd5_row36_col2, #T_56dd5_row36_col3, #T_56dd5_row36_col4, #T_56dd5_row36_col5, #T_56dd5_row36_col6, #T_56dd5_row36_col7, #T_56dd5_row36_col8, #T_56dd5_row37_col0, #T_56dd5_row37_col1, #T_56dd5_row37_col2, #T_56dd5_row37_col3, #T_56dd5_row37_col4, #T_56dd5_row37_col5, #T_56dd5_row37_col6, #T_56dd5_row37_col7, #T_56dd5_row37_col8, #T_56dd5_row38_col0, #T_56dd5_row38_col1, #T_56dd5_row38_col2, #T_56dd5_row38_col3, #T_56dd5_row38_col4, #T_56dd5_row38_col5, #T_56dd5_row38_col6, #T_56dd5_row38_col7, #T_56dd5_row38_col8, #T_56dd5_row39_col0, #T_56dd5_row39_col1, #T_56dd5_row39_col2, #T_56dd5_row39_col3, #T_56dd5_row39_col4, #T_56dd5_row39_col5, #T_56dd5_row39_col6, #T_56dd5_row39_col7, #T_56dd5_row39_col8, #T_56dd5_row40_col0, #T_56dd5_row40_col1, #T_56dd5_row40_col2, #T_56dd5_row40_col3, #T_56dd5_row40_col4, #T_56dd5_row40_col5, #T_56dd5_row40_col6, #T_56dd5_row40_col7, #T_56dd5_row40_col8, #T_56dd5_row41_col0, #T_56dd5_row41_col1, #T_56dd5_row41_col2, #T_56dd5_row41_col3, #T_56dd5_row41_col4, #T_56dd5_row41_col5, #T_56dd5_row41_col6, #T_56dd5_row41_col7, #T_56dd5_row41_col8, #T_56dd5_row42_col0, #T_56dd5_row42_col1, #T_56dd5_row42_col2, #T_56dd5_row42_col3, #T_56dd5_row42_col4, #T_56dd5_row42_col5, #T_56dd5_row42_col6, #T_56dd5_row42_col7, #T_56dd5_row42_col8, #T_56dd5_row43_col0, #T_56dd5_row43_col1, #T_56dd5_row43_col2, #T_56dd5_row43_col3, #T_56dd5_row43_col4, #T_56dd5_row43_col5, #T_56dd5_row43_col6, #T_56dd5_row43_col7, #T_56dd5_row43_col8, #T_56dd5_row44_col0, #T_56dd5_row44_col1, #T_56dd5_row44_col2, #T_56dd5_row44_col3, #T_56dd5_row44_col4, #T_56dd5_row44_col5, #T_56dd5_row44_col6, #T_56dd5_row44_col7, #T_56dd5_row44_col8, #T_56dd5_row45_col0, #T_56dd5_row45_col1, #T_56dd5_row45_col2, #T_56dd5_row45_col3, #T_56dd5_row45_col4, #T_56dd5_row45_col5, #T_56dd5_row45_col6, #T_56dd5_row45_col7, #T_56dd5_row45_col8, #T_56dd5_row46_col0, #T_56dd5_row46_col1, #T_56dd5_row46_col2, #T_56dd5_row46_col3, #T_56dd5_row46_col4, #T_56dd5_row46_col5, #T_56dd5_row46_col6, #T_56dd5_row46_col7, #T_56dd5_row46_col8, #T_56dd5_row47_col0, #T_56dd5_row47_col1, #T_56dd5_row47_col2, #T_56dd5_row47_col3, #T_56dd5_row47_col4, #T_56dd5_row47_col5, #T_56dd5_row47_col6, #T_56dd5_row47_col7, #T_56dd5_row47_col8, #T_56dd5_row48_col0, #T_56dd5_row48_col1, #T_56dd5_row48_col2, #T_56dd5_row48_col3, #T_56dd5_row48_col4, #T_56dd5_row48_col5, #T_56dd5_row48_col6, #T_56dd5_row48_col7, #T_56dd5_row48_col8, #T_56dd5_row49_col0, #T_56dd5_row49_col1, #T_56dd5_row49_col2, #T_56dd5_row49_col3, #T_56dd5_row49_col4, #T_56dd5_row49_col5, #T_56dd5_row49_col6, #T_56dd5_row49_col7, #T_56dd5_row49_col8, #T_56dd5_row50_col0, #T_56dd5_row50_col1, #T_56dd5_row50_col2, #T_56dd5_row50_col3, #T_56dd5_row50_col4, #T_56dd5_row50_col5, #T_56dd5_row50_col6, #T_56dd5_row50_col7, #T_56dd5_row50_col8, #T_56dd5_row51_col0, #T_56dd5_row51_col1, #T_56dd5_row51_col2, #T_56dd5_row51_col3, #T_56dd5_row51_col4, #T_56dd5_row51_col5, #T_56dd5_row51_col6, #T_56dd5_row51_col7, #T_56dd5_row51_col8, #T_56dd5_row52_col0, #T_56dd5_row52_col1, #T_56dd5_row52_col2, #T_56dd5_row52_col3, #T_56dd5_row52_col4, #T_56dd5_row52_col5, #T_56dd5_row52_col6, #T_56dd5_row52_col7, #T_56dd5_row52_col8, #T_56dd5_row53_col0, #T_56dd5_row53_col1, #T_56dd5_row53_col2, #T_56dd5_row53_col3, #T_56dd5_row53_col4, #T_56dd5_row53_col5, #T_56dd5_row53_col6, #T_56dd5_row53_col7, #T_56dd5_row53_col8, #T_56dd5_row54_col0, #T_56dd5_row54_col1, #T_56dd5_row54_col2, #T_56dd5_row54_col3, #T_56dd5_row54_col4, #T_56dd5_row54_col5, #T_56dd5_row54_col6, #T_56dd5_row54_col7, #T_56dd5_row54_col8, #T_56dd5_row55_col0, #T_56dd5_row55_col1, #T_56dd5_row55_col2, #T_56dd5_row55_col3, #T_56dd5_row55_col4, #T_56dd5_row55_col5, #T_56dd5_row55_col6, #T_56dd5_row55_col7, #T_56dd5_row55_col8, #T_56dd5_row56_col0, #T_56dd5_row56_col1, #T_56dd5_row56_col2, #T_56dd5_row56_col3, #T_56dd5_row56_col4, #T_56dd5_row56_col5, #T_56dd5_row56_col6, #T_56dd5_row56_col7, #T_56dd5_row56_col8, #T_56dd5_row57_col0, #T_56dd5_row57_col1, #T_56dd5_row57_col2, #T_56dd5_row57_col3, #T_56dd5_row57_col4, #T_56dd5_row57_col5, #T_56dd5_row57_col6, #T_56dd5_row57_col7, #T_56dd5_row57_col8, #T_56dd5_row58_col0, #T_56dd5_row58_col1, #T_56dd5_row58_col2, #T_56dd5_row58_col3, #T_56dd5_row58_col4, #T_56dd5_row58_col5, #T_56dd5_row58_col6, #T_56dd5_row58_col7, #T_56dd5_row58_col8, #T_56dd5_row59_col0, #T_56dd5_row59_col1, #T_56dd5_row59_col2, #T_56dd5_row59_col3, #T_56dd5_row59_col4, #T_56dd5_row59_col5, #T_56dd5_row59_col6, #T_56dd5_row59_col7, #T_56dd5_row59_col8, #T_56dd5_row60_col0, #T_56dd5_row60_col1, #T_56dd5_row60_col2, #T_56dd5_row60_col3, #T_56dd5_row60_col4, #T_56dd5_row60_col5, #T_56dd5_row60_col6, #T_56dd5_row60_col7, #T_56dd5_row60_col8, #T_56dd5_row61_col0, #T_56dd5_row61_col1, #T_56dd5_row61_col2, #T_56dd5_row61_col3, #T_56dd5_row61_col4, #T_56dd5_row61_col5, #T_56dd5_row61_col6, #T_56dd5_row61_col7, #T_56dd5_row61_col8, #T_56dd5_row62_col0, #T_56dd5_row62_col1, #T_56dd5_row62_col2, #T_56dd5_row62_col3, #T_56dd5_row62_col4, #T_56dd5_row62_col5, #T_56dd5_row62_col6, #T_56dd5_row62_col7, #T_56dd5_row62_col8, #T_56dd5_row63_col0, #T_56dd5_row63_col1, #T_56dd5_row63_col2, #T_56dd5_row63_col3, #T_56dd5_row63_col4, #T_56dd5_row63_col5, #T_56dd5_row63_col6, #T_56dd5_row63_col7, #T_56dd5_row63_col8, #T_56dd5_row64_col0, #T_56dd5_row64_col1, #T_56dd5_row64_col2, #T_56dd5_row64_col3, #T_56dd5_row64_col4, #T_56dd5_row64_col5, #T_56dd5_row64_col6, #T_56dd5_row64_col7, #T_56dd5_row64_col8, #T_56dd5_row65_col0, #T_56dd5_row65_col1, #T_56dd5_row65_col2, #T_56dd5_row65_col3, #T_56dd5_row65_col4, #T_56dd5_row65_col5, #T_56dd5_row65_col6, #T_56dd5_row65_col7, #T_56dd5_row65_col8, #T_56dd5_row66_col0, #T_56dd5_row66_col1, #T_56dd5_row66_col2, #T_56dd5_row66_col3, #T_56dd5_row66_col4, #T_56dd5_row66_col5, #T_56dd5_row66_col6, #T_56dd5_row66_col7, #T_56dd5_row66_col8, #T_56dd5_row67_col0, #T_56dd5_row67_col1, #T_56dd5_row67_col2, #T_56dd5_row67_col3, #T_56dd5_row67_col4, #T_56dd5_row67_col5, #T_56dd5_row67_col6, #T_56dd5_row67_col7, #T_56dd5_row67_col8, #T_56dd5_row68_col0, #T_56dd5_row68_col1, #T_56dd5_row68_col2, #T_56dd5_row68_col3, #T_56dd5_row68_col4, #T_56dd5_row68_col5, #T_56dd5_row68_col6, #T_56dd5_row68_col7, #T_56dd5_row68_col8, #T_56dd5_row69_col0, #T_56dd5_row69_col1, #T_56dd5_row69_col2, #T_56dd5_row69_col3, #T_56dd5_row69_col4, #T_56dd5_row69_col5, #T_56dd5_row69_col6, #T_56dd5_row69_col7, #T_56dd5_row69_col8, #T_56dd5_row70_col0, #T_56dd5_row70_col1, #T_56dd5_row70_col2, #T_56dd5_row70_col3, #T_56dd5_row70_col4, #T_56dd5_row70_col5, #T_56dd5_row70_col6, #T_56dd5_row70_col7, #T_56dd5_row70_col8, #T_56dd5_row71_col0, #T_56dd5_row71_col1, #T_56dd5_row71_col2, #T_56dd5_row71_col3, #T_56dd5_row71_col4, #T_56dd5_row71_col5, #T_56dd5_row71_col6, #T_56dd5_row71_col7, #T_56dd5_row71_col8, #T_56dd5_row72_col0, #T_56dd5_row72_col1, #T_56dd5_row72_col2, #T_56dd5_row72_col3, #T_56dd5_row72_col4, #T_56dd5_row72_col5, #T_56dd5_row72_col6, #T_56dd5_row72_col7, #T_56dd5_row72_col8, #T_56dd5_row73_col0, #T_56dd5_row73_col1, #T_56dd5_row73_col2, #T_56dd5_row73_col3, #T_56dd5_row73_col4, #T_56dd5_row73_col5, #T_56dd5_row73_col6, #T_56dd5_row73_col7, #T_56dd5_row73_col8, #T_56dd5_row74_col0, #T_56dd5_row74_col1, #T_56dd5_row74_col2, #T_56dd5_row74_col3, #T_56dd5_row74_col4, #T_56dd5_row74_col5, #T_56dd5_row74_col6, #T_56dd5_row74_col7, #T_56dd5_row74_col8, #T_56dd5_row75_col0, #T_56dd5_row75_col1, #T_56dd5_row75_col2, #T_56dd5_row75_col3, #T_56dd5_row75_col4, #T_56dd5_row75_col5, #T_56dd5_row75_col6, #T_56dd5_row75_col7, #T_56dd5_row75_col8 {\n", - " text-align: left;\n", - "}\n", - "</style>\n", - "<table id=\"T_56dd5\">\n", - " <thead>\n", - " <tr>\n", - " <th id=\"T_56dd5_level0_col0\" class=\"col_heading level0 col0\" >ID</th>\n", - " <th id=\"T_56dd5_level0_col1\" class=\"col_heading level0 col1\" >Name</th>\n", - " <th id=\"T_56dd5_level0_col2\" class=\"col_heading level0 col2\" >Description</th>\n", - " <th id=\"T_56dd5_level0_col3\" class=\"col_heading level0 col3\" >Has Figure</th>\n", - " <th id=\"T_56dd5_level0_col4\" class=\"col_heading level0 col4\" >Has Table</th>\n", - " <th id=\"T_56dd5_level0_col5\" class=\"col_heading level0 col5\" >Required Inputs</th>\n", - " <th id=\"T_56dd5_level0_col6\" class=\"col_heading level0 col6\" >Params</th>\n", - " <th id=\"T_56dd5_level0_col7\" class=\"col_heading level0 col7\" >Tags</th>\n", - " <th id=\"T_56dd5_level0_col8\" class=\"col_heading level0 col8\" >Tasks</th>\n", - " </tr>\n", - " </thead>\n", - " <tbody>\n", - " <tr>\n", - " <td id=\"T_56dd5_row0_col0\" class=\"data row0 col0\" >validmind.data_validation.BivariateScatterPlots</td>\n", - " <td id=\"T_56dd5_row0_col1\" class=\"data row0 col1\" >Bivariate Scatter Plots</td>\n", - " <td id=\"T_56dd5_row0_col2\" class=\"data row0 col2\" >Generates bivariate scatterplots to visually inspect relationships between pairs of numerical predictor variables...</td>\n", - " <td id=\"T_56dd5_row0_col3\" class=\"data row0 col3\" >True</td>\n", - " <td id=\"T_56dd5_row0_col4\" class=\"data row0 col4\" >False</td>\n", - " <td id=\"T_56dd5_row0_col5\" class=\"data row0 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row0_col6\" class=\"data row0 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row0_col7\" class=\"data row0 col7\" >['tabular_data', 'numerical_data', 'visualization']</td>\n", - " <td id=\"T_56dd5_row0_col8\" class=\"data row0 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row1_col0\" class=\"data row1 col0\" >validmind.data_validation.ChiSquaredFeaturesTable</td>\n", - " <td id=\"T_56dd5_row1_col1\" class=\"data row1 col1\" >Chi Squared Features Table</td>\n", - " <td id=\"T_56dd5_row1_col2\" class=\"data row1 col2\" >Assesses the statistical association between categorical features and a target variable using the Chi-Squared test....</td>\n", - " <td id=\"T_56dd5_row1_col3\" class=\"data row1 col3\" >False</td>\n", - " <td id=\"T_56dd5_row1_col4\" class=\"data row1 col4\" >True</td>\n", - " <td id=\"T_56dd5_row1_col5\" class=\"data row1 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row1_col6\" class=\"data row1 col6\" >{'p_threshold': {'type': '_empty', 'default': 0.05}}</td>\n", - " <td id=\"T_56dd5_row1_col7\" class=\"data row1 col7\" >['tabular_data', 'categorical_data', 'statistical_test']</td>\n", - " <td id=\"T_56dd5_row1_col8\" class=\"data row1 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row2_col0\" class=\"data row2 col0\" >validmind.data_validation.ClassImbalance</td>\n", - " <td id=\"T_56dd5_row2_col1\" class=\"data row2 col1\" >Class Imbalance</td>\n", - " <td id=\"T_56dd5_row2_col2\" class=\"data row2 col2\" >Evaluates and quantifies class distribution imbalance in a dataset used by a machine learning model....</td>\n", - " <td id=\"T_56dd5_row2_col3\" class=\"data row2 col3\" >True</td>\n", - " <td id=\"T_56dd5_row2_col4\" class=\"data row2 col4\" >True</td>\n", - " <td id=\"T_56dd5_row2_col5\" class=\"data row2 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row2_col6\" class=\"data row2 col6\" >{'min_percent_threshold': {'type': 'int', 'default': 10}}</td>\n", - " <td id=\"T_56dd5_row2_col7\" class=\"data row2 col7\" >['tabular_data', 'binary_classification', 'multiclass_classification', 'data_quality']</td>\n", - " <td id=\"T_56dd5_row2_col8\" class=\"data row2 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row3_col0\" class=\"data row3 col0\" >validmind.data_validation.DatasetDescription</td>\n", - " <td id=\"T_56dd5_row3_col1\" class=\"data row3 col1\" >Dataset Description</td>\n", - " <td id=\"T_56dd5_row3_col2\" class=\"data row3 col2\" >Provides comprehensive analysis and statistical summaries of each column in a machine learning model's dataset....</td>\n", - " <td id=\"T_56dd5_row3_col3\" class=\"data row3 col3\" >False</td>\n", - " <td id=\"T_56dd5_row3_col4\" class=\"data row3 col4\" >True</td>\n", - " <td id=\"T_56dd5_row3_col5\" class=\"data row3 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row3_col6\" class=\"data row3 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row3_col7\" class=\"data row3 col7\" >['tabular_data', 'time_series_data', 'text_data']</td>\n", - " <td id=\"T_56dd5_row3_col8\" class=\"data row3 col8\" >['classification', 'regression', 'text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row4_col0\" class=\"data row4 col0\" >validmind.data_validation.DatasetSplit</td>\n", - " <td id=\"T_56dd5_row4_col1\" class=\"data row4 col1\" >Dataset Split</td>\n", - " <td id=\"T_56dd5_row4_col2\" class=\"data row4 col2\" >Evaluates and visualizes the distribution proportions among training, testing, and validation datasets of an ML...</td>\n", - " <td id=\"T_56dd5_row4_col3\" class=\"data row4 col3\" >False</td>\n", - " <td id=\"T_56dd5_row4_col4\" class=\"data row4 col4\" >True</td>\n", - " <td id=\"T_56dd5_row4_col5\" class=\"data row4 col5\" >['datasets']</td>\n", - " <td id=\"T_56dd5_row4_col6\" class=\"data row4 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row4_col7\" class=\"data row4 col7\" >['tabular_data', 'time_series_data', 'text_data']</td>\n", - " <td id=\"T_56dd5_row4_col8\" class=\"data row4 col8\" >['classification', 'regression', 'text_classification', 'text_summarization']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row5_col0\" class=\"data row5 col0\" >validmind.data_validation.DescriptiveStatistics</td>\n", - " <td id=\"T_56dd5_row5_col1\" class=\"data row5 col1\" >Descriptive Statistics</td>\n", - " <td id=\"T_56dd5_row5_col2\" class=\"data row5 col2\" >Performs a detailed descriptive statistical analysis of both numerical and categorical data within a model's...</td>\n", - " <td id=\"T_56dd5_row5_col3\" class=\"data row5 col3\" >False</td>\n", - " <td id=\"T_56dd5_row5_col4\" class=\"data row5 col4\" >True</td>\n", - " <td id=\"T_56dd5_row5_col5\" class=\"data row5 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row5_col6\" class=\"data row5 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row5_col7\" class=\"data row5 col7\" >['tabular_data', 'time_series_data', 'data_quality']</td>\n", - " <td id=\"T_56dd5_row5_col8\" class=\"data row5 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row6_col0\" class=\"data row6 col0\" >validmind.data_validation.Duplicates</td>\n", - " <td id=\"T_56dd5_row6_col1\" class=\"data row6 col1\" >Duplicates</td>\n", - " <td id=\"T_56dd5_row6_col2\" class=\"data row6 col2\" >Tests dataset for duplicate entries, ensuring model reliability via data quality verification....</td>\n", - " <td id=\"T_56dd5_row6_col3\" class=\"data row6 col3\" >False</td>\n", - " <td id=\"T_56dd5_row6_col4\" class=\"data row6 col4\" >True</td>\n", - " <td id=\"T_56dd5_row6_col5\" class=\"data row6 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row6_col6\" class=\"data row6 col6\" >{'min_threshold': {'type': '_empty', 'default': 1}}</td>\n", - " <td id=\"T_56dd5_row6_col7\" class=\"data row6 col7\" >['tabular_data', 'data_quality', 'text_data']</td>\n", - " <td id=\"T_56dd5_row6_col8\" class=\"data row6 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row7_col0\" class=\"data row7 col0\" >validmind.data_validation.FeatureTargetCorrelationPlot</td>\n", - " <td id=\"T_56dd5_row7_col1\" class=\"data row7 col1\" >Feature Target Correlation Plot</td>\n", - " <td id=\"T_56dd5_row7_col2\" class=\"data row7 col2\" >Visualizes the correlation between input features and the model's target output in a color-coded horizontal bar...</td>\n", - " <td id=\"T_56dd5_row7_col3\" class=\"data row7 col3\" >True</td>\n", - " <td id=\"T_56dd5_row7_col4\" class=\"data row7 col4\" >False</td>\n", - " <td id=\"T_56dd5_row7_col5\" class=\"data row7 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row7_col6\" class=\"data row7 col6\" >{'fig_height': {'type': '_empty', 'default': 600}}</td>\n", - " <td id=\"T_56dd5_row7_col7\" class=\"data row7 col7\" >['tabular_data', 'visualization', 'correlation']</td>\n", - " <td id=\"T_56dd5_row7_col8\" class=\"data row7 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row8_col0\" class=\"data row8 col0\" >validmind.data_validation.HighCardinality</td>\n", - " <td id=\"T_56dd5_row8_col1\" class=\"data row8 col1\" >High Cardinality</td>\n", - " <td id=\"T_56dd5_row8_col2\" class=\"data row8 col2\" >Assesses the number of unique values in categorical columns to detect high cardinality and potential overfitting....</td>\n", - " <td id=\"T_56dd5_row8_col3\" class=\"data row8 col3\" >False</td>\n", - " <td id=\"T_56dd5_row8_col4\" class=\"data row8 col4\" >True</td>\n", - " <td id=\"T_56dd5_row8_col5\" class=\"data row8 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row8_col6\" class=\"data row8 col6\" >{'num_threshold': {'type': 'int', 'default': 100}, 'percent_threshold': {'type': 'float', 'default': 0.1}, 'threshold_type': {'type': 'str', 'default': 'percent'}}</td>\n", - " <td id=\"T_56dd5_row8_col7\" class=\"data row8 col7\" >['tabular_data', 'data_quality', 'categorical_data']</td>\n", - " <td id=\"T_56dd5_row8_col8\" class=\"data row8 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row9_col0\" class=\"data row9 col0\" >validmind.data_validation.HighPearsonCorrelation</td>\n", - " <td id=\"T_56dd5_row9_col1\" class=\"data row9 col1\" >High Pearson Correlation</td>\n", - " <td id=\"T_56dd5_row9_col2\" class=\"data row9 col2\" >Identifies highly correlated feature pairs in a dataset suggesting feature redundancy or multicollinearity....</td>\n", - " <td id=\"T_56dd5_row9_col3\" class=\"data row9 col3\" >False</td>\n", - " <td id=\"T_56dd5_row9_col4\" class=\"data row9 col4\" >True</td>\n", - " <td id=\"T_56dd5_row9_col5\" class=\"data row9 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row9_col6\" class=\"data row9 col6\" >{'max_threshold': {'type': 'float', 'default': 0.3}, 'top_n_correlations': {'type': 'int', 'default': 10}, 'feature_columns': {'type': 'list', 'default': None}}</td>\n", - " <td id=\"T_56dd5_row9_col7\" class=\"data row9 col7\" >['tabular_data', 'data_quality', 'correlation']</td>\n", - " <td id=\"T_56dd5_row9_col8\" class=\"data row9 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row10_col0\" class=\"data row10 col0\" >validmind.data_validation.IQROutliersBarPlot</td>\n", - " <td id=\"T_56dd5_row10_col1\" class=\"data row10 col1\" >IQR Outliers Bar Plot</td>\n", - " <td id=\"T_56dd5_row10_col2\" class=\"data row10 col2\" >Visualizes outlier distribution across percentiles in numerical data using the Interquartile Range (IQR) method....</td>\n", - " <td id=\"T_56dd5_row10_col3\" class=\"data row10 col3\" >True</td>\n", - " <td id=\"T_56dd5_row10_col4\" class=\"data row10 col4\" >False</td>\n", - " <td id=\"T_56dd5_row10_col5\" class=\"data row10 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row10_col6\" class=\"data row10 col6\" >{'threshold': {'type': 'float', 'default': 1.5}, 'fig_width': {'type': 'int', 'default': 800}}</td>\n", - " <td id=\"T_56dd5_row10_col7\" class=\"data row10 col7\" >['tabular_data', 'visualization', 'numerical_data']</td>\n", - " <td id=\"T_56dd5_row10_col8\" class=\"data row10 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row11_col0\" class=\"data row11 col0\" >validmind.data_validation.IQROutliersTable</td>\n", - " <td id=\"T_56dd5_row11_col1\" class=\"data row11 col1\" >IQR Outliers Table</td>\n", - " <td id=\"T_56dd5_row11_col2\" class=\"data row11 col2\" >Determines and summarizes outliers in numerical features using the Interquartile Range method....</td>\n", - " <td id=\"T_56dd5_row11_col3\" class=\"data row11 col3\" >False</td>\n", - " <td id=\"T_56dd5_row11_col4\" class=\"data row11 col4\" >True</td>\n", - " <td id=\"T_56dd5_row11_col5\" class=\"data row11 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row11_col6\" class=\"data row11 col6\" >{'threshold': {'type': 'float', 'default': 1.5}}</td>\n", - " <td id=\"T_56dd5_row11_col7\" class=\"data row11 col7\" >['tabular_data', 'numerical_data']</td>\n", - " <td id=\"T_56dd5_row11_col8\" class=\"data row11 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row12_col0\" class=\"data row12 col0\" >validmind.data_validation.IsolationForestOutliers</td>\n", - " <td id=\"T_56dd5_row12_col1\" class=\"data row12 col1\" >Isolation Forest Outliers</td>\n", - " <td id=\"T_56dd5_row12_col2\" class=\"data row12 col2\" >Detects outliers in a dataset using the Isolation Forest algorithm and visualizes results through scatter plots....</td>\n", - " <td id=\"T_56dd5_row12_col3\" class=\"data row12 col3\" >True</td>\n", - " <td id=\"T_56dd5_row12_col4\" class=\"data row12 col4\" >False</td>\n", - " <td id=\"T_56dd5_row12_col5\" class=\"data row12 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row12_col6\" class=\"data row12 col6\" >{'random_state': {'type': 'int', 'default': 0}, 'contamination': {'type': 'float', 'default': 0.1}, 'feature_columns': {'type': 'list', 'default': None}}</td>\n", - " <td id=\"T_56dd5_row12_col7\" class=\"data row12 col7\" >['tabular_data', 'anomaly_detection']</td>\n", - " <td id=\"T_56dd5_row12_col8\" class=\"data row12 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row13_col0\" class=\"data row13 col0\" >validmind.data_validation.JarqueBera</td>\n", - " <td id=\"T_56dd5_row13_col1\" class=\"data row13 col1\" >Jarque Bera</td>\n", - " <td id=\"T_56dd5_row13_col2\" class=\"data row13 col2\" >Assesses normality of dataset features in an ML model using the Jarque-Bera test....</td>\n", - " <td id=\"T_56dd5_row13_col3\" class=\"data row13 col3\" >False</td>\n", - " <td id=\"T_56dd5_row13_col4\" class=\"data row13 col4\" >True</td>\n", - " <td id=\"T_56dd5_row13_col5\" class=\"data row13 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row13_col6\" class=\"data row13 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row13_col7\" class=\"data row13 col7\" >['tabular_data', 'data_distribution', 'statistical_test', 'statsmodels']</td>\n", - " <td id=\"T_56dd5_row13_col8\" class=\"data row13 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row14_col0\" class=\"data row14 col0\" >validmind.data_validation.MissingValues</td>\n", - " <td id=\"T_56dd5_row14_col1\" class=\"data row14 col1\" >Missing Values</td>\n", - " <td id=\"T_56dd5_row14_col2\" class=\"data row14 col2\" >Evaluates dataset quality by ensuring missing value ratio across all features does not exceed a set threshold....</td>\n", - " <td id=\"T_56dd5_row14_col3\" class=\"data row14 col3\" >False</td>\n", - " <td id=\"T_56dd5_row14_col4\" class=\"data row14 col4\" >True</td>\n", - " <td id=\"T_56dd5_row14_col5\" class=\"data row14 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row14_col6\" class=\"data row14 col6\" >{'min_threshold': {'type': 'int', 'default': 1}}</td>\n", - " <td id=\"T_56dd5_row14_col7\" class=\"data row14 col7\" >['tabular_data', 'data_quality']</td>\n", - " <td id=\"T_56dd5_row14_col8\" class=\"data row14 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row15_col0\" class=\"data row15 col0\" >validmind.data_validation.MissingValuesBarPlot</td>\n", - " <td id=\"T_56dd5_row15_col1\" class=\"data row15 col1\" >Missing Values Bar Plot</td>\n", - " <td id=\"T_56dd5_row15_col2\" class=\"data row15 col2\" >Assesses the percentage and distribution of missing values in the dataset via a bar plot, with emphasis on...</td>\n", - " <td id=\"T_56dd5_row15_col3\" class=\"data row15 col3\" >True</td>\n", - " <td id=\"T_56dd5_row15_col4\" class=\"data row15 col4\" >False</td>\n", - " <td id=\"T_56dd5_row15_col5\" class=\"data row15 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row15_col6\" class=\"data row15 col6\" >{'threshold': {'type': 'int', 'default': 80}, 'fig_height': {'type': 'int', 'default': 600}}</td>\n", - " <td id=\"T_56dd5_row15_col7\" class=\"data row15 col7\" >['tabular_data', 'data_quality', 'visualization']</td>\n", - " <td id=\"T_56dd5_row15_col8\" class=\"data row15 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row16_col0\" class=\"data row16 col0\" >validmind.data_validation.MutualInformation</td>\n", - " <td id=\"T_56dd5_row16_col1\" class=\"data row16 col1\" >Mutual Information</td>\n", - " <td id=\"T_56dd5_row16_col2\" class=\"data row16 col2\" >Calculates mutual information scores between features and target variable to evaluate feature relevance....</td>\n", - " <td id=\"T_56dd5_row16_col3\" class=\"data row16 col3\" >True</td>\n", - " <td id=\"T_56dd5_row16_col4\" class=\"data row16 col4\" >False</td>\n", - " <td id=\"T_56dd5_row16_col5\" class=\"data row16 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row16_col6\" class=\"data row16 col6\" >{'min_threshold': {'type': 'float', 'default': 0.01}, 'task': {'type': 'str', 'default': 'classification'}}</td>\n", - " <td id=\"T_56dd5_row16_col7\" class=\"data row16 col7\" >['feature_selection', 'data_analysis']</td>\n", - " <td id=\"T_56dd5_row16_col8\" class=\"data row16 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row17_col0\" class=\"data row17 col0\" >validmind.data_validation.PearsonCorrelationMatrix</td>\n", - " <td id=\"T_56dd5_row17_col1\" class=\"data row17 col1\" >Pearson Correlation Matrix</td>\n", - " <td id=\"T_56dd5_row17_col2\" class=\"data row17 col2\" >Evaluates linear dependency between numerical variables in a dataset via a Pearson Correlation coefficient heat map....</td>\n", - " <td id=\"T_56dd5_row17_col3\" class=\"data row17 col3\" >True</td>\n", - " <td id=\"T_56dd5_row17_col4\" class=\"data row17 col4\" >False</td>\n", - " <td id=\"T_56dd5_row17_col5\" class=\"data row17 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row17_col6\" class=\"data row17 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row17_col7\" class=\"data row17 col7\" >['tabular_data', 'numerical_data', 'correlation']</td>\n", - " <td id=\"T_56dd5_row17_col8\" class=\"data row17 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row18_col0\" class=\"data row18 col0\" >validmind.data_validation.ProtectedClassesDescription</td>\n", - " <td id=\"T_56dd5_row18_col1\" class=\"data row18 col1\" >Protected Classes Description</td>\n", - " <td id=\"T_56dd5_row18_col2\" class=\"data row18 col2\" >Visualizes the distribution of protected classes in the dataset relative to the target variable...</td>\n", - " <td id=\"T_56dd5_row18_col3\" class=\"data row18 col3\" >True</td>\n", - " <td id=\"T_56dd5_row18_col4\" class=\"data row18 col4\" >True</td>\n", - " <td id=\"T_56dd5_row18_col5\" class=\"data row18 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row18_col6\" class=\"data row18 col6\" >{'protected_classes': {'type': '_empty', 'default': None}}</td>\n", - " <td id=\"T_56dd5_row18_col7\" class=\"data row18 col7\" >['bias_and_fairness', 'descriptive_statistics']</td>\n", - " <td id=\"T_56dd5_row18_col8\" class=\"data row18 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row19_col0\" class=\"data row19 col0\" >validmind.data_validation.RunsTest</td>\n", - " <td id=\"T_56dd5_row19_col1\" class=\"data row19 col1\" >Runs Test</td>\n", - " <td id=\"T_56dd5_row19_col2\" class=\"data row19 col2\" >Executes Runs Test on ML model to detect non-random patterns in output data sequence....</td>\n", - " <td id=\"T_56dd5_row19_col3\" class=\"data row19 col3\" >False</td>\n", - " <td id=\"T_56dd5_row19_col4\" class=\"data row19 col4\" >True</td>\n", - " <td id=\"T_56dd5_row19_col5\" class=\"data row19 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row19_col6\" class=\"data row19 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row19_col7\" class=\"data row19 col7\" >['tabular_data', 'statistical_test', 'statsmodels']</td>\n", - " <td id=\"T_56dd5_row19_col8\" class=\"data row19 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row20_col0\" class=\"data row20 col0\" >validmind.data_validation.ScatterPlot</td>\n", - " <td id=\"T_56dd5_row20_col1\" class=\"data row20 col1\" >Scatter Plot</td>\n", - " <td id=\"T_56dd5_row20_col2\" class=\"data row20 col2\" >Assesses visual relationships, patterns, and outliers among features in a dataset through scatter plot matrices....</td>\n", - " <td id=\"T_56dd5_row20_col3\" class=\"data row20 col3\" >True</td>\n", - " <td id=\"T_56dd5_row20_col4\" class=\"data row20 col4\" >False</td>\n", - " <td id=\"T_56dd5_row20_col5\" class=\"data row20 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row20_col6\" class=\"data row20 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row20_col7\" class=\"data row20 col7\" >['tabular_data', 'visualization']</td>\n", - " <td id=\"T_56dd5_row20_col8\" class=\"data row20 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row21_col0\" class=\"data row21 col0\" >validmind.data_validation.ScoreBandDefaultRates</td>\n", - " <td id=\"T_56dd5_row21_col1\" class=\"data row21 col1\" >Score Band Default Rates</td>\n", - " <td id=\"T_56dd5_row21_col2\" class=\"data row21 col2\" >Analyzes default rates and population distribution across credit score bands....</td>\n", - " <td id=\"T_56dd5_row21_col3\" class=\"data row21 col3\" >False</td>\n", - " <td id=\"T_56dd5_row21_col4\" class=\"data row21 col4\" >True</td>\n", - " <td id=\"T_56dd5_row21_col5\" class=\"data row21 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_56dd5_row21_col6\" class=\"data row21 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'score_bands': {'type': 'list', 'default': None}}</td>\n", - " <td id=\"T_56dd5_row21_col7\" class=\"data row21 col7\" >['visualization', 'credit_risk', 'scorecard']</td>\n", - " <td id=\"T_56dd5_row21_col8\" class=\"data row21 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row22_col0\" class=\"data row22 col0\" >validmind.data_validation.ShapiroWilk</td>\n", - " <td id=\"T_56dd5_row22_col1\" class=\"data row22 col1\" >Shapiro Wilk</td>\n", - " <td id=\"T_56dd5_row22_col2\" class=\"data row22 col2\" >Evaluates feature-wise normality of training data using the Shapiro-Wilk test....</td>\n", - " <td id=\"T_56dd5_row22_col3\" class=\"data row22 col3\" >False</td>\n", - " <td id=\"T_56dd5_row22_col4\" class=\"data row22 col4\" >True</td>\n", - " <td id=\"T_56dd5_row22_col5\" class=\"data row22 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row22_col6\" class=\"data row22 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row22_col7\" class=\"data row22 col7\" >['tabular_data', 'data_distribution', 'statistical_test']</td>\n", - " <td id=\"T_56dd5_row22_col8\" class=\"data row22 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row23_col0\" class=\"data row23 col0\" >validmind.data_validation.Skewness</td>\n", - " <td id=\"T_56dd5_row23_col1\" class=\"data row23 col1\" >Skewness</td>\n", - " <td id=\"T_56dd5_row23_col2\" class=\"data row23 col2\" >Evaluates the skewness of numerical data in a dataset to check against a defined threshold, aiming to ensure data...</td>\n", - " <td id=\"T_56dd5_row23_col3\" class=\"data row23 col3\" >False</td>\n", - " <td id=\"T_56dd5_row23_col4\" class=\"data row23 col4\" >True</td>\n", - " <td id=\"T_56dd5_row23_col5\" class=\"data row23 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row23_col6\" class=\"data row23 col6\" >{'max_threshold': {'type': '_empty', 'default': 1}}</td>\n", - " <td id=\"T_56dd5_row23_col7\" class=\"data row23 col7\" >['data_quality', 'tabular_data']</td>\n", - " <td id=\"T_56dd5_row23_col8\" class=\"data row23 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row24_col0\" class=\"data row24 col0\" >validmind.data_validation.TabularCategoricalBarPlots</td>\n", - " <td id=\"T_56dd5_row24_col1\" class=\"data row24 col1\" >Tabular Categorical Bar Plots</td>\n", - " <td id=\"T_56dd5_row24_col2\" class=\"data row24 col2\" >Generates and visualizes bar plots for each category in categorical features to evaluate the dataset's composition....</td>\n", - " <td id=\"T_56dd5_row24_col3\" class=\"data row24 col3\" >True</td>\n", - " <td id=\"T_56dd5_row24_col4\" class=\"data row24 col4\" >False</td>\n", - " <td id=\"T_56dd5_row24_col5\" class=\"data row24 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row24_col6\" class=\"data row24 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row24_col7\" class=\"data row24 col7\" >['tabular_data', 'visualization']</td>\n", - " <td id=\"T_56dd5_row24_col8\" class=\"data row24 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row25_col0\" class=\"data row25 col0\" >validmind.data_validation.TabularDateTimeHistograms</td>\n", - " <td id=\"T_56dd5_row25_col1\" class=\"data row25 col1\" >Tabular Date Time Histograms</td>\n", - " <td id=\"T_56dd5_row25_col2\" class=\"data row25 col2\" >Generates histograms to provide graphical insight into the distribution of time intervals in a model's datetime...</td>\n", - " <td id=\"T_56dd5_row25_col3\" class=\"data row25 col3\" >True</td>\n", - " <td id=\"T_56dd5_row25_col4\" class=\"data row25 col4\" >False</td>\n", - " <td id=\"T_56dd5_row25_col5\" class=\"data row25 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row25_col6\" class=\"data row25 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row25_col7\" class=\"data row25 col7\" >['time_series_data', 'visualization']</td>\n", - " <td id=\"T_56dd5_row25_col8\" class=\"data row25 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row26_col0\" class=\"data row26 col0\" >validmind.data_validation.TabularDescriptionTables</td>\n", - " <td id=\"T_56dd5_row26_col1\" class=\"data row26 col1\" >Tabular Description Tables</td>\n", - " <td id=\"T_56dd5_row26_col2\" class=\"data row26 col2\" >Summarizes key descriptive statistics for numerical, categorical, and datetime variables in a dataset....</td>\n", - " <td id=\"T_56dd5_row26_col3\" class=\"data row26 col3\" >False</td>\n", - " <td id=\"T_56dd5_row26_col4\" class=\"data row26 col4\" >True</td>\n", - " <td id=\"T_56dd5_row26_col5\" class=\"data row26 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row26_col6\" class=\"data row26 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row26_col7\" class=\"data row26 col7\" >['tabular_data']</td>\n", - " <td id=\"T_56dd5_row26_col8\" class=\"data row26 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row27_col0\" class=\"data row27 col0\" >validmind.data_validation.TabularNumericalHistograms</td>\n", - " <td id=\"T_56dd5_row27_col1\" class=\"data row27 col1\" >Tabular Numerical Histograms</td>\n", - " <td id=\"T_56dd5_row27_col2\" class=\"data row27 col2\" >Generates histograms for each numerical feature in a dataset to provide visual insights into data distribution and...</td>\n", - " <td id=\"T_56dd5_row27_col3\" class=\"data row27 col3\" >True</td>\n", - " <td id=\"T_56dd5_row27_col4\" class=\"data row27 col4\" >False</td>\n", - " <td id=\"T_56dd5_row27_col5\" class=\"data row27 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row27_col6\" class=\"data row27 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row27_col7\" class=\"data row27 col7\" >['tabular_data', 'visualization']</td>\n", - " <td id=\"T_56dd5_row27_col8\" class=\"data row27 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row28_col0\" class=\"data row28 col0\" >validmind.data_validation.TargetRateBarPlots</td>\n", - " <td id=\"T_56dd5_row28_col1\" class=\"data row28 col1\" >Target Rate Bar Plots</td>\n", - " <td id=\"T_56dd5_row28_col2\" class=\"data row28 col2\" >Generates bar plots visualizing the default rates of categorical features for a classification machine learning...</td>\n", - " <td id=\"T_56dd5_row28_col3\" class=\"data row28 col3\" >True</td>\n", - " <td id=\"T_56dd5_row28_col4\" class=\"data row28 col4\" >False</td>\n", - " <td id=\"T_56dd5_row28_col5\" class=\"data row28 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row28_col6\" class=\"data row28 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row28_col7\" class=\"data row28 col7\" >['tabular_data', 'visualization', 'categorical_data']</td>\n", - " <td id=\"T_56dd5_row28_col8\" class=\"data row28 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row29_col0\" class=\"data row29 col0\" >validmind.data_validation.TooManyZeroValues</td>\n", - " <td id=\"T_56dd5_row29_col1\" class=\"data row29 col1\" >Too Many Zero Values</td>\n", - " <td id=\"T_56dd5_row29_col2\" class=\"data row29 col2\" >Identifies numerical columns in a dataset that contain an excessive number of zero values, defined by a threshold...</td>\n", - " <td id=\"T_56dd5_row29_col3\" class=\"data row29 col3\" >False</td>\n", - " <td id=\"T_56dd5_row29_col4\" class=\"data row29 col4\" >True</td>\n", - " <td id=\"T_56dd5_row29_col5\" class=\"data row29 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row29_col6\" class=\"data row29 col6\" >{'max_percent_threshold': {'type': 'float', 'default': 0.03}}</td>\n", - " <td id=\"T_56dd5_row29_col7\" class=\"data row29 col7\" >['tabular_data']</td>\n", - " <td id=\"T_56dd5_row29_col8\" class=\"data row29 col8\" >['regression', 'classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row30_col0\" class=\"data row30 col0\" >validmind.data_validation.UniqueRows</td>\n", - " <td id=\"T_56dd5_row30_col1\" class=\"data row30 col1\" >Unique Rows</td>\n", - " <td id=\"T_56dd5_row30_col2\" class=\"data row30 col2\" >Verifies the diversity of the dataset by ensuring that the count of unique rows exceeds a prescribed threshold....</td>\n", - " <td id=\"T_56dd5_row30_col3\" class=\"data row30 col3\" >False</td>\n", - " <td id=\"T_56dd5_row30_col4\" class=\"data row30 col4\" >True</td>\n", - " <td id=\"T_56dd5_row30_col5\" class=\"data row30 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row30_col6\" class=\"data row30 col6\" >{'min_percent_threshold': {'type': 'float', 'default': 1}}</td>\n", - " <td id=\"T_56dd5_row30_col7\" class=\"data row30 col7\" >['tabular_data']</td>\n", - " <td id=\"T_56dd5_row30_col8\" class=\"data row30 col8\" >['regression', 'classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row31_col0\" class=\"data row31 col0\" >validmind.data_validation.WOEBinPlots</td>\n", - " <td id=\"T_56dd5_row31_col1\" class=\"data row31 col1\" >WOE Bin Plots</td>\n", - " <td id=\"T_56dd5_row31_col2\" class=\"data row31 col2\" >Generates visualizations of Weight of Evidence (WoE) and Information Value (IV) for understanding predictive power...</td>\n", - " <td id=\"T_56dd5_row31_col3\" class=\"data row31 col3\" >True</td>\n", - " <td id=\"T_56dd5_row31_col4\" class=\"data row31 col4\" >False</td>\n", - " <td id=\"T_56dd5_row31_col5\" class=\"data row31 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row31_col6\" class=\"data row31 col6\" >{'breaks_adj': {'type': 'list', 'default': None}, 'fig_height': {'type': 'int', 'default': 600}, 'fig_width': {'type': 'int', 'default': 500}}</td>\n", - " <td id=\"T_56dd5_row31_col7\" class=\"data row31 col7\" >['tabular_data', 'visualization', 'categorical_data']</td>\n", - " <td id=\"T_56dd5_row31_col8\" class=\"data row31 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row32_col0\" class=\"data row32 col0\" >validmind.data_validation.WOEBinTable</td>\n", - " <td id=\"T_56dd5_row32_col1\" class=\"data row32 col1\" >WOE Bin Table</td>\n", - " <td id=\"T_56dd5_row32_col2\" class=\"data row32 col2\" >Assesses the Weight of Evidence (WoE) and Information Value (IV) of each feature to evaluate its predictive power...</td>\n", - " <td id=\"T_56dd5_row32_col3\" class=\"data row32 col3\" >False</td>\n", - " <td id=\"T_56dd5_row32_col4\" class=\"data row32 col4\" >True</td>\n", - " <td id=\"T_56dd5_row32_col5\" class=\"data row32 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row32_col6\" class=\"data row32 col6\" >{'breaks_adj': {'type': 'list', 'default': None}}</td>\n", - " <td id=\"T_56dd5_row32_col7\" class=\"data row32 col7\" >['tabular_data', 'categorical_data']</td>\n", - " <td id=\"T_56dd5_row32_col8\" class=\"data row32 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row33_col0\" class=\"data row33 col0\" >validmind.model_validation.FeaturesAUC</td>\n", - " <td id=\"T_56dd5_row33_col1\" class=\"data row33 col1\" >Features AUC</td>\n", - " <td id=\"T_56dd5_row33_col2\" class=\"data row33 col2\" >Evaluates the discriminatory power of each individual feature within a binary classification model by calculating...</td>\n", - " <td id=\"T_56dd5_row33_col3\" class=\"data row33 col3\" >True</td>\n", - " <td id=\"T_56dd5_row33_col4\" class=\"data row33 col4\" >False</td>\n", - " <td id=\"T_56dd5_row33_col5\" class=\"data row33 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row33_col6\" class=\"data row33 col6\" >{'fontsize': {'type': 'int', 'default': 12}, 'figure_height': {'type': 'int', 'default': 500}}</td>\n", - " <td id=\"T_56dd5_row33_col7\" class=\"data row33 col7\" >['feature_importance', 'AUC', 'visualization']</td>\n", - " <td id=\"T_56dd5_row33_col8\" class=\"data row33 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row34_col0\" class=\"data row34 col0\" >validmind.model_validation.sklearn.CalibrationCurve</td>\n", - " <td id=\"T_56dd5_row34_col1\" class=\"data row34 col1\" >Calibration Curve</td>\n", - " <td id=\"T_56dd5_row34_col2\" class=\"data row34 col2\" >Evaluates the calibration of probability estimates by comparing predicted probabilities against observed...</td>\n", - " <td id=\"T_56dd5_row34_col3\" class=\"data row34 col3\" >True</td>\n", - " <td id=\"T_56dd5_row34_col4\" class=\"data row34 col4\" >False</td>\n", - " <td id=\"T_56dd5_row34_col5\" class=\"data row34 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_56dd5_row34_col6\" class=\"data row34 col6\" >{'n_bins': {'type': 'int', 'default': 10}}</td>\n", - " <td id=\"T_56dd5_row34_col7\" class=\"data row34 col7\" >['sklearn', 'model_performance', 'classification']</td>\n", - " <td id=\"T_56dd5_row34_col8\" class=\"data row34 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row35_col0\" class=\"data row35 col0\" >validmind.model_validation.sklearn.ClassifierPerformance</td>\n", - " <td id=\"T_56dd5_row35_col1\" class=\"data row35 col1\" >Classifier Performance</td>\n", - " <td id=\"T_56dd5_row35_col2\" class=\"data row35 col2\" >Evaluates performance of binary or multiclass classification models using precision, recall, F1-Score, accuracy,...</td>\n", - " <td id=\"T_56dd5_row35_col3\" class=\"data row35 col3\" >False</td>\n", - " <td id=\"T_56dd5_row35_col4\" class=\"data row35 col4\" >True</td>\n", - " <td id=\"T_56dd5_row35_col5\" class=\"data row35 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_56dd5_row35_col6\" class=\"data row35 col6\" >{'average': {'type': 'str', 'default': 'macro'}}</td>\n", - " <td id=\"T_56dd5_row35_col7\" class=\"data row35 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_56dd5_row35_col8\" class=\"data row35 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row36_col0\" class=\"data row36 col0\" >validmind.model_validation.sklearn.ClassifierThresholdOptimization</td>\n", - " <td id=\"T_56dd5_row36_col1\" class=\"data row36 col1\" >Classifier Threshold Optimization</td>\n", - " <td id=\"T_56dd5_row36_col2\" class=\"data row36 col2\" >Analyzes and visualizes different threshold optimization methods for binary classification models....</td>\n", - " <td id=\"T_56dd5_row36_col3\" class=\"data row36 col3\" >False</td>\n", - " <td id=\"T_56dd5_row36_col4\" class=\"data row36 col4\" >True</td>\n", - " <td id=\"T_56dd5_row36_col5\" class=\"data row36 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_56dd5_row36_col6\" class=\"data row36 col6\" >{'methods': {'type': None, 'default': None}, 'target_recall': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_56dd5_row36_col7\" class=\"data row36 col7\" >['model_validation', 'threshold_optimization', 'classification_metrics']</td>\n", - " <td id=\"T_56dd5_row36_col8\" class=\"data row36 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row37_col0\" class=\"data row37 col0\" >validmind.model_validation.sklearn.ConfusionMatrix</td>\n", - " <td id=\"T_56dd5_row37_col1\" class=\"data row37 col1\" >Confusion Matrix</td>\n", - " <td id=\"T_56dd5_row37_col2\" class=\"data row37 col2\" >Evaluates and visually represents the classification ML model's predictive performance using a Confusion Matrix...</td>\n", - " <td id=\"T_56dd5_row37_col3\" class=\"data row37 col3\" >True</td>\n", - " <td id=\"T_56dd5_row37_col4\" class=\"data row37 col4\" >False</td>\n", - " <td id=\"T_56dd5_row37_col5\" class=\"data row37 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_56dd5_row37_col6\" class=\"data row37 col6\" >{'threshold': {'type': 'float', 'default': 0.5}}</td>\n", - " <td id=\"T_56dd5_row37_col7\" class=\"data row37 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_56dd5_row37_col8\" class=\"data row37 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row38_col0\" class=\"data row38 col0\" >validmind.model_validation.sklearn.HyperParametersTuning</td>\n", - " <td id=\"T_56dd5_row38_col1\" class=\"data row38 col1\" >Hyper Parameters Tuning</td>\n", - " <td id=\"T_56dd5_row38_col2\" class=\"data row38 col2\" >Performs exhaustive grid search over specified parameter ranges to find optimal model configurations...</td>\n", - " <td id=\"T_56dd5_row38_col3\" class=\"data row38 col3\" >False</td>\n", - " <td id=\"T_56dd5_row38_col4\" class=\"data row38 col4\" >True</td>\n", - " <td id=\"T_56dd5_row38_col5\" class=\"data row38 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_56dd5_row38_col6\" class=\"data row38 col6\" >{'param_grid': {'type': 'dict', 'default': None}, 'scoring': {'type': None, 'default': None}, 'thresholds': {'type': None, 'default': None}, 'fit_params': {'type': 'dict', 'default': None}}</td>\n", - " <td id=\"T_56dd5_row38_col7\" class=\"data row38 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_56dd5_row38_col8\" class=\"data row38 col8\" >['clustering', 'classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row39_col0\" class=\"data row39 col0\" >validmind.model_validation.sklearn.MinimumAccuracy</td>\n", - " <td id=\"T_56dd5_row39_col1\" class=\"data row39 col1\" >Minimum Accuracy</td>\n", - " <td id=\"T_56dd5_row39_col2\" class=\"data row39 col2\" >Checks if the model's prediction accuracy meets or surpasses a specified threshold....</td>\n", - " <td id=\"T_56dd5_row39_col3\" class=\"data row39 col3\" >False</td>\n", - " <td id=\"T_56dd5_row39_col4\" class=\"data row39 col4\" >True</td>\n", - " <td id=\"T_56dd5_row39_col5\" class=\"data row39 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_56dd5_row39_col6\" class=\"data row39 col6\" >{'min_threshold': {'type': 'float', 'default': 0.7}}</td>\n", - " <td id=\"T_56dd5_row39_col7\" class=\"data row39 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_56dd5_row39_col8\" class=\"data row39 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row40_col0\" class=\"data row40 col0\" >validmind.model_validation.sklearn.MinimumF1Score</td>\n", - " <td id=\"T_56dd5_row40_col1\" class=\"data row40 col1\" >Minimum F1 Score</td>\n", - " <td id=\"T_56dd5_row40_col2\" class=\"data row40 col2\" >Assesses if the model's F1 score on the validation set meets a predefined minimum threshold, ensuring balanced...</td>\n", - " <td id=\"T_56dd5_row40_col3\" class=\"data row40 col3\" >False</td>\n", - " <td id=\"T_56dd5_row40_col4\" class=\"data row40 col4\" >True</td>\n", - " <td id=\"T_56dd5_row40_col5\" class=\"data row40 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_56dd5_row40_col6\" class=\"data row40 col6\" >{'min_threshold': {'type': 'float', 'default': 0.5}}</td>\n", - " <td id=\"T_56dd5_row40_col7\" class=\"data row40 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_56dd5_row40_col8\" class=\"data row40 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row41_col0\" class=\"data row41 col0\" >validmind.model_validation.sklearn.MinimumROCAUCScore</td>\n", - " <td id=\"T_56dd5_row41_col1\" class=\"data row41 col1\" >Minimum ROCAUC Score</td>\n", - " <td id=\"T_56dd5_row41_col2\" class=\"data row41 col2\" >Validates model by checking if the ROC AUC score meets or surpasses a specified threshold....</td>\n", - " <td id=\"T_56dd5_row41_col3\" class=\"data row41 col3\" >False</td>\n", - " <td id=\"T_56dd5_row41_col4\" class=\"data row41 col4\" >True</td>\n", - " <td id=\"T_56dd5_row41_col5\" class=\"data row41 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_56dd5_row41_col6\" class=\"data row41 col6\" >{'min_threshold': {'type': 'float', 'default': 0.5}}</td>\n", - " <td id=\"T_56dd5_row41_col7\" class=\"data row41 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_56dd5_row41_col8\" class=\"data row41 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row42_col0\" class=\"data row42 col0\" >validmind.model_validation.sklearn.ModelParameters</td>\n", - " <td id=\"T_56dd5_row42_col1\" class=\"data row42 col1\" >Model Parameters</td>\n", - " <td id=\"T_56dd5_row42_col2\" class=\"data row42 col2\" >Extracts and displays model parameters in a structured format for transparency and reproducibility....</td>\n", - " <td id=\"T_56dd5_row42_col3\" class=\"data row42 col3\" >False</td>\n", - " <td id=\"T_56dd5_row42_col4\" class=\"data row42 col4\" >True</td>\n", - " <td id=\"T_56dd5_row42_col5\" class=\"data row42 col5\" >['model']</td>\n", - " <td id=\"T_56dd5_row42_col6\" class=\"data row42 col6\" >{'model_params': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_56dd5_row42_col7\" class=\"data row42 col7\" >['model_training', 'metadata']</td>\n", - " <td id=\"T_56dd5_row42_col8\" class=\"data row42 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row43_col0\" class=\"data row43 col0\" >validmind.model_validation.sklearn.ModelsPerformanceComparison</td>\n", - " <td id=\"T_56dd5_row43_col1\" class=\"data row43 col1\" >Models Performance Comparison</td>\n", - " <td id=\"T_56dd5_row43_col2\" class=\"data row43 col2\" >Evaluates and compares the performance of multiple Machine Learning models using various metrics like accuracy,...</td>\n", - " <td id=\"T_56dd5_row43_col3\" class=\"data row43 col3\" >False</td>\n", - " <td id=\"T_56dd5_row43_col4\" class=\"data row43 col4\" >True</td>\n", - " <td id=\"T_56dd5_row43_col5\" class=\"data row43 col5\" >['dataset', 'models']</td>\n", - " <td id=\"T_56dd5_row43_col6\" class=\"data row43 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row43_col7\" class=\"data row43 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'model_comparison']</td>\n", - " <td id=\"T_56dd5_row43_col8\" class=\"data row43 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row44_col0\" class=\"data row44 col0\" >validmind.model_validation.sklearn.OverfitDiagnosis</td>\n", - " <td id=\"T_56dd5_row44_col1\" class=\"data row44 col1\" >Overfit Diagnosis</td>\n", - " <td id=\"T_56dd5_row44_col2\" class=\"data row44 col2\" >Assesses potential overfitting in a model's predictions, identifying regions where performance between training and...</td>\n", - " <td id=\"T_56dd5_row44_col3\" class=\"data row44 col3\" >True</td>\n", - " <td id=\"T_56dd5_row44_col4\" class=\"data row44 col4\" >True</td>\n", - " <td id=\"T_56dd5_row44_col5\" class=\"data row44 col5\" >['model', 'datasets']</td>\n", - " <td id=\"T_56dd5_row44_col6\" class=\"data row44 col6\" >{'metric': {'type': 'str', 'default': None}, 'cut_off_threshold': {'type': 'float', 'default': 0.04}}</td>\n", - " <td id=\"T_56dd5_row44_col7\" class=\"data row44 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'linear_regression', 'model_diagnosis']</td>\n", - " <td id=\"T_56dd5_row44_col8\" class=\"data row44 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row45_col0\" class=\"data row45 col0\" >validmind.model_validation.sklearn.PermutationFeatureImportance</td>\n", - " <td id=\"T_56dd5_row45_col1\" class=\"data row45 col1\" >Permutation Feature Importance</td>\n", - " <td id=\"T_56dd5_row45_col2\" class=\"data row45 col2\" >Assesses the significance of each feature in a model by evaluating the impact on model performance when feature...</td>\n", - " <td id=\"T_56dd5_row45_col3\" class=\"data row45 col3\" >True</td>\n", - " <td id=\"T_56dd5_row45_col4\" class=\"data row45 col4\" >False</td>\n", - " <td id=\"T_56dd5_row45_col5\" class=\"data row45 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_56dd5_row45_col6\" class=\"data row45 col6\" >{'fontsize': {'type': None, 'default': None}, 'figure_height': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_56dd5_row45_col7\" class=\"data row45 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'feature_importance', 'visualization']</td>\n", - " <td id=\"T_56dd5_row45_col8\" class=\"data row45 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row46_col0\" class=\"data row46 col0\" >validmind.model_validation.sklearn.PopulationStabilityIndex</td>\n", - " <td id=\"T_56dd5_row46_col1\" class=\"data row46 col1\" >Population Stability Index</td>\n", - " <td id=\"T_56dd5_row46_col2\" class=\"data row46 col2\" >Assesses the Population Stability Index (PSI) to quantify the stability of an ML model's predictions across...</td>\n", - " <td id=\"T_56dd5_row46_col3\" class=\"data row46 col3\" >True</td>\n", - " <td id=\"T_56dd5_row46_col4\" class=\"data row46 col4\" >True</td>\n", - " <td id=\"T_56dd5_row46_col5\" class=\"data row46 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_56dd5_row46_col6\" class=\"data row46 col6\" >{'num_bins': {'type': 'int', 'default': 10}, 'mode': {'type': 'str', 'default': 'fixed'}}</td>\n", - " <td id=\"T_56dd5_row46_col7\" class=\"data row46 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_56dd5_row46_col8\" class=\"data row46 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row47_col0\" class=\"data row47 col0\" >validmind.model_validation.sklearn.PrecisionRecallCurve</td>\n", - " <td id=\"T_56dd5_row47_col1\" class=\"data row47 col1\" >Precision Recall Curve</td>\n", - " <td id=\"T_56dd5_row47_col2\" class=\"data row47 col2\" >Evaluates the precision-recall trade-off for binary classification models and visualizes the Precision-Recall curve....</td>\n", - " <td id=\"T_56dd5_row47_col3\" class=\"data row47 col3\" >True</td>\n", - " <td id=\"T_56dd5_row47_col4\" class=\"data row47 col4\" >False</td>\n", - " <td id=\"T_56dd5_row47_col5\" class=\"data row47 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_56dd5_row47_col6\" class=\"data row47 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row47_col7\" class=\"data row47 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_56dd5_row47_col8\" class=\"data row47 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row48_col0\" class=\"data row48 col0\" >validmind.model_validation.sklearn.ROCCurve</td>\n", - " <td id=\"T_56dd5_row48_col1\" class=\"data row48 col1\" >ROC Curve</td>\n", - " <td id=\"T_56dd5_row48_col2\" class=\"data row48 col2\" >Evaluates binary classification model performance by generating and plotting the Receiver Operating Characteristic...</td>\n", - " <td id=\"T_56dd5_row48_col3\" class=\"data row48 col3\" >True</td>\n", - " <td id=\"T_56dd5_row48_col4\" class=\"data row48 col4\" >False</td>\n", - " <td id=\"T_56dd5_row48_col5\" class=\"data row48 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_56dd5_row48_col6\" class=\"data row48 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row48_col7\" class=\"data row48 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_56dd5_row48_col8\" class=\"data row48 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row49_col0\" class=\"data row49 col0\" >validmind.model_validation.sklearn.RegressionErrors</td>\n", - " <td id=\"T_56dd5_row49_col1\" class=\"data row49 col1\" >Regression Errors</td>\n", - " <td id=\"T_56dd5_row49_col2\" class=\"data row49 col2\" >Assesses the performance and error distribution of a regression model using various error metrics....</td>\n", - " <td id=\"T_56dd5_row49_col3\" class=\"data row49 col3\" >False</td>\n", - " <td id=\"T_56dd5_row49_col4\" class=\"data row49 col4\" >True</td>\n", - " <td id=\"T_56dd5_row49_col5\" class=\"data row49 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_56dd5_row49_col6\" class=\"data row49 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row49_col7\" class=\"data row49 col7\" >['sklearn', 'model_performance']</td>\n", - " <td id=\"T_56dd5_row49_col8\" class=\"data row49 col8\" >['regression', 'classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row50_col0\" class=\"data row50 col0\" >validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", - " <td id=\"T_56dd5_row50_col1\" class=\"data row50 col1\" >Robustness Diagnosis</td>\n", - " <td id=\"T_56dd5_row50_col2\" class=\"data row50 col2\" >Assesses the robustness of a machine learning model by evaluating performance decay under noisy conditions....</td>\n", - " <td id=\"T_56dd5_row50_col3\" class=\"data row50 col3\" >True</td>\n", - " <td id=\"T_56dd5_row50_col4\" class=\"data row50 col4\" >True</td>\n", - " <td id=\"T_56dd5_row50_col5\" class=\"data row50 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_56dd5_row50_col6\" class=\"data row50 col6\" >{'metric': {'type': 'str', 'default': None}, 'scaling_factor_std_dev_list': {'type': None, 'default': [0.1, 0.2, 0.3, 0.4, 0.5]}, 'performance_decay_threshold': {'type': 'float', 'default': 0.05}}</td>\n", - " <td id=\"T_56dd5_row50_col7\" class=\"data row50 col7\" >['sklearn', 'model_diagnosis', 'visualization']</td>\n", - " <td id=\"T_56dd5_row50_col8\" class=\"data row50 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row51_col0\" class=\"data row51 col0\" >validmind.model_validation.sklearn.SHAPGlobalImportance</td>\n", - " <td id=\"T_56dd5_row51_col1\" class=\"data row51 col1\" >SHAP Global Importance</td>\n", - " <td id=\"T_56dd5_row51_col2\" class=\"data row51 col2\" >Evaluates and visualizes global feature importance using SHAP values for model explanation and risk identification....</td>\n", - " <td id=\"T_56dd5_row51_col3\" class=\"data row51 col3\" >False</td>\n", - " <td id=\"T_56dd5_row51_col4\" class=\"data row51 col4\" >True</td>\n", - " <td id=\"T_56dd5_row51_col5\" class=\"data row51 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_56dd5_row51_col6\" class=\"data row51 col6\" >{'kernel_explainer_samples': {'type': 'int', 'default': 10}, 'tree_or_linear_explainer_samples': {'type': 'int', 'default': 200}, 'class_of_interest': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_56dd5_row51_col7\" class=\"data row51 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'feature_importance', 'visualization']</td>\n", - " <td id=\"T_56dd5_row51_col8\" class=\"data row51 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row52_col0\" class=\"data row52 col0\" >validmind.model_validation.sklearn.ScoreProbabilityAlignment</td>\n", - " <td id=\"T_56dd5_row52_col1\" class=\"data row52 col1\" >Score Probability Alignment</td>\n", - " <td id=\"T_56dd5_row52_col2\" class=\"data row52 col2\" >Analyzes the alignment between credit scores and predicted probabilities....</td>\n", - " <td id=\"T_56dd5_row52_col3\" class=\"data row52 col3\" >True</td>\n", - " <td id=\"T_56dd5_row52_col4\" class=\"data row52 col4\" >True</td>\n", - " <td id=\"T_56dd5_row52_col5\" class=\"data row52 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_56dd5_row52_col6\" class=\"data row52 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'n_bins': {'type': 'int', 'default': 10}}</td>\n", - " <td id=\"T_56dd5_row52_col7\" class=\"data row52 col7\" >['visualization', 'credit_risk', 'calibration']</td>\n", - " <td id=\"T_56dd5_row52_col8\" class=\"data row52 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row53_col0\" class=\"data row53 col0\" >validmind.model_validation.sklearn.TrainingTestDegradation</td>\n", - " <td id=\"T_56dd5_row53_col1\" class=\"data row53 col1\" >Training Test Degradation</td>\n", - " <td id=\"T_56dd5_row53_col2\" class=\"data row53 col2\" >Tests if model performance degradation between training and test datasets exceeds a predefined threshold....</td>\n", - " <td id=\"T_56dd5_row53_col3\" class=\"data row53 col3\" >False</td>\n", - " <td id=\"T_56dd5_row53_col4\" class=\"data row53 col4\" >True</td>\n", - " <td id=\"T_56dd5_row53_col5\" class=\"data row53 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_56dd5_row53_col6\" class=\"data row53 col6\" >{'max_threshold': {'type': 'float', 'default': 0.1}}</td>\n", - " <td id=\"T_56dd5_row53_col7\" class=\"data row53 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_56dd5_row53_col8\" class=\"data row53 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row54_col0\" class=\"data row54 col0\" >validmind.model_validation.sklearn.WeakspotsDiagnosis</td>\n", - " <td id=\"T_56dd5_row54_col1\" class=\"data row54 col1\" >Weakspots Diagnosis</td>\n", - " <td id=\"T_56dd5_row54_col2\" class=\"data row54 col2\" >Identifies and visualizes weak spots in a machine learning model's performance across various sections of the...</td>\n", - " <td id=\"T_56dd5_row54_col3\" class=\"data row54 col3\" >True</td>\n", - " <td id=\"T_56dd5_row54_col4\" class=\"data row54 col4\" >True</td>\n", - " <td id=\"T_56dd5_row54_col5\" class=\"data row54 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_56dd5_row54_col6\" class=\"data row54 col6\" >{'features_columns': {'type': None, 'default': None}, 'metrics': {'type': None, 'default': None}, 'thresholds': {'type': None, 'default': None}}</td>\n", - " <td id=\"T_56dd5_row54_col7\" class=\"data row54 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_diagnosis', 'visualization']</td>\n", - " <td id=\"T_56dd5_row54_col8\" class=\"data row54 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row55_col0\" class=\"data row55 col0\" >validmind.model_validation.statsmodels.CumulativePredictionProbabilities</td>\n", - " <td id=\"T_56dd5_row55_col1\" class=\"data row55 col1\" >Cumulative Prediction Probabilities</td>\n", - " <td id=\"T_56dd5_row55_col2\" class=\"data row55 col2\" >Visualizes cumulative probabilities of positive and negative classes for both training and testing in classification models....</td>\n", - " <td id=\"T_56dd5_row55_col3\" class=\"data row55 col3\" >True</td>\n", - " <td id=\"T_56dd5_row55_col4\" class=\"data row55 col4\" >False</td>\n", - " <td id=\"T_56dd5_row55_col5\" class=\"data row55 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_56dd5_row55_col6\" class=\"data row55 col6\" >{'title': {'type': 'str', 'default': 'Cumulative Probabilities'}}</td>\n", - " <td id=\"T_56dd5_row55_col7\" class=\"data row55 col7\" >['visualization', 'credit_risk']</td>\n", - " <td id=\"T_56dd5_row55_col8\" class=\"data row55 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row56_col0\" class=\"data row56 col0\" >validmind.model_validation.statsmodels.GINITable</td>\n", - " <td id=\"T_56dd5_row56_col1\" class=\"data row56 col1\" >GINI Table</td>\n", - " <td id=\"T_56dd5_row56_col2\" class=\"data row56 col2\" >Evaluates classification model performance using AUC, GINI, and KS metrics for training and test datasets....</td>\n", - " <td id=\"T_56dd5_row56_col3\" class=\"data row56 col3\" >False</td>\n", - " <td id=\"T_56dd5_row56_col4\" class=\"data row56 col4\" >True</td>\n", - " <td id=\"T_56dd5_row56_col5\" class=\"data row56 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_56dd5_row56_col6\" class=\"data row56 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row56_col7\" class=\"data row56 col7\" >['model_performance']</td>\n", - " <td id=\"T_56dd5_row56_col8\" class=\"data row56 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row57_col0\" class=\"data row57 col0\" >validmind.model_validation.statsmodels.KolmogorovSmirnov</td>\n", - " <td id=\"T_56dd5_row57_col1\" class=\"data row57 col1\" >Kolmogorov Smirnov</td>\n", - " <td id=\"T_56dd5_row57_col2\" class=\"data row57 col2\" >Assesses whether each feature in the dataset aligns with a normal distribution using the Kolmogorov-Smirnov test....</td>\n", - " <td id=\"T_56dd5_row57_col3\" class=\"data row57 col3\" >False</td>\n", - " <td id=\"T_56dd5_row57_col4\" class=\"data row57 col4\" >True</td>\n", - " <td id=\"T_56dd5_row57_col5\" class=\"data row57 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_56dd5_row57_col6\" class=\"data row57 col6\" >{'dist': {'type': 'str', 'default': 'norm'}}</td>\n", - " <td id=\"T_56dd5_row57_col7\" class=\"data row57 col7\" >['tabular_data', 'data_distribution', 'statistical_test', 'statsmodels']</td>\n", - " <td id=\"T_56dd5_row57_col8\" class=\"data row57 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row58_col0\" class=\"data row58 col0\" >validmind.model_validation.statsmodels.Lilliefors</td>\n", - " <td id=\"T_56dd5_row58_col1\" class=\"data row58 col1\" >Lilliefors</td>\n", - " <td id=\"T_56dd5_row58_col2\" class=\"data row58 col2\" >Assesses the normality of feature distributions in an ML model's training dataset using the Lilliefors test....</td>\n", - " <td id=\"T_56dd5_row58_col3\" class=\"data row58 col3\" >False</td>\n", - " <td id=\"T_56dd5_row58_col4\" class=\"data row58 col4\" >True</td>\n", - " <td id=\"T_56dd5_row58_col5\" class=\"data row58 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row58_col6\" class=\"data row58 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row58_col7\" class=\"data row58 col7\" >['tabular_data', 'data_distribution', 'statistical_test', 'statsmodels']</td>\n", - " <td id=\"T_56dd5_row58_col8\" class=\"data row58 col8\" >['classification', 'regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row59_col0\" class=\"data row59 col0\" >validmind.model_validation.statsmodels.PredictionProbabilitiesHistogram</td>\n", - " <td id=\"T_56dd5_row59_col1\" class=\"data row59 col1\" >Prediction Probabilities Histogram</td>\n", - " <td id=\"T_56dd5_row59_col2\" class=\"data row59 col2\" >Assesses the predictive probability distribution for binary classification to evaluate model performance and...</td>\n", - " <td id=\"T_56dd5_row59_col3\" class=\"data row59 col3\" >True</td>\n", - " <td id=\"T_56dd5_row59_col4\" class=\"data row59 col4\" >False</td>\n", - " <td id=\"T_56dd5_row59_col5\" class=\"data row59 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_56dd5_row59_col6\" class=\"data row59 col6\" >{'title': {'type': 'str', 'default': 'Histogram of Predictive Probabilities'}}</td>\n", - " <td id=\"T_56dd5_row59_col7\" class=\"data row59 col7\" >['visualization', 'credit_risk']</td>\n", - " <td id=\"T_56dd5_row59_col8\" class=\"data row59 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row60_col0\" class=\"data row60 col0\" >validmind.model_validation.statsmodels.ScorecardHistogram</td>\n", - " <td id=\"T_56dd5_row60_col1\" class=\"data row60 col1\" >Scorecard Histogram</td>\n", - " <td id=\"T_56dd5_row60_col2\" class=\"data row60 col2\" >The Scorecard Histogram test evaluates the distribution of credit scores between default and non-default instances,...</td>\n", - " <td id=\"T_56dd5_row60_col3\" class=\"data row60 col3\" >True</td>\n", - " <td id=\"T_56dd5_row60_col4\" class=\"data row60 col4\" >False</td>\n", - " <td id=\"T_56dd5_row60_col5\" class=\"data row60 col5\" >['dataset']</td>\n", - " <td id=\"T_56dd5_row60_col6\" class=\"data row60 col6\" >{'title': {'type': 'str', 'default': 'Histogram of Scores'}, 'score_column': {'type': 'str', 'default': 'score'}}</td>\n", - " <td id=\"T_56dd5_row60_col7\" class=\"data row60 col7\" >['visualization', 'credit_risk', 'logistic_regression']</td>\n", - " <td id=\"T_56dd5_row60_col8\" class=\"data row60 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row61_col0\" class=\"data row61 col0\" >validmind.ongoing_monitoring.CalibrationCurveDrift</td>\n", - " <td id=\"T_56dd5_row61_col1\" class=\"data row61 col1\" >Calibration Curve Drift</td>\n", - " <td id=\"T_56dd5_row61_col2\" class=\"data row61 col2\" >Evaluates changes in probability calibration between reference and monitoring datasets....</td>\n", - " <td id=\"T_56dd5_row61_col3\" class=\"data row61 col3\" >True</td>\n", - " <td id=\"T_56dd5_row61_col4\" class=\"data row61 col4\" >True</td>\n", - " <td id=\"T_56dd5_row61_col5\" class=\"data row61 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_56dd5_row61_col6\" class=\"data row61 col6\" >{'n_bins': {'type': 'int', 'default': 10}, 'drift_pct_threshold': {'type': 'float', 'default': 20}}</td>\n", - " <td id=\"T_56dd5_row61_col7\" class=\"data row61 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_56dd5_row61_col8\" class=\"data row61 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row62_col0\" class=\"data row62 col0\" >validmind.ongoing_monitoring.ClassDiscriminationDrift</td>\n", - " <td id=\"T_56dd5_row62_col1\" class=\"data row62 col1\" >Class Discrimination Drift</td>\n", - " <td id=\"T_56dd5_row62_col2\" class=\"data row62 col2\" >Compares classification discrimination metrics between reference and monitoring datasets....</td>\n", - " <td id=\"T_56dd5_row62_col3\" class=\"data row62 col3\" >False</td>\n", - " <td id=\"T_56dd5_row62_col4\" class=\"data row62 col4\" >True</td>\n", - " <td id=\"T_56dd5_row62_col5\" class=\"data row62 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_56dd5_row62_col6\" class=\"data row62 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", - " <td id=\"T_56dd5_row62_col7\" class=\"data row62 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_56dd5_row62_col8\" class=\"data row62 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row63_col0\" class=\"data row63 col0\" >validmind.ongoing_monitoring.ClassImbalanceDrift</td>\n", - " <td id=\"T_56dd5_row63_col1\" class=\"data row63 col1\" >Class Imbalance Drift</td>\n", - " <td id=\"T_56dd5_row63_col2\" class=\"data row63 col2\" >Evaluates drift in class distribution between reference and monitoring datasets....</td>\n", - " <td id=\"T_56dd5_row63_col3\" class=\"data row63 col3\" >True</td>\n", - " <td id=\"T_56dd5_row63_col4\" class=\"data row63 col4\" >True</td>\n", - " <td id=\"T_56dd5_row63_col5\" class=\"data row63 col5\" >['datasets']</td>\n", - " <td id=\"T_56dd5_row63_col6\" class=\"data row63 col6\" >{'drift_pct_threshold': {'type': 'float', 'default': 5.0}, 'title': {'type': 'str', 'default': 'Class Distribution Drift'}}</td>\n", - " <td id=\"T_56dd5_row63_col7\" class=\"data row63 col7\" >['tabular_data', 'binary_classification', 'multiclass_classification']</td>\n", - " <td id=\"T_56dd5_row63_col8\" class=\"data row63 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row64_col0\" class=\"data row64 col0\" >validmind.ongoing_monitoring.ClassificationAccuracyDrift</td>\n", - " <td id=\"T_56dd5_row64_col1\" class=\"data row64 col1\" >Classification Accuracy Drift</td>\n", - " <td id=\"T_56dd5_row64_col2\" class=\"data row64 col2\" >Compares classification accuracy metrics between reference and monitoring datasets....</td>\n", - " <td id=\"T_56dd5_row64_col3\" class=\"data row64 col3\" >False</td>\n", - " <td id=\"T_56dd5_row64_col4\" class=\"data row64 col4\" >True</td>\n", - " <td id=\"T_56dd5_row64_col5\" class=\"data row64 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_56dd5_row64_col6\" class=\"data row64 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", - " <td id=\"T_56dd5_row64_col7\" class=\"data row64 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_56dd5_row64_col8\" class=\"data row64 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row65_col0\" class=\"data row65 col0\" >validmind.ongoing_monitoring.ConfusionMatrixDrift</td>\n", - " <td id=\"T_56dd5_row65_col1\" class=\"data row65 col1\" >Confusion Matrix Drift</td>\n", - " <td id=\"T_56dd5_row65_col2\" class=\"data row65 col2\" >Compares confusion matrix metrics between reference and monitoring datasets....</td>\n", - " <td id=\"T_56dd5_row65_col3\" class=\"data row65 col3\" >False</td>\n", - " <td id=\"T_56dd5_row65_col4\" class=\"data row65 col4\" >True</td>\n", - " <td id=\"T_56dd5_row65_col5\" class=\"data row65 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_56dd5_row65_col6\" class=\"data row65 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", - " <td id=\"T_56dd5_row65_col7\" class=\"data row65 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", - " <td id=\"T_56dd5_row65_col8\" class=\"data row65 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row66_col0\" class=\"data row66 col0\" >validmind.ongoing_monitoring.CumulativePredictionProbabilitiesDrift</td>\n", - " <td id=\"T_56dd5_row66_col1\" class=\"data row66 col1\" >Cumulative Prediction Probabilities Drift</td>\n", - " <td id=\"T_56dd5_row66_col2\" class=\"data row66 col2\" >Compares cumulative prediction probability distributions between reference and monitoring datasets....</td>\n", - " <td id=\"T_56dd5_row66_col3\" class=\"data row66 col3\" >True</td>\n", - " <td id=\"T_56dd5_row66_col4\" class=\"data row66 col4\" >False</td>\n", - " <td id=\"T_56dd5_row66_col5\" class=\"data row66 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_56dd5_row66_col6\" class=\"data row66 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row66_col7\" class=\"data row66 col7\" >['visualization', 'credit_risk']</td>\n", - " <td id=\"T_56dd5_row66_col8\" class=\"data row66 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row67_col0\" class=\"data row67 col0\" >validmind.ongoing_monitoring.PredictionProbabilitiesHistogramDrift</td>\n", - " <td id=\"T_56dd5_row67_col1\" class=\"data row67 col1\" >Prediction Probabilities Histogram Drift</td>\n", - " <td id=\"T_56dd5_row67_col2\" class=\"data row67 col2\" >Compares prediction probability distributions between reference and monitoring datasets....</td>\n", - " <td id=\"T_56dd5_row67_col3\" class=\"data row67 col3\" >True</td>\n", - " <td id=\"T_56dd5_row67_col4\" class=\"data row67 col4\" >True</td>\n", - " <td id=\"T_56dd5_row67_col5\" class=\"data row67 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_56dd5_row67_col6\" class=\"data row67 col6\" >{'title': {'type': '_empty', 'default': 'Prediction Probabilities Histogram Drift'}, 'drift_pct_threshold': {'type': 'float', 'default': 20.0}}</td>\n", - " <td id=\"T_56dd5_row67_col7\" class=\"data row67 col7\" >['visualization', 'credit_risk']</td>\n", - " <td id=\"T_56dd5_row67_col8\" class=\"data row67 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row68_col0\" class=\"data row68 col0\" >validmind.ongoing_monitoring.ROCCurveDrift</td>\n", - " <td id=\"T_56dd5_row68_col1\" class=\"data row68 col1\" >ROC Curve Drift</td>\n", - " <td id=\"T_56dd5_row68_col2\" class=\"data row68 col2\" >Compares ROC curves between reference and monitoring datasets....</td>\n", - " <td id=\"T_56dd5_row68_col3\" class=\"data row68 col3\" >True</td>\n", - " <td id=\"T_56dd5_row68_col4\" class=\"data row68 col4\" >False</td>\n", - " <td id=\"T_56dd5_row68_col5\" class=\"data row68 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_56dd5_row68_col6\" class=\"data row68 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row68_col7\" class=\"data row68 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_56dd5_row68_col8\" class=\"data row68 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row69_col0\" class=\"data row69 col0\" >validmind.ongoing_monitoring.ScoreBandsDrift</td>\n", - " <td id=\"T_56dd5_row69_col1\" class=\"data row69 col1\" >Score Bands Drift</td>\n", - " <td id=\"T_56dd5_row69_col2\" class=\"data row69 col2\" >Analyzes drift in population distribution and default rates across score bands....</td>\n", - " <td id=\"T_56dd5_row69_col3\" class=\"data row69 col3\" >False</td>\n", - " <td id=\"T_56dd5_row69_col4\" class=\"data row69 col4\" >True</td>\n", - " <td id=\"T_56dd5_row69_col5\" class=\"data row69 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_56dd5_row69_col6\" class=\"data row69 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'score_bands': {'type': 'list', 'default': None}, 'drift_threshold': {'type': 'float', 'default': 20.0}}</td>\n", - " <td id=\"T_56dd5_row69_col7\" class=\"data row69 col7\" >['visualization', 'credit_risk', 'scorecard']</td>\n", - " <td id=\"T_56dd5_row69_col8\" class=\"data row69 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row70_col0\" class=\"data row70 col0\" >validmind.ongoing_monitoring.ScorecardHistogramDrift</td>\n", - " <td id=\"T_56dd5_row70_col1\" class=\"data row70 col1\" >Scorecard Histogram Drift</td>\n", - " <td id=\"T_56dd5_row70_col2\" class=\"data row70 col2\" >Compares score distributions between reference and monitoring datasets for each class....</td>\n", - " <td id=\"T_56dd5_row70_col3\" class=\"data row70 col3\" >True</td>\n", - " <td id=\"T_56dd5_row70_col4\" class=\"data row70 col4\" >True</td>\n", - " <td id=\"T_56dd5_row70_col5\" class=\"data row70 col5\" >['datasets']</td>\n", - " <td id=\"T_56dd5_row70_col6\" class=\"data row70 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'title': {'type': 'str', 'default': 'Scorecard Histogram Drift'}, 'drift_pct_threshold': {'type': 'float', 'default': 20.0}}</td>\n", - " <td id=\"T_56dd5_row70_col7\" class=\"data row70 col7\" >['visualization', 'credit_risk', 'logistic_regression']</td>\n", - " <td id=\"T_56dd5_row70_col8\" class=\"data row70 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row71_col0\" class=\"data row71 col0\" >validmind.unit_metrics.classification.Accuracy</td>\n", - " <td id=\"T_56dd5_row71_col1\" class=\"data row71 col1\" >Accuracy</td>\n", - " <td id=\"T_56dd5_row71_col2\" class=\"data row71 col2\" >Calculates the accuracy of a model</td>\n", - " <td id=\"T_56dd5_row71_col3\" class=\"data row71 col3\" >False</td>\n", - " <td id=\"T_56dd5_row71_col4\" class=\"data row71 col4\" >False</td>\n", - " <td id=\"T_56dd5_row71_col5\" class=\"data row71 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_56dd5_row71_col6\" class=\"data row71 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row71_col7\" class=\"data row71 col7\" >['classification']</td>\n", - " <td id=\"T_56dd5_row71_col8\" class=\"data row71 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row72_col0\" class=\"data row72 col0\" >validmind.unit_metrics.classification.F1</td>\n", - " <td id=\"T_56dd5_row72_col1\" class=\"data row72 col1\" >F1</td>\n", - " <td id=\"T_56dd5_row72_col2\" class=\"data row72 col2\" >Calculates the F1 score for a classification model.</td>\n", - " <td id=\"T_56dd5_row72_col3\" class=\"data row72 col3\" >False</td>\n", - " <td id=\"T_56dd5_row72_col4\" class=\"data row72 col4\" >False</td>\n", - " <td id=\"T_56dd5_row72_col5\" class=\"data row72 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_56dd5_row72_col6\" class=\"data row72 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row72_col7\" class=\"data row72 col7\" >['classification']</td>\n", - " <td id=\"T_56dd5_row72_col8\" class=\"data row72 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row73_col0\" class=\"data row73 col0\" >validmind.unit_metrics.classification.Precision</td>\n", - " <td id=\"T_56dd5_row73_col1\" class=\"data row73 col1\" >Precision</td>\n", - " <td id=\"T_56dd5_row73_col2\" class=\"data row73 col2\" >Calculates the precision for a classification model.</td>\n", - " <td id=\"T_56dd5_row73_col3\" class=\"data row73 col3\" >False</td>\n", - " <td id=\"T_56dd5_row73_col4\" class=\"data row73 col4\" >False</td>\n", - " <td id=\"T_56dd5_row73_col5\" class=\"data row73 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_56dd5_row73_col6\" class=\"data row73 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row73_col7\" class=\"data row73 col7\" >['classification']</td>\n", - " <td id=\"T_56dd5_row73_col8\" class=\"data row73 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row74_col0\" class=\"data row74 col0\" >validmind.unit_metrics.classification.ROC_AUC</td>\n", - " <td id=\"T_56dd5_row74_col1\" class=\"data row74 col1\" >ROC AUC</td>\n", - " <td id=\"T_56dd5_row74_col2\" class=\"data row74 col2\" >Calculates the ROC AUC for a classification model.</td>\n", - " <td id=\"T_56dd5_row74_col3\" class=\"data row74 col3\" >False</td>\n", - " <td id=\"T_56dd5_row74_col4\" class=\"data row74 col4\" >False</td>\n", - " <td id=\"T_56dd5_row74_col5\" class=\"data row74 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_56dd5_row74_col6\" class=\"data row74 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row74_col7\" class=\"data row74 col7\" >['classification']</td>\n", - " <td id=\"T_56dd5_row74_col8\" class=\"data row74 col8\" >['classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_56dd5_row75_col0\" class=\"data row75 col0\" >validmind.unit_metrics.classification.Recall</td>\n", - " <td id=\"T_56dd5_row75_col1\" class=\"data row75 col1\" >Recall</td>\n", - " <td id=\"T_56dd5_row75_col2\" class=\"data row75 col2\" >Calculates the recall for a classification model.</td>\n", - " <td id=\"T_56dd5_row75_col3\" class=\"data row75 col3\" >False</td>\n", - " <td id=\"T_56dd5_row75_col4\" class=\"data row75 col4\" >False</td>\n", - " <td id=\"T_56dd5_row75_col5\" class=\"data row75 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_56dd5_row75_col6\" class=\"data row75 col6\" >{}</td>\n", - " <td id=\"T_56dd5_row75_col7\" class=\"data row75 col7\" >['classification']</td>\n", - " <td id=\"T_56dd5_row75_col8\" class=\"data row75 col8\" >['classification']</td>\n", - " </tr>\n", - " </tbody>\n", - "</table>\n" + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "<style type=\"text/css\">\n", + "#T_326c3 th {\n", + " text-align: left;\n", + "}\n", + "#T_326c3_row0_col0, #T_326c3_row0_col1, #T_326c3_row0_col2, #T_326c3_row0_col3, #T_326c3_row0_col4, #T_326c3_row0_col5, #T_326c3_row0_col6, #T_326c3_row0_col7, #T_326c3_row0_col8, #T_326c3_row1_col0, #T_326c3_row1_col1, #T_326c3_row1_col2, #T_326c3_row1_col3, #T_326c3_row1_col4, #T_326c3_row1_col5, #T_326c3_row1_col6, #T_326c3_row1_col7, #T_326c3_row1_col8, #T_326c3_row2_col0, #T_326c3_row2_col1, #T_326c3_row2_col2, #T_326c3_row2_col3, #T_326c3_row2_col4, #T_326c3_row2_col5, #T_326c3_row2_col6, #T_326c3_row2_col7, #T_326c3_row2_col8, #T_326c3_row3_col0, #T_326c3_row3_col1, #T_326c3_row3_col2, #T_326c3_row3_col3, #T_326c3_row3_col4, #T_326c3_row3_col5, #T_326c3_row3_col6, #T_326c3_row3_col7, #T_326c3_row3_col8, #T_326c3_row4_col0, #T_326c3_row4_col1, #T_326c3_row4_col2, #T_326c3_row4_col3, #T_326c3_row4_col4, #T_326c3_row4_col5, #T_326c3_row4_col6, #T_326c3_row4_col7, #T_326c3_row4_col8, #T_326c3_row5_col0, #T_326c3_row5_col1, #T_326c3_row5_col2, #T_326c3_row5_col3, #T_326c3_row5_col4, #T_326c3_row5_col5, #T_326c3_row5_col6, #T_326c3_row5_col7, #T_326c3_row5_col8, #T_326c3_row6_col0, #T_326c3_row6_col1, #T_326c3_row6_col2, #T_326c3_row6_col3, #T_326c3_row6_col4, #T_326c3_row6_col5, #T_326c3_row6_col6, #T_326c3_row6_col7, #T_326c3_row6_col8, #T_326c3_row7_col0, #T_326c3_row7_col1, #T_326c3_row7_col2, #T_326c3_row7_col3, #T_326c3_row7_col4, #T_326c3_row7_col5, #T_326c3_row7_col6, #T_326c3_row7_col7, #T_326c3_row7_col8, #T_326c3_row8_col0, #T_326c3_row8_col1, #T_326c3_row8_col2, #T_326c3_row8_col3, #T_326c3_row8_col4, #T_326c3_row8_col5, #T_326c3_row8_col6, #T_326c3_row8_col7, #T_326c3_row8_col8, #T_326c3_row9_col0, #T_326c3_row9_col1, #T_326c3_row9_col2, #T_326c3_row9_col3, #T_326c3_row9_col4, #T_326c3_row9_col5, #T_326c3_row9_col6, #T_326c3_row9_col7, #T_326c3_row9_col8, #T_326c3_row10_col0, #T_326c3_row10_col1, #T_326c3_row10_col2, #T_326c3_row10_col3, #T_326c3_row10_col4, #T_326c3_row10_col5, #T_326c3_row10_col6, #T_326c3_row10_col7, #T_326c3_row10_col8, #T_326c3_row11_col0, #T_326c3_row11_col1, #T_326c3_row11_col2, #T_326c3_row11_col3, #T_326c3_row11_col4, #T_326c3_row11_col5, #T_326c3_row11_col6, #T_326c3_row11_col7, #T_326c3_row11_col8, #T_326c3_row12_col0, #T_326c3_row12_col1, #T_326c3_row12_col2, #T_326c3_row12_col3, #T_326c3_row12_col4, #T_326c3_row12_col5, #T_326c3_row12_col6, #T_326c3_row12_col7, #T_326c3_row12_col8, #T_326c3_row13_col0, #T_326c3_row13_col1, #T_326c3_row13_col2, #T_326c3_row13_col3, #T_326c3_row13_col4, #T_326c3_row13_col5, #T_326c3_row13_col6, #T_326c3_row13_col7, #T_326c3_row13_col8, #T_326c3_row14_col0, #T_326c3_row14_col1, #T_326c3_row14_col2, #T_326c3_row14_col3, #T_326c3_row14_col4, #T_326c3_row14_col5, #T_326c3_row14_col6, #T_326c3_row14_col7, #T_326c3_row14_col8, #T_326c3_row15_col0, #T_326c3_row15_col1, #T_326c3_row15_col2, #T_326c3_row15_col3, #T_326c3_row15_col4, #T_326c3_row15_col5, #T_326c3_row15_col6, #T_326c3_row15_col7, #T_326c3_row15_col8, #T_326c3_row16_col0, #T_326c3_row16_col1, #T_326c3_row16_col2, #T_326c3_row16_col3, #T_326c3_row16_col4, #T_326c3_row16_col5, #T_326c3_row16_col6, #T_326c3_row16_col7, #T_326c3_row16_col8, #T_326c3_row17_col0, #T_326c3_row17_col1, #T_326c3_row17_col2, #T_326c3_row17_col3, #T_326c3_row17_col4, #T_326c3_row17_col5, #T_326c3_row17_col6, #T_326c3_row17_col7, #T_326c3_row17_col8, #T_326c3_row18_col0, #T_326c3_row18_col1, #T_326c3_row18_col2, #T_326c3_row18_col3, #T_326c3_row18_col4, #T_326c3_row18_col5, #T_326c3_row18_col6, #T_326c3_row18_col7, #T_326c3_row18_col8, #T_326c3_row19_col0, #T_326c3_row19_col1, #T_326c3_row19_col2, #T_326c3_row19_col3, #T_326c3_row19_col4, #T_326c3_row19_col5, #T_326c3_row19_col6, #T_326c3_row19_col7, #T_326c3_row19_col8, #T_326c3_row20_col0, #T_326c3_row20_col1, #T_326c3_row20_col2, #T_326c3_row20_col3, #T_326c3_row20_col4, #T_326c3_row20_col5, #T_326c3_row20_col6, #T_326c3_row20_col7, #T_326c3_row20_col8, #T_326c3_row21_col0, #T_326c3_row21_col1, #T_326c3_row21_col2, #T_326c3_row21_col3, #T_326c3_row21_col4, #T_326c3_row21_col5, #T_326c3_row21_col6, #T_326c3_row21_col7, #T_326c3_row21_col8, #T_326c3_row22_col0, #T_326c3_row22_col1, #T_326c3_row22_col2, #T_326c3_row22_col3, #T_326c3_row22_col4, #T_326c3_row22_col5, #T_326c3_row22_col6, #T_326c3_row22_col7, #T_326c3_row22_col8, #T_326c3_row23_col0, #T_326c3_row23_col1, #T_326c3_row23_col2, #T_326c3_row23_col3, #T_326c3_row23_col4, #T_326c3_row23_col5, #T_326c3_row23_col6, #T_326c3_row23_col7, #T_326c3_row23_col8, #T_326c3_row24_col0, #T_326c3_row24_col1, #T_326c3_row24_col2, #T_326c3_row24_col3, #T_326c3_row24_col4, #T_326c3_row24_col5, #T_326c3_row24_col6, #T_326c3_row24_col7, #T_326c3_row24_col8, #T_326c3_row25_col0, #T_326c3_row25_col1, #T_326c3_row25_col2, #T_326c3_row25_col3, #T_326c3_row25_col4, #T_326c3_row25_col5, #T_326c3_row25_col6, #T_326c3_row25_col7, #T_326c3_row25_col8, #T_326c3_row26_col0, #T_326c3_row26_col1, #T_326c3_row26_col2, #T_326c3_row26_col3, #T_326c3_row26_col4, #T_326c3_row26_col5, #T_326c3_row26_col6, #T_326c3_row26_col7, #T_326c3_row26_col8, #T_326c3_row27_col0, #T_326c3_row27_col1, #T_326c3_row27_col2, #T_326c3_row27_col3, #T_326c3_row27_col4, #T_326c3_row27_col5, #T_326c3_row27_col6, #T_326c3_row27_col7, #T_326c3_row27_col8, #T_326c3_row28_col0, #T_326c3_row28_col1, #T_326c3_row28_col2, #T_326c3_row28_col3, #T_326c3_row28_col4, #T_326c3_row28_col5, #T_326c3_row28_col6, #T_326c3_row28_col7, #T_326c3_row28_col8, #T_326c3_row29_col0, #T_326c3_row29_col1, #T_326c3_row29_col2, #T_326c3_row29_col3, #T_326c3_row29_col4, #T_326c3_row29_col5, #T_326c3_row29_col6, #T_326c3_row29_col7, #T_326c3_row29_col8, #T_326c3_row30_col0, #T_326c3_row30_col1, #T_326c3_row30_col2, #T_326c3_row30_col3, #T_326c3_row30_col4, #T_326c3_row30_col5, #T_326c3_row30_col6, #T_326c3_row30_col7, #T_326c3_row30_col8, #T_326c3_row31_col0, #T_326c3_row31_col1, #T_326c3_row31_col2, #T_326c3_row31_col3, #T_326c3_row31_col4, #T_326c3_row31_col5, #T_326c3_row31_col6, #T_326c3_row31_col7, #T_326c3_row31_col8, #T_326c3_row32_col0, #T_326c3_row32_col1, #T_326c3_row32_col2, #T_326c3_row32_col3, #T_326c3_row32_col4, #T_326c3_row32_col5, #T_326c3_row32_col6, #T_326c3_row32_col7, #T_326c3_row32_col8, #T_326c3_row33_col0, #T_326c3_row33_col1, #T_326c3_row33_col2, #T_326c3_row33_col3, #T_326c3_row33_col4, #T_326c3_row33_col5, #T_326c3_row33_col6, #T_326c3_row33_col7, #T_326c3_row33_col8, #T_326c3_row34_col0, #T_326c3_row34_col1, #T_326c3_row34_col2, #T_326c3_row34_col3, #T_326c3_row34_col4, #T_326c3_row34_col5, #T_326c3_row34_col6, #T_326c3_row34_col7, #T_326c3_row34_col8, #T_326c3_row35_col0, #T_326c3_row35_col1, #T_326c3_row35_col2, #T_326c3_row35_col3, #T_326c3_row35_col4, #T_326c3_row35_col5, #T_326c3_row35_col6, #T_326c3_row35_col7, #T_326c3_row35_col8, #T_326c3_row36_col0, #T_326c3_row36_col1, #T_326c3_row36_col2, #T_326c3_row36_col3, #T_326c3_row36_col4, #T_326c3_row36_col5, #T_326c3_row36_col6, #T_326c3_row36_col7, #T_326c3_row36_col8, #T_326c3_row37_col0, #T_326c3_row37_col1, #T_326c3_row37_col2, #T_326c3_row37_col3, #T_326c3_row37_col4, #T_326c3_row37_col5, #T_326c3_row37_col6, #T_326c3_row37_col7, #T_326c3_row37_col8, #T_326c3_row38_col0, #T_326c3_row38_col1, #T_326c3_row38_col2, #T_326c3_row38_col3, #T_326c3_row38_col4, #T_326c3_row38_col5, #T_326c3_row38_col6, #T_326c3_row38_col7, #T_326c3_row38_col8, #T_326c3_row39_col0, #T_326c3_row39_col1, #T_326c3_row39_col2, #T_326c3_row39_col3, #T_326c3_row39_col4, #T_326c3_row39_col5, #T_326c3_row39_col6, #T_326c3_row39_col7, #T_326c3_row39_col8, #T_326c3_row40_col0, #T_326c3_row40_col1, #T_326c3_row40_col2, #T_326c3_row40_col3, #T_326c3_row40_col4, #T_326c3_row40_col5, #T_326c3_row40_col6, #T_326c3_row40_col7, #T_326c3_row40_col8, #T_326c3_row41_col0, #T_326c3_row41_col1, #T_326c3_row41_col2, #T_326c3_row41_col3, #T_326c3_row41_col4, #T_326c3_row41_col5, #T_326c3_row41_col6, #T_326c3_row41_col7, #T_326c3_row41_col8, #T_326c3_row42_col0, #T_326c3_row42_col1, #T_326c3_row42_col2, #T_326c3_row42_col3, #T_326c3_row42_col4, #T_326c3_row42_col5, #T_326c3_row42_col6, #T_326c3_row42_col7, #T_326c3_row42_col8 {\n", + " text-align: left;\n", + "}\n", + "</style>\n", + "<table id=\"T_326c3\">\n", + " <thead>\n", + " <tr>\n", + " <th id=\"T_326c3_level0_col0\" class=\"col_heading level0 col0\" >ID</th>\n", + " <th id=\"T_326c3_level0_col1\" class=\"col_heading level0 col1\" >Name</th>\n", + " <th id=\"T_326c3_level0_col2\" class=\"col_heading level0 col2\" >Description</th>\n", + " <th id=\"T_326c3_level0_col3\" class=\"col_heading level0 col3\" >Has Figure</th>\n", + " <th id=\"T_326c3_level0_col4\" class=\"col_heading level0 col4\" >Has Table</th>\n", + " <th id=\"T_326c3_level0_col5\" class=\"col_heading level0 col5\" >Required Inputs</th>\n", + " <th id=\"T_326c3_level0_col6\" class=\"col_heading level0 col6\" >Params</th>\n", + " <th id=\"T_326c3_level0_col7\" class=\"col_heading level0 col7\" >Tags</th>\n", + " <th id=\"T_326c3_level0_col8\" class=\"col_heading level0 col8\" >Tasks</th>\n", + " </tr>\n", + " </thead>\n", + " <tbody>\n", + " <tr>\n", + " <td id=\"T_326c3_row0_col0\" class=\"data row0 col0\" >validmind.model_validation.ClusterSizeDistribution</td>\n", + " <td id=\"T_326c3_row0_col1\" class=\"data row0 col1\" >Cluster Size Distribution</td>\n", + " <td id=\"T_326c3_row0_col2\" class=\"data row0 col2\" >Assesses the performance of clustering models by comparing the distribution of cluster sizes in model predictions...</td>\n", + " <td id=\"T_326c3_row0_col3\" class=\"data row0 col3\" >True</td>\n", + " <td id=\"T_326c3_row0_col4\" class=\"data row0 col4\" >False</td>\n", + " <td id=\"T_326c3_row0_col5\" class=\"data row0 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_326c3_row0_col6\" class=\"data row0 col6\" >{}</td>\n", + " <td id=\"T_326c3_row0_col7\" class=\"data row0 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_326c3_row0_col8\" class=\"data row0 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row1_col0\" class=\"data row1 col0\" >validmind.model_validation.TimeSeriesR2SquareBySegments</td>\n", + " <td id=\"T_326c3_row1_col1\" class=\"data row1 col1\" >Time Series R2 Square By Segments</td>\n", + " <td id=\"T_326c3_row1_col2\" class=\"data row1 col2\" >Evaluates the R-Squared values of regression models over specified time segments in time series data to assess...</td>\n", + " <td id=\"T_326c3_row1_col3\" class=\"data row1 col3\" >True</td>\n", + " <td id=\"T_326c3_row1_col4\" class=\"data row1 col4\" >True</td>\n", + " <td id=\"T_326c3_row1_col5\" class=\"data row1 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_326c3_row1_col6\" class=\"data row1 col6\" >{'segments': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_326c3_row1_col7\" class=\"data row1 col7\" >['model_performance', 'sklearn']</td>\n", + " <td id=\"T_326c3_row1_col8\" class=\"data row1 col8\" >['regression', 'time_series_forecasting']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row2_col0\" class=\"data row2 col0\" >validmind.model_validation.sklearn.AdjustedMutualInformation</td>\n", + " <td id=\"T_326c3_row2_col1\" class=\"data row2 col1\" >Adjusted Mutual Information</td>\n", + " <td id=\"T_326c3_row2_col2\" class=\"data row2 col2\" >Evaluates clustering model performance by measuring mutual information between true and predicted labels, adjusting...</td>\n", + " <td id=\"T_326c3_row2_col3\" class=\"data row2 col3\" >False</td>\n", + " <td id=\"T_326c3_row2_col4\" class=\"data row2 col4\" >True</td>\n", + " <td id=\"T_326c3_row2_col5\" class=\"data row2 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row2_col6\" class=\"data row2 col6\" >{}</td>\n", + " <td id=\"T_326c3_row2_col7\" class=\"data row2 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", + " <td id=\"T_326c3_row2_col8\" class=\"data row2 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row3_col0\" class=\"data row3 col0\" >validmind.model_validation.sklearn.AdjustedRandIndex</td>\n", + " <td id=\"T_326c3_row3_col1\" class=\"data row3 col1\" >Adjusted Rand Index</td>\n", + " <td id=\"T_326c3_row3_col2\" class=\"data row3 col2\" >Measures the similarity between two data clusters using the Adjusted Rand Index (ARI) metric in clustering machine...</td>\n", + " <td id=\"T_326c3_row3_col3\" class=\"data row3 col3\" >False</td>\n", + " <td id=\"T_326c3_row3_col4\" class=\"data row3 col4\" >True</td>\n", + " <td id=\"T_326c3_row3_col5\" class=\"data row3 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row3_col6\" class=\"data row3 col6\" >{}</td>\n", + " <td id=\"T_326c3_row3_col7\" class=\"data row3 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", + " <td id=\"T_326c3_row3_col8\" class=\"data row3 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row4_col0\" class=\"data row4 col0\" >validmind.model_validation.sklearn.CalibrationCurve</td>\n", + " <td id=\"T_326c3_row4_col1\" class=\"data row4 col1\" >Calibration Curve</td>\n", + " <td id=\"T_326c3_row4_col2\" class=\"data row4 col2\" >Evaluates the calibration of probability estimates by comparing predicted probabilities against observed...</td>\n", + " <td id=\"T_326c3_row4_col3\" class=\"data row4 col3\" >True</td>\n", + " <td id=\"T_326c3_row4_col4\" class=\"data row4 col4\" >False</td>\n", + " <td id=\"T_326c3_row4_col5\" class=\"data row4 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row4_col6\" class=\"data row4 col6\" >{'n_bins': {'type': 'int', 'default': 10}}</td>\n", + " <td id=\"T_326c3_row4_col7\" class=\"data row4 col7\" >['sklearn', 'model_performance', 'classification']</td>\n", + " <td id=\"T_326c3_row4_col8\" class=\"data row4 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row5_col0\" class=\"data row5 col0\" >validmind.model_validation.sklearn.ClassifierPerformance</td>\n", + " <td id=\"T_326c3_row5_col1\" class=\"data row5 col1\" >Classifier Performance</td>\n", + " <td id=\"T_326c3_row5_col2\" class=\"data row5 col2\" >Evaluates performance of binary or multiclass classification models using precision, recall, F1-Score, accuracy,...</td>\n", + " <td id=\"T_326c3_row5_col3\" class=\"data row5 col3\" >False</td>\n", + " <td id=\"T_326c3_row5_col4\" class=\"data row5 col4\" >True</td>\n", + " <td id=\"T_326c3_row5_col5\" class=\"data row5 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_326c3_row5_col6\" class=\"data row5 col6\" >{'average': {'type': 'str', 'default': 'macro'}}</td>\n", + " <td id=\"T_326c3_row5_col7\" class=\"data row5 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_326c3_row5_col8\" class=\"data row5 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row6_col0\" class=\"data row6 col0\" >validmind.model_validation.sklearn.ClassifierThresholdOptimization</td>\n", + " <td id=\"T_326c3_row6_col1\" class=\"data row6 col1\" >Classifier Threshold Optimization</td>\n", + " <td id=\"T_326c3_row6_col2\" class=\"data row6 col2\" >Analyzes and visualizes different threshold optimization methods for binary classification models....</td>\n", + " <td id=\"T_326c3_row6_col3\" class=\"data row6 col3\" >False</td>\n", + " <td id=\"T_326c3_row6_col4\" class=\"data row6 col4\" >True</td>\n", + " <td id=\"T_326c3_row6_col5\" class=\"data row6 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_326c3_row6_col6\" class=\"data row6 col6\" >{'methods': {'type': None, 'default': None}, 'target_recall': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_326c3_row6_col7\" class=\"data row6 col7\" >['model_validation', 'threshold_optimization', 'classification_metrics']</td>\n", + " <td id=\"T_326c3_row6_col8\" class=\"data row6 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row7_col0\" class=\"data row7 col0\" >validmind.model_validation.sklearn.ClusterCosineSimilarity</td>\n", + " <td id=\"T_326c3_row7_col1\" class=\"data row7 col1\" >Cluster Cosine Similarity</td>\n", + " <td id=\"T_326c3_row7_col2\" class=\"data row7 col2\" >Measures the intra-cluster similarity of a clustering model using cosine similarity....</td>\n", + " <td id=\"T_326c3_row7_col3\" class=\"data row7 col3\" >False</td>\n", + " <td id=\"T_326c3_row7_col4\" class=\"data row7 col4\" >True</td>\n", + " <td id=\"T_326c3_row7_col5\" class=\"data row7 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row7_col6\" class=\"data row7 col6\" >{}</td>\n", + " <td id=\"T_326c3_row7_col7\" class=\"data row7 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", + " <td id=\"T_326c3_row7_col8\" class=\"data row7 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row8_col0\" class=\"data row8 col0\" >validmind.model_validation.sklearn.ClusterPerformanceMetrics</td>\n", + " <td id=\"T_326c3_row8_col1\" class=\"data row8 col1\" >Cluster Performance Metrics</td>\n", + " <td id=\"T_326c3_row8_col2\" class=\"data row8 col2\" >Evaluates the performance of clustering machine learning models using multiple established metrics....</td>\n", + " <td id=\"T_326c3_row8_col3\" class=\"data row8 col3\" >False</td>\n", + " <td id=\"T_326c3_row8_col4\" class=\"data row8 col4\" >True</td>\n", + " <td id=\"T_326c3_row8_col5\" class=\"data row8 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row8_col6\" class=\"data row8 col6\" >{}</td>\n", + " <td id=\"T_326c3_row8_col7\" class=\"data row8 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", + " <td id=\"T_326c3_row8_col8\" class=\"data row8 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row9_col0\" class=\"data row9 col0\" >validmind.model_validation.sklearn.CompletenessScore</td>\n", + " <td id=\"T_326c3_row9_col1\" class=\"data row9 col1\" >Completeness Score</td>\n", + " <td id=\"T_326c3_row9_col2\" class=\"data row9 col2\" >Evaluates a clustering model's capacity to categorize instances from a single class into the same cluster....</td>\n", + " <td id=\"T_326c3_row9_col3\" class=\"data row9 col3\" >False</td>\n", + " <td id=\"T_326c3_row9_col4\" class=\"data row9 col4\" >True</td>\n", + " <td id=\"T_326c3_row9_col5\" class=\"data row9 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row9_col6\" class=\"data row9 col6\" >{}</td>\n", + " <td id=\"T_326c3_row9_col7\" class=\"data row9 col7\" >['sklearn', 'model_performance', 'clustering']</td>\n", + " <td id=\"T_326c3_row9_col8\" class=\"data row9 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row10_col0\" class=\"data row10 col0\" >validmind.model_validation.sklearn.ConfusionMatrix</td>\n", + " <td id=\"T_326c3_row10_col1\" class=\"data row10 col1\" >Confusion Matrix</td>\n", + " <td id=\"T_326c3_row10_col2\" class=\"data row10 col2\" >Evaluates and visually represents the classification ML model's predictive performance using a Confusion Matrix...</td>\n", + " <td id=\"T_326c3_row10_col3\" class=\"data row10 col3\" >True</td>\n", + " <td id=\"T_326c3_row10_col4\" class=\"data row10 col4\" >False</td>\n", + " <td id=\"T_326c3_row10_col5\" class=\"data row10 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_326c3_row10_col6\" class=\"data row10 col6\" >{'threshold': {'type': 'float', 'default': 0.5}}</td>\n", + " <td id=\"T_326c3_row10_col7\" class=\"data row10 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_326c3_row10_col8\" class=\"data row10 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row11_col0\" class=\"data row11 col0\" >validmind.model_validation.sklearn.FeatureImportance</td>\n", + " <td id=\"T_326c3_row11_col1\" class=\"data row11 col1\" >Feature Importance</td>\n", + " <td id=\"T_326c3_row11_col2\" class=\"data row11 col2\" >Compute feature importance scores for a given model and generate a summary table...</td>\n", + " <td id=\"T_326c3_row11_col3\" class=\"data row11 col3\" >False</td>\n", + " <td id=\"T_326c3_row11_col4\" class=\"data row11 col4\" >True</td>\n", + " <td id=\"T_326c3_row11_col5\" class=\"data row11 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_326c3_row11_col6\" class=\"data row11 col6\" >{'num_features': {'type': 'int', 'default': 3}}</td>\n", + " <td id=\"T_326c3_row11_col7\" class=\"data row11 col7\" >['model_explainability', 'sklearn']</td>\n", + " <td id=\"T_326c3_row11_col8\" class=\"data row11 col8\" >['regression', 'time_series_forecasting']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row12_col0\" class=\"data row12 col0\" >validmind.model_validation.sklearn.FowlkesMallowsScore</td>\n", + " <td id=\"T_326c3_row12_col1\" class=\"data row12 col1\" >Fowlkes Mallows Score</td>\n", + " <td id=\"T_326c3_row12_col2\" class=\"data row12 col2\" >Evaluates the similarity between predicted and actual cluster assignments in a model using the Fowlkes-Mallows...</td>\n", + " <td id=\"T_326c3_row12_col3\" class=\"data row12 col3\" >False</td>\n", + " <td id=\"T_326c3_row12_col4\" class=\"data row12 col4\" >True</td>\n", + " <td id=\"T_326c3_row12_col5\" class=\"data row12 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_326c3_row12_col6\" class=\"data row12 col6\" >{}</td>\n", + " <td id=\"T_326c3_row12_col7\" class=\"data row12 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_326c3_row12_col8\" class=\"data row12 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row13_col0\" class=\"data row13 col0\" >validmind.model_validation.sklearn.HomogeneityScore</td>\n", + " <td id=\"T_326c3_row13_col1\" class=\"data row13 col1\" >Homogeneity Score</td>\n", + " <td id=\"T_326c3_row13_col2\" class=\"data row13 col2\" >Assesses clustering homogeneity by comparing true and predicted labels, scoring from 0 (heterogeneous) to 1...</td>\n", + " <td id=\"T_326c3_row13_col3\" class=\"data row13 col3\" >False</td>\n", + " <td id=\"T_326c3_row13_col4\" class=\"data row13 col4\" >True</td>\n", + " <td id=\"T_326c3_row13_col5\" class=\"data row13 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_326c3_row13_col6\" class=\"data row13 col6\" >{}</td>\n", + " <td id=\"T_326c3_row13_col7\" class=\"data row13 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_326c3_row13_col8\" class=\"data row13 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row14_col0\" class=\"data row14 col0\" >validmind.model_validation.sklearn.HyperParametersTuning</td>\n", + " <td id=\"T_326c3_row14_col1\" class=\"data row14 col1\" >Hyper Parameters Tuning</td>\n", + " <td id=\"T_326c3_row14_col2\" class=\"data row14 col2\" >Performs exhaustive grid search over specified parameter ranges to find optimal model configurations...</td>\n", + " <td id=\"T_326c3_row14_col3\" class=\"data row14 col3\" >False</td>\n", + " <td id=\"T_326c3_row14_col4\" class=\"data row14 col4\" >True</td>\n", + " <td id=\"T_326c3_row14_col5\" class=\"data row14 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row14_col6\" class=\"data row14 col6\" >{'param_grid': {'type': 'dict', 'default': None}, 'scoring': {'type': None, 'default': None}, 'thresholds': {'type': None, 'default': None}, 'fit_params': {'type': 'dict', 'default': None}}</td>\n", + " <td id=\"T_326c3_row14_col7\" class=\"data row14 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_326c3_row14_col8\" class=\"data row14 col8\" >['clustering', 'classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row15_col0\" class=\"data row15 col0\" >validmind.model_validation.sklearn.KMeansClustersOptimization</td>\n", + " <td id=\"T_326c3_row15_col1\" class=\"data row15 col1\" >K Means Clusters Optimization</td>\n", + " <td id=\"T_326c3_row15_col2\" class=\"data row15 col2\" >Optimizes the number of clusters in K-means models using Elbow and Silhouette methods....</td>\n", + " <td id=\"T_326c3_row15_col3\" class=\"data row15 col3\" >True</td>\n", + " <td id=\"T_326c3_row15_col4\" class=\"data row15 col4\" >False</td>\n", + " <td id=\"T_326c3_row15_col5\" class=\"data row15 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row15_col6\" class=\"data row15 col6\" >{'n_clusters': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_326c3_row15_col7\" class=\"data row15 col7\" >['sklearn', 'model_performance', 'kmeans']</td>\n", + " <td id=\"T_326c3_row15_col8\" class=\"data row15 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row16_col0\" class=\"data row16 col0\" >validmind.model_validation.sklearn.MinimumAccuracy</td>\n", + " <td id=\"T_326c3_row16_col1\" class=\"data row16 col1\" >Minimum Accuracy</td>\n", + " <td id=\"T_326c3_row16_col2\" class=\"data row16 col2\" >Checks if the model's prediction accuracy meets or surpasses a specified threshold....</td>\n", + " <td id=\"T_326c3_row16_col3\" class=\"data row16 col3\" >False</td>\n", + " <td id=\"T_326c3_row16_col4\" class=\"data row16 col4\" >True</td>\n", + " <td id=\"T_326c3_row16_col5\" class=\"data row16 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_326c3_row16_col6\" class=\"data row16 col6\" >{'min_threshold': {'type': 'float', 'default': 0.7}}</td>\n", + " <td id=\"T_326c3_row16_col7\" class=\"data row16 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_326c3_row16_col8\" class=\"data row16 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row17_col0\" class=\"data row17 col0\" >validmind.model_validation.sklearn.MinimumF1Score</td>\n", + " <td id=\"T_326c3_row17_col1\" class=\"data row17 col1\" >Minimum F1 Score</td>\n", + " <td id=\"T_326c3_row17_col2\" class=\"data row17 col2\" >Assesses if the model's F1 score on the validation set meets a predefined minimum threshold, ensuring balanced...</td>\n", + " <td id=\"T_326c3_row17_col3\" class=\"data row17 col3\" >False</td>\n", + " <td id=\"T_326c3_row17_col4\" class=\"data row17 col4\" >True</td>\n", + " <td id=\"T_326c3_row17_col5\" class=\"data row17 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_326c3_row17_col6\" class=\"data row17 col6\" >{'min_threshold': {'type': 'float', 'default': 0.5}}</td>\n", + " <td id=\"T_326c3_row17_col7\" class=\"data row17 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_326c3_row17_col8\" class=\"data row17 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row18_col0\" class=\"data row18 col0\" >validmind.model_validation.sklearn.MinimumROCAUCScore</td>\n", + " <td id=\"T_326c3_row18_col1\" class=\"data row18 col1\" >Minimum ROCAUC Score</td>\n", + " <td id=\"T_326c3_row18_col2\" class=\"data row18 col2\" >Validates model by checking if the ROC AUC score meets or surpasses a specified threshold....</td>\n", + " <td id=\"T_326c3_row18_col3\" class=\"data row18 col3\" >False</td>\n", + " <td id=\"T_326c3_row18_col4\" class=\"data row18 col4\" >True</td>\n", + " <td id=\"T_326c3_row18_col5\" class=\"data row18 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_326c3_row18_col6\" class=\"data row18 col6\" >{'min_threshold': {'type': 'float', 'default': 0.5}}</td>\n", + " <td id=\"T_326c3_row18_col7\" class=\"data row18 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_326c3_row18_col8\" class=\"data row18 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row19_col0\" class=\"data row19 col0\" >validmind.model_validation.sklearn.ModelParameters</td>\n", + " <td id=\"T_326c3_row19_col1\" class=\"data row19 col1\" >Model Parameters</td>\n", + " <td id=\"T_326c3_row19_col2\" class=\"data row19 col2\" >Extracts and displays model parameters in a structured format for transparency and reproducibility....</td>\n", + " <td id=\"T_326c3_row19_col3\" class=\"data row19 col3\" >False</td>\n", + " <td id=\"T_326c3_row19_col4\" class=\"data row19 col4\" >True</td>\n", + " <td id=\"T_326c3_row19_col5\" class=\"data row19 col5\" >['model']</td>\n", + " <td id=\"T_326c3_row19_col6\" class=\"data row19 col6\" >{'model_params': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_326c3_row19_col7\" class=\"data row19 col7\" >['model_training', 'metadata']</td>\n", + " <td id=\"T_326c3_row19_col8\" class=\"data row19 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row20_col0\" class=\"data row20 col0\" >validmind.model_validation.sklearn.ModelsPerformanceComparison</td>\n", + " <td id=\"T_326c3_row20_col1\" class=\"data row20 col1\" >Models Performance Comparison</td>\n", + " <td id=\"T_326c3_row20_col2\" class=\"data row20 col2\" >Evaluates and compares the performance of multiple Machine Learning models using various metrics like accuracy,...</td>\n", + " <td id=\"T_326c3_row20_col3\" class=\"data row20 col3\" >False</td>\n", + " <td id=\"T_326c3_row20_col4\" class=\"data row20 col4\" >True</td>\n", + " <td id=\"T_326c3_row20_col5\" class=\"data row20 col5\" >['dataset', 'models']</td>\n", + " <td id=\"T_326c3_row20_col6\" class=\"data row20 col6\" >{}</td>\n", + " <td id=\"T_326c3_row20_col7\" class=\"data row20 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'model_comparison']</td>\n", + " <td id=\"T_326c3_row20_col8\" class=\"data row20 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row21_col0\" class=\"data row21 col0\" >validmind.model_validation.sklearn.OverfitDiagnosis</td>\n", + " <td id=\"T_326c3_row21_col1\" class=\"data row21 col1\" >Overfit Diagnosis</td>\n", + " <td id=\"T_326c3_row21_col2\" class=\"data row21 col2\" >Assesses potential overfitting in a model's predictions, identifying regions where performance between training and...</td>\n", + " <td id=\"T_326c3_row21_col3\" class=\"data row21 col3\" >True</td>\n", + " <td id=\"T_326c3_row21_col4\" class=\"data row21 col4\" >True</td>\n", + " <td id=\"T_326c3_row21_col5\" class=\"data row21 col5\" >['model', 'datasets']</td>\n", + " <td id=\"T_326c3_row21_col6\" class=\"data row21 col6\" >{'metric': {'type': 'str', 'default': None}, 'cut_off_threshold': {'type': 'float', 'default': 0.04}}</td>\n", + " <td id=\"T_326c3_row21_col7\" class=\"data row21 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'linear_regression', 'model_diagnosis']</td>\n", + " <td id=\"T_326c3_row21_col8\" class=\"data row21 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row22_col0\" class=\"data row22 col0\" >validmind.model_validation.sklearn.PermutationFeatureImportance</td>\n", + " <td id=\"T_326c3_row22_col1\" class=\"data row22 col1\" >Permutation Feature Importance</td>\n", + " <td id=\"T_326c3_row22_col2\" class=\"data row22 col2\" >Assesses the significance of each feature in a model by evaluating the impact on model performance when feature...</td>\n", + " <td id=\"T_326c3_row22_col3\" class=\"data row22 col3\" >True</td>\n", + " <td id=\"T_326c3_row22_col4\" class=\"data row22 col4\" >False</td>\n", + " <td id=\"T_326c3_row22_col5\" class=\"data row22 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row22_col6\" class=\"data row22 col6\" >{'fontsize': {'type': None, 'default': None}, 'figure_height': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_326c3_row22_col7\" class=\"data row22 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'feature_importance', 'visualization']</td>\n", + " <td id=\"T_326c3_row22_col8\" class=\"data row22 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row23_col0\" class=\"data row23 col0\" >validmind.model_validation.sklearn.PopulationStabilityIndex</td>\n", + " <td id=\"T_326c3_row23_col1\" class=\"data row23 col1\" >Population Stability Index</td>\n", + " <td id=\"T_326c3_row23_col2\" class=\"data row23 col2\" >Assesses the Population Stability Index (PSI) to quantify the stability of an ML model's predictions across...</td>\n", + " <td id=\"T_326c3_row23_col3\" class=\"data row23 col3\" >True</td>\n", + " <td id=\"T_326c3_row23_col4\" class=\"data row23 col4\" >True</td>\n", + " <td id=\"T_326c3_row23_col5\" class=\"data row23 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_326c3_row23_col6\" class=\"data row23 col6\" >{'num_bins': {'type': 'int', 'default': 10}, 'mode': {'type': 'str', 'default': 'fixed'}}</td>\n", + " <td id=\"T_326c3_row23_col7\" class=\"data row23 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_326c3_row23_col8\" class=\"data row23 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row24_col0\" class=\"data row24 col0\" >validmind.model_validation.sklearn.PrecisionRecallCurve</td>\n", + " <td id=\"T_326c3_row24_col1\" class=\"data row24 col1\" >Precision Recall Curve</td>\n", + " <td id=\"T_326c3_row24_col2\" class=\"data row24 col2\" >Evaluates the precision-recall trade-off for binary classification models and visualizes the Precision-Recall curve....</td>\n", + " <td id=\"T_326c3_row24_col3\" class=\"data row24 col3\" >True</td>\n", + " <td id=\"T_326c3_row24_col4\" class=\"data row24 col4\" >False</td>\n", + " <td id=\"T_326c3_row24_col5\" class=\"data row24 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row24_col6\" class=\"data row24 col6\" >{}</td>\n", + " <td id=\"T_326c3_row24_col7\" class=\"data row24 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_326c3_row24_col8\" class=\"data row24 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row25_col0\" class=\"data row25 col0\" >validmind.model_validation.sklearn.ROCCurve</td>\n", + " <td id=\"T_326c3_row25_col1\" class=\"data row25 col1\" >ROC Curve</td>\n", + " <td id=\"T_326c3_row25_col2\" class=\"data row25 col2\" >Evaluates binary classification model performance by generating and plotting the Receiver Operating Characteristic...</td>\n", + " <td id=\"T_326c3_row25_col3\" class=\"data row25 col3\" >True</td>\n", + " <td id=\"T_326c3_row25_col4\" class=\"data row25 col4\" >False</td>\n", + " <td id=\"T_326c3_row25_col5\" class=\"data row25 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row25_col6\" class=\"data row25 col6\" >{}</td>\n", + " <td id=\"T_326c3_row25_col7\" class=\"data row25 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_326c3_row25_col8\" class=\"data row25 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row26_col0\" class=\"data row26 col0\" >validmind.model_validation.sklearn.RegressionErrors</td>\n", + " <td id=\"T_326c3_row26_col1\" class=\"data row26 col1\" >Regression Errors</td>\n", + " <td id=\"T_326c3_row26_col2\" class=\"data row26 col2\" >Assesses the performance and error distribution of a regression model using various error metrics....</td>\n", + " <td id=\"T_326c3_row26_col3\" class=\"data row26 col3\" >False</td>\n", + " <td id=\"T_326c3_row26_col4\" class=\"data row26 col4\" >True</td>\n", + " <td id=\"T_326c3_row26_col5\" class=\"data row26 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row26_col6\" class=\"data row26 col6\" >{}</td>\n", + " <td id=\"T_326c3_row26_col7\" class=\"data row26 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_326c3_row26_col8\" class=\"data row26 col8\" >['regression', 'classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row27_col0\" class=\"data row27 col0\" >validmind.model_validation.sklearn.RegressionErrorsComparison</td>\n", + " <td id=\"T_326c3_row27_col1\" class=\"data row27 col1\" >Regression Errors Comparison</td>\n", + " <td id=\"T_326c3_row27_col2\" class=\"data row27 col2\" >Assesses multiple regression error metrics to compare model performance across different datasets, emphasizing...</td>\n", + " <td id=\"T_326c3_row27_col3\" class=\"data row27 col3\" >False</td>\n", + " <td id=\"T_326c3_row27_col4\" class=\"data row27 col4\" >True</td>\n", + " <td id=\"T_326c3_row27_col5\" class=\"data row27 col5\" >['datasets', 'models']</td>\n", + " <td id=\"T_326c3_row27_col6\" class=\"data row27 col6\" >{}</td>\n", + " <td id=\"T_326c3_row27_col7\" class=\"data row27 col7\" >['model_performance', 'sklearn']</td>\n", + " <td id=\"T_326c3_row27_col8\" class=\"data row27 col8\" >['regression', 'time_series_forecasting']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row28_col0\" class=\"data row28 col0\" >validmind.model_validation.sklearn.RegressionPerformance</td>\n", + " <td id=\"T_326c3_row28_col1\" class=\"data row28 col1\" >Regression Performance</td>\n", + " <td id=\"T_326c3_row28_col2\" class=\"data row28 col2\" >Evaluates the performance of a regression model using five different metrics: MAE, MSE, RMSE, MAPE, and MBD....</td>\n", + " <td id=\"T_326c3_row28_col3\" class=\"data row28 col3\" >False</td>\n", + " <td id=\"T_326c3_row28_col4\" class=\"data row28 col4\" >True</td>\n", + " <td id=\"T_326c3_row28_col5\" class=\"data row28 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row28_col6\" class=\"data row28 col6\" >{}</td>\n", + " <td id=\"T_326c3_row28_col7\" class=\"data row28 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_326c3_row28_col8\" class=\"data row28 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row29_col0\" class=\"data row29 col0\" >validmind.model_validation.sklearn.RegressionR2Square</td>\n", + " <td id=\"T_326c3_row29_col1\" class=\"data row29 col1\" >Regression R2 Square</td>\n", + " <td id=\"T_326c3_row29_col2\" class=\"data row29 col2\" >Assesses the overall goodness-of-fit of a regression model by evaluating R-squared (R2) and Adjusted R-squared (Adj...</td>\n", + " <td id=\"T_326c3_row29_col3\" class=\"data row29 col3\" >False</td>\n", + " <td id=\"T_326c3_row29_col4\" class=\"data row29 col4\" >True</td>\n", + " <td id=\"T_326c3_row29_col5\" class=\"data row29 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_326c3_row29_col6\" class=\"data row29 col6\" >{}</td>\n", + " <td id=\"T_326c3_row29_col7\" class=\"data row29 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_326c3_row29_col8\" class=\"data row29 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row30_col0\" class=\"data row30 col0\" >validmind.model_validation.sklearn.RegressionR2SquareComparison</td>\n", + " <td id=\"T_326c3_row30_col1\" class=\"data row30 col1\" >Regression R2 Square Comparison</td>\n", + " <td id=\"T_326c3_row30_col2\" class=\"data row30 col2\" >Compares R-Squared and Adjusted R-Squared values for different regression models across multiple datasets to assess...</td>\n", + " <td id=\"T_326c3_row30_col3\" class=\"data row30 col3\" >False</td>\n", + " <td id=\"T_326c3_row30_col4\" class=\"data row30 col4\" >True</td>\n", + " <td id=\"T_326c3_row30_col5\" class=\"data row30 col5\" >['datasets', 'models']</td>\n", + " <td id=\"T_326c3_row30_col6\" class=\"data row30 col6\" >{}</td>\n", + " <td id=\"T_326c3_row30_col7\" class=\"data row30 col7\" >['model_performance', 'sklearn']</td>\n", + " <td id=\"T_326c3_row30_col8\" class=\"data row30 col8\" >['regression', 'time_series_forecasting']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row31_col0\" class=\"data row31 col0\" >validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", + " <td id=\"T_326c3_row31_col1\" class=\"data row31 col1\" >Robustness Diagnosis</td>\n", + " <td id=\"T_326c3_row31_col2\" class=\"data row31 col2\" >Assesses the robustness of a machine learning model by evaluating performance decay under noisy conditions....</td>\n", + " <td id=\"T_326c3_row31_col3\" class=\"data row31 col3\" >True</td>\n", + " <td id=\"T_326c3_row31_col4\" class=\"data row31 col4\" >True</td>\n", + " <td id=\"T_326c3_row31_col5\" class=\"data row31 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_326c3_row31_col6\" class=\"data row31 col6\" >{'metric': {'type': 'str', 'default': None}, 'scaling_factor_std_dev_list': {'type': None, 'default': [0.1, 0.2, 0.3, 0.4, 0.5]}, 'performance_decay_threshold': {'type': 'float', 'default': 0.05}}</td>\n", + " <td id=\"T_326c3_row31_col7\" class=\"data row31 col7\" >['sklearn', 'model_diagnosis', 'visualization']</td>\n", + " <td id=\"T_326c3_row31_col8\" class=\"data row31 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row32_col0\" class=\"data row32 col0\" >validmind.model_validation.sklearn.SHAPGlobalImportance</td>\n", + " <td id=\"T_326c3_row32_col1\" class=\"data row32 col1\" >SHAP Global Importance</td>\n", + " <td id=\"T_326c3_row32_col2\" class=\"data row32 col2\" >Evaluates and visualizes global feature importance using SHAP values for model explanation and risk identification....</td>\n", + " <td id=\"T_326c3_row32_col3\" class=\"data row32 col3\" >False</td>\n", + " <td id=\"T_326c3_row32_col4\" class=\"data row32 col4\" >True</td>\n", + " <td id=\"T_326c3_row32_col5\" class=\"data row32 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row32_col6\" class=\"data row32 col6\" >{'kernel_explainer_samples': {'type': 'int', 'default': 10}, 'tree_or_linear_explainer_samples': {'type': 'int', 'default': 200}, 'class_of_interest': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_326c3_row32_col7\" class=\"data row32 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'feature_importance', 'visualization']</td>\n", + " <td id=\"T_326c3_row32_col8\" class=\"data row32 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row33_col0\" class=\"data row33 col0\" >validmind.model_validation.sklearn.ScoreProbabilityAlignment</td>\n", + " <td id=\"T_326c3_row33_col1\" class=\"data row33 col1\" >Score Probability Alignment</td>\n", + " <td id=\"T_326c3_row33_col2\" class=\"data row33 col2\" >Analyzes the alignment between credit scores and predicted probabilities....</td>\n", + " <td id=\"T_326c3_row33_col3\" class=\"data row33 col3\" >True</td>\n", + " <td id=\"T_326c3_row33_col4\" class=\"data row33 col4\" >True</td>\n", + " <td id=\"T_326c3_row33_col5\" class=\"data row33 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row33_col6\" class=\"data row33 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'n_bins': {'type': 'int', 'default': 10}}</td>\n", + " <td id=\"T_326c3_row33_col7\" class=\"data row33 col7\" >['visualization', 'credit_risk', 'calibration']</td>\n", + " <td id=\"T_326c3_row33_col8\" class=\"data row33 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row34_col0\" class=\"data row34 col0\" >validmind.model_validation.sklearn.SilhouettePlot</td>\n", + " <td id=\"T_326c3_row34_col1\" class=\"data row34 col1\" >Silhouette Plot</td>\n", + " <td id=\"T_326c3_row34_col2\" class=\"data row34 col2\" >Calculates and visualizes Silhouette Score, assessing the degree of data point suitability to its cluster in ML...</td>\n", + " <td id=\"T_326c3_row34_col3\" class=\"data row34 col3\" >True</td>\n", + " <td id=\"T_326c3_row34_col4\" class=\"data row34 col4\" >True</td>\n", + " <td id=\"T_326c3_row34_col5\" class=\"data row34 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_326c3_row34_col6\" class=\"data row34 col6\" >{}</td>\n", + " <td id=\"T_326c3_row34_col7\" class=\"data row34 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_326c3_row34_col8\" class=\"data row34 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row35_col0\" class=\"data row35 col0\" >validmind.model_validation.sklearn.TrainingTestDegradation</td>\n", + " <td id=\"T_326c3_row35_col1\" class=\"data row35 col1\" >Training Test Degradation</td>\n", + " <td id=\"T_326c3_row35_col2\" class=\"data row35 col2\" >Tests if model performance degradation between training and test datasets exceeds a predefined threshold....</td>\n", + " <td id=\"T_326c3_row35_col3\" class=\"data row35 col3\" >False</td>\n", + " <td id=\"T_326c3_row35_col4\" class=\"data row35 col4\" >True</td>\n", + " <td id=\"T_326c3_row35_col5\" class=\"data row35 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_326c3_row35_col6\" class=\"data row35 col6\" >{'max_threshold': {'type': 'float', 'default': 0.1}}</td>\n", + " <td id=\"T_326c3_row35_col7\" class=\"data row35 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_326c3_row35_col8\" class=\"data row35 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row36_col0\" class=\"data row36 col0\" >validmind.model_validation.sklearn.VMeasure</td>\n", + " <td id=\"T_326c3_row36_col1\" class=\"data row36 col1\" >V Measure</td>\n", + " <td id=\"T_326c3_row36_col2\" class=\"data row36 col2\" >Evaluates homogeneity and completeness of a clustering model using the V Measure Score....</td>\n", + " <td id=\"T_326c3_row36_col3\" class=\"data row36 col3\" >False</td>\n", + " <td id=\"T_326c3_row36_col4\" class=\"data row36 col4\" >True</td>\n", + " <td id=\"T_326c3_row36_col5\" class=\"data row36 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_326c3_row36_col6\" class=\"data row36 col6\" >{}</td>\n", + " <td id=\"T_326c3_row36_col7\" class=\"data row36 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_326c3_row36_col8\" class=\"data row36 col8\" >['clustering']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row37_col0\" class=\"data row37 col0\" >validmind.model_validation.sklearn.WeakspotsDiagnosis</td>\n", + " <td id=\"T_326c3_row37_col1\" class=\"data row37 col1\" >Weakspots Diagnosis</td>\n", + " <td id=\"T_326c3_row37_col2\" class=\"data row37 col2\" >Identifies and visualizes weak spots in a machine learning model's performance across various sections of the...</td>\n", + " <td id=\"T_326c3_row37_col3\" class=\"data row37 col3\" >True</td>\n", + " <td id=\"T_326c3_row37_col4\" class=\"data row37 col4\" >True</td>\n", + " <td id=\"T_326c3_row37_col5\" class=\"data row37 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_326c3_row37_col6\" class=\"data row37 col6\" >{'features_columns': {'type': None, 'default': None}, 'metrics': {'type': None, 'default': None}, 'thresholds': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_326c3_row37_col7\" class=\"data row37 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_diagnosis', 'visualization']</td>\n", + " <td id=\"T_326c3_row37_col8\" class=\"data row37 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row38_col0\" class=\"data row38 col0\" >validmind.ongoing_monitoring.CalibrationCurveDrift</td>\n", + " <td id=\"T_326c3_row38_col1\" class=\"data row38 col1\" >Calibration Curve Drift</td>\n", + " <td id=\"T_326c3_row38_col2\" class=\"data row38 col2\" >Evaluates changes in probability calibration between reference and monitoring datasets....</td>\n", + " <td id=\"T_326c3_row38_col3\" class=\"data row38 col3\" >True</td>\n", + " <td id=\"T_326c3_row38_col4\" class=\"data row38 col4\" >True</td>\n", + " <td id=\"T_326c3_row38_col5\" class=\"data row38 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_326c3_row38_col6\" class=\"data row38 col6\" >{'n_bins': {'type': 'int', 'default': 10}, 'drift_pct_threshold': {'type': 'float', 'default': 20}}</td>\n", + " <td id=\"T_326c3_row38_col7\" class=\"data row38 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_326c3_row38_col8\" class=\"data row38 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row39_col0\" class=\"data row39 col0\" >validmind.ongoing_monitoring.ClassDiscriminationDrift</td>\n", + " <td id=\"T_326c3_row39_col1\" class=\"data row39 col1\" >Class Discrimination Drift</td>\n", + " <td id=\"T_326c3_row39_col2\" class=\"data row39 col2\" >Compares classification discrimination metrics between reference and monitoring datasets....</td>\n", + " <td id=\"T_326c3_row39_col3\" class=\"data row39 col3\" >False</td>\n", + " <td id=\"T_326c3_row39_col4\" class=\"data row39 col4\" >True</td>\n", + " <td id=\"T_326c3_row39_col5\" class=\"data row39 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_326c3_row39_col6\" class=\"data row39 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", + " <td id=\"T_326c3_row39_col7\" class=\"data row39 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_326c3_row39_col8\" class=\"data row39 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row40_col0\" class=\"data row40 col0\" >validmind.ongoing_monitoring.ClassificationAccuracyDrift</td>\n", + " <td id=\"T_326c3_row40_col1\" class=\"data row40 col1\" >Classification Accuracy Drift</td>\n", + " <td id=\"T_326c3_row40_col2\" class=\"data row40 col2\" >Compares classification accuracy metrics between reference and monitoring datasets....</td>\n", + " <td id=\"T_326c3_row40_col3\" class=\"data row40 col3\" >False</td>\n", + " <td id=\"T_326c3_row40_col4\" class=\"data row40 col4\" >True</td>\n", + " <td id=\"T_326c3_row40_col5\" class=\"data row40 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_326c3_row40_col6\" class=\"data row40 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", + " <td id=\"T_326c3_row40_col7\" class=\"data row40 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_326c3_row40_col8\" class=\"data row40 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row41_col0\" class=\"data row41 col0\" >validmind.ongoing_monitoring.ConfusionMatrixDrift</td>\n", + " <td id=\"T_326c3_row41_col1\" class=\"data row41 col1\" >Confusion Matrix Drift</td>\n", + " <td id=\"T_326c3_row41_col2\" class=\"data row41 col2\" >Compares confusion matrix metrics between reference and monitoring datasets....</td>\n", + " <td id=\"T_326c3_row41_col3\" class=\"data row41 col3\" >False</td>\n", + " <td id=\"T_326c3_row41_col4\" class=\"data row41 col4\" >True</td>\n", + " <td id=\"T_326c3_row41_col5\" class=\"data row41 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_326c3_row41_col6\" class=\"data row41 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", + " <td id=\"T_326c3_row41_col7\" class=\"data row41 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_326c3_row41_col8\" class=\"data row41 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_326c3_row42_col0\" class=\"data row42 col0\" >validmind.ongoing_monitoring.ROCCurveDrift</td>\n", + " <td id=\"T_326c3_row42_col1\" class=\"data row42 col1\" >ROC Curve Drift</td>\n", + " <td id=\"T_326c3_row42_col2\" class=\"data row42 col2\" >Compares ROC curves between reference and monitoring datasets....</td>\n", + " <td id=\"T_326c3_row42_col3\" class=\"data row42 col3\" >True</td>\n", + " <td id=\"T_326c3_row42_col4\" class=\"data row42 col4\" >False</td>\n", + " <td id=\"T_326c3_row42_col5\" class=\"data row42 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_326c3_row42_col6\" class=\"data row42 col6\" >{}</td>\n", + " <td id=\"T_326c3_row42_col7\" class=\"data row42 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_326c3_row42_col8\" class=\"data row42 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " </tbody>\n", + "</table>\n" + ], + "text/plain": [ + "<pandas.io.formats.style.Styler at 0x1052e6790>" + ] + }, + "execution_count": null, + "metadata": {}, + "output_type": "execute_result" + } ], - "text/plain": [ - "<pandas.io.formats.style.Styler at 0x10516c880>" + "source": [ + "list_tests(filter=\"sklearn\")" ] - }, - "execution_count": 7, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "list_tests(task=\"classification\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Use the `tags` parameter to find tests based on their tags, such as `model_performance` or `visualization`:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [ + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Use the `task` parameter to find tests that match a specific task type, such as `classification`:" + ] + }, { - "data": { - "text/html": [ - "<style type=\"text/css\">\n", - "#T_4d8bf th {\n", - " text-align: left;\n", - "}\n", - "#T_4d8bf_row0_col0, #T_4d8bf_row0_col1, #T_4d8bf_row0_col2, #T_4d8bf_row0_col3, #T_4d8bf_row0_col4, #T_4d8bf_row0_col5, #T_4d8bf_row0_col6, #T_4d8bf_row0_col7, #T_4d8bf_row0_col8, #T_4d8bf_row1_col0, #T_4d8bf_row1_col1, #T_4d8bf_row1_col2, #T_4d8bf_row1_col3, #T_4d8bf_row1_col4, #T_4d8bf_row1_col5, #T_4d8bf_row1_col6, #T_4d8bf_row1_col7, #T_4d8bf_row1_col8, #T_4d8bf_row2_col0, #T_4d8bf_row2_col1, #T_4d8bf_row2_col2, #T_4d8bf_row2_col3, #T_4d8bf_row2_col4, #T_4d8bf_row2_col5, #T_4d8bf_row2_col6, #T_4d8bf_row2_col7, #T_4d8bf_row2_col8, #T_4d8bf_row3_col0, #T_4d8bf_row3_col1, #T_4d8bf_row3_col2, #T_4d8bf_row3_col3, #T_4d8bf_row3_col4, #T_4d8bf_row3_col5, #T_4d8bf_row3_col6, #T_4d8bf_row3_col7, #T_4d8bf_row3_col8, #T_4d8bf_row4_col0, #T_4d8bf_row4_col1, #T_4d8bf_row4_col2, #T_4d8bf_row4_col3, #T_4d8bf_row4_col4, #T_4d8bf_row4_col5, #T_4d8bf_row4_col6, #T_4d8bf_row4_col7, #T_4d8bf_row4_col8, #T_4d8bf_row5_col0, #T_4d8bf_row5_col1, #T_4d8bf_row5_col2, #T_4d8bf_row5_col3, #T_4d8bf_row5_col4, #T_4d8bf_row5_col5, #T_4d8bf_row5_col6, #T_4d8bf_row5_col7, #T_4d8bf_row5_col8, #T_4d8bf_row6_col0, #T_4d8bf_row6_col1, #T_4d8bf_row6_col2, #T_4d8bf_row6_col3, #T_4d8bf_row6_col4, #T_4d8bf_row6_col5, #T_4d8bf_row6_col6, #T_4d8bf_row6_col7, #T_4d8bf_row6_col8 {\n", - " text-align: left;\n", - "}\n", - "</style>\n", - "<table id=\"T_4d8bf\">\n", - " <thead>\n", - " <tr>\n", - " <th id=\"T_4d8bf_level0_col0\" class=\"col_heading level0 col0\" >ID</th>\n", - " <th id=\"T_4d8bf_level0_col1\" class=\"col_heading level0 col1\" >Name</th>\n", - " <th id=\"T_4d8bf_level0_col2\" class=\"col_heading level0 col2\" >Description</th>\n", - " <th id=\"T_4d8bf_level0_col3\" class=\"col_heading level0 col3\" >Has Figure</th>\n", - " <th id=\"T_4d8bf_level0_col4\" class=\"col_heading level0 col4\" >Has Table</th>\n", - " <th id=\"T_4d8bf_level0_col5\" class=\"col_heading level0 col5\" >Required Inputs</th>\n", - " <th id=\"T_4d8bf_level0_col6\" class=\"col_heading level0 col6\" >Params</th>\n", - " <th id=\"T_4d8bf_level0_col7\" class=\"col_heading level0 col7\" >Tags</th>\n", - " <th id=\"T_4d8bf_level0_col8\" class=\"col_heading level0 col8\" >Tasks</th>\n", - " </tr>\n", - " </thead>\n", - " <tbody>\n", - " <tr>\n", - " <td id=\"T_4d8bf_row0_col0\" class=\"data row0 col0\" >validmind.model_validation.RegressionResidualsPlot</td>\n", - " <td id=\"T_4d8bf_row0_col1\" class=\"data row0 col1\" >Regression Residuals Plot</td>\n", - " <td id=\"T_4d8bf_row0_col2\" class=\"data row0 col2\" >Evaluates regression model performance using residual distribution and actual vs. predicted plots....</td>\n", - " <td id=\"T_4d8bf_row0_col3\" class=\"data row0 col3\" >True</td>\n", - " <td id=\"T_4d8bf_row0_col4\" class=\"data row0 col4\" >False</td>\n", - " <td id=\"T_4d8bf_row0_col5\" class=\"data row0 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_4d8bf_row0_col6\" class=\"data row0 col6\" >{'bin_size': {'type': 'float', 'default': 0.1}}</td>\n", - " <td id=\"T_4d8bf_row0_col7\" class=\"data row0 col7\" >['model_performance', 'visualization']</td>\n", - " <td id=\"T_4d8bf_row0_col8\" class=\"data row0 col8\" >['regression']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_4d8bf_row1_col0\" class=\"data row1 col0\" >validmind.model_validation.sklearn.ConfusionMatrix</td>\n", - " <td id=\"T_4d8bf_row1_col1\" class=\"data row1 col1\" >Confusion Matrix</td>\n", - " <td id=\"T_4d8bf_row1_col2\" class=\"data row1 col2\" >Evaluates and visually represents the classification ML model's predictive performance using a Confusion Matrix...</td>\n", - " <td id=\"T_4d8bf_row1_col3\" class=\"data row1 col3\" >True</td>\n", - " <td id=\"T_4d8bf_row1_col4\" class=\"data row1 col4\" >False</td>\n", - " <td id=\"T_4d8bf_row1_col5\" class=\"data row1 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_4d8bf_row1_col6\" class=\"data row1 col6\" >{'threshold': {'type': 'float', 'default': 0.5}}</td>\n", - " <td id=\"T_4d8bf_row1_col7\" class=\"data row1 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_4d8bf_row1_col8\" class=\"data row1 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_4d8bf_row2_col0\" class=\"data row2 col0\" >validmind.model_validation.sklearn.PrecisionRecallCurve</td>\n", - " <td id=\"T_4d8bf_row2_col1\" class=\"data row2 col1\" >Precision Recall Curve</td>\n", - " <td id=\"T_4d8bf_row2_col2\" class=\"data row2 col2\" >Evaluates the precision-recall trade-off for binary classification models and visualizes the Precision-Recall curve....</td>\n", - " <td id=\"T_4d8bf_row2_col3\" class=\"data row2 col3\" >True</td>\n", - " <td id=\"T_4d8bf_row2_col4\" class=\"data row2 col4\" >False</td>\n", - " <td id=\"T_4d8bf_row2_col5\" class=\"data row2 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_4d8bf_row2_col6\" class=\"data row2 col6\" >{}</td>\n", - " <td id=\"T_4d8bf_row2_col7\" class=\"data row2 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_4d8bf_row2_col8\" class=\"data row2 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_4d8bf_row3_col0\" class=\"data row3 col0\" >validmind.model_validation.sklearn.ROCCurve</td>\n", - " <td id=\"T_4d8bf_row3_col1\" class=\"data row3 col1\" >ROC Curve</td>\n", - " <td id=\"T_4d8bf_row3_col2\" class=\"data row3 col2\" >Evaluates binary classification model performance by generating and plotting the Receiver Operating Characteristic...</td>\n", - " <td id=\"T_4d8bf_row3_col3\" class=\"data row3 col3\" >True</td>\n", - " <td id=\"T_4d8bf_row3_col4\" class=\"data row3 col4\" >False</td>\n", - " <td id=\"T_4d8bf_row3_col5\" class=\"data row3 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_4d8bf_row3_col6\" class=\"data row3 col6\" >{}</td>\n", - " <td id=\"T_4d8bf_row3_col7\" class=\"data row3 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_4d8bf_row3_col8\" class=\"data row3 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_4d8bf_row4_col0\" class=\"data row4 col0\" >validmind.model_validation.sklearn.TrainingTestDegradation</td>\n", - " <td id=\"T_4d8bf_row4_col1\" class=\"data row4 col1\" >Training Test Degradation</td>\n", - " <td id=\"T_4d8bf_row4_col2\" class=\"data row4 col2\" >Tests if model performance degradation between training and test datasets exceeds a predefined threshold....</td>\n", - " <td id=\"T_4d8bf_row4_col3\" class=\"data row4 col3\" >False</td>\n", - " <td id=\"T_4d8bf_row4_col4\" class=\"data row4 col4\" >True</td>\n", - " <td id=\"T_4d8bf_row4_col5\" class=\"data row4 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_4d8bf_row4_col6\" class=\"data row4 col6\" >{'max_threshold': {'type': 'float', 'default': 0.1}}</td>\n", - " <td id=\"T_4d8bf_row4_col7\" class=\"data row4 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_4d8bf_row4_col8\" class=\"data row4 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_4d8bf_row5_col0\" class=\"data row5 col0\" >validmind.ongoing_monitoring.CalibrationCurveDrift</td>\n", - " <td id=\"T_4d8bf_row5_col1\" class=\"data row5 col1\" >Calibration Curve Drift</td>\n", - " <td id=\"T_4d8bf_row5_col2\" class=\"data row5 col2\" >Evaluates changes in probability calibration between reference and monitoring datasets....</td>\n", - " <td id=\"T_4d8bf_row5_col3\" class=\"data row5 col3\" >True</td>\n", - " <td id=\"T_4d8bf_row5_col4\" class=\"data row5 col4\" >True</td>\n", - " <td id=\"T_4d8bf_row5_col5\" class=\"data row5 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_4d8bf_row5_col6\" class=\"data row5 col6\" >{'n_bins': {'type': 'int', 'default': 10}, 'drift_pct_threshold': {'type': 'float', 'default': 20}}</td>\n", - " <td id=\"T_4d8bf_row5_col7\" class=\"data row5 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_4d8bf_row5_col8\" class=\"data row5 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_4d8bf_row6_col0\" class=\"data row6 col0\" >validmind.ongoing_monitoring.ROCCurveDrift</td>\n", - " <td id=\"T_4d8bf_row6_col1\" class=\"data row6 col1\" >ROC Curve Drift</td>\n", - " <td id=\"T_4d8bf_row6_col2\" class=\"data row6 col2\" >Compares ROC curves between reference and monitoring datasets....</td>\n", - " <td id=\"T_4d8bf_row6_col3\" class=\"data row6 col3\" >True</td>\n", - " <td id=\"T_4d8bf_row6_col4\" class=\"data row6 col4\" >False</td>\n", - " <td id=\"T_4d8bf_row6_col5\" class=\"data row6 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_4d8bf_row6_col6\" class=\"data row6 col6\" >{}</td>\n", - " <td id=\"T_4d8bf_row6_col7\" class=\"data row6 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_4d8bf_row6_col8\" class=\"data row6 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " </tbody>\n", - "</table>\n" + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "<style type=\"text/css\">\n", + "#T_56dd5 th {\n", + " text-align: left;\n", + "}\n", + "#T_56dd5_row0_col0, #T_56dd5_row0_col1, #T_56dd5_row0_col2, #T_56dd5_row0_col3, #T_56dd5_row0_col4, #T_56dd5_row0_col5, #T_56dd5_row0_col6, #T_56dd5_row0_col7, #T_56dd5_row0_col8, #T_56dd5_row1_col0, #T_56dd5_row1_col1, #T_56dd5_row1_col2, #T_56dd5_row1_col3, #T_56dd5_row1_col4, #T_56dd5_row1_col5, #T_56dd5_row1_col6, #T_56dd5_row1_col7, #T_56dd5_row1_col8, #T_56dd5_row2_col0, #T_56dd5_row2_col1, #T_56dd5_row2_col2, #T_56dd5_row2_col3, #T_56dd5_row2_col4, #T_56dd5_row2_col5, #T_56dd5_row2_col6, #T_56dd5_row2_col7, #T_56dd5_row2_col8, #T_56dd5_row3_col0, #T_56dd5_row3_col1, #T_56dd5_row3_col2, #T_56dd5_row3_col3, #T_56dd5_row3_col4, #T_56dd5_row3_col5, #T_56dd5_row3_col6, #T_56dd5_row3_col7, #T_56dd5_row3_col8, #T_56dd5_row4_col0, #T_56dd5_row4_col1, #T_56dd5_row4_col2, #T_56dd5_row4_col3, #T_56dd5_row4_col4, #T_56dd5_row4_col5, #T_56dd5_row4_col6, #T_56dd5_row4_col7, #T_56dd5_row4_col8, #T_56dd5_row5_col0, #T_56dd5_row5_col1, #T_56dd5_row5_col2, #T_56dd5_row5_col3, #T_56dd5_row5_col4, #T_56dd5_row5_col5, #T_56dd5_row5_col6, #T_56dd5_row5_col7, #T_56dd5_row5_col8, #T_56dd5_row6_col0, #T_56dd5_row6_col1, #T_56dd5_row6_col2, #T_56dd5_row6_col3, #T_56dd5_row6_col4, #T_56dd5_row6_col5, #T_56dd5_row6_col6, #T_56dd5_row6_col7, #T_56dd5_row6_col8, #T_56dd5_row7_col0, #T_56dd5_row7_col1, #T_56dd5_row7_col2, #T_56dd5_row7_col3, #T_56dd5_row7_col4, #T_56dd5_row7_col5, #T_56dd5_row7_col6, #T_56dd5_row7_col7, #T_56dd5_row7_col8, #T_56dd5_row8_col0, #T_56dd5_row8_col1, #T_56dd5_row8_col2, #T_56dd5_row8_col3, #T_56dd5_row8_col4, #T_56dd5_row8_col5, #T_56dd5_row8_col6, #T_56dd5_row8_col7, #T_56dd5_row8_col8, #T_56dd5_row9_col0, #T_56dd5_row9_col1, #T_56dd5_row9_col2, #T_56dd5_row9_col3, #T_56dd5_row9_col4, #T_56dd5_row9_col5, #T_56dd5_row9_col6, #T_56dd5_row9_col7, #T_56dd5_row9_col8, #T_56dd5_row10_col0, #T_56dd5_row10_col1, #T_56dd5_row10_col2, #T_56dd5_row10_col3, #T_56dd5_row10_col4, #T_56dd5_row10_col5, #T_56dd5_row10_col6, #T_56dd5_row10_col7, #T_56dd5_row10_col8, #T_56dd5_row11_col0, #T_56dd5_row11_col1, #T_56dd5_row11_col2, #T_56dd5_row11_col3, #T_56dd5_row11_col4, #T_56dd5_row11_col5, #T_56dd5_row11_col6, #T_56dd5_row11_col7, #T_56dd5_row11_col8, #T_56dd5_row12_col0, #T_56dd5_row12_col1, #T_56dd5_row12_col2, #T_56dd5_row12_col3, #T_56dd5_row12_col4, #T_56dd5_row12_col5, #T_56dd5_row12_col6, #T_56dd5_row12_col7, #T_56dd5_row12_col8, #T_56dd5_row13_col0, #T_56dd5_row13_col1, #T_56dd5_row13_col2, #T_56dd5_row13_col3, #T_56dd5_row13_col4, #T_56dd5_row13_col5, #T_56dd5_row13_col6, #T_56dd5_row13_col7, #T_56dd5_row13_col8, #T_56dd5_row14_col0, #T_56dd5_row14_col1, #T_56dd5_row14_col2, #T_56dd5_row14_col3, #T_56dd5_row14_col4, #T_56dd5_row14_col5, #T_56dd5_row14_col6, #T_56dd5_row14_col7, #T_56dd5_row14_col8, #T_56dd5_row15_col0, #T_56dd5_row15_col1, #T_56dd5_row15_col2, #T_56dd5_row15_col3, #T_56dd5_row15_col4, #T_56dd5_row15_col5, #T_56dd5_row15_col6, #T_56dd5_row15_col7, #T_56dd5_row15_col8, #T_56dd5_row16_col0, #T_56dd5_row16_col1, #T_56dd5_row16_col2, #T_56dd5_row16_col3, #T_56dd5_row16_col4, #T_56dd5_row16_col5, #T_56dd5_row16_col6, #T_56dd5_row16_col7, #T_56dd5_row16_col8, #T_56dd5_row17_col0, #T_56dd5_row17_col1, #T_56dd5_row17_col2, #T_56dd5_row17_col3, #T_56dd5_row17_col4, #T_56dd5_row17_col5, #T_56dd5_row17_col6, #T_56dd5_row17_col7, #T_56dd5_row17_col8, #T_56dd5_row18_col0, #T_56dd5_row18_col1, #T_56dd5_row18_col2, #T_56dd5_row18_col3, #T_56dd5_row18_col4, #T_56dd5_row18_col5, #T_56dd5_row18_col6, #T_56dd5_row18_col7, #T_56dd5_row18_col8, #T_56dd5_row19_col0, #T_56dd5_row19_col1, #T_56dd5_row19_col2, #T_56dd5_row19_col3, #T_56dd5_row19_col4, #T_56dd5_row19_col5, #T_56dd5_row19_col6, #T_56dd5_row19_col7, #T_56dd5_row19_col8, #T_56dd5_row20_col0, #T_56dd5_row20_col1, #T_56dd5_row20_col2, #T_56dd5_row20_col3, #T_56dd5_row20_col4, #T_56dd5_row20_col5, #T_56dd5_row20_col6, #T_56dd5_row20_col7, #T_56dd5_row20_col8, #T_56dd5_row21_col0, #T_56dd5_row21_col1, #T_56dd5_row21_col2, #T_56dd5_row21_col3, #T_56dd5_row21_col4, #T_56dd5_row21_col5, #T_56dd5_row21_col6, #T_56dd5_row21_col7, #T_56dd5_row21_col8, #T_56dd5_row22_col0, #T_56dd5_row22_col1, #T_56dd5_row22_col2, #T_56dd5_row22_col3, #T_56dd5_row22_col4, #T_56dd5_row22_col5, #T_56dd5_row22_col6, #T_56dd5_row22_col7, #T_56dd5_row22_col8, #T_56dd5_row23_col0, #T_56dd5_row23_col1, #T_56dd5_row23_col2, #T_56dd5_row23_col3, #T_56dd5_row23_col4, #T_56dd5_row23_col5, #T_56dd5_row23_col6, #T_56dd5_row23_col7, #T_56dd5_row23_col8, #T_56dd5_row24_col0, #T_56dd5_row24_col1, #T_56dd5_row24_col2, #T_56dd5_row24_col3, #T_56dd5_row24_col4, #T_56dd5_row24_col5, #T_56dd5_row24_col6, #T_56dd5_row24_col7, #T_56dd5_row24_col8, #T_56dd5_row25_col0, #T_56dd5_row25_col1, #T_56dd5_row25_col2, #T_56dd5_row25_col3, #T_56dd5_row25_col4, #T_56dd5_row25_col5, #T_56dd5_row25_col6, #T_56dd5_row25_col7, #T_56dd5_row25_col8, #T_56dd5_row26_col0, #T_56dd5_row26_col1, #T_56dd5_row26_col2, #T_56dd5_row26_col3, #T_56dd5_row26_col4, #T_56dd5_row26_col5, #T_56dd5_row26_col6, #T_56dd5_row26_col7, #T_56dd5_row26_col8, #T_56dd5_row27_col0, #T_56dd5_row27_col1, #T_56dd5_row27_col2, #T_56dd5_row27_col3, #T_56dd5_row27_col4, #T_56dd5_row27_col5, #T_56dd5_row27_col6, #T_56dd5_row27_col7, #T_56dd5_row27_col8, #T_56dd5_row28_col0, #T_56dd5_row28_col1, #T_56dd5_row28_col2, #T_56dd5_row28_col3, #T_56dd5_row28_col4, #T_56dd5_row28_col5, #T_56dd5_row28_col6, #T_56dd5_row28_col7, #T_56dd5_row28_col8, #T_56dd5_row29_col0, #T_56dd5_row29_col1, #T_56dd5_row29_col2, #T_56dd5_row29_col3, #T_56dd5_row29_col4, #T_56dd5_row29_col5, #T_56dd5_row29_col6, #T_56dd5_row29_col7, #T_56dd5_row29_col8, #T_56dd5_row30_col0, #T_56dd5_row30_col1, #T_56dd5_row30_col2, #T_56dd5_row30_col3, #T_56dd5_row30_col4, #T_56dd5_row30_col5, #T_56dd5_row30_col6, #T_56dd5_row30_col7, #T_56dd5_row30_col8, #T_56dd5_row31_col0, #T_56dd5_row31_col1, #T_56dd5_row31_col2, #T_56dd5_row31_col3, #T_56dd5_row31_col4, #T_56dd5_row31_col5, #T_56dd5_row31_col6, #T_56dd5_row31_col7, #T_56dd5_row31_col8, #T_56dd5_row32_col0, #T_56dd5_row32_col1, #T_56dd5_row32_col2, #T_56dd5_row32_col3, #T_56dd5_row32_col4, #T_56dd5_row32_col5, #T_56dd5_row32_col6, #T_56dd5_row32_col7, #T_56dd5_row32_col8, #T_56dd5_row33_col0, #T_56dd5_row33_col1, #T_56dd5_row33_col2, #T_56dd5_row33_col3, #T_56dd5_row33_col4, #T_56dd5_row33_col5, #T_56dd5_row33_col6, #T_56dd5_row33_col7, #T_56dd5_row33_col8, #T_56dd5_row34_col0, #T_56dd5_row34_col1, #T_56dd5_row34_col2, #T_56dd5_row34_col3, #T_56dd5_row34_col4, #T_56dd5_row34_col5, #T_56dd5_row34_col6, #T_56dd5_row34_col7, #T_56dd5_row34_col8, #T_56dd5_row35_col0, #T_56dd5_row35_col1, #T_56dd5_row35_col2, #T_56dd5_row35_col3, #T_56dd5_row35_col4, #T_56dd5_row35_col5, #T_56dd5_row35_col6, #T_56dd5_row35_col7, #T_56dd5_row35_col8, #T_56dd5_row36_col0, #T_56dd5_row36_col1, #T_56dd5_row36_col2, #T_56dd5_row36_col3, #T_56dd5_row36_col4, #T_56dd5_row36_col5, #T_56dd5_row36_col6, #T_56dd5_row36_col7, #T_56dd5_row36_col8, #T_56dd5_row37_col0, #T_56dd5_row37_col1, #T_56dd5_row37_col2, #T_56dd5_row37_col3, #T_56dd5_row37_col4, #T_56dd5_row37_col5, #T_56dd5_row37_col6, #T_56dd5_row37_col7, #T_56dd5_row37_col8, #T_56dd5_row38_col0, #T_56dd5_row38_col1, #T_56dd5_row38_col2, #T_56dd5_row38_col3, #T_56dd5_row38_col4, #T_56dd5_row38_col5, #T_56dd5_row38_col6, #T_56dd5_row38_col7, #T_56dd5_row38_col8, #T_56dd5_row39_col0, #T_56dd5_row39_col1, #T_56dd5_row39_col2, #T_56dd5_row39_col3, #T_56dd5_row39_col4, #T_56dd5_row39_col5, #T_56dd5_row39_col6, #T_56dd5_row39_col7, #T_56dd5_row39_col8, #T_56dd5_row40_col0, #T_56dd5_row40_col1, #T_56dd5_row40_col2, #T_56dd5_row40_col3, #T_56dd5_row40_col4, #T_56dd5_row40_col5, #T_56dd5_row40_col6, #T_56dd5_row40_col7, #T_56dd5_row40_col8, #T_56dd5_row41_col0, #T_56dd5_row41_col1, #T_56dd5_row41_col2, #T_56dd5_row41_col3, #T_56dd5_row41_col4, #T_56dd5_row41_col5, #T_56dd5_row41_col6, #T_56dd5_row41_col7, #T_56dd5_row41_col8, #T_56dd5_row42_col0, #T_56dd5_row42_col1, #T_56dd5_row42_col2, #T_56dd5_row42_col3, #T_56dd5_row42_col4, #T_56dd5_row42_col5, #T_56dd5_row42_col6, #T_56dd5_row42_col7, #T_56dd5_row42_col8, #T_56dd5_row43_col0, #T_56dd5_row43_col1, #T_56dd5_row43_col2, #T_56dd5_row43_col3, #T_56dd5_row43_col4, #T_56dd5_row43_col5, #T_56dd5_row43_col6, #T_56dd5_row43_col7, #T_56dd5_row43_col8, #T_56dd5_row44_col0, #T_56dd5_row44_col1, #T_56dd5_row44_col2, #T_56dd5_row44_col3, #T_56dd5_row44_col4, #T_56dd5_row44_col5, #T_56dd5_row44_col6, #T_56dd5_row44_col7, #T_56dd5_row44_col8, #T_56dd5_row45_col0, #T_56dd5_row45_col1, #T_56dd5_row45_col2, #T_56dd5_row45_col3, #T_56dd5_row45_col4, #T_56dd5_row45_col5, #T_56dd5_row45_col6, #T_56dd5_row45_col7, #T_56dd5_row45_col8, #T_56dd5_row46_col0, #T_56dd5_row46_col1, #T_56dd5_row46_col2, #T_56dd5_row46_col3, #T_56dd5_row46_col4, #T_56dd5_row46_col5, #T_56dd5_row46_col6, #T_56dd5_row46_col7, #T_56dd5_row46_col8, #T_56dd5_row47_col0, #T_56dd5_row47_col1, #T_56dd5_row47_col2, #T_56dd5_row47_col3, #T_56dd5_row47_col4, #T_56dd5_row47_col5, #T_56dd5_row47_col6, #T_56dd5_row47_col7, #T_56dd5_row47_col8, #T_56dd5_row48_col0, #T_56dd5_row48_col1, #T_56dd5_row48_col2, #T_56dd5_row48_col3, #T_56dd5_row48_col4, #T_56dd5_row48_col5, #T_56dd5_row48_col6, #T_56dd5_row48_col7, #T_56dd5_row48_col8, #T_56dd5_row49_col0, #T_56dd5_row49_col1, #T_56dd5_row49_col2, #T_56dd5_row49_col3, #T_56dd5_row49_col4, #T_56dd5_row49_col5, #T_56dd5_row49_col6, #T_56dd5_row49_col7, #T_56dd5_row49_col8, #T_56dd5_row50_col0, #T_56dd5_row50_col1, #T_56dd5_row50_col2, #T_56dd5_row50_col3, #T_56dd5_row50_col4, #T_56dd5_row50_col5, #T_56dd5_row50_col6, #T_56dd5_row50_col7, #T_56dd5_row50_col8, #T_56dd5_row51_col0, #T_56dd5_row51_col1, #T_56dd5_row51_col2, #T_56dd5_row51_col3, #T_56dd5_row51_col4, #T_56dd5_row51_col5, #T_56dd5_row51_col6, #T_56dd5_row51_col7, #T_56dd5_row51_col8, #T_56dd5_row52_col0, #T_56dd5_row52_col1, #T_56dd5_row52_col2, #T_56dd5_row52_col3, #T_56dd5_row52_col4, #T_56dd5_row52_col5, #T_56dd5_row52_col6, #T_56dd5_row52_col7, #T_56dd5_row52_col8, #T_56dd5_row53_col0, #T_56dd5_row53_col1, #T_56dd5_row53_col2, #T_56dd5_row53_col3, #T_56dd5_row53_col4, #T_56dd5_row53_col5, #T_56dd5_row53_col6, #T_56dd5_row53_col7, #T_56dd5_row53_col8, #T_56dd5_row54_col0, #T_56dd5_row54_col1, #T_56dd5_row54_col2, #T_56dd5_row54_col3, #T_56dd5_row54_col4, #T_56dd5_row54_col5, #T_56dd5_row54_col6, #T_56dd5_row54_col7, #T_56dd5_row54_col8, #T_56dd5_row55_col0, #T_56dd5_row55_col1, #T_56dd5_row55_col2, #T_56dd5_row55_col3, #T_56dd5_row55_col4, #T_56dd5_row55_col5, #T_56dd5_row55_col6, #T_56dd5_row55_col7, #T_56dd5_row55_col8, #T_56dd5_row56_col0, #T_56dd5_row56_col1, #T_56dd5_row56_col2, #T_56dd5_row56_col3, #T_56dd5_row56_col4, #T_56dd5_row56_col5, #T_56dd5_row56_col6, #T_56dd5_row56_col7, #T_56dd5_row56_col8, #T_56dd5_row57_col0, #T_56dd5_row57_col1, #T_56dd5_row57_col2, #T_56dd5_row57_col3, #T_56dd5_row57_col4, #T_56dd5_row57_col5, #T_56dd5_row57_col6, #T_56dd5_row57_col7, #T_56dd5_row57_col8, #T_56dd5_row58_col0, #T_56dd5_row58_col1, #T_56dd5_row58_col2, #T_56dd5_row58_col3, #T_56dd5_row58_col4, #T_56dd5_row58_col5, #T_56dd5_row58_col6, #T_56dd5_row58_col7, #T_56dd5_row58_col8, #T_56dd5_row59_col0, #T_56dd5_row59_col1, #T_56dd5_row59_col2, #T_56dd5_row59_col3, #T_56dd5_row59_col4, #T_56dd5_row59_col5, #T_56dd5_row59_col6, #T_56dd5_row59_col7, #T_56dd5_row59_col8, #T_56dd5_row60_col0, #T_56dd5_row60_col1, #T_56dd5_row60_col2, #T_56dd5_row60_col3, #T_56dd5_row60_col4, #T_56dd5_row60_col5, #T_56dd5_row60_col6, #T_56dd5_row60_col7, #T_56dd5_row60_col8, #T_56dd5_row61_col0, #T_56dd5_row61_col1, #T_56dd5_row61_col2, #T_56dd5_row61_col3, #T_56dd5_row61_col4, #T_56dd5_row61_col5, #T_56dd5_row61_col6, #T_56dd5_row61_col7, #T_56dd5_row61_col8, #T_56dd5_row62_col0, #T_56dd5_row62_col1, #T_56dd5_row62_col2, #T_56dd5_row62_col3, #T_56dd5_row62_col4, #T_56dd5_row62_col5, #T_56dd5_row62_col6, #T_56dd5_row62_col7, #T_56dd5_row62_col8, #T_56dd5_row63_col0, #T_56dd5_row63_col1, #T_56dd5_row63_col2, #T_56dd5_row63_col3, #T_56dd5_row63_col4, #T_56dd5_row63_col5, #T_56dd5_row63_col6, #T_56dd5_row63_col7, #T_56dd5_row63_col8, #T_56dd5_row64_col0, #T_56dd5_row64_col1, #T_56dd5_row64_col2, #T_56dd5_row64_col3, #T_56dd5_row64_col4, #T_56dd5_row64_col5, #T_56dd5_row64_col6, #T_56dd5_row64_col7, #T_56dd5_row64_col8, #T_56dd5_row65_col0, #T_56dd5_row65_col1, #T_56dd5_row65_col2, #T_56dd5_row65_col3, #T_56dd5_row65_col4, #T_56dd5_row65_col5, #T_56dd5_row65_col6, #T_56dd5_row65_col7, #T_56dd5_row65_col8, #T_56dd5_row66_col0, #T_56dd5_row66_col1, #T_56dd5_row66_col2, #T_56dd5_row66_col3, #T_56dd5_row66_col4, #T_56dd5_row66_col5, #T_56dd5_row66_col6, #T_56dd5_row66_col7, #T_56dd5_row66_col8, #T_56dd5_row67_col0, #T_56dd5_row67_col1, #T_56dd5_row67_col2, #T_56dd5_row67_col3, #T_56dd5_row67_col4, #T_56dd5_row67_col5, #T_56dd5_row67_col6, #T_56dd5_row67_col7, #T_56dd5_row67_col8, #T_56dd5_row68_col0, #T_56dd5_row68_col1, #T_56dd5_row68_col2, #T_56dd5_row68_col3, #T_56dd5_row68_col4, #T_56dd5_row68_col5, #T_56dd5_row68_col6, #T_56dd5_row68_col7, #T_56dd5_row68_col8, #T_56dd5_row69_col0, #T_56dd5_row69_col1, #T_56dd5_row69_col2, #T_56dd5_row69_col3, #T_56dd5_row69_col4, #T_56dd5_row69_col5, #T_56dd5_row69_col6, #T_56dd5_row69_col7, #T_56dd5_row69_col8, #T_56dd5_row70_col0, #T_56dd5_row70_col1, #T_56dd5_row70_col2, #T_56dd5_row70_col3, #T_56dd5_row70_col4, #T_56dd5_row70_col5, #T_56dd5_row70_col6, #T_56dd5_row70_col7, #T_56dd5_row70_col8, #T_56dd5_row71_col0, #T_56dd5_row71_col1, #T_56dd5_row71_col2, #T_56dd5_row71_col3, #T_56dd5_row71_col4, #T_56dd5_row71_col5, #T_56dd5_row71_col6, #T_56dd5_row71_col7, #T_56dd5_row71_col8, #T_56dd5_row72_col0, #T_56dd5_row72_col1, #T_56dd5_row72_col2, #T_56dd5_row72_col3, #T_56dd5_row72_col4, #T_56dd5_row72_col5, #T_56dd5_row72_col6, #T_56dd5_row72_col7, #T_56dd5_row72_col8, #T_56dd5_row73_col0, #T_56dd5_row73_col1, #T_56dd5_row73_col2, #T_56dd5_row73_col3, #T_56dd5_row73_col4, #T_56dd5_row73_col5, #T_56dd5_row73_col6, #T_56dd5_row73_col7, #T_56dd5_row73_col8, #T_56dd5_row74_col0, #T_56dd5_row74_col1, #T_56dd5_row74_col2, #T_56dd5_row74_col3, #T_56dd5_row74_col4, #T_56dd5_row74_col5, #T_56dd5_row74_col6, #T_56dd5_row74_col7, #T_56dd5_row74_col8, #T_56dd5_row75_col0, #T_56dd5_row75_col1, #T_56dd5_row75_col2, #T_56dd5_row75_col3, #T_56dd5_row75_col4, #T_56dd5_row75_col5, #T_56dd5_row75_col6, #T_56dd5_row75_col7, #T_56dd5_row75_col8 {\n", + " text-align: left;\n", + "}\n", + "</style>\n", + "<table id=\"T_56dd5\">\n", + " <thead>\n", + " <tr>\n", + " <th id=\"T_56dd5_level0_col0\" class=\"col_heading level0 col0\" >ID</th>\n", + " <th id=\"T_56dd5_level0_col1\" class=\"col_heading level0 col1\" >Name</th>\n", + " <th id=\"T_56dd5_level0_col2\" class=\"col_heading level0 col2\" >Description</th>\n", + " <th id=\"T_56dd5_level0_col3\" class=\"col_heading level0 col3\" >Has Figure</th>\n", + " <th id=\"T_56dd5_level0_col4\" class=\"col_heading level0 col4\" >Has Table</th>\n", + " <th id=\"T_56dd5_level0_col5\" class=\"col_heading level0 col5\" >Required Inputs</th>\n", + " <th id=\"T_56dd5_level0_col6\" class=\"col_heading level0 col6\" >Params</th>\n", + " <th id=\"T_56dd5_level0_col7\" class=\"col_heading level0 col7\" >Tags</th>\n", + " <th id=\"T_56dd5_level0_col8\" class=\"col_heading level0 col8\" >Tasks</th>\n", + " </tr>\n", + " </thead>\n", + " <tbody>\n", + " <tr>\n", + " <td id=\"T_56dd5_row0_col0\" class=\"data row0 col0\" >validmind.data_validation.BivariateScatterPlots</td>\n", + " <td id=\"T_56dd5_row0_col1\" class=\"data row0 col1\" >Bivariate Scatter Plots</td>\n", + " <td id=\"T_56dd5_row0_col2\" class=\"data row0 col2\" >Generates bivariate scatterplots to visually inspect relationships between pairs of numerical predictor variables...</td>\n", + " <td id=\"T_56dd5_row0_col3\" class=\"data row0 col3\" >True</td>\n", + " <td id=\"T_56dd5_row0_col4\" class=\"data row0 col4\" >False</td>\n", + " <td id=\"T_56dd5_row0_col5\" class=\"data row0 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row0_col6\" class=\"data row0 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row0_col7\" class=\"data row0 col7\" >['tabular_data', 'numerical_data', 'visualization']</td>\n", + " <td id=\"T_56dd5_row0_col8\" class=\"data row0 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row1_col0\" class=\"data row1 col0\" >validmind.data_validation.ChiSquaredFeaturesTable</td>\n", + " <td id=\"T_56dd5_row1_col1\" class=\"data row1 col1\" >Chi Squared Features Table</td>\n", + " <td id=\"T_56dd5_row1_col2\" class=\"data row1 col2\" >Assesses the statistical association between categorical features and a target variable using the Chi-Squared test....</td>\n", + " <td id=\"T_56dd5_row1_col3\" class=\"data row1 col3\" >False</td>\n", + " <td id=\"T_56dd5_row1_col4\" class=\"data row1 col4\" >True</td>\n", + " <td id=\"T_56dd5_row1_col5\" class=\"data row1 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row1_col6\" class=\"data row1 col6\" >{'p_threshold': {'type': '_empty', 'default': 0.05}}</td>\n", + " <td id=\"T_56dd5_row1_col7\" class=\"data row1 col7\" >['tabular_data', 'categorical_data', 'statistical_test']</td>\n", + " <td id=\"T_56dd5_row1_col8\" class=\"data row1 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row2_col0\" class=\"data row2 col0\" >validmind.data_validation.ClassImbalance</td>\n", + " <td id=\"T_56dd5_row2_col1\" class=\"data row2 col1\" >Class Imbalance</td>\n", + " <td id=\"T_56dd5_row2_col2\" class=\"data row2 col2\" >Evaluates and quantifies class distribution imbalance in a dataset used by a machine learning model....</td>\n", + " <td id=\"T_56dd5_row2_col3\" class=\"data row2 col3\" >True</td>\n", + " <td id=\"T_56dd5_row2_col4\" class=\"data row2 col4\" >True</td>\n", + " <td id=\"T_56dd5_row2_col5\" class=\"data row2 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row2_col6\" class=\"data row2 col6\" >{'min_percent_threshold': {'type': 'int', 'default': 10}}</td>\n", + " <td id=\"T_56dd5_row2_col7\" class=\"data row2 col7\" >['tabular_data', 'binary_classification', 'multiclass_classification', 'data_quality']</td>\n", + " <td id=\"T_56dd5_row2_col8\" class=\"data row2 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row3_col0\" class=\"data row3 col0\" >validmind.data_validation.DatasetDescription</td>\n", + " <td id=\"T_56dd5_row3_col1\" class=\"data row3 col1\" >Dataset Description</td>\n", + " <td id=\"T_56dd5_row3_col2\" class=\"data row3 col2\" >Provides comprehensive analysis and statistical summaries of each column in a machine learning model's dataset....</td>\n", + " <td id=\"T_56dd5_row3_col3\" class=\"data row3 col3\" >False</td>\n", + " <td id=\"T_56dd5_row3_col4\" class=\"data row3 col4\" >True</td>\n", + " <td id=\"T_56dd5_row3_col5\" class=\"data row3 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row3_col6\" class=\"data row3 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row3_col7\" class=\"data row3 col7\" >['tabular_data', 'time_series_data', 'text_data']</td>\n", + " <td id=\"T_56dd5_row3_col8\" class=\"data row3 col8\" >['classification', 'regression', 'text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row4_col0\" class=\"data row4 col0\" >validmind.data_validation.DatasetSplit</td>\n", + " <td id=\"T_56dd5_row4_col1\" class=\"data row4 col1\" >Dataset Split</td>\n", + " <td id=\"T_56dd5_row4_col2\" class=\"data row4 col2\" >Evaluates and visualizes the distribution proportions among training, testing, and validation datasets of an ML...</td>\n", + " <td id=\"T_56dd5_row4_col3\" class=\"data row4 col3\" >False</td>\n", + " <td id=\"T_56dd5_row4_col4\" class=\"data row4 col4\" >True</td>\n", + " <td id=\"T_56dd5_row4_col5\" class=\"data row4 col5\" >['datasets']</td>\n", + " <td id=\"T_56dd5_row4_col6\" class=\"data row4 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row4_col7\" class=\"data row4 col7\" >['tabular_data', 'time_series_data', 'text_data']</td>\n", + " <td id=\"T_56dd5_row4_col8\" class=\"data row4 col8\" >['classification', 'regression', 'text_classification', 'text_summarization']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row5_col0\" class=\"data row5 col0\" >validmind.data_validation.DescriptiveStatistics</td>\n", + " <td id=\"T_56dd5_row5_col1\" class=\"data row5 col1\" >Descriptive Statistics</td>\n", + " <td id=\"T_56dd5_row5_col2\" class=\"data row5 col2\" >Performs a detailed descriptive statistical analysis of both numerical and categorical data within a model's...</td>\n", + " <td id=\"T_56dd5_row5_col3\" class=\"data row5 col3\" >False</td>\n", + " <td id=\"T_56dd5_row5_col4\" class=\"data row5 col4\" >True</td>\n", + " <td id=\"T_56dd5_row5_col5\" class=\"data row5 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row5_col6\" class=\"data row5 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row5_col7\" class=\"data row5 col7\" >['tabular_data', 'time_series_data', 'data_quality']</td>\n", + " <td id=\"T_56dd5_row5_col8\" class=\"data row5 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row6_col0\" class=\"data row6 col0\" >validmind.data_validation.Duplicates</td>\n", + " <td id=\"T_56dd5_row6_col1\" class=\"data row6 col1\" >Duplicates</td>\n", + " <td id=\"T_56dd5_row6_col2\" class=\"data row6 col2\" >Tests dataset for duplicate entries, ensuring model reliability via data quality verification....</td>\n", + " <td id=\"T_56dd5_row6_col3\" class=\"data row6 col3\" >False</td>\n", + " <td id=\"T_56dd5_row6_col4\" class=\"data row6 col4\" >True</td>\n", + " <td id=\"T_56dd5_row6_col5\" class=\"data row6 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row6_col6\" class=\"data row6 col6\" >{'min_threshold': {'type': '_empty', 'default': 1}}</td>\n", + " <td id=\"T_56dd5_row6_col7\" class=\"data row6 col7\" >['tabular_data', 'data_quality', 'text_data']</td>\n", + " <td id=\"T_56dd5_row6_col8\" class=\"data row6 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row7_col0\" class=\"data row7 col0\" >validmind.data_validation.FeatureTargetCorrelationPlot</td>\n", + " <td id=\"T_56dd5_row7_col1\" class=\"data row7 col1\" >Feature Target Correlation Plot</td>\n", + " <td id=\"T_56dd5_row7_col2\" class=\"data row7 col2\" >Visualizes the correlation between input features and the model's target output in a color-coded horizontal bar...</td>\n", + " <td id=\"T_56dd5_row7_col3\" class=\"data row7 col3\" >True</td>\n", + " <td id=\"T_56dd5_row7_col4\" class=\"data row7 col4\" >False</td>\n", + " <td id=\"T_56dd5_row7_col5\" class=\"data row7 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row7_col6\" class=\"data row7 col6\" >{'fig_height': {'type': '_empty', 'default': 600}}</td>\n", + " <td id=\"T_56dd5_row7_col7\" class=\"data row7 col7\" >['tabular_data', 'visualization', 'correlation']</td>\n", + " <td id=\"T_56dd5_row7_col8\" class=\"data row7 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row8_col0\" class=\"data row8 col0\" >validmind.data_validation.HighCardinality</td>\n", + " <td id=\"T_56dd5_row8_col1\" class=\"data row8 col1\" >High Cardinality</td>\n", + " <td id=\"T_56dd5_row8_col2\" class=\"data row8 col2\" >Assesses the number of unique values in categorical columns to detect high cardinality and potential overfitting....</td>\n", + " <td id=\"T_56dd5_row8_col3\" class=\"data row8 col3\" >False</td>\n", + " <td id=\"T_56dd5_row8_col4\" class=\"data row8 col4\" >True</td>\n", + " <td id=\"T_56dd5_row8_col5\" class=\"data row8 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row8_col6\" class=\"data row8 col6\" >{'num_threshold': {'type': 'int', 'default': 100}, 'percent_threshold': {'type': 'float', 'default': 0.1}, 'threshold_type': {'type': 'str', 'default': 'percent'}}</td>\n", + " <td id=\"T_56dd5_row8_col7\" class=\"data row8 col7\" >['tabular_data', 'data_quality', 'categorical_data']</td>\n", + " <td id=\"T_56dd5_row8_col8\" class=\"data row8 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row9_col0\" class=\"data row9 col0\" >validmind.data_validation.HighPearsonCorrelation</td>\n", + " <td id=\"T_56dd5_row9_col1\" class=\"data row9 col1\" >High Pearson Correlation</td>\n", + " <td id=\"T_56dd5_row9_col2\" class=\"data row9 col2\" >Identifies highly correlated feature pairs in a dataset suggesting feature redundancy or multicollinearity....</td>\n", + " <td id=\"T_56dd5_row9_col3\" class=\"data row9 col3\" >False</td>\n", + " <td id=\"T_56dd5_row9_col4\" class=\"data row9 col4\" >True</td>\n", + " <td id=\"T_56dd5_row9_col5\" class=\"data row9 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row9_col6\" class=\"data row9 col6\" >{'max_threshold': {'type': 'float', 'default': 0.3}, 'top_n_correlations': {'type': 'int', 'default': 10}, 'feature_columns': {'type': 'list', 'default': None}}</td>\n", + " <td id=\"T_56dd5_row9_col7\" class=\"data row9 col7\" >['tabular_data', 'data_quality', 'correlation']</td>\n", + " <td id=\"T_56dd5_row9_col8\" class=\"data row9 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row10_col0\" class=\"data row10 col0\" >validmind.data_validation.IQROutliersBarPlot</td>\n", + " <td id=\"T_56dd5_row10_col1\" class=\"data row10 col1\" >IQR Outliers Bar Plot</td>\n", + " <td id=\"T_56dd5_row10_col2\" class=\"data row10 col2\" >Visualizes outlier distribution across percentiles in numerical data using the Interquartile Range (IQR) method....</td>\n", + " <td id=\"T_56dd5_row10_col3\" class=\"data row10 col3\" >True</td>\n", + " <td id=\"T_56dd5_row10_col4\" class=\"data row10 col4\" >False</td>\n", + " <td id=\"T_56dd5_row10_col5\" class=\"data row10 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row10_col6\" class=\"data row10 col6\" >{'threshold': {'type': 'float', 'default': 1.5}, 'fig_width': {'type': 'int', 'default': 800}}</td>\n", + " <td id=\"T_56dd5_row10_col7\" class=\"data row10 col7\" >['tabular_data', 'visualization', 'numerical_data']</td>\n", + " <td id=\"T_56dd5_row10_col8\" class=\"data row10 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row11_col0\" class=\"data row11 col0\" >validmind.data_validation.IQROutliersTable</td>\n", + " <td id=\"T_56dd5_row11_col1\" class=\"data row11 col1\" >IQR Outliers Table</td>\n", + " <td id=\"T_56dd5_row11_col2\" class=\"data row11 col2\" >Determines and summarizes outliers in numerical features using the Interquartile Range method....</td>\n", + " <td id=\"T_56dd5_row11_col3\" class=\"data row11 col3\" >False</td>\n", + " <td id=\"T_56dd5_row11_col4\" class=\"data row11 col4\" >True</td>\n", + " <td id=\"T_56dd5_row11_col5\" class=\"data row11 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row11_col6\" class=\"data row11 col6\" >{'threshold': {'type': 'float', 'default': 1.5}}</td>\n", + " <td id=\"T_56dd5_row11_col7\" class=\"data row11 col7\" >['tabular_data', 'numerical_data']</td>\n", + " <td id=\"T_56dd5_row11_col8\" class=\"data row11 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row12_col0\" class=\"data row12 col0\" >validmind.data_validation.IsolationForestOutliers</td>\n", + " <td id=\"T_56dd5_row12_col1\" class=\"data row12 col1\" >Isolation Forest Outliers</td>\n", + " <td id=\"T_56dd5_row12_col2\" class=\"data row12 col2\" >Detects outliers in a dataset using the Isolation Forest algorithm and visualizes results through scatter plots....</td>\n", + " <td id=\"T_56dd5_row12_col3\" class=\"data row12 col3\" >True</td>\n", + " <td id=\"T_56dd5_row12_col4\" class=\"data row12 col4\" >False</td>\n", + " <td id=\"T_56dd5_row12_col5\" class=\"data row12 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row12_col6\" class=\"data row12 col6\" >{'random_state': {'type': 'int', 'default': 0}, 'contamination': {'type': 'float', 'default': 0.1}, 'feature_columns': {'type': 'list', 'default': None}}</td>\n", + " <td id=\"T_56dd5_row12_col7\" class=\"data row12 col7\" >['tabular_data', 'anomaly_detection']</td>\n", + " <td id=\"T_56dd5_row12_col8\" class=\"data row12 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row13_col0\" class=\"data row13 col0\" >validmind.data_validation.JarqueBera</td>\n", + " <td id=\"T_56dd5_row13_col1\" class=\"data row13 col1\" >Jarque Bera</td>\n", + " <td id=\"T_56dd5_row13_col2\" class=\"data row13 col2\" >Assesses normality of dataset features in an ML model using the Jarque-Bera test....</td>\n", + " <td id=\"T_56dd5_row13_col3\" class=\"data row13 col3\" >False</td>\n", + " <td id=\"T_56dd5_row13_col4\" class=\"data row13 col4\" >True</td>\n", + " <td id=\"T_56dd5_row13_col5\" class=\"data row13 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row13_col6\" class=\"data row13 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row13_col7\" class=\"data row13 col7\" >['tabular_data', 'data_distribution', 'statistical_test', 'statsmodels']</td>\n", + " <td id=\"T_56dd5_row13_col8\" class=\"data row13 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row14_col0\" class=\"data row14 col0\" >validmind.data_validation.MissingValues</td>\n", + " <td id=\"T_56dd5_row14_col1\" class=\"data row14 col1\" >Missing Values</td>\n", + " <td id=\"T_56dd5_row14_col2\" class=\"data row14 col2\" >Evaluates dataset quality by ensuring missing value ratio across all features does not exceed a set threshold....</td>\n", + " <td id=\"T_56dd5_row14_col3\" class=\"data row14 col3\" >False</td>\n", + " <td id=\"T_56dd5_row14_col4\" class=\"data row14 col4\" >True</td>\n", + " <td id=\"T_56dd5_row14_col5\" class=\"data row14 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row14_col6\" class=\"data row14 col6\" >{'min_threshold': {'type': 'int', 'default': 1}}</td>\n", + " <td id=\"T_56dd5_row14_col7\" class=\"data row14 col7\" >['tabular_data', 'data_quality']</td>\n", + " <td id=\"T_56dd5_row14_col8\" class=\"data row14 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row15_col0\" class=\"data row15 col0\" >validmind.data_validation.MissingValuesBarPlot</td>\n", + " <td id=\"T_56dd5_row15_col1\" class=\"data row15 col1\" >Missing Values Bar Plot</td>\n", + " <td id=\"T_56dd5_row15_col2\" class=\"data row15 col2\" >Assesses the percentage and distribution of missing values in the dataset via a bar plot, with emphasis on...</td>\n", + " <td id=\"T_56dd5_row15_col3\" class=\"data row15 col3\" >True</td>\n", + " <td id=\"T_56dd5_row15_col4\" class=\"data row15 col4\" >False</td>\n", + " <td id=\"T_56dd5_row15_col5\" class=\"data row15 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row15_col6\" class=\"data row15 col6\" >{'threshold': {'type': 'int', 'default': 80}, 'fig_height': {'type': 'int', 'default': 600}}</td>\n", + " <td id=\"T_56dd5_row15_col7\" class=\"data row15 col7\" >['tabular_data', 'data_quality', 'visualization']</td>\n", + " <td id=\"T_56dd5_row15_col8\" class=\"data row15 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row16_col0\" class=\"data row16 col0\" >validmind.data_validation.MutualInformation</td>\n", + " <td id=\"T_56dd5_row16_col1\" class=\"data row16 col1\" >Mutual Information</td>\n", + " <td id=\"T_56dd5_row16_col2\" class=\"data row16 col2\" >Calculates mutual information scores between features and target variable to evaluate feature relevance....</td>\n", + " <td id=\"T_56dd5_row16_col3\" class=\"data row16 col3\" >True</td>\n", + " <td id=\"T_56dd5_row16_col4\" class=\"data row16 col4\" >False</td>\n", + " <td id=\"T_56dd5_row16_col5\" class=\"data row16 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row16_col6\" class=\"data row16 col6\" >{'min_threshold': {'type': 'float', 'default': 0.01}, 'task': {'type': 'str', 'default': 'classification'}}</td>\n", + " <td id=\"T_56dd5_row16_col7\" class=\"data row16 col7\" >['feature_selection', 'data_analysis']</td>\n", + " <td id=\"T_56dd5_row16_col8\" class=\"data row16 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row17_col0\" class=\"data row17 col0\" >validmind.data_validation.PearsonCorrelationMatrix</td>\n", + " <td id=\"T_56dd5_row17_col1\" class=\"data row17 col1\" >Pearson Correlation Matrix</td>\n", + " <td id=\"T_56dd5_row17_col2\" class=\"data row17 col2\" >Evaluates linear dependency between numerical variables in a dataset via a Pearson Correlation coefficient heat map....</td>\n", + " <td id=\"T_56dd5_row17_col3\" class=\"data row17 col3\" >True</td>\n", + " <td id=\"T_56dd5_row17_col4\" class=\"data row17 col4\" >False</td>\n", + " <td id=\"T_56dd5_row17_col5\" class=\"data row17 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row17_col6\" class=\"data row17 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row17_col7\" class=\"data row17 col7\" >['tabular_data', 'numerical_data', 'correlation']</td>\n", + " <td id=\"T_56dd5_row17_col8\" class=\"data row17 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row18_col0\" class=\"data row18 col0\" >validmind.data_validation.ProtectedClassesDescription</td>\n", + " <td id=\"T_56dd5_row18_col1\" class=\"data row18 col1\" >Protected Classes Description</td>\n", + " <td id=\"T_56dd5_row18_col2\" class=\"data row18 col2\" >Visualizes the distribution of protected classes in the dataset relative to the target variable...</td>\n", + " <td id=\"T_56dd5_row18_col3\" class=\"data row18 col3\" >True</td>\n", + " <td id=\"T_56dd5_row18_col4\" class=\"data row18 col4\" >True</td>\n", + " <td id=\"T_56dd5_row18_col5\" class=\"data row18 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row18_col6\" class=\"data row18 col6\" >{'protected_classes': {'type': '_empty', 'default': None}}</td>\n", + " <td id=\"T_56dd5_row18_col7\" class=\"data row18 col7\" >['bias_and_fairness', 'descriptive_statistics']</td>\n", + " <td id=\"T_56dd5_row18_col8\" class=\"data row18 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row19_col0\" class=\"data row19 col0\" >validmind.data_validation.RunsTest</td>\n", + " <td id=\"T_56dd5_row19_col1\" class=\"data row19 col1\" >Runs Test</td>\n", + " <td id=\"T_56dd5_row19_col2\" class=\"data row19 col2\" >Executes Runs Test on ML model to detect non-random patterns in output data sequence....</td>\n", + " <td id=\"T_56dd5_row19_col3\" class=\"data row19 col3\" >False</td>\n", + " <td id=\"T_56dd5_row19_col4\" class=\"data row19 col4\" >True</td>\n", + " <td id=\"T_56dd5_row19_col5\" class=\"data row19 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row19_col6\" class=\"data row19 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row19_col7\" class=\"data row19 col7\" >['tabular_data', 'statistical_test', 'statsmodels']</td>\n", + " <td id=\"T_56dd5_row19_col8\" class=\"data row19 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row20_col0\" class=\"data row20 col0\" >validmind.data_validation.ScatterPlot</td>\n", + " <td id=\"T_56dd5_row20_col1\" class=\"data row20 col1\" >Scatter Plot</td>\n", + " <td id=\"T_56dd5_row20_col2\" class=\"data row20 col2\" >Assesses visual relationships, patterns, and outliers among features in a dataset through scatter plot matrices....</td>\n", + " <td id=\"T_56dd5_row20_col3\" class=\"data row20 col3\" >True</td>\n", + " <td id=\"T_56dd5_row20_col4\" class=\"data row20 col4\" >False</td>\n", + " <td id=\"T_56dd5_row20_col5\" class=\"data row20 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row20_col6\" class=\"data row20 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row20_col7\" class=\"data row20 col7\" >['tabular_data', 'visualization']</td>\n", + " <td id=\"T_56dd5_row20_col8\" class=\"data row20 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row21_col0\" class=\"data row21 col0\" >validmind.data_validation.ScoreBandDefaultRates</td>\n", + " <td id=\"T_56dd5_row21_col1\" class=\"data row21 col1\" >Score Band Default Rates</td>\n", + " <td id=\"T_56dd5_row21_col2\" class=\"data row21 col2\" >Analyzes default rates and population distribution across credit score bands....</td>\n", + " <td id=\"T_56dd5_row21_col3\" class=\"data row21 col3\" >False</td>\n", + " <td id=\"T_56dd5_row21_col4\" class=\"data row21 col4\" >True</td>\n", + " <td id=\"T_56dd5_row21_col5\" class=\"data row21 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_56dd5_row21_col6\" class=\"data row21 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'score_bands': {'type': 'list', 'default': None}}</td>\n", + " <td id=\"T_56dd5_row21_col7\" class=\"data row21 col7\" >['visualization', 'credit_risk', 'scorecard']</td>\n", + " <td id=\"T_56dd5_row21_col8\" class=\"data row21 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row22_col0\" class=\"data row22 col0\" >validmind.data_validation.ShapiroWilk</td>\n", + " <td id=\"T_56dd5_row22_col1\" class=\"data row22 col1\" >Shapiro Wilk</td>\n", + " <td id=\"T_56dd5_row22_col2\" class=\"data row22 col2\" >Evaluates feature-wise normality of training data using the Shapiro-Wilk test....</td>\n", + " <td id=\"T_56dd5_row22_col3\" class=\"data row22 col3\" >False</td>\n", + " <td id=\"T_56dd5_row22_col4\" class=\"data row22 col4\" >True</td>\n", + " <td id=\"T_56dd5_row22_col5\" class=\"data row22 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row22_col6\" class=\"data row22 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row22_col7\" class=\"data row22 col7\" >['tabular_data', 'data_distribution', 'statistical_test']</td>\n", + " <td id=\"T_56dd5_row22_col8\" class=\"data row22 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row23_col0\" class=\"data row23 col0\" >validmind.data_validation.Skewness</td>\n", + " <td id=\"T_56dd5_row23_col1\" class=\"data row23 col1\" >Skewness</td>\n", + " <td id=\"T_56dd5_row23_col2\" class=\"data row23 col2\" >Evaluates the skewness of numerical data in a dataset to check against a defined threshold, aiming to ensure data...</td>\n", + " <td id=\"T_56dd5_row23_col3\" class=\"data row23 col3\" >False</td>\n", + " <td id=\"T_56dd5_row23_col4\" class=\"data row23 col4\" >True</td>\n", + " <td id=\"T_56dd5_row23_col5\" class=\"data row23 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row23_col6\" class=\"data row23 col6\" >{'max_threshold': {'type': '_empty', 'default': 1}}</td>\n", + " <td id=\"T_56dd5_row23_col7\" class=\"data row23 col7\" >['data_quality', 'tabular_data']</td>\n", + " <td id=\"T_56dd5_row23_col8\" class=\"data row23 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row24_col0\" class=\"data row24 col0\" >validmind.data_validation.TabularCategoricalBarPlots</td>\n", + " <td id=\"T_56dd5_row24_col1\" class=\"data row24 col1\" >Tabular Categorical Bar Plots</td>\n", + " <td id=\"T_56dd5_row24_col2\" class=\"data row24 col2\" >Generates and visualizes bar plots for each category in categorical features to evaluate the dataset's composition....</td>\n", + " <td id=\"T_56dd5_row24_col3\" class=\"data row24 col3\" >True</td>\n", + " <td id=\"T_56dd5_row24_col4\" class=\"data row24 col4\" >False</td>\n", + " <td id=\"T_56dd5_row24_col5\" class=\"data row24 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row24_col6\" class=\"data row24 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row24_col7\" class=\"data row24 col7\" >['tabular_data', 'visualization']</td>\n", + " <td id=\"T_56dd5_row24_col8\" class=\"data row24 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row25_col0\" class=\"data row25 col0\" >validmind.data_validation.TabularDateTimeHistograms</td>\n", + " <td id=\"T_56dd5_row25_col1\" class=\"data row25 col1\" >Tabular Date Time Histograms</td>\n", + " <td id=\"T_56dd5_row25_col2\" class=\"data row25 col2\" >Generates histograms to provide graphical insight into the distribution of time intervals in a model's datetime...</td>\n", + " <td id=\"T_56dd5_row25_col3\" class=\"data row25 col3\" >True</td>\n", + " <td id=\"T_56dd5_row25_col4\" class=\"data row25 col4\" >False</td>\n", + " <td id=\"T_56dd5_row25_col5\" class=\"data row25 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row25_col6\" class=\"data row25 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row25_col7\" class=\"data row25 col7\" >['time_series_data', 'visualization']</td>\n", + " <td id=\"T_56dd5_row25_col8\" class=\"data row25 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row26_col0\" class=\"data row26 col0\" >validmind.data_validation.TabularDescriptionTables</td>\n", + " <td id=\"T_56dd5_row26_col1\" class=\"data row26 col1\" >Tabular Description Tables</td>\n", + " <td id=\"T_56dd5_row26_col2\" class=\"data row26 col2\" >Summarizes key descriptive statistics for numerical, categorical, and datetime variables in a dataset....</td>\n", + " <td id=\"T_56dd5_row26_col3\" class=\"data row26 col3\" >False</td>\n", + " <td id=\"T_56dd5_row26_col4\" class=\"data row26 col4\" >True</td>\n", + " <td id=\"T_56dd5_row26_col5\" class=\"data row26 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row26_col6\" class=\"data row26 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row26_col7\" class=\"data row26 col7\" >['tabular_data']</td>\n", + " <td id=\"T_56dd5_row26_col8\" class=\"data row26 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row27_col0\" class=\"data row27 col0\" >validmind.data_validation.TabularNumericalHistograms</td>\n", + " <td id=\"T_56dd5_row27_col1\" class=\"data row27 col1\" >Tabular Numerical Histograms</td>\n", + " <td id=\"T_56dd5_row27_col2\" class=\"data row27 col2\" >Generates histograms for each numerical feature in a dataset to provide visual insights into data distribution and...</td>\n", + " <td id=\"T_56dd5_row27_col3\" class=\"data row27 col3\" >True</td>\n", + " <td id=\"T_56dd5_row27_col4\" class=\"data row27 col4\" >False</td>\n", + " <td id=\"T_56dd5_row27_col5\" class=\"data row27 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row27_col6\" class=\"data row27 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row27_col7\" class=\"data row27 col7\" >['tabular_data', 'visualization']</td>\n", + " <td id=\"T_56dd5_row27_col8\" class=\"data row27 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row28_col0\" class=\"data row28 col0\" >validmind.data_validation.TargetRateBarPlots</td>\n", + " <td id=\"T_56dd5_row28_col1\" class=\"data row28 col1\" >Target Rate Bar Plots</td>\n", + " <td id=\"T_56dd5_row28_col2\" class=\"data row28 col2\" >Generates bar plots visualizing the default rates of categorical features for a classification machine learning...</td>\n", + " <td id=\"T_56dd5_row28_col3\" class=\"data row28 col3\" >True</td>\n", + " <td id=\"T_56dd5_row28_col4\" class=\"data row28 col4\" >False</td>\n", + " <td id=\"T_56dd5_row28_col5\" class=\"data row28 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row28_col6\" class=\"data row28 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row28_col7\" class=\"data row28 col7\" >['tabular_data', 'visualization', 'categorical_data']</td>\n", + " <td id=\"T_56dd5_row28_col8\" class=\"data row28 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row29_col0\" class=\"data row29 col0\" >validmind.data_validation.TooManyZeroValues</td>\n", + " <td id=\"T_56dd5_row29_col1\" class=\"data row29 col1\" >Too Many Zero Values</td>\n", + " <td id=\"T_56dd5_row29_col2\" class=\"data row29 col2\" >Identifies numerical columns in a dataset that contain an excessive number of zero values, defined by a threshold...</td>\n", + " <td id=\"T_56dd5_row29_col3\" class=\"data row29 col3\" >False</td>\n", + " <td id=\"T_56dd5_row29_col4\" class=\"data row29 col4\" >True</td>\n", + " <td id=\"T_56dd5_row29_col5\" class=\"data row29 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row29_col6\" class=\"data row29 col6\" >{'max_percent_threshold': {'type': 'float', 'default': 0.03}}</td>\n", + " <td id=\"T_56dd5_row29_col7\" class=\"data row29 col7\" >['tabular_data']</td>\n", + " <td id=\"T_56dd5_row29_col8\" class=\"data row29 col8\" >['regression', 'classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row30_col0\" class=\"data row30 col0\" >validmind.data_validation.UniqueRows</td>\n", + " <td id=\"T_56dd5_row30_col1\" class=\"data row30 col1\" >Unique Rows</td>\n", + " <td id=\"T_56dd5_row30_col2\" class=\"data row30 col2\" >Verifies the diversity of the dataset by ensuring that the count of unique rows exceeds a prescribed threshold....</td>\n", + " <td id=\"T_56dd5_row30_col3\" class=\"data row30 col3\" >False</td>\n", + " <td id=\"T_56dd5_row30_col4\" class=\"data row30 col4\" >True</td>\n", + " <td id=\"T_56dd5_row30_col5\" class=\"data row30 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row30_col6\" class=\"data row30 col6\" >{'min_percent_threshold': {'type': 'float', 'default': 1}}</td>\n", + " <td id=\"T_56dd5_row30_col7\" class=\"data row30 col7\" >['tabular_data']</td>\n", + " <td id=\"T_56dd5_row30_col8\" class=\"data row30 col8\" >['regression', 'classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row31_col0\" class=\"data row31 col0\" >validmind.data_validation.WOEBinPlots</td>\n", + " <td id=\"T_56dd5_row31_col1\" class=\"data row31 col1\" >WOE Bin Plots</td>\n", + " <td id=\"T_56dd5_row31_col2\" class=\"data row31 col2\" >Generates visualizations of Weight of Evidence (WoE) and Information Value (IV) for understanding predictive power...</td>\n", + " <td id=\"T_56dd5_row31_col3\" class=\"data row31 col3\" >True</td>\n", + " <td id=\"T_56dd5_row31_col4\" class=\"data row31 col4\" >False</td>\n", + " <td id=\"T_56dd5_row31_col5\" class=\"data row31 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row31_col6\" class=\"data row31 col6\" >{'breaks_adj': {'type': 'list', 'default': None}, 'fig_height': {'type': 'int', 'default': 600}, 'fig_width': {'type': 'int', 'default': 500}}</td>\n", + " <td id=\"T_56dd5_row31_col7\" class=\"data row31 col7\" >['tabular_data', 'visualization', 'categorical_data']</td>\n", + " <td id=\"T_56dd5_row31_col8\" class=\"data row31 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row32_col0\" class=\"data row32 col0\" >validmind.data_validation.WOEBinTable</td>\n", + " <td id=\"T_56dd5_row32_col1\" class=\"data row32 col1\" >WOE Bin Table</td>\n", + " <td id=\"T_56dd5_row32_col2\" class=\"data row32 col2\" >Assesses the Weight of Evidence (WoE) and Information Value (IV) of each feature to evaluate its predictive power...</td>\n", + " <td id=\"T_56dd5_row32_col3\" class=\"data row32 col3\" >False</td>\n", + " <td id=\"T_56dd5_row32_col4\" class=\"data row32 col4\" >True</td>\n", + " <td id=\"T_56dd5_row32_col5\" class=\"data row32 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row32_col6\" class=\"data row32 col6\" >{'breaks_adj': {'type': 'list', 'default': None}}</td>\n", + " <td id=\"T_56dd5_row32_col7\" class=\"data row32 col7\" >['tabular_data', 'categorical_data']</td>\n", + " <td id=\"T_56dd5_row32_col8\" class=\"data row32 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row33_col0\" class=\"data row33 col0\" >validmind.model_validation.FeaturesAUC</td>\n", + " <td id=\"T_56dd5_row33_col1\" class=\"data row33 col1\" >Features AUC</td>\n", + " <td id=\"T_56dd5_row33_col2\" class=\"data row33 col2\" >Evaluates the discriminatory power of each individual feature within a binary classification model by calculating...</td>\n", + " <td id=\"T_56dd5_row33_col3\" class=\"data row33 col3\" >True</td>\n", + " <td id=\"T_56dd5_row33_col4\" class=\"data row33 col4\" >False</td>\n", + " <td id=\"T_56dd5_row33_col5\" class=\"data row33 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row33_col6\" class=\"data row33 col6\" >{'fontsize': {'type': 'int', 'default': 12}, 'figure_height': {'type': 'int', 'default': 500}}</td>\n", + " <td id=\"T_56dd5_row33_col7\" class=\"data row33 col7\" >['feature_importance', 'AUC', 'visualization']</td>\n", + " <td id=\"T_56dd5_row33_col8\" class=\"data row33 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row34_col0\" class=\"data row34 col0\" >validmind.model_validation.sklearn.CalibrationCurve</td>\n", + " <td id=\"T_56dd5_row34_col1\" class=\"data row34 col1\" >Calibration Curve</td>\n", + " <td id=\"T_56dd5_row34_col2\" class=\"data row34 col2\" >Evaluates the calibration of probability estimates by comparing predicted probabilities against observed...</td>\n", + " <td id=\"T_56dd5_row34_col3\" class=\"data row34 col3\" >True</td>\n", + " <td id=\"T_56dd5_row34_col4\" class=\"data row34 col4\" >False</td>\n", + " <td id=\"T_56dd5_row34_col5\" class=\"data row34 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_56dd5_row34_col6\" class=\"data row34 col6\" >{'n_bins': {'type': 'int', 'default': 10}}</td>\n", + " <td id=\"T_56dd5_row34_col7\" class=\"data row34 col7\" >['sklearn', 'model_performance', 'classification']</td>\n", + " <td id=\"T_56dd5_row34_col8\" class=\"data row34 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row35_col0\" class=\"data row35 col0\" >validmind.model_validation.sklearn.ClassifierPerformance</td>\n", + " <td id=\"T_56dd5_row35_col1\" class=\"data row35 col1\" >Classifier Performance</td>\n", + " <td id=\"T_56dd5_row35_col2\" class=\"data row35 col2\" >Evaluates performance of binary or multiclass classification models using precision, recall, F1-Score, accuracy,...</td>\n", + " <td id=\"T_56dd5_row35_col3\" class=\"data row35 col3\" >False</td>\n", + " <td id=\"T_56dd5_row35_col4\" class=\"data row35 col4\" >True</td>\n", + " <td id=\"T_56dd5_row35_col5\" class=\"data row35 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_56dd5_row35_col6\" class=\"data row35 col6\" >{'average': {'type': 'str', 'default': 'macro'}}</td>\n", + " <td id=\"T_56dd5_row35_col7\" class=\"data row35 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_56dd5_row35_col8\" class=\"data row35 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row36_col0\" class=\"data row36 col0\" >validmind.model_validation.sklearn.ClassifierThresholdOptimization</td>\n", + " <td id=\"T_56dd5_row36_col1\" class=\"data row36 col1\" >Classifier Threshold Optimization</td>\n", + " <td id=\"T_56dd5_row36_col2\" class=\"data row36 col2\" >Analyzes and visualizes different threshold optimization methods for binary classification models....</td>\n", + " <td id=\"T_56dd5_row36_col3\" class=\"data row36 col3\" >False</td>\n", + " <td id=\"T_56dd5_row36_col4\" class=\"data row36 col4\" >True</td>\n", + " <td id=\"T_56dd5_row36_col5\" class=\"data row36 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_56dd5_row36_col6\" class=\"data row36 col6\" >{'methods': {'type': None, 'default': None}, 'target_recall': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_56dd5_row36_col7\" class=\"data row36 col7\" >['model_validation', 'threshold_optimization', 'classification_metrics']</td>\n", + " <td id=\"T_56dd5_row36_col8\" class=\"data row36 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row37_col0\" class=\"data row37 col0\" >validmind.model_validation.sklearn.ConfusionMatrix</td>\n", + " <td id=\"T_56dd5_row37_col1\" class=\"data row37 col1\" >Confusion Matrix</td>\n", + " <td id=\"T_56dd5_row37_col2\" class=\"data row37 col2\" >Evaluates and visually represents the classification ML model's predictive performance using a Confusion Matrix...</td>\n", + " <td id=\"T_56dd5_row37_col3\" class=\"data row37 col3\" >True</td>\n", + " <td id=\"T_56dd5_row37_col4\" class=\"data row37 col4\" >False</td>\n", + " <td id=\"T_56dd5_row37_col5\" class=\"data row37 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_56dd5_row37_col6\" class=\"data row37 col6\" >{'threshold': {'type': 'float', 'default': 0.5}}</td>\n", + " <td id=\"T_56dd5_row37_col7\" class=\"data row37 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_56dd5_row37_col8\" class=\"data row37 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row38_col0\" class=\"data row38 col0\" >validmind.model_validation.sklearn.HyperParametersTuning</td>\n", + " <td id=\"T_56dd5_row38_col1\" class=\"data row38 col1\" >Hyper Parameters Tuning</td>\n", + " <td id=\"T_56dd5_row38_col2\" class=\"data row38 col2\" >Performs exhaustive grid search over specified parameter ranges to find optimal model configurations...</td>\n", + " <td id=\"T_56dd5_row38_col3\" class=\"data row38 col3\" >False</td>\n", + " <td id=\"T_56dd5_row38_col4\" class=\"data row38 col4\" >True</td>\n", + " <td id=\"T_56dd5_row38_col5\" class=\"data row38 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_56dd5_row38_col6\" class=\"data row38 col6\" >{'param_grid': {'type': 'dict', 'default': None}, 'scoring': {'type': None, 'default': None}, 'thresholds': {'type': None, 'default': None}, 'fit_params': {'type': 'dict', 'default': None}}</td>\n", + " <td id=\"T_56dd5_row38_col7\" class=\"data row38 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_56dd5_row38_col8\" class=\"data row38 col8\" >['clustering', 'classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row39_col0\" class=\"data row39 col0\" >validmind.model_validation.sklearn.MinimumAccuracy</td>\n", + " <td id=\"T_56dd5_row39_col1\" class=\"data row39 col1\" >Minimum Accuracy</td>\n", + " <td id=\"T_56dd5_row39_col2\" class=\"data row39 col2\" >Checks if the model's prediction accuracy meets or surpasses a specified threshold....</td>\n", + " <td id=\"T_56dd5_row39_col3\" class=\"data row39 col3\" >False</td>\n", + " <td id=\"T_56dd5_row39_col4\" class=\"data row39 col4\" >True</td>\n", + " <td id=\"T_56dd5_row39_col5\" class=\"data row39 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_56dd5_row39_col6\" class=\"data row39 col6\" >{'min_threshold': {'type': 'float', 'default': 0.7}}</td>\n", + " <td id=\"T_56dd5_row39_col7\" class=\"data row39 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_56dd5_row39_col8\" class=\"data row39 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row40_col0\" class=\"data row40 col0\" >validmind.model_validation.sklearn.MinimumF1Score</td>\n", + " <td id=\"T_56dd5_row40_col1\" class=\"data row40 col1\" >Minimum F1 Score</td>\n", + " <td id=\"T_56dd5_row40_col2\" class=\"data row40 col2\" >Assesses if the model's F1 score on the validation set meets a predefined minimum threshold, ensuring balanced...</td>\n", + " <td id=\"T_56dd5_row40_col3\" class=\"data row40 col3\" >False</td>\n", + " <td id=\"T_56dd5_row40_col4\" class=\"data row40 col4\" >True</td>\n", + " <td id=\"T_56dd5_row40_col5\" class=\"data row40 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_56dd5_row40_col6\" class=\"data row40 col6\" >{'min_threshold': {'type': 'float', 'default': 0.5}}</td>\n", + " <td id=\"T_56dd5_row40_col7\" class=\"data row40 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_56dd5_row40_col8\" class=\"data row40 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row41_col0\" class=\"data row41 col0\" >validmind.model_validation.sklearn.MinimumROCAUCScore</td>\n", + " <td id=\"T_56dd5_row41_col1\" class=\"data row41 col1\" >Minimum ROCAUC Score</td>\n", + " <td id=\"T_56dd5_row41_col2\" class=\"data row41 col2\" >Validates model by checking if the ROC AUC score meets or surpasses a specified threshold....</td>\n", + " <td id=\"T_56dd5_row41_col3\" class=\"data row41 col3\" >False</td>\n", + " <td id=\"T_56dd5_row41_col4\" class=\"data row41 col4\" >True</td>\n", + " <td id=\"T_56dd5_row41_col5\" class=\"data row41 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_56dd5_row41_col6\" class=\"data row41 col6\" >{'min_threshold': {'type': 'float', 'default': 0.5}}</td>\n", + " <td id=\"T_56dd5_row41_col7\" class=\"data row41 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_56dd5_row41_col8\" class=\"data row41 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row42_col0\" class=\"data row42 col0\" >validmind.model_validation.sklearn.ModelParameters</td>\n", + " <td id=\"T_56dd5_row42_col1\" class=\"data row42 col1\" >Model Parameters</td>\n", + " <td id=\"T_56dd5_row42_col2\" class=\"data row42 col2\" >Extracts and displays model parameters in a structured format for transparency and reproducibility....</td>\n", + " <td id=\"T_56dd5_row42_col3\" class=\"data row42 col3\" >False</td>\n", + " <td id=\"T_56dd5_row42_col4\" class=\"data row42 col4\" >True</td>\n", + " <td id=\"T_56dd5_row42_col5\" class=\"data row42 col5\" >['model']</td>\n", + " <td id=\"T_56dd5_row42_col6\" class=\"data row42 col6\" >{'model_params': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_56dd5_row42_col7\" class=\"data row42 col7\" >['model_training', 'metadata']</td>\n", + " <td id=\"T_56dd5_row42_col8\" class=\"data row42 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row43_col0\" class=\"data row43 col0\" >validmind.model_validation.sklearn.ModelsPerformanceComparison</td>\n", + " <td id=\"T_56dd5_row43_col1\" class=\"data row43 col1\" >Models Performance Comparison</td>\n", + " <td id=\"T_56dd5_row43_col2\" class=\"data row43 col2\" >Evaluates and compares the performance of multiple Machine Learning models using various metrics like accuracy,...</td>\n", + " <td id=\"T_56dd5_row43_col3\" class=\"data row43 col3\" >False</td>\n", + " <td id=\"T_56dd5_row43_col4\" class=\"data row43 col4\" >True</td>\n", + " <td id=\"T_56dd5_row43_col5\" class=\"data row43 col5\" >['dataset', 'models']</td>\n", + " <td id=\"T_56dd5_row43_col6\" class=\"data row43 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row43_col7\" class=\"data row43 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'model_comparison']</td>\n", + " <td id=\"T_56dd5_row43_col8\" class=\"data row43 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row44_col0\" class=\"data row44 col0\" >validmind.model_validation.sklearn.OverfitDiagnosis</td>\n", + " <td id=\"T_56dd5_row44_col1\" class=\"data row44 col1\" >Overfit Diagnosis</td>\n", + " <td id=\"T_56dd5_row44_col2\" class=\"data row44 col2\" >Assesses potential overfitting in a model's predictions, identifying regions where performance between training and...</td>\n", + " <td id=\"T_56dd5_row44_col3\" class=\"data row44 col3\" >True</td>\n", + " <td id=\"T_56dd5_row44_col4\" class=\"data row44 col4\" >True</td>\n", + " <td id=\"T_56dd5_row44_col5\" class=\"data row44 col5\" >['model', 'datasets']</td>\n", + " <td id=\"T_56dd5_row44_col6\" class=\"data row44 col6\" >{'metric': {'type': 'str', 'default': None}, 'cut_off_threshold': {'type': 'float', 'default': 0.04}}</td>\n", + " <td id=\"T_56dd5_row44_col7\" class=\"data row44 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'linear_regression', 'model_diagnosis']</td>\n", + " <td id=\"T_56dd5_row44_col8\" class=\"data row44 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row45_col0\" class=\"data row45 col0\" >validmind.model_validation.sklearn.PermutationFeatureImportance</td>\n", + " <td id=\"T_56dd5_row45_col1\" class=\"data row45 col1\" >Permutation Feature Importance</td>\n", + " <td id=\"T_56dd5_row45_col2\" class=\"data row45 col2\" >Assesses the significance of each feature in a model by evaluating the impact on model performance when feature...</td>\n", + " <td id=\"T_56dd5_row45_col3\" class=\"data row45 col3\" >True</td>\n", + " <td id=\"T_56dd5_row45_col4\" class=\"data row45 col4\" >False</td>\n", + " <td id=\"T_56dd5_row45_col5\" class=\"data row45 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_56dd5_row45_col6\" class=\"data row45 col6\" >{'fontsize': {'type': None, 'default': None}, 'figure_height': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_56dd5_row45_col7\" class=\"data row45 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'feature_importance', 'visualization']</td>\n", + " <td id=\"T_56dd5_row45_col8\" class=\"data row45 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row46_col0\" class=\"data row46 col0\" >validmind.model_validation.sklearn.PopulationStabilityIndex</td>\n", + " <td id=\"T_56dd5_row46_col1\" class=\"data row46 col1\" >Population Stability Index</td>\n", + " <td id=\"T_56dd5_row46_col2\" class=\"data row46 col2\" >Assesses the Population Stability Index (PSI) to quantify the stability of an ML model's predictions across...</td>\n", + " <td id=\"T_56dd5_row46_col3\" class=\"data row46 col3\" >True</td>\n", + " <td id=\"T_56dd5_row46_col4\" class=\"data row46 col4\" >True</td>\n", + " <td id=\"T_56dd5_row46_col5\" class=\"data row46 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_56dd5_row46_col6\" class=\"data row46 col6\" >{'num_bins': {'type': 'int', 'default': 10}, 'mode': {'type': 'str', 'default': 'fixed'}}</td>\n", + " <td id=\"T_56dd5_row46_col7\" class=\"data row46 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_56dd5_row46_col8\" class=\"data row46 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row47_col0\" class=\"data row47 col0\" >validmind.model_validation.sklearn.PrecisionRecallCurve</td>\n", + " <td id=\"T_56dd5_row47_col1\" class=\"data row47 col1\" >Precision Recall Curve</td>\n", + " <td id=\"T_56dd5_row47_col2\" class=\"data row47 col2\" >Evaluates the precision-recall trade-off for binary classification models and visualizes the Precision-Recall curve....</td>\n", + " <td id=\"T_56dd5_row47_col3\" class=\"data row47 col3\" >True</td>\n", + " <td id=\"T_56dd5_row47_col4\" class=\"data row47 col4\" >False</td>\n", + " <td id=\"T_56dd5_row47_col5\" class=\"data row47 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_56dd5_row47_col6\" class=\"data row47 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row47_col7\" class=\"data row47 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_56dd5_row47_col8\" class=\"data row47 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row48_col0\" class=\"data row48 col0\" >validmind.model_validation.sklearn.ROCCurve</td>\n", + " <td id=\"T_56dd5_row48_col1\" class=\"data row48 col1\" >ROC Curve</td>\n", + " <td id=\"T_56dd5_row48_col2\" class=\"data row48 col2\" >Evaluates binary classification model performance by generating and plotting the Receiver Operating Characteristic...</td>\n", + " <td id=\"T_56dd5_row48_col3\" class=\"data row48 col3\" >True</td>\n", + " <td id=\"T_56dd5_row48_col4\" class=\"data row48 col4\" >False</td>\n", + " <td id=\"T_56dd5_row48_col5\" class=\"data row48 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_56dd5_row48_col6\" class=\"data row48 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row48_col7\" class=\"data row48 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_56dd5_row48_col8\" class=\"data row48 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row49_col0\" class=\"data row49 col0\" >validmind.model_validation.sklearn.RegressionErrors</td>\n", + " <td id=\"T_56dd5_row49_col1\" class=\"data row49 col1\" >Regression Errors</td>\n", + " <td id=\"T_56dd5_row49_col2\" class=\"data row49 col2\" >Assesses the performance and error distribution of a regression model using various error metrics....</td>\n", + " <td id=\"T_56dd5_row49_col3\" class=\"data row49 col3\" >False</td>\n", + " <td id=\"T_56dd5_row49_col4\" class=\"data row49 col4\" >True</td>\n", + " <td id=\"T_56dd5_row49_col5\" class=\"data row49 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_56dd5_row49_col6\" class=\"data row49 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row49_col7\" class=\"data row49 col7\" >['sklearn', 'model_performance']</td>\n", + " <td id=\"T_56dd5_row49_col8\" class=\"data row49 col8\" >['regression', 'classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row50_col0\" class=\"data row50 col0\" >validmind.model_validation.sklearn.RobustnessDiagnosis</td>\n", + " <td id=\"T_56dd5_row50_col1\" class=\"data row50 col1\" >Robustness Diagnosis</td>\n", + " <td id=\"T_56dd5_row50_col2\" class=\"data row50 col2\" >Assesses the robustness of a machine learning model by evaluating performance decay under noisy conditions....</td>\n", + " <td id=\"T_56dd5_row50_col3\" class=\"data row50 col3\" >True</td>\n", + " <td id=\"T_56dd5_row50_col4\" class=\"data row50 col4\" >True</td>\n", + " <td id=\"T_56dd5_row50_col5\" class=\"data row50 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_56dd5_row50_col6\" class=\"data row50 col6\" >{'metric': {'type': 'str', 'default': None}, 'scaling_factor_std_dev_list': {'type': None, 'default': [0.1, 0.2, 0.3, 0.4, 0.5]}, 'performance_decay_threshold': {'type': 'float', 'default': 0.05}}</td>\n", + " <td id=\"T_56dd5_row50_col7\" class=\"data row50 col7\" >['sklearn', 'model_diagnosis', 'visualization']</td>\n", + " <td id=\"T_56dd5_row50_col8\" class=\"data row50 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row51_col0\" class=\"data row51 col0\" >validmind.model_validation.sklearn.SHAPGlobalImportance</td>\n", + " <td id=\"T_56dd5_row51_col1\" class=\"data row51 col1\" >SHAP Global Importance</td>\n", + " <td id=\"T_56dd5_row51_col2\" class=\"data row51 col2\" >Evaluates and visualizes global feature importance using SHAP values for model explanation and risk identification....</td>\n", + " <td id=\"T_56dd5_row51_col3\" class=\"data row51 col3\" >False</td>\n", + " <td id=\"T_56dd5_row51_col4\" class=\"data row51 col4\" >True</td>\n", + " <td id=\"T_56dd5_row51_col5\" class=\"data row51 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_56dd5_row51_col6\" class=\"data row51 col6\" >{'kernel_explainer_samples': {'type': 'int', 'default': 10}, 'tree_or_linear_explainer_samples': {'type': 'int', 'default': 200}, 'class_of_interest': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_56dd5_row51_col7\" class=\"data row51 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'feature_importance', 'visualization']</td>\n", + " <td id=\"T_56dd5_row51_col8\" class=\"data row51 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row52_col0\" class=\"data row52 col0\" >validmind.model_validation.sklearn.ScoreProbabilityAlignment</td>\n", + " <td id=\"T_56dd5_row52_col1\" class=\"data row52 col1\" >Score Probability Alignment</td>\n", + " <td id=\"T_56dd5_row52_col2\" class=\"data row52 col2\" >Analyzes the alignment between credit scores and predicted probabilities....</td>\n", + " <td id=\"T_56dd5_row52_col3\" class=\"data row52 col3\" >True</td>\n", + " <td id=\"T_56dd5_row52_col4\" class=\"data row52 col4\" >True</td>\n", + " <td id=\"T_56dd5_row52_col5\" class=\"data row52 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_56dd5_row52_col6\" class=\"data row52 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'n_bins': {'type': 'int', 'default': 10}}</td>\n", + " <td id=\"T_56dd5_row52_col7\" class=\"data row52 col7\" >['visualization', 'credit_risk', 'calibration']</td>\n", + " <td id=\"T_56dd5_row52_col8\" class=\"data row52 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row53_col0\" class=\"data row53 col0\" >validmind.model_validation.sklearn.TrainingTestDegradation</td>\n", + " <td id=\"T_56dd5_row53_col1\" class=\"data row53 col1\" >Training Test Degradation</td>\n", + " <td id=\"T_56dd5_row53_col2\" class=\"data row53 col2\" >Tests if model performance degradation between training and test datasets exceeds a predefined threshold....</td>\n", + " <td id=\"T_56dd5_row53_col3\" class=\"data row53 col3\" >False</td>\n", + " <td id=\"T_56dd5_row53_col4\" class=\"data row53 col4\" >True</td>\n", + " <td id=\"T_56dd5_row53_col5\" class=\"data row53 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_56dd5_row53_col6\" class=\"data row53 col6\" >{'max_threshold': {'type': 'float', 'default': 0.1}}</td>\n", + " <td id=\"T_56dd5_row53_col7\" class=\"data row53 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_56dd5_row53_col8\" class=\"data row53 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row54_col0\" class=\"data row54 col0\" >validmind.model_validation.sklearn.WeakspotsDiagnosis</td>\n", + " <td id=\"T_56dd5_row54_col1\" class=\"data row54 col1\" >Weakspots Diagnosis</td>\n", + " <td id=\"T_56dd5_row54_col2\" class=\"data row54 col2\" >Identifies and visualizes weak spots in a machine learning model's performance across various sections of the...</td>\n", + " <td id=\"T_56dd5_row54_col3\" class=\"data row54 col3\" >True</td>\n", + " <td id=\"T_56dd5_row54_col4\" class=\"data row54 col4\" >True</td>\n", + " <td id=\"T_56dd5_row54_col5\" class=\"data row54 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_56dd5_row54_col6\" class=\"data row54 col6\" >{'features_columns': {'type': None, 'default': None}, 'metrics': {'type': None, 'default': None}, 'thresholds': {'type': None, 'default': None}}</td>\n", + " <td id=\"T_56dd5_row54_col7\" class=\"data row54 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_diagnosis', 'visualization']</td>\n", + " <td id=\"T_56dd5_row54_col8\" class=\"data row54 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row55_col0\" class=\"data row55 col0\" >validmind.model_validation.statsmodels.CumulativePredictionProbabilities</td>\n", + " <td id=\"T_56dd5_row55_col1\" class=\"data row55 col1\" >Cumulative Prediction Probabilities</td>\n", + " <td id=\"T_56dd5_row55_col2\" class=\"data row55 col2\" >Visualizes cumulative probabilities of positive and negative classes for both training and testing in classification models....</td>\n", + " <td id=\"T_56dd5_row55_col3\" class=\"data row55 col3\" >True</td>\n", + " <td id=\"T_56dd5_row55_col4\" class=\"data row55 col4\" >False</td>\n", + " <td id=\"T_56dd5_row55_col5\" class=\"data row55 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_56dd5_row55_col6\" class=\"data row55 col6\" >{'title': {'type': 'str', 'default': 'Cumulative Probabilities'}}</td>\n", + " <td id=\"T_56dd5_row55_col7\" class=\"data row55 col7\" >['visualization', 'credit_risk']</td>\n", + " <td id=\"T_56dd5_row55_col8\" class=\"data row55 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row56_col0\" class=\"data row56 col0\" >validmind.model_validation.statsmodels.GINITable</td>\n", + " <td id=\"T_56dd5_row56_col1\" class=\"data row56 col1\" >GINI Table</td>\n", + " <td id=\"T_56dd5_row56_col2\" class=\"data row56 col2\" >Evaluates classification model performance using AUC, GINI, and KS metrics for training and test datasets....</td>\n", + " <td id=\"T_56dd5_row56_col3\" class=\"data row56 col3\" >False</td>\n", + " <td id=\"T_56dd5_row56_col4\" class=\"data row56 col4\" >True</td>\n", + " <td id=\"T_56dd5_row56_col5\" class=\"data row56 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_56dd5_row56_col6\" class=\"data row56 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row56_col7\" class=\"data row56 col7\" >['model_performance']</td>\n", + " <td id=\"T_56dd5_row56_col8\" class=\"data row56 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row57_col0\" class=\"data row57 col0\" >validmind.model_validation.statsmodels.KolmogorovSmirnov</td>\n", + " <td id=\"T_56dd5_row57_col1\" class=\"data row57 col1\" >Kolmogorov Smirnov</td>\n", + " <td id=\"T_56dd5_row57_col2\" class=\"data row57 col2\" >Assesses whether each feature in the dataset aligns with a normal distribution using the Kolmogorov-Smirnov test....</td>\n", + " <td id=\"T_56dd5_row57_col3\" class=\"data row57 col3\" >False</td>\n", + " <td id=\"T_56dd5_row57_col4\" class=\"data row57 col4\" >True</td>\n", + " <td id=\"T_56dd5_row57_col5\" class=\"data row57 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_56dd5_row57_col6\" class=\"data row57 col6\" >{'dist': {'type': 'str', 'default': 'norm'}}</td>\n", + " <td id=\"T_56dd5_row57_col7\" class=\"data row57 col7\" >['tabular_data', 'data_distribution', 'statistical_test', 'statsmodels']</td>\n", + " <td id=\"T_56dd5_row57_col8\" class=\"data row57 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row58_col0\" class=\"data row58 col0\" >validmind.model_validation.statsmodels.Lilliefors</td>\n", + " <td id=\"T_56dd5_row58_col1\" class=\"data row58 col1\" >Lilliefors</td>\n", + " <td id=\"T_56dd5_row58_col2\" class=\"data row58 col2\" >Assesses the normality of feature distributions in an ML model's training dataset using the Lilliefors test....</td>\n", + " <td id=\"T_56dd5_row58_col3\" class=\"data row58 col3\" >False</td>\n", + " <td id=\"T_56dd5_row58_col4\" class=\"data row58 col4\" >True</td>\n", + " <td id=\"T_56dd5_row58_col5\" class=\"data row58 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row58_col6\" class=\"data row58 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row58_col7\" class=\"data row58 col7\" >['tabular_data', 'data_distribution', 'statistical_test', 'statsmodels']</td>\n", + " <td id=\"T_56dd5_row58_col8\" class=\"data row58 col8\" >['classification', 'regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row59_col0\" class=\"data row59 col0\" >validmind.model_validation.statsmodels.PredictionProbabilitiesHistogram</td>\n", + " <td id=\"T_56dd5_row59_col1\" class=\"data row59 col1\" >Prediction Probabilities Histogram</td>\n", + " <td id=\"T_56dd5_row59_col2\" class=\"data row59 col2\" >Assesses the predictive probability distribution for binary classification to evaluate model performance and...</td>\n", + " <td id=\"T_56dd5_row59_col3\" class=\"data row59 col3\" >True</td>\n", + " <td id=\"T_56dd5_row59_col4\" class=\"data row59 col4\" >False</td>\n", + " <td id=\"T_56dd5_row59_col5\" class=\"data row59 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_56dd5_row59_col6\" class=\"data row59 col6\" >{'title': {'type': 'str', 'default': 'Histogram of Predictive Probabilities'}}</td>\n", + " <td id=\"T_56dd5_row59_col7\" class=\"data row59 col7\" >['visualization', 'credit_risk']</td>\n", + " <td id=\"T_56dd5_row59_col8\" class=\"data row59 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row60_col0\" class=\"data row60 col0\" >validmind.model_validation.statsmodels.ScorecardHistogram</td>\n", + " <td id=\"T_56dd5_row60_col1\" class=\"data row60 col1\" >Scorecard Histogram</td>\n", + " <td id=\"T_56dd5_row60_col2\" class=\"data row60 col2\" >The Scorecard Histogram test evaluates the distribution of credit scores between default and non-default instances,...</td>\n", + " <td id=\"T_56dd5_row60_col3\" class=\"data row60 col3\" >True</td>\n", + " <td id=\"T_56dd5_row60_col4\" class=\"data row60 col4\" >False</td>\n", + " <td id=\"T_56dd5_row60_col5\" class=\"data row60 col5\" >['dataset']</td>\n", + " <td id=\"T_56dd5_row60_col6\" class=\"data row60 col6\" >{'title': {'type': 'str', 'default': 'Histogram of Scores'}, 'score_column': {'type': 'str', 'default': 'score'}}</td>\n", + " <td id=\"T_56dd5_row60_col7\" class=\"data row60 col7\" >['visualization', 'credit_risk', 'logistic_regression']</td>\n", + " <td id=\"T_56dd5_row60_col8\" class=\"data row60 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row61_col0\" class=\"data row61 col0\" >validmind.ongoing_monitoring.CalibrationCurveDrift</td>\n", + " <td id=\"T_56dd5_row61_col1\" class=\"data row61 col1\" >Calibration Curve Drift</td>\n", + " <td id=\"T_56dd5_row61_col2\" class=\"data row61 col2\" >Evaluates changes in probability calibration between reference and monitoring datasets....</td>\n", + " <td id=\"T_56dd5_row61_col3\" class=\"data row61 col3\" >True</td>\n", + " <td id=\"T_56dd5_row61_col4\" class=\"data row61 col4\" >True</td>\n", + " <td id=\"T_56dd5_row61_col5\" class=\"data row61 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_56dd5_row61_col6\" class=\"data row61 col6\" >{'n_bins': {'type': 'int', 'default': 10}, 'drift_pct_threshold': {'type': 'float', 'default': 20}}</td>\n", + " <td id=\"T_56dd5_row61_col7\" class=\"data row61 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_56dd5_row61_col8\" class=\"data row61 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row62_col0\" class=\"data row62 col0\" >validmind.ongoing_monitoring.ClassDiscriminationDrift</td>\n", + " <td id=\"T_56dd5_row62_col1\" class=\"data row62 col1\" >Class Discrimination Drift</td>\n", + " <td id=\"T_56dd5_row62_col2\" class=\"data row62 col2\" >Compares classification discrimination metrics between reference and monitoring datasets....</td>\n", + " <td id=\"T_56dd5_row62_col3\" class=\"data row62 col3\" >False</td>\n", + " <td id=\"T_56dd5_row62_col4\" class=\"data row62 col4\" >True</td>\n", + " <td id=\"T_56dd5_row62_col5\" class=\"data row62 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_56dd5_row62_col6\" class=\"data row62 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", + " <td id=\"T_56dd5_row62_col7\" class=\"data row62 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_56dd5_row62_col8\" class=\"data row62 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row63_col0\" class=\"data row63 col0\" >validmind.ongoing_monitoring.ClassImbalanceDrift</td>\n", + " <td id=\"T_56dd5_row63_col1\" class=\"data row63 col1\" >Class Imbalance Drift</td>\n", + " <td id=\"T_56dd5_row63_col2\" class=\"data row63 col2\" >Evaluates drift in class distribution between reference and monitoring datasets....</td>\n", + " <td id=\"T_56dd5_row63_col3\" class=\"data row63 col3\" >True</td>\n", + " <td id=\"T_56dd5_row63_col4\" class=\"data row63 col4\" >True</td>\n", + " <td id=\"T_56dd5_row63_col5\" class=\"data row63 col5\" >['datasets']</td>\n", + " <td id=\"T_56dd5_row63_col6\" class=\"data row63 col6\" >{'drift_pct_threshold': {'type': 'float', 'default': 5.0}, 'title': {'type': 'str', 'default': 'Class Distribution Drift'}}</td>\n", + " <td id=\"T_56dd5_row63_col7\" class=\"data row63 col7\" >['tabular_data', 'binary_classification', 'multiclass_classification']</td>\n", + " <td id=\"T_56dd5_row63_col8\" class=\"data row63 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row64_col0\" class=\"data row64 col0\" >validmind.ongoing_monitoring.ClassificationAccuracyDrift</td>\n", + " <td id=\"T_56dd5_row64_col1\" class=\"data row64 col1\" >Classification Accuracy Drift</td>\n", + " <td id=\"T_56dd5_row64_col2\" class=\"data row64 col2\" >Compares classification accuracy metrics between reference and monitoring datasets....</td>\n", + " <td id=\"T_56dd5_row64_col3\" class=\"data row64 col3\" >False</td>\n", + " <td id=\"T_56dd5_row64_col4\" class=\"data row64 col4\" >True</td>\n", + " <td id=\"T_56dd5_row64_col5\" class=\"data row64 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_56dd5_row64_col6\" class=\"data row64 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", + " <td id=\"T_56dd5_row64_col7\" class=\"data row64 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_56dd5_row64_col8\" class=\"data row64 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row65_col0\" class=\"data row65 col0\" >validmind.ongoing_monitoring.ConfusionMatrixDrift</td>\n", + " <td id=\"T_56dd5_row65_col1\" class=\"data row65 col1\" >Confusion Matrix Drift</td>\n", + " <td id=\"T_56dd5_row65_col2\" class=\"data row65 col2\" >Compares confusion matrix metrics between reference and monitoring datasets....</td>\n", + " <td id=\"T_56dd5_row65_col3\" class=\"data row65 col3\" >False</td>\n", + " <td id=\"T_56dd5_row65_col4\" class=\"data row65 col4\" >True</td>\n", + " <td id=\"T_56dd5_row65_col5\" class=\"data row65 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_56dd5_row65_col6\" class=\"data row65 col6\" >{'drift_pct_threshold': {'type': '_empty', 'default': 20}}</td>\n", + " <td id=\"T_56dd5_row65_col7\" class=\"data row65 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance']</td>\n", + " <td id=\"T_56dd5_row65_col8\" class=\"data row65 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row66_col0\" class=\"data row66 col0\" >validmind.ongoing_monitoring.CumulativePredictionProbabilitiesDrift</td>\n", + " <td id=\"T_56dd5_row66_col1\" class=\"data row66 col1\" >Cumulative Prediction Probabilities Drift</td>\n", + " <td id=\"T_56dd5_row66_col2\" class=\"data row66 col2\" >Compares cumulative prediction probability distributions between reference and monitoring datasets....</td>\n", + " <td id=\"T_56dd5_row66_col3\" class=\"data row66 col3\" >True</td>\n", + " <td id=\"T_56dd5_row66_col4\" class=\"data row66 col4\" >False</td>\n", + " <td id=\"T_56dd5_row66_col5\" class=\"data row66 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_56dd5_row66_col6\" class=\"data row66 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row66_col7\" class=\"data row66 col7\" >['visualization', 'credit_risk']</td>\n", + " <td id=\"T_56dd5_row66_col8\" class=\"data row66 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row67_col0\" class=\"data row67 col0\" >validmind.ongoing_monitoring.PredictionProbabilitiesHistogramDrift</td>\n", + " <td id=\"T_56dd5_row67_col1\" class=\"data row67 col1\" >Prediction Probabilities Histogram Drift</td>\n", + " <td id=\"T_56dd5_row67_col2\" class=\"data row67 col2\" >Compares prediction probability distributions between reference and monitoring datasets....</td>\n", + " <td id=\"T_56dd5_row67_col3\" class=\"data row67 col3\" >True</td>\n", + " <td id=\"T_56dd5_row67_col4\" class=\"data row67 col4\" >True</td>\n", + " <td id=\"T_56dd5_row67_col5\" class=\"data row67 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_56dd5_row67_col6\" class=\"data row67 col6\" >{'title': {'type': '_empty', 'default': 'Prediction Probabilities Histogram Drift'}, 'drift_pct_threshold': {'type': 'float', 'default': 20.0}}</td>\n", + " <td id=\"T_56dd5_row67_col7\" class=\"data row67 col7\" >['visualization', 'credit_risk']</td>\n", + " <td id=\"T_56dd5_row67_col8\" class=\"data row67 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row68_col0\" class=\"data row68 col0\" >validmind.ongoing_monitoring.ROCCurveDrift</td>\n", + " <td id=\"T_56dd5_row68_col1\" class=\"data row68 col1\" >ROC Curve Drift</td>\n", + " <td id=\"T_56dd5_row68_col2\" class=\"data row68 col2\" >Compares ROC curves between reference and monitoring datasets....</td>\n", + " <td id=\"T_56dd5_row68_col3\" class=\"data row68 col3\" >True</td>\n", + " <td id=\"T_56dd5_row68_col4\" class=\"data row68 col4\" >False</td>\n", + " <td id=\"T_56dd5_row68_col5\" class=\"data row68 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_56dd5_row68_col6\" class=\"data row68 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row68_col7\" class=\"data row68 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_56dd5_row68_col8\" class=\"data row68 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row69_col0\" class=\"data row69 col0\" >validmind.ongoing_monitoring.ScoreBandsDrift</td>\n", + " <td id=\"T_56dd5_row69_col1\" class=\"data row69 col1\" >Score Bands Drift</td>\n", + " <td id=\"T_56dd5_row69_col2\" class=\"data row69 col2\" >Analyzes drift in population distribution and default rates across score bands....</td>\n", + " <td id=\"T_56dd5_row69_col3\" class=\"data row69 col3\" >False</td>\n", + " <td id=\"T_56dd5_row69_col4\" class=\"data row69 col4\" >True</td>\n", + " <td id=\"T_56dd5_row69_col5\" class=\"data row69 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_56dd5_row69_col6\" class=\"data row69 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'score_bands': {'type': 'list', 'default': None}, 'drift_threshold': {'type': 'float', 'default': 20.0}}</td>\n", + " <td id=\"T_56dd5_row69_col7\" class=\"data row69 col7\" >['visualization', 'credit_risk', 'scorecard']</td>\n", + " <td id=\"T_56dd5_row69_col8\" class=\"data row69 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row70_col0\" class=\"data row70 col0\" >validmind.ongoing_monitoring.ScorecardHistogramDrift</td>\n", + " <td id=\"T_56dd5_row70_col1\" class=\"data row70 col1\" >Scorecard Histogram Drift</td>\n", + " <td id=\"T_56dd5_row70_col2\" class=\"data row70 col2\" >Compares score distributions between reference and monitoring datasets for each class....</td>\n", + " <td id=\"T_56dd5_row70_col3\" class=\"data row70 col3\" >True</td>\n", + " <td id=\"T_56dd5_row70_col4\" class=\"data row70 col4\" >True</td>\n", + " <td id=\"T_56dd5_row70_col5\" class=\"data row70 col5\" >['datasets']</td>\n", + " <td id=\"T_56dd5_row70_col6\" class=\"data row70 col6\" >{'score_column': {'type': 'str', 'default': 'score'}, 'title': {'type': 'str', 'default': 'Scorecard Histogram Drift'}, 'drift_pct_threshold': {'type': 'float', 'default': 20.0}}</td>\n", + " <td id=\"T_56dd5_row70_col7\" class=\"data row70 col7\" >['visualization', 'credit_risk', 'logistic_regression']</td>\n", + " <td id=\"T_56dd5_row70_col8\" class=\"data row70 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row71_col0\" class=\"data row71 col0\" >validmind.unit_metrics.classification.Accuracy</td>\n", + " <td id=\"T_56dd5_row71_col1\" class=\"data row71 col1\" >Accuracy</td>\n", + " <td id=\"T_56dd5_row71_col2\" class=\"data row71 col2\" >Calculates the accuracy of a model</td>\n", + " <td id=\"T_56dd5_row71_col3\" class=\"data row71 col3\" >False</td>\n", + " <td id=\"T_56dd5_row71_col4\" class=\"data row71 col4\" >False</td>\n", + " <td id=\"T_56dd5_row71_col5\" class=\"data row71 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_56dd5_row71_col6\" class=\"data row71 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row71_col7\" class=\"data row71 col7\" >['classification']</td>\n", + " <td id=\"T_56dd5_row71_col8\" class=\"data row71 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row72_col0\" class=\"data row72 col0\" >validmind.unit_metrics.classification.F1</td>\n", + " <td id=\"T_56dd5_row72_col1\" class=\"data row72 col1\" >F1</td>\n", + " <td id=\"T_56dd5_row72_col2\" class=\"data row72 col2\" >Calculates the F1 score for a classification model.</td>\n", + " <td id=\"T_56dd5_row72_col3\" class=\"data row72 col3\" >False</td>\n", + " <td id=\"T_56dd5_row72_col4\" class=\"data row72 col4\" >False</td>\n", + " <td id=\"T_56dd5_row72_col5\" class=\"data row72 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_56dd5_row72_col6\" class=\"data row72 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row72_col7\" class=\"data row72 col7\" >['classification']</td>\n", + " <td id=\"T_56dd5_row72_col8\" class=\"data row72 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row73_col0\" class=\"data row73 col0\" >validmind.unit_metrics.classification.Precision</td>\n", + " <td id=\"T_56dd5_row73_col1\" class=\"data row73 col1\" >Precision</td>\n", + " <td id=\"T_56dd5_row73_col2\" class=\"data row73 col2\" >Calculates the precision for a classification model.</td>\n", + " <td id=\"T_56dd5_row73_col3\" class=\"data row73 col3\" >False</td>\n", + " <td id=\"T_56dd5_row73_col4\" class=\"data row73 col4\" >False</td>\n", + " <td id=\"T_56dd5_row73_col5\" class=\"data row73 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_56dd5_row73_col6\" class=\"data row73 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row73_col7\" class=\"data row73 col7\" >['classification']</td>\n", + " <td id=\"T_56dd5_row73_col8\" class=\"data row73 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row74_col0\" class=\"data row74 col0\" >validmind.unit_metrics.classification.ROC_AUC</td>\n", + " <td id=\"T_56dd5_row74_col1\" class=\"data row74 col1\" >ROC AUC</td>\n", + " <td id=\"T_56dd5_row74_col2\" class=\"data row74 col2\" >Calculates the ROC AUC for a classification model.</td>\n", + " <td id=\"T_56dd5_row74_col3\" class=\"data row74 col3\" >False</td>\n", + " <td id=\"T_56dd5_row74_col4\" class=\"data row74 col4\" >False</td>\n", + " <td id=\"T_56dd5_row74_col5\" class=\"data row74 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_56dd5_row74_col6\" class=\"data row74 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row74_col7\" class=\"data row74 col7\" >['classification']</td>\n", + " <td id=\"T_56dd5_row74_col8\" class=\"data row74 col8\" >['classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_56dd5_row75_col0\" class=\"data row75 col0\" >validmind.unit_metrics.classification.Recall</td>\n", + " <td id=\"T_56dd5_row75_col1\" class=\"data row75 col1\" >Recall</td>\n", + " <td id=\"T_56dd5_row75_col2\" class=\"data row75 col2\" >Calculates the recall for a classification model.</td>\n", + " <td id=\"T_56dd5_row75_col3\" class=\"data row75 col3\" >False</td>\n", + " <td id=\"T_56dd5_row75_col4\" class=\"data row75 col4\" >False</td>\n", + " <td id=\"T_56dd5_row75_col5\" class=\"data row75 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_56dd5_row75_col6\" class=\"data row75 col6\" >{}</td>\n", + " <td id=\"T_56dd5_row75_col7\" class=\"data row75 col7\" >['classification']</td>\n", + " <td id=\"T_56dd5_row75_col8\" class=\"data row75 col8\" >['classification']</td>\n", + " </tr>\n", + " </tbody>\n", + "</table>\n" + ], + "text/plain": [ + "<pandas.io.formats.style.Styler at 0x10516c880>" + ] + }, + "execution_count": null, + "metadata": {}, + "output_type": "execute_result" + } ], - "text/plain": [ - "<pandas.io.formats.style.Styler at 0x36a280f40>" + "source": [ + "list_tests(task=\"classification\")" ] - }, - "execution_count": 8, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "list_tests(tags=[\"model_performance\", \"visualization\"])" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Use `filter`, `task`, and `tags` together to create more specific queries.\n", - "\n", - "For example, apply all three to find tests compatible with `sklearn` models, designed for `classification` tasks:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [ + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Use the `tags` parameter to find tests based on their tags, such as `model_performance` or `visualization`:" + ] + }, { - "data": { - "text/html": [ - "<style type=\"text/css\">\n", - "#T_36394 th {\n", - " text-align: left;\n", - "}\n", - "#T_36394_row0_col0, #T_36394_row0_col1, #T_36394_row0_col2, #T_36394_row0_col3, #T_36394_row0_col4, #T_36394_row0_col5, #T_36394_row0_col6, #T_36394_row0_col7, #T_36394_row0_col8, #T_36394_row1_col0, #T_36394_row1_col1, #T_36394_row1_col2, #T_36394_row1_col3, #T_36394_row1_col4, #T_36394_row1_col5, #T_36394_row1_col6, #T_36394_row1_col7, #T_36394_row1_col8, #T_36394_row2_col0, #T_36394_row2_col1, #T_36394_row2_col2, #T_36394_row2_col3, #T_36394_row2_col4, #T_36394_row2_col5, #T_36394_row2_col6, #T_36394_row2_col7, #T_36394_row2_col8, #T_36394_row3_col0, #T_36394_row3_col1, #T_36394_row3_col2, #T_36394_row3_col3, #T_36394_row3_col4, #T_36394_row3_col5, #T_36394_row3_col6, #T_36394_row3_col7, #T_36394_row3_col8, #T_36394_row4_col0, #T_36394_row4_col1, #T_36394_row4_col2, #T_36394_row4_col3, #T_36394_row4_col4, #T_36394_row4_col5, #T_36394_row4_col6, #T_36394_row4_col7, #T_36394_row4_col8, #T_36394_row5_col0, #T_36394_row5_col1, #T_36394_row5_col2, #T_36394_row5_col3, #T_36394_row5_col4, #T_36394_row5_col5, #T_36394_row5_col6, #T_36394_row5_col7, #T_36394_row5_col8 {\n", - " text-align: left;\n", - "}\n", - "</style>\n", - "<table id=\"T_36394\">\n", - " <thead>\n", - " <tr>\n", - " <th id=\"T_36394_level0_col0\" class=\"col_heading level0 col0\" >ID</th>\n", - " <th id=\"T_36394_level0_col1\" class=\"col_heading level0 col1\" >Name</th>\n", - " <th id=\"T_36394_level0_col2\" class=\"col_heading level0 col2\" >Description</th>\n", - " <th id=\"T_36394_level0_col3\" class=\"col_heading level0 col3\" >Has Figure</th>\n", - " <th id=\"T_36394_level0_col4\" class=\"col_heading level0 col4\" >Has Table</th>\n", - " <th id=\"T_36394_level0_col5\" class=\"col_heading level0 col5\" >Required Inputs</th>\n", - " <th id=\"T_36394_level0_col6\" class=\"col_heading level0 col6\" >Params</th>\n", - " <th id=\"T_36394_level0_col7\" class=\"col_heading level0 col7\" >Tags</th>\n", - " <th id=\"T_36394_level0_col8\" class=\"col_heading level0 col8\" >Tasks</th>\n", - " </tr>\n", - " </thead>\n", - " <tbody>\n", - " <tr>\n", - " <td id=\"T_36394_row0_col0\" class=\"data row0 col0\" >validmind.model_validation.sklearn.ConfusionMatrix</td>\n", - " <td id=\"T_36394_row0_col1\" class=\"data row0 col1\" >Confusion Matrix</td>\n", - " <td id=\"T_36394_row0_col2\" class=\"data row0 col2\" >Evaluates and visually represents the classification ML model's predictive performance using a Confusion Matrix...</td>\n", - " <td id=\"T_36394_row0_col3\" class=\"data row0 col3\" >True</td>\n", - " <td id=\"T_36394_row0_col4\" class=\"data row0 col4\" >False</td>\n", - " <td id=\"T_36394_row0_col5\" class=\"data row0 col5\" >['dataset', 'model']</td>\n", - " <td id=\"T_36394_row0_col6\" class=\"data row0 col6\" >{'threshold': {'type': 'float', 'default': 0.5}}</td>\n", - " <td id=\"T_36394_row0_col7\" class=\"data row0 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_36394_row0_col8\" class=\"data row0 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_36394_row1_col0\" class=\"data row1 col0\" >validmind.model_validation.sklearn.PrecisionRecallCurve</td>\n", - " <td id=\"T_36394_row1_col1\" class=\"data row1 col1\" >Precision Recall Curve</td>\n", - " <td id=\"T_36394_row1_col2\" class=\"data row1 col2\" >Evaluates the precision-recall trade-off for binary classification models and visualizes the Precision-Recall curve....</td>\n", - " <td id=\"T_36394_row1_col3\" class=\"data row1 col3\" >True</td>\n", - " <td id=\"T_36394_row1_col4\" class=\"data row1 col4\" >False</td>\n", - " <td id=\"T_36394_row1_col5\" class=\"data row1 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_36394_row1_col6\" class=\"data row1 col6\" >{}</td>\n", - " <td id=\"T_36394_row1_col7\" class=\"data row1 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_36394_row1_col8\" class=\"data row1 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_36394_row2_col0\" class=\"data row2 col0\" >validmind.model_validation.sklearn.ROCCurve</td>\n", - " <td id=\"T_36394_row2_col1\" class=\"data row2 col1\" >ROC Curve</td>\n", - " <td id=\"T_36394_row2_col2\" class=\"data row2 col2\" >Evaluates binary classification model performance by generating and plotting the Receiver Operating Characteristic...</td>\n", - " <td id=\"T_36394_row2_col3\" class=\"data row2 col3\" >True</td>\n", - " <td id=\"T_36394_row2_col4\" class=\"data row2 col4\" >False</td>\n", - " <td id=\"T_36394_row2_col5\" class=\"data row2 col5\" >['model', 'dataset']</td>\n", - " <td id=\"T_36394_row2_col6\" class=\"data row2 col6\" >{}</td>\n", - " <td id=\"T_36394_row2_col7\" class=\"data row2 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_36394_row2_col8\" class=\"data row2 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_36394_row3_col0\" class=\"data row3 col0\" >validmind.model_validation.sklearn.TrainingTestDegradation</td>\n", - " <td id=\"T_36394_row3_col1\" class=\"data row3 col1\" >Training Test Degradation</td>\n", - " <td id=\"T_36394_row3_col2\" class=\"data row3 col2\" >Tests if model performance degradation between training and test datasets exceeds a predefined threshold....</td>\n", - " <td id=\"T_36394_row3_col3\" class=\"data row3 col3\" >False</td>\n", - " <td id=\"T_36394_row3_col4\" class=\"data row3 col4\" >True</td>\n", - " <td id=\"T_36394_row3_col5\" class=\"data row3 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_36394_row3_col6\" class=\"data row3 col6\" >{'max_threshold': {'type': 'float', 'default': 0.1}}</td>\n", - " <td id=\"T_36394_row3_col7\" class=\"data row3 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_36394_row3_col8\" class=\"data row3 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_36394_row4_col0\" class=\"data row4 col0\" >validmind.ongoing_monitoring.CalibrationCurveDrift</td>\n", - " <td id=\"T_36394_row4_col1\" class=\"data row4 col1\" >Calibration Curve Drift</td>\n", - " <td id=\"T_36394_row4_col2\" class=\"data row4 col2\" >Evaluates changes in probability calibration between reference and monitoring datasets....</td>\n", - " <td id=\"T_36394_row4_col3\" class=\"data row4 col3\" >True</td>\n", - " <td id=\"T_36394_row4_col4\" class=\"data row4 col4\" >True</td>\n", - " <td id=\"T_36394_row4_col5\" class=\"data row4 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_36394_row4_col6\" class=\"data row4 col6\" >{'n_bins': {'type': 'int', 'default': 10}, 'drift_pct_threshold': {'type': 'float', 'default': 20}}</td>\n", - " <td id=\"T_36394_row4_col7\" class=\"data row4 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_36394_row4_col8\" class=\"data row4 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " <tr>\n", - " <td id=\"T_36394_row5_col0\" class=\"data row5 col0\" >validmind.ongoing_monitoring.ROCCurveDrift</td>\n", - " <td id=\"T_36394_row5_col1\" class=\"data row5 col1\" >ROC Curve Drift</td>\n", - " <td id=\"T_36394_row5_col2\" class=\"data row5 col2\" >Compares ROC curves between reference and monitoring datasets....</td>\n", - " <td id=\"T_36394_row5_col3\" class=\"data row5 col3\" >True</td>\n", - " <td id=\"T_36394_row5_col4\" class=\"data row5 col4\" >False</td>\n", - " <td id=\"T_36394_row5_col5\" class=\"data row5 col5\" >['datasets', 'model']</td>\n", - " <td id=\"T_36394_row5_col6\" class=\"data row5 col6\" >{}</td>\n", - " <td id=\"T_36394_row5_col7\" class=\"data row5 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", - " <td id=\"T_36394_row5_col8\" class=\"data row5 col8\" >['classification', 'text_classification']</td>\n", - " </tr>\n", - " </tbody>\n", - "</table>\n" + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "<style type=\"text/css\">\n", + "#T_4d8bf th {\n", + " text-align: left;\n", + "}\n", + "#T_4d8bf_row0_col0, #T_4d8bf_row0_col1, #T_4d8bf_row0_col2, #T_4d8bf_row0_col3, #T_4d8bf_row0_col4, #T_4d8bf_row0_col5, #T_4d8bf_row0_col6, #T_4d8bf_row0_col7, #T_4d8bf_row0_col8, #T_4d8bf_row1_col0, #T_4d8bf_row1_col1, #T_4d8bf_row1_col2, #T_4d8bf_row1_col3, #T_4d8bf_row1_col4, #T_4d8bf_row1_col5, #T_4d8bf_row1_col6, #T_4d8bf_row1_col7, #T_4d8bf_row1_col8, #T_4d8bf_row2_col0, #T_4d8bf_row2_col1, #T_4d8bf_row2_col2, #T_4d8bf_row2_col3, #T_4d8bf_row2_col4, #T_4d8bf_row2_col5, #T_4d8bf_row2_col6, #T_4d8bf_row2_col7, #T_4d8bf_row2_col8, #T_4d8bf_row3_col0, #T_4d8bf_row3_col1, #T_4d8bf_row3_col2, #T_4d8bf_row3_col3, #T_4d8bf_row3_col4, #T_4d8bf_row3_col5, #T_4d8bf_row3_col6, #T_4d8bf_row3_col7, #T_4d8bf_row3_col8, #T_4d8bf_row4_col0, #T_4d8bf_row4_col1, #T_4d8bf_row4_col2, #T_4d8bf_row4_col3, #T_4d8bf_row4_col4, #T_4d8bf_row4_col5, #T_4d8bf_row4_col6, #T_4d8bf_row4_col7, #T_4d8bf_row4_col8, #T_4d8bf_row5_col0, #T_4d8bf_row5_col1, #T_4d8bf_row5_col2, #T_4d8bf_row5_col3, #T_4d8bf_row5_col4, #T_4d8bf_row5_col5, #T_4d8bf_row5_col6, #T_4d8bf_row5_col7, #T_4d8bf_row5_col8, #T_4d8bf_row6_col0, #T_4d8bf_row6_col1, #T_4d8bf_row6_col2, #T_4d8bf_row6_col3, #T_4d8bf_row6_col4, #T_4d8bf_row6_col5, #T_4d8bf_row6_col6, #T_4d8bf_row6_col7, #T_4d8bf_row6_col8 {\n", + " text-align: left;\n", + "}\n", + "</style>\n", + "<table id=\"T_4d8bf\">\n", + " <thead>\n", + " <tr>\n", + " <th id=\"T_4d8bf_level0_col0\" class=\"col_heading level0 col0\" >ID</th>\n", + " <th id=\"T_4d8bf_level0_col1\" class=\"col_heading level0 col1\" >Name</th>\n", + " <th id=\"T_4d8bf_level0_col2\" class=\"col_heading level0 col2\" >Description</th>\n", + " <th id=\"T_4d8bf_level0_col3\" class=\"col_heading level0 col3\" >Has Figure</th>\n", + " <th id=\"T_4d8bf_level0_col4\" class=\"col_heading level0 col4\" >Has Table</th>\n", + " <th id=\"T_4d8bf_level0_col5\" class=\"col_heading level0 col5\" >Required Inputs</th>\n", + " <th id=\"T_4d8bf_level0_col6\" class=\"col_heading level0 col6\" >Params</th>\n", + " <th id=\"T_4d8bf_level0_col7\" class=\"col_heading level0 col7\" >Tags</th>\n", + " <th id=\"T_4d8bf_level0_col8\" class=\"col_heading level0 col8\" >Tasks</th>\n", + " </tr>\n", + " </thead>\n", + " <tbody>\n", + " <tr>\n", + " <td id=\"T_4d8bf_row0_col0\" class=\"data row0 col0\" >validmind.model_validation.RegressionResidualsPlot</td>\n", + " <td id=\"T_4d8bf_row0_col1\" class=\"data row0 col1\" >Regression Residuals Plot</td>\n", + " <td id=\"T_4d8bf_row0_col2\" class=\"data row0 col2\" >Evaluates regression model performance using residual distribution and actual vs. predicted plots....</td>\n", + " <td id=\"T_4d8bf_row0_col3\" class=\"data row0 col3\" >True</td>\n", + " <td id=\"T_4d8bf_row0_col4\" class=\"data row0 col4\" >False</td>\n", + " <td id=\"T_4d8bf_row0_col5\" class=\"data row0 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_4d8bf_row0_col6\" class=\"data row0 col6\" >{'bin_size': {'type': 'float', 'default': 0.1}}</td>\n", + " <td id=\"T_4d8bf_row0_col7\" class=\"data row0 col7\" >['model_performance', 'visualization']</td>\n", + " <td id=\"T_4d8bf_row0_col8\" class=\"data row0 col8\" >['regression']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_4d8bf_row1_col0\" class=\"data row1 col0\" >validmind.model_validation.sklearn.ConfusionMatrix</td>\n", + " <td id=\"T_4d8bf_row1_col1\" class=\"data row1 col1\" >Confusion Matrix</td>\n", + " <td id=\"T_4d8bf_row1_col2\" class=\"data row1 col2\" >Evaluates and visually represents the classification ML model's predictive performance using a Confusion Matrix...</td>\n", + " <td id=\"T_4d8bf_row1_col3\" class=\"data row1 col3\" >True</td>\n", + " <td id=\"T_4d8bf_row1_col4\" class=\"data row1 col4\" >False</td>\n", + " <td id=\"T_4d8bf_row1_col5\" class=\"data row1 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_4d8bf_row1_col6\" class=\"data row1 col6\" >{'threshold': {'type': 'float', 'default': 0.5}}</td>\n", + " <td id=\"T_4d8bf_row1_col7\" class=\"data row1 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_4d8bf_row1_col8\" class=\"data row1 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_4d8bf_row2_col0\" class=\"data row2 col0\" >validmind.model_validation.sklearn.PrecisionRecallCurve</td>\n", + " <td id=\"T_4d8bf_row2_col1\" class=\"data row2 col1\" >Precision Recall Curve</td>\n", + " <td id=\"T_4d8bf_row2_col2\" class=\"data row2 col2\" >Evaluates the precision-recall trade-off for binary classification models and visualizes the Precision-Recall curve....</td>\n", + " <td id=\"T_4d8bf_row2_col3\" class=\"data row2 col3\" >True</td>\n", + " <td id=\"T_4d8bf_row2_col4\" class=\"data row2 col4\" >False</td>\n", + " <td id=\"T_4d8bf_row2_col5\" class=\"data row2 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_4d8bf_row2_col6\" class=\"data row2 col6\" >{}</td>\n", + " <td id=\"T_4d8bf_row2_col7\" class=\"data row2 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_4d8bf_row2_col8\" class=\"data row2 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_4d8bf_row3_col0\" class=\"data row3 col0\" >validmind.model_validation.sklearn.ROCCurve</td>\n", + " <td id=\"T_4d8bf_row3_col1\" class=\"data row3 col1\" >ROC Curve</td>\n", + " <td id=\"T_4d8bf_row3_col2\" class=\"data row3 col2\" >Evaluates binary classification model performance by generating and plotting the Receiver Operating Characteristic...</td>\n", + " <td id=\"T_4d8bf_row3_col3\" class=\"data row3 col3\" >True</td>\n", + " <td id=\"T_4d8bf_row3_col4\" class=\"data row3 col4\" >False</td>\n", + " <td id=\"T_4d8bf_row3_col5\" class=\"data row3 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_4d8bf_row3_col6\" class=\"data row3 col6\" >{}</td>\n", + " <td id=\"T_4d8bf_row3_col7\" class=\"data row3 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_4d8bf_row3_col8\" class=\"data row3 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_4d8bf_row4_col0\" class=\"data row4 col0\" >validmind.model_validation.sklearn.TrainingTestDegradation</td>\n", + " <td id=\"T_4d8bf_row4_col1\" class=\"data row4 col1\" >Training Test Degradation</td>\n", + " <td id=\"T_4d8bf_row4_col2\" class=\"data row4 col2\" >Tests if model performance degradation between training and test datasets exceeds a predefined threshold....</td>\n", + " <td id=\"T_4d8bf_row4_col3\" class=\"data row4 col3\" >False</td>\n", + " <td id=\"T_4d8bf_row4_col4\" class=\"data row4 col4\" >True</td>\n", + " <td id=\"T_4d8bf_row4_col5\" class=\"data row4 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_4d8bf_row4_col6\" class=\"data row4 col6\" >{'max_threshold': {'type': 'float', 'default': 0.1}}</td>\n", + " <td id=\"T_4d8bf_row4_col7\" class=\"data row4 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_4d8bf_row4_col8\" class=\"data row4 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_4d8bf_row5_col0\" class=\"data row5 col0\" >validmind.ongoing_monitoring.CalibrationCurveDrift</td>\n", + " <td id=\"T_4d8bf_row5_col1\" class=\"data row5 col1\" >Calibration Curve Drift</td>\n", + " <td id=\"T_4d8bf_row5_col2\" class=\"data row5 col2\" >Evaluates changes in probability calibration between reference and monitoring datasets....</td>\n", + " <td id=\"T_4d8bf_row5_col3\" class=\"data row5 col3\" >True</td>\n", + " <td id=\"T_4d8bf_row5_col4\" class=\"data row5 col4\" >True</td>\n", + " <td id=\"T_4d8bf_row5_col5\" class=\"data row5 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_4d8bf_row5_col6\" class=\"data row5 col6\" >{'n_bins': {'type': 'int', 'default': 10}, 'drift_pct_threshold': {'type': 'float', 'default': 20}}</td>\n", + " <td id=\"T_4d8bf_row5_col7\" class=\"data row5 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_4d8bf_row5_col8\" class=\"data row5 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_4d8bf_row6_col0\" class=\"data row6 col0\" >validmind.ongoing_monitoring.ROCCurveDrift</td>\n", + " <td id=\"T_4d8bf_row6_col1\" class=\"data row6 col1\" >ROC Curve Drift</td>\n", + " <td id=\"T_4d8bf_row6_col2\" class=\"data row6 col2\" >Compares ROC curves between reference and monitoring datasets....</td>\n", + " <td id=\"T_4d8bf_row6_col3\" class=\"data row6 col3\" >True</td>\n", + " <td id=\"T_4d8bf_row6_col4\" class=\"data row6 col4\" >False</td>\n", + " <td id=\"T_4d8bf_row6_col5\" class=\"data row6 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_4d8bf_row6_col6\" class=\"data row6 col6\" >{}</td>\n", + " <td id=\"T_4d8bf_row6_col7\" class=\"data row6 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_4d8bf_row6_col8\" class=\"data row6 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " </tbody>\n", + "</table>\n" + ], + "text/plain": [ + "<pandas.io.formats.style.Styler at 0x36a280f40>" + ] + }, + "execution_count": null, + "metadata": {}, + "output_type": "execute_result" + } ], - "text/plain": [ - "<pandas.io.formats.style.Styler at 0x380009c40>" + "source": [ + "list_tests(tags=[\"model_performance\", \"visualization\"])" ] - }, - "execution_count": 9, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "list_tests(filter=\"sklearn\",\n", - " tags=[\"model_performance\", \"visualization\"], task=\"classification\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Store test sets for use\n", - "\n", - "Once you've identified specific sets of tests you'd like to run, you can store the tests in variables, enabling you to easily reuse those tests in later steps.\n", - "\n", - "For example, if you're validating a summarization model, use [`list_tests()`](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to retrieve all tests tagged for text summarization and save them to `text_summarization_tests` for later use:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [ + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Use `filter`, `task`, and `tags` together to create more specific queries.\n", + "\n", + "For example, apply all three to find tests compatible with `sklearn` models, designed for `classification` tasks:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "<style type=\"text/css\">\n", + "#T_36394 th {\n", + " text-align: left;\n", + "}\n", + "#T_36394_row0_col0, #T_36394_row0_col1, #T_36394_row0_col2, #T_36394_row0_col3, #T_36394_row0_col4, #T_36394_row0_col5, #T_36394_row0_col6, #T_36394_row0_col7, #T_36394_row0_col8, #T_36394_row1_col0, #T_36394_row1_col1, #T_36394_row1_col2, #T_36394_row1_col3, #T_36394_row1_col4, #T_36394_row1_col5, #T_36394_row1_col6, #T_36394_row1_col7, #T_36394_row1_col8, #T_36394_row2_col0, #T_36394_row2_col1, #T_36394_row2_col2, #T_36394_row2_col3, #T_36394_row2_col4, #T_36394_row2_col5, #T_36394_row2_col6, #T_36394_row2_col7, #T_36394_row2_col8, #T_36394_row3_col0, #T_36394_row3_col1, #T_36394_row3_col2, #T_36394_row3_col3, #T_36394_row3_col4, #T_36394_row3_col5, #T_36394_row3_col6, #T_36394_row3_col7, #T_36394_row3_col8, #T_36394_row4_col0, #T_36394_row4_col1, #T_36394_row4_col2, #T_36394_row4_col3, #T_36394_row4_col4, #T_36394_row4_col5, #T_36394_row4_col6, #T_36394_row4_col7, #T_36394_row4_col8, #T_36394_row5_col0, #T_36394_row5_col1, #T_36394_row5_col2, #T_36394_row5_col3, #T_36394_row5_col4, #T_36394_row5_col5, #T_36394_row5_col6, #T_36394_row5_col7, #T_36394_row5_col8 {\n", + " text-align: left;\n", + "}\n", + "</style>\n", + "<table id=\"T_36394\">\n", + " <thead>\n", + " <tr>\n", + " <th id=\"T_36394_level0_col0\" class=\"col_heading level0 col0\" >ID</th>\n", + " <th id=\"T_36394_level0_col1\" class=\"col_heading level0 col1\" >Name</th>\n", + " <th id=\"T_36394_level0_col2\" class=\"col_heading level0 col2\" >Description</th>\n", + " <th id=\"T_36394_level0_col3\" class=\"col_heading level0 col3\" >Has Figure</th>\n", + " <th id=\"T_36394_level0_col4\" class=\"col_heading level0 col4\" >Has Table</th>\n", + " <th id=\"T_36394_level0_col5\" class=\"col_heading level0 col5\" >Required Inputs</th>\n", + " <th id=\"T_36394_level0_col6\" class=\"col_heading level0 col6\" >Params</th>\n", + " <th id=\"T_36394_level0_col7\" class=\"col_heading level0 col7\" >Tags</th>\n", + " <th id=\"T_36394_level0_col8\" class=\"col_heading level0 col8\" >Tasks</th>\n", + " </tr>\n", + " </thead>\n", + " <tbody>\n", + " <tr>\n", + " <td id=\"T_36394_row0_col0\" class=\"data row0 col0\" >validmind.model_validation.sklearn.ConfusionMatrix</td>\n", + " <td id=\"T_36394_row0_col1\" class=\"data row0 col1\" >Confusion Matrix</td>\n", + " <td id=\"T_36394_row0_col2\" class=\"data row0 col2\" >Evaluates and visually represents the classification ML model's predictive performance using a Confusion Matrix...</td>\n", + " <td id=\"T_36394_row0_col3\" class=\"data row0 col3\" >True</td>\n", + " <td id=\"T_36394_row0_col4\" class=\"data row0 col4\" >False</td>\n", + " <td id=\"T_36394_row0_col5\" class=\"data row0 col5\" >['dataset', 'model']</td>\n", + " <td id=\"T_36394_row0_col6\" class=\"data row0 col6\" >{'threshold': {'type': 'float', 'default': 0.5}}</td>\n", + " <td id=\"T_36394_row0_col7\" class=\"data row0 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_36394_row0_col8\" class=\"data row0 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_36394_row1_col0\" class=\"data row1 col0\" >validmind.model_validation.sklearn.PrecisionRecallCurve</td>\n", + " <td id=\"T_36394_row1_col1\" class=\"data row1 col1\" >Precision Recall Curve</td>\n", + " <td id=\"T_36394_row1_col2\" class=\"data row1 col2\" >Evaluates the precision-recall trade-off for binary classification models and visualizes the Precision-Recall curve....</td>\n", + " <td id=\"T_36394_row1_col3\" class=\"data row1 col3\" >True</td>\n", + " <td id=\"T_36394_row1_col4\" class=\"data row1 col4\" >False</td>\n", + " <td id=\"T_36394_row1_col5\" class=\"data row1 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_36394_row1_col6\" class=\"data row1 col6\" >{}</td>\n", + " <td id=\"T_36394_row1_col7\" class=\"data row1 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_36394_row1_col8\" class=\"data row1 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_36394_row2_col0\" class=\"data row2 col0\" >validmind.model_validation.sklearn.ROCCurve</td>\n", + " <td id=\"T_36394_row2_col1\" class=\"data row2 col1\" >ROC Curve</td>\n", + " <td id=\"T_36394_row2_col2\" class=\"data row2 col2\" >Evaluates binary classification model performance by generating and plotting the Receiver Operating Characteristic...</td>\n", + " <td id=\"T_36394_row2_col3\" class=\"data row2 col3\" >True</td>\n", + " <td id=\"T_36394_row2_col4\" class=\"data row2 col4\" >False</td>\n", + " <td id=\"T_36394_row2_col5\" class=\"data row2 col5\" >['model', 'dataset']</td>\n", + " <td id=\"T_36394_row2_col6\" class=\"data row2 col6\" >{}</td>\n", + " <td id=\"T_36394_row2_col7\" class=\"data row2 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_36394_row2_col8\" class=\"data row2 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_36394_row3_col0\" class=\"data row3 col0\" >validmind.model_validation.sklearn.TrainingTestDegradation</td>\n", + " <td id=\"T_36394_row3_col1\" class=\"data row3 col1\" >Training Test Degradation</td>\n", + " <td id=\"T_36394_row3_col2\" class=\"data row3 col2\" >Tests if model performance degradation between training and test datasets exceeds a predefined threshold....</td>\n", + " <td id=\"T_36394_row3_col3\" class=\"data row3 col3\" >False</td>\n", + " <td id=\"T_36394_row3_col4\" class=\"data row3 col4\" >True</td>\n", + " <td id=\"T_36394_row3_col5\" class=\"data row3 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_36394_row3_col6\" class=\"data row3 col6\" >{'max_threshold': {'type': 'float', 'default': 0.1}}</td>\n", + " <td id=\"T_36394_row3_col7\" class=\"data row3 col7\" >['sklearn', 'binary_classification', 'multiclass_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_36394_row3_col8\" class=\"data row3 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_36394_row4_col0\" class=\"data row4 col0\" >validmind.ongoing_monitoring.CalibrationCurveDrift</td>\n", + " <td id=\"T_36394_row4_col1\" class=\"data row4 col1\" >Calibration Curve Drift</td>\n", + " <td id=\"T_36394_row4_col2\" class=\"data row4 col2\" >Evaluates changes in probability calibration between reference and monitoring datasets....</td>\n", + " <td id=\"T_36394_row4_col3\" class=\"data row4 col3\" >True</td>\n", + " <td id=\"T_36394_row4_col4\" class=\"data row4 col4\" >True</td>\n", + " <td id=\"T_36394_row4_col5\" class=\"data row4 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_36394_row4_col6\" class=\"data row4 col6\" >{'n_bins': {'type': 'int', 'default': 10}, 'drift_pct_threshold': {'type': 'float', 'default': 20}}</td>\n", + " <td id=\"T_36394_row4_col7\" class=\"data row4 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_36394_row4_col8\" class=\"data row4 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " <tr>\n", + " <td id=\"T_36394_row5_col0\" class=\"data row5 col0\" >validmind.ongoing_monitoring.ROCCurveDrift</td>\n", + " <td id=\"T_36394_row5_col1\" class=\"data row5 col1\" >ROC Curve Drift</td>\n", + " <td id=\"T_36394_row5_col2\" class=\"data row5 col2\" >Compares ROC curves between reference and monitoring datasets....</td>\n", + " <td id=\"T_36394_row5_col3\" class=\"data row5 col3\" >True</td>\n", + " <td id=\"T_36394_row5_col4\" class=\"data row5 col4\" >False</td>\n", + " <td id=\"T_36394_row5_col5\" class=\"data row5 col5\" >['datasets', 'model']</td>\n", + " <td id=\"T_36394_row5_col6\" class=\"data row5 col6\" >{}</td>\n", + " <td id=\"T_36394_row5_col7\" class=\"data row5 col7\" >['sklearn', 'binary_classification', 'model_performance', 'visualization']</td>\n", + " <td id=\"T_36394_row5_col8\" class=\"data row5 col8\" >['classification', 'text_classification']</td>\n", + " </tr>\n", + " </tbody>\n", + "</table>\n" + ], + "text/plain": [ + "<pandas.io.formats.style.Styler at 0x380009c40>" + ] + }, + "execution_count": null, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "list_tests(filter=\"sklearn\",\n", + " tags=[\"model_performance\", \"visualization\"], task=\"classification\"\n", + ")" + ] + }, { - "data": { - "text/plain": [ - "['validmind.data_validation.DatasetDescription',\n", - " 'validmind.data_validation.DatasetSplit',\n", - " 'validmind.data_validation.nlp.CommonWords',\n", - " 'validmind.data_validation.nlp.Hashtags',\n", - " 'validmind.data_validation.nlp.LanguageDetection',\n", - " 'validmind.data_validation.nlp.Mentions',\n", - " 'validmind.data_validation.nlp.Punctuations',\n", - " 'validmind.data_validation.nlp.StopWords',\n", - " 'validmind.data_validation.nlp.TextDescription',\n", - " 'validmind.model_validation.BertScore',\n", - " 'validmind.model_validation.BleuScore',\n", - " 'validmind.model_validation.ContextualRecall',\n", - " 'validmind.model_validation.MeteorScore',\n", - " 'validmind.model_validation.RegardScore',\n", - " 'validmind.model_validation.RougeScore',\n", - " 'validmind.model_validation.TokenDisparity',\n", - " 'validmind.model_validation.ToxicityScore',\n", - " 'validmind.model_validation.embeddings.CosineSimilarityComparison',\n", - " 'validmind.model_validation.embeddings.CosineSimilarityHeatmap',\n", - " 'validmind.model_validation.embeddings.EuclideanDistanceComparison',\n", - " 'validmind.model_validation.embeddings.EuclideanDistanceHeatmap',\n", - " 'validmind.model_validation.embeddings.PCAComponentsPairwisePlots',\n", - " 'validmind.model_validation.embeddings.TSNEComponentsPairwisePlots',\n", - " 'validmind.model_validation.ragas.AnswerCorrectness',\n", - " 'validmind.model_validation.ragas.AspectCritic',\n", - " 'validmind.model_validation.ragas.ContextEntityRecall',\n", - " 'validmind.model_validation.ragas.ContextPrecision',\n", - " 'validmind.model_validation.ragas.ContextPrecisionWithoutReference',\n", - " 'validmind.model_validation.ragas.ContextRecall',\n", - " 'validmind.model_validation.ragas.Faithfulness',\n", - " 'validmind.model_validation.ragas.NoiseSensitivity',\n", - " 'validmind.model_validation.ragas.ResponseRelevancy',\n", - " 'validmind.model_validation.ragas.SemanticSimilarity',\n", - " 'validmind.prompt_validation.Bias',\n", - " 'validmind.prompt_validation.Clarity',\n", - " 'validmind.prompt_validation.Conciseness',\n", - " 'validmind.prompt_validation.Delimitation',\n", - " 'validmind.prompt_validation.NegativeInstruction',\n", - " 'validmind.prompt_validation.Robustness',\n", - " 'validmind.prompt_validation.Specificity']" + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Store test sets for use\n", + "\n", + "Once you've identified specific sets of tests you'd like to run, you can store the tests in variables, enabling you to easily reuse those tests in later steps.\n", + "\n", + "For example, if you're validating a summarization model, use [`list_tests()`](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to retrieve all tests tagged for text summarization and save them to `text_summarization_tests` for later use:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "['validmind.data_validation.DatasetDescription',\n", + " 'validmind.data_validation.DatasetSplit',\n", + " 'validmind.data_validation.nlp.CommonWords',\n", + " 'validmind.data_validation.nlp.Hashtags',\n", + " 'validmind.data_validation.nlp.LanguageDetection',\n", + " 'validmind.data_validation.nlp.Mentions',\n", + " 'validmind.data_validation.nlp.Punctuations',\n", + " 'validmind.data_validation.nlp.StopWords',\n", + " 'validmind.data_validation.nlp.TextDescription',\n", + " 'validmind.model_validation.BertScore',\n", + " 'validmind.model_validation.BleuScore',\n", + " 'validmind.model_validation.ContextualRecall',\n", + " 'validmind.model_validation.MeteorScore',\n", + " 'validmind.model_validation.RegardScore',\n", + " 'validmind.model_validation.RougeScore',\n", + " 'validmind.model_validation.TokenDisparity',\n", + " 'validmind.model_validation.ToxicityScore',\n", + " 'validmind.model_validation.embeddings.CosineSimilarityComparison',\n", + " 'validmind.model_validation.embeddings.CosineSimilarityHeatmap',\n", + " 'validmind.model_validation.embeddings.EuclideanDistanceComparison',\n", + " 'validmind.model_validation.embeddings.EuclideanDistanceHeatmap',\n", + " 'validmind.model_validation.embeddings.PCAComponentsPairwisePlots',\n", + " 'validmind.model_validation.embeddings.TSNEComponentsPairwisePlots',\n", + " 'validmind.model_validation.ragas.AnswerCorrectness',\n", + " 'validmind.model_validation.ragas.AspectCritic',\n", + " 'validmind.model_validation.ragas.ContextEntityRecall',\n", + " 'validmind.model_validation.ragas.ContextPrecision',\n", + " 'validmind.model_validation.ragas.ContextPrecisionWithoutReference',\n", + " 'validmind.model_validation.ragas.ContextRecall',\n", + " 'validmind.model_validation.ragas.Faithfulness',\n", + " 'validmind.model_validation.ragas.NoiseSensitivity',\n", + " 'validmind.model_validation.ragas.ResponseRelevancy',\n", + " 'validmind.model_validation.ragas.SemanticSimilarity',\n", + " 'validmind.prompt_validation.Bias',\n", + " 'validmind.prompt_validation.Clarity',\n", + " 'validmind.prompt_validation.Conciseness',\n", + " 'validmind.prompt_validation.Delimitation',\n", + " 'validmind.prompt_validation.NegativeInstruction',\n", + " 'validmind.prompt_validation.Robustness',\n", + " 'validmind.prompt_validation.Specificity']" + ] + }, + "execution_count": null, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "text_summarization_tests = list_tests(task=\"text_summarization\", pretty=False)\n", + "text_summarization_tests" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "Now that you know how to browse and filter tests in the ValidMind Library, you’re ready to take the next step. Use the test IDs you’ve selected to either run individual tests or batch run them with custom test suites.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn about the tests suites available in the ValidMind Library.</b></span>\n", + "<br></br>\n", + "Check out our <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/explore_tests/explore_test_suites.html\" style=\"color: #DE257E;\"><b>Explore test suites</b></a> notebook for more code examples and usage of key functions.</div>\n", + "\n", + "<a id='toc7_1__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you'll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "%pip show validmind" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "id": "copyright-fb6994d364c54669b356f7a2278d6480", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" ] - }, - "execution_count": 10, - "metadata": {}, - "output_type": "execute_result" } - ], - "source": [ - "text_summarization_tests = list_tests(task=\"text_summarization\", pretty=False)\n", - "text_summarization_tests" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "Now that you know how to browse and filter tests in the ValidMind Library, you’re ready to take the next step. Use the test IDs you’ve selected to either run individual tests or batch run them with custom test suites.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn about the tests suites available in the ValidMind Library.</b></span>\n", - "<br></br>\n", - "Check out our <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/explore_tests/explore_test_suites.html\" style=\"color: #DE257E;\"><b>Explore test suites</b></a> notebook for more code examples and usage of key functions.</div>\n", - "\n", - "<a id='toc7_1__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you'll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-fb6994d364c54669b356f7a2278d6480", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" + ], + "metadata": { + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.13" + } }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 4 + "nbformat": 4, + "nbformat_minor": 4 } diff --git a/site/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb b/site/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb index d4f7812c01..ae4b200aa6 100644 --- a/site/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb +++ b/site/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.ipynb @@ -1,781 +1,785 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "976bb3d9", - "metadata": {}, - "source": [ - "# Run dataset-based tests\n", - "\n", - "Learn how to use the ValidMind Library to run tests that take any dataset or record (model) as input. Identify specific tests to run, initialize ValidMind dataset objects in preparation for passing them to your tests, and then run the chosen tests — generating outputs that can be automatically logged to your documentation in the ValidMind Platform." - ] - }, - { - "cell_type": "markdown", - "id": "8c4d9b9c", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Preview the documentation template](#toc2_3__) \n", - "- [Explore a ValidMind test](#toc3__) \n", - "- [Working with ValidMind datasets](#toc4__) \n", - " - [Create a sample dataset](#toc4_1__) \n", - " - [Initialize the ValidMind dataset](#toc4_2__) \n", - "- [Running ValidMind tests](#toc5__) \n", - " - [Run test using ValidMind dataset](#toc5_1__) \n", - " - [Run and log test requiring parameters](#toc5_2__) \n", - " - [Log ClassImbalance test with default parameters](#toc5_2_1__) \n", - " - [Log ClassImbalance test with custom paramaters](#toc5_2_2__) \n", - "- [Work with test results](#toc6__) \n", - "- [Next steps](#toc7__) \n", - " - [Discover more learning resources](#toc7_1__) \n", - "- [Upgrade ValidMind](#toc8__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "id": "f49237b3", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." - ] - }, - { - "cell_type": "markdown", - "id": "907737bd", - "metadata": {}, - "source": [ - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." - ] - }, - { - "cell_type": "markdown", - "id": "115cdfa7", - "metadata": {}, - "source": [ - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "id": "c3051ca8", - "metadata": {}, - "source": [ - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "id": "656db165", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "id": "30fa24d7", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", - "<br></br>\n", - "Python 3.8 <= x <= 3.14</div>\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "524602cc", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "id": "b38fc5f6", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "id": "451c5a1b", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook.\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "id": "0e55ac40", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "id": "3545620d", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0ed9e84d", - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "8fea9380", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e44a2345", - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "id": "43ee2f43", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Explore a ValidMind test\n", - "\n", - "Before we run a test, use [the `vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to return information on out-of-the-box tests available in the ValidMind Library.\n", - "\n", - "Let's assume you want to generate the *pearson correlation matrix* for a dataset. A Pearson correlation matrix is a table that shows the [Pearson correlation coefficients](https://en.wikipedia.org/wiki/Pearson_correlation_coefficient) between several variables.\n", - "\n", - "We'll pass in a `filter` to the `list_tests` function to find the test ID for the pearson correlation matrix:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "a63e7a43", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tests(filter=\"PearsonCorrelationMatrix\")" - ] - }, - { - "cell_type": "markdown", - "id": "011de751", - "metadata": {}, - "source": [ - "We've identified from the output that the test ID for the pearson correlation matrix test is `validmind.data_validation.PearsonCorrelationMatrix`.\n", - "\n", - "Use this ID combined with [the `describe_test()` function](https://docs.validmind.ai/validmind/validmind/tests.html#describe_test) to retrieve more information about the test, including its **Required Inputs**:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "9886cd27", - "metadata": {}, - "outputs": [], - "source": [ - "test_id = \"validmind.data_validation.PearsonCorrelationMatrix\"\n", - "vm.tests.describe_test(test_id)" - ] - }, - { - "cell_type": "markdown", - "id": "f1f7a84a", - "metadata": {}, - "source": [ - "Since this test requires a dataset, you can expect it to throw an error when we run it without passing in a `dataset` as input:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "ee38704a", - "metadata": {}, - "outputs": [], - "source": [ - "try:\n", - " vm.tests.run_test(test_id)\n", - "except Exception as e:\n", - " print(e)" - ] - }, - { - "cell_type": "markdown", - "id": "60ede8e0", - "metadata": {}, - "source": [ - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn more about the individual tests available in the ValidMind Library</b></span>\n", - "<br></br>\n", - "Check out our <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/explore_tests/explore_tests.html\" style=\"color: #DE257E;\"><b>Explore tests</b></a> notebook for more code examples and usage of key functions.</div>" - ] - }, - { - "cell_type": "markdown", - "id": "6bcd01d2", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Working with ValidMind datasets" - ] - }, - { - "cell_type": "markdown", - "id": "35331764", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Create a sample dataset\n", - "\n", - "Since we need a dataset to run tests, let's use the [sklearn `make_classification` function](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.make_classification.html) to generate a random sample dataset for testing.\n", - "\n", - "In the code example below, note that:\n", - "\n", - "- The `make_classification` function generates a synthetic binary classification dataset with `10,000` samples and `10` features, where the `weights=[0.1]` parameter creates a class imbalance (roughly 10% positive class).\n", - "- The `random_state=42` parameter ensures reproducibility so you get the same dataset each time you run the code.\n", - "- The generated feature matrix `X` and target array `y` are combined into a single [Pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) with columns named `feature_0` through `feature_9`, plus a `target` column that has a value of `1` for the positive class and `0` otherwise." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "25774f44", - "metadata": {}, - "outputs": [], - "source": [ - "import pandas as pd\n", - "from sklearn.datasets import make_classification\n", - "\n", - "X, y = make_classification(\n", - " n_samples=10000,\n", - " n_features=10,\n", - " weights=[0.1],\n", - " random_state=42,\n", - ")\n", - "X.shape\n", - "y.shape\n", - "\n", - "df = pd.DataFrame(X, columns=[f\"feature_{i}\" for i in range(X.shape[1])])\n", - "df[\"target\"] = y\n", - "df.head()" - ] - }, - { - "cell_type": "markdown", - "id": "3b3032fc", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### Initialize the ValidMind dataset\n", - "\n", - "The next step is to connect your data with a ValidMind `Dataset` object. **This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind,** but you only need to do it once per dataset.\n", - "\n", - "ValidMind dataset objects provide a wrapper to any type of dataset (NumPy, Pandas, Polars, etc.) so that tests can run transparently regardless of the underlying library.\n", - "\n", - "Initialize a ValidMind dataset object using the [`init_dataset` function](https://docs.validmind.ai/validmind/validmind.html#init_dataset) from the ValidMind (`vm`) module. For this example, we'll pass in the following arguments:\n", - "\n", - "- **`dataset`** — The raw dataset that you want to provide as input to tests.\n", - "- **`input_id`** — A unique identifier that allows tracking what inputs are used when running each individual test.\n", - "- **`target_column`** — A required argument if tests require access to true values. This is the name of the target column in the dataset." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "70c52c03", - "metadata": {}, - "outputs": [], - "source": [ - "# Initialize the ValidMind dataset for the previously created sample `df`\n", - "vm_dataset = vm.init_dataset(\n", - " df,\n", - " input_id=\"my_demo_dataset\",\n", - " target_column=\"target\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "ec65df1b", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Running ValidMind tests\n", - "\n", - "Now that we know how to initialize a ValidMind `dataset` object, we're ready to run some tests!\n", - "\n", - "You run individual tests by calling [the `run_test` function](https://docs.validmind.ai/validmind/validmind/tests.html#run_test) provided by the `validmind.tests` module. For the examples below, we'll pass in the following arguments:\n", - "\n", - "- **`test_id`** — The ID of the test to run, as seen in the `ID` column when you run `list_tests`.\n", - "- **`inputs`** — A dictionary of test inputs, such as `dataset`, `model`, `datasets`, or `models`. These are ValidMind objects initialized with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) or [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model)." - ] - }, - { - "cell_type": "markdown", - "id": "c46789a4", - "metadata": {}, - "source": [ - "<a id='toc5_1__'></a>\n", - "\n", - "### Run test using ValidMind dataset\n", - "\n", - "Given that our `test_id` is currently set to `validmind.data_validation.PearsonCorrelationMatrix`, we'll get the results of the Pearson Correlation Matrix test as output when we call `run_test()`:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0c636915", - "metadata": {}, - "outputs": [], - "source": [ - "result = vm.tests.run_test(\n", - " test_id,\n", - " inputs={\"dataset\": vm_dataset},\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "12694f87", - "metadata": {}, - "source": [ - "<a id='toc5_2__'></a>\n", - "\n", - "### Run and log test requiring parameters\n", - "\n", - "Our `vm_dataset` can also be used for any other test that requires a dataset input, including tests that take additional parameters.\n", - "\n", - "Let's find a *class imbalance* test to understand the distribution of the target column in the dataset to demonstrate. Class imbalance is a common problem in machine learning, particularly in classification tasks, where the number of instances (or data points) in each class isn't evenly distributed across the available categories.\n", - "\n", - "`Tags` describe what a test applies to and help you filter tests for your use case. Use [list_tags()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tags) to view all unique tags used to describe tests in the ValidMind Library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "99eaf2da", - "metadata": {}, - "outputs": [], - "source": [ - "# Sort the tags in ABC order\n", - "sorted(vm.tests.list_tags())" - ] - }, - { - "cell_type": "markdown", - "id": "561b225a", - "metadata": {}, - "source": [ - "Use `list_tests()`, this time filtering tests by tags for `binary_classification` relating to `tabular_data`:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "97a45b6b", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tests(tags=[\"binary_classification\", \"tabular_data\"])" - ] - }, - { - "cell_type": "markdown", - "id": "4ba2ec07", - "metadata": {}, - "source": [ - "Let's use `describe_test()` again to retrieve more information about the test, including confirmation that it accepts some additional parameters, such as `min_percent_threshold` which allows you configure the threshold for an acceptable class imbalance:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "ec456cd2", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.describe_test(\"validmind.data_validation.ClassImbalance\")" - ] - }, - { - "cell_type": "markdown", - "id": "e419dd51", - "metadata": {}, - "source": [ - "<a id='toc5_2_1__'></a>\n", - "\n", - "#### Log ClassImbalance test with default parameters\n", - "\n", - "Every test result returned by the `run_test()` function has a [`.log()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#TestResult.log) that can be used to send the test results to the ValidMind Platform.\n", - "\n", - "Let's first run the class imbalance test without any parameters to see its output using a default value for the threshold and log the results to the ValidMind Platform for later comparison:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "1c137483", - "metadata": {}, - "outputs": [], - "source": [ - "result = vm.tests.run_test(\n", - " \"validmind.data_validation.ClassImbalance\",\n", - " inputs={\"dataset\": vm_dataset},\n", - ")\n", - "\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "6cc499de", - "metadata": {}, - "source": [ - "<a id='toc5_2_2__'></a>\n", - "\n", - "#### Log ClassImbalance test with custom paramaters\n", - "\n", - "From the output, we've confirmed that the class imbalance test passes the pass-fail criteria with the default threshold of 10%. Let's try to run the test with a threshold of 20% to see if it fails.\n", - "\n", - "When running individual tests, **you can use a custom `result_id` to tag the individual result with a unique identifier**, allowing you to submit individual results for the same test to the ValidMind Platform:\n", - "\n", - "- This `result_id` can be appended to `test_id` with a `:` separator.\n", - "- The `custom_threshold` identifier will correspond with the results of our adjusted `min_percent_threshold` parameter." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "2c6f19ad", - "metadata": {}, - "outputs": [], - "source": [ - "result = vm.tests.run_test(\n", - " \"validmind.data_validation.ClassImbalance:custom_threshold\",\n", - " inputs={\"dataset\": vm_dataset},\n", - " params={\"min_percent_threshold\": 20},\n", - ")\n", - "\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "30e82fc3", - "metadata": {}, - "source": [ - "When the threshold is set to 20%, the results show that the class imbalance test fails." - ] - }, - { - "cell_type": "markdown", - "id": "faa09935", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Work with test results\n", - "\n", - "You can look at the output of tests produced by the ValidMind Library right in this notebook where you ran the tests, as you would expect. But there is a better way — use the ValidMind Platform to attach the logged test results your documentation (**Learn more:** [Work with test results](https://docs.validmind.ai/guide/documentation/work-with-test-results.html)):\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you connected to earlier.\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "3. Locate the Data Preparation section and click on **2.1. Data Description** to expand that section.\n", - "\n", - "4. Hover under the logged test block for the default Class Imbalance test until a horizontal dashed line with a **+** button appears, indicating that you can insert a new block.\n", - "\n", - "5. Click **+** and then select **Test-Driven Block** under FROM LIBRARY:\n", - "\n", - " - Click on **VM Library** under TEST-DRIVEN in the left sidebar.\n", - " - Select `ClassImbalance:custom_threshold` as the test.\n", - "\n", - "6. Finally, click **Insert 1 Test Result to Document** to add the test result to the documentation.\n", - "\n", - " Confirm that the individual results for the adjusted threshold class imbalance test has been correctly inserted into section **2.1. Data Description** of the documentation.\n", - "\n", - "You just worked with a draft of your model's documentation, in an easily consumable format matching the structure of the template you previewed in the beginning of this notebook. When you connect to a model with the ValidMind Library, logged test results automatically populate for easy insertion into your documentation.\n", - "\n", - "In the ValidMind Platform, you can make qualitative edits to model documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))" - ] - }, - { - "cell_type": "markdown", - "id": "cbe20d76", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "Now that you know the basics of how to run out-of-the-box tests in the ValidMind Library, you’re ready to take the next step. Use `run_test()` with any combination of datasets or records (models) as inputs to run comparison tests, and log your consolidated test results to the ValidMind Platform.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn how to run comparison tests with the ValidMind Library.</b></span>\n", - "<br></br>\n", - "Check out our <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/run_tests/2-run_comparison_tests.html\" style=\"color: #DE257E;\"><b>Run comparison tests</b></a> notebook for code examples and usage of key functions.</div>" - ] - }, - { - "cell_type": "markdown", - "id": "ec08c9bc", - "metadata": {}, - "source": [ - "<a id='toc7_1__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "id": "bff625a1", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b5f64e27", - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "id": "da29fb9d", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "id": "82837a85", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-38501808b29c456ab97562eebdd497d4", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.12.12" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Run dataset-based tests\n", + "\n", + "Learn how to use the ValidMind Library to run tests that take any dataset or record (model) as input. Identify specific tests to run, initialize ValidMind dataset objects in preparation for passing them to your tests, and then run the chosen tests — generating outputs that can be automatically logged to your documentation in the ValidMind Platform." + ], + "id": "976bb3d9" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Preview the documentation template](#toc2_3__) \n", + "- [Explore a ValidMind test](#toc3__) \n", + "- [Working with ValidMind datasets](#toc4__) \n", + " - [Create a sample dataset](#toc4_1__) \n", + " - [Initialize the ValidMind dataset](#toc4_2__) \n", + "- [Running ValidMind tests](#toc5__) \n", + " - [Run test using ValidMind dataset](#toc5_1__) \n", + " - [Run and log test requiring parameters](#toc5_2__) \n", + " - [Log ClassImbalance test with default parameters](#toc5_2_1__) \n", + " - [Log ClassImbalance test with custom paramaters](#toc5_2_2__) \n", + "- [Work with test results](#toc6__) \n", + "- [Next steps](#toc7__) \n", + " - [Discover more learning resources](#toc7_1__) \n", + "- [Upgrade ValidMind](#toc8__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ], + "id": "8c4d9b9c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." + ], + "id": "f49237b3" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." + ], + "id": "907737bd" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ], + "id": "115cdfa7" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ], + "id": "c3051ca8" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ], + "id": "656db165" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", + "<br></br>\n", + "Python 3.8 <= x <= 3.14</div>\n", + "\n", + "To install the library:" + ], + "id": "30fa24d7" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [], + "id": "524602cc" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ], + "id": "b38fc5f6" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook.\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ], + "id": "451c5a1b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ], + "id": "0e55ac40" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ], + "id": "3545620d" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "0ed9e84d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ], + "id": "8fea9380" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [], + "id": "e44a2345" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Explore a ValidMind test\n", + "\n", + "Before we run a test, use [the `vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to return information on out-of-the-box tests available in the ValidMind Library.\n", + "\n", + "Let's assume you want to generate the *pearson correlation matrix* for a dataset. A Pearson correlation matrix is a table that shows the [Pearson correlation coefficients](https://en.wikipedia.org/wiki/Pearson_correlation_coefficient) between several variables.\n", + "\n", + "We'll pass in a `filter` to the `list_tests` function to find the test ID for the pearson correlation matrix:" + ], + "id": "43ee2f43" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tests(filter=\"PearsonCorrelationMatrix\")" + ], + "execution_count": null, + "outputs": [], + "id": "a63e7a43" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We've identified from the output that the test ID for the pearson correlation matrix test is `validmind.data_validation.PearsonCorrelationMatrix`.\n", + "\n", + "Use this ID combined with [the `describe_test()` function](https://docs.validmind.ai/validmind/validmind/tests.html#describe_test) to retrieve more information about the test, including its **Required Inputs**:" + ], + "id": "011de751" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test_id = \"validmind.data_validation.PearsonCorrelationMatrix\"\n", + "vm.tests.describe_test(test_id)" + ], + "execution_count": null, + "outputs": [], + "id": "9886cd27" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Since this test requires a dataset, you can expect it to throw an error when we run it without passing in a `dataset` as input:" + ], + "id": "f1f7a84a" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "try:\n", + " vm.tests.run_test(test_id)\n", + "except Exception as e:\n", + " print(e)" + ], + "execution_count": null, + "outputs": [], + "id": "ee38704a" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn more about the individual tests available in the ValidMind Library</b></span>\n", + "<br></br>\n", + "Check out our <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/explore_tests/explore_tests.html\" style=\"color: #DE257E;\"><b>Explore tests</b></a> notebook for more code examples and usage of key functions.</div>" + ], + "id": "60ede8e0" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Working with ValidMind datasets" + ], + "id": "6bcd01d2" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Create a sample dataset\n", + "\n", + "Since we need a dataset to run tests, let's use the [sklearn `make_classification` function](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.make_classification.html) to generate a random sample dataset for testing.\n", + "\n", + "In the code example below, note that:\n", + "\n", + "- The `make_classification` function generates a synthetic binary classification dataset with `10,000` samples and `10` features, where the `weights=[0.1]` parameter creates a class imbalance (roughly 10% positive class).\n", + "- The `random_state=42` parameter ensures reproducibility so you get the same dataset each time you run the code.\n", + "- The generated feature matrix `X` and target array `y` are combined into a single [Pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) with columns named `feature_0` through `feature_9`, plus a `target` column that has a value of `1` for the positive class and `0` otherwise." + ], + "id": "35331764" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import pandas as pd\n", + "from sklearn.datasets import make_classification\n", + "\n", + "X, y = make_classification(\n", + " n_samples=10000,\n", + " n_features=10,\n", + " weights=[0.1],\n", + " random_state=42,\n", + ")\n", + "X.shape\n", + "y.shape\n", + "\n", + "df = pd.DataFrame(X, columns=[f\"feature_{i}\" for i in range(X.shape[1])])\n", + "df[\"target\"] = y\n", + "df.head()" + ], + "execution_count": null, + "outputs": [], + "id": "25774f44" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### Initialize the ValidMind dataset\n", + "\n", + "The next step is to connect your data with a ValidMind `Dataset` object. **This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind,** but you only need to do it once per dataset.\n", + "\n", + "ValidMind dataset objects provide a wrapper to any type of dataset (NumPy, Pandas, Polars, etc.) so that tests can run transparently regardless of the underlying library.\n", + "\n", + "Initialize a ValidMind dataset object using the [`init_dataset` function](https://docs.validmind.ai/validmind/validmind.html#init_dataset) from the ValidMind (`vm`) module. For this example, we'll pass in the following arguments:\n", + "\n", + "- **`dataset`** — The raw dataset that you want to provide as input to tests.\n", + "- **`input_id`** — A unique identifier that allows tracking what inputs are used when running each individual test.\n", + "- **`target_column`** — A required argument if tests require access to true values. This is the name of the target column in the dataset." + ], + "id": "3b3032fc" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Initialize the ValidMind dataset for the previously created sample `df`\n", + "vm_dataset = vm.init_dataset(\n", + " df,\n", + " input_id=\"my_demo_dataset\",\n", + " target_column=\"target\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "70c52c03" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Running ValidMind tests\n", + "\n", + "Now that we know how to initialize a ValidMind `dataset` object, we're ready to run some tests!\n", + "\n", + "You run individual tests by calling [the `run_test` function](https://docs.validmind.ai/validmind/validmind/tests.html#run_test) provided by the `validmind.tests` module. For the examples below, we'll pass in the following arguments:\n", + "\n", + "- **`test_id`** — The ID of the test to run, as seen in the `ID` column when you run `list_tests`.\n", + "- **`inputs`** — A dictionary of test inputs, such as `dataset`, `model`, `datasets`, or `models`. These are ValidMind objects initialized with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) or [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model)." + ], + "id": "ec65df1b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1__'></a>\n", + "\n", + "### Run test using ValidMind dataset\n", + "\n", + "Given that our `test_id` is currently set to `validmind.data_validation.PearsonCorrelationMatrix`, we'll get the results of the Pearson Correlation Matrix test as output when we call `run_test()`:" + ], + "id": "c46789a4" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = vm.tests.run_test(\n", + " test_id,\n", + " inputs={\"dataset\": vm_dataset},\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "0c636915" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_2__'></a>\n", + "\n", + "### Run and log test requiring parameters\n", + "\n", + "Our `vm_dataset` can also be used for any other test that requires a dataset input, including tests that take additional parameters.\n", + "\n", + "Let's find a *class imbalance* test to understand the distribution of the target column in the dataset to demonstrate. Class imbalance is a common problem in machine learning, particularly in classification tasks, where the number of instances (or data points) in each class isn't evenly distributed across the available categories.\n", + "\n", + "`Tags` describe what a test applies to and help you filter tests for your use case. Use [list_tags()](https://docs.validmind.ai/validmind/validmind/tests.html#list_tags) to view all unique tags used to describe tests in the ValidMind Library:" + ], + "id": "12694f87" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Sort the tags in ABC order\n", + "sorted(vm.tests.list_tags())" + ], + "execution_count": null, + "outputs": [], + "id": "99eaf2da" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Use `list_tests()`, this time filtering tests by tags for `binary_classification` relating to `tabular_data`:" + ], + "id": "561b225a" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tests(tags=[\"binary_classification\", \"tabular_data\"])" + ], + "execution_count": null, + "outputs": [], + "id": "97a45b6b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's use `describe_test()` again to retrieve more information about the test, including confirmation that it accepts some additional parameters, such as `min_percent_threshold` which allows you configure the threshold for an acceptable class imbalance:" + ], + "id": "4ba2ec07" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.describe_test(\"validmind.data_validation.ClassImbalance\")" + ], + "execution_count": null, + "outputs": [], + "id": "ec456cd2" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_2_1__'></a>\n", + "\n", + "#### Log ClassImbalance test with default parameters\n", + "\n", + "Every test result returned by the `run_test()` function has a [`.log()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#TestResult.log) that can be used to send the test results to the ValidMind Platform.\n", + "\n", + "Let's first run the class imbalance test without any parameters to see its output using a default value for the threshold and log the results to the ValidMind Platform for later comparison:" + ], + "id": "e419dd51" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = vm.tests.run_test(\n", + " \"validmind.data_validation.ClassImbalance\",\n", + " inputs={\"dataset\": vm_dataset},\n", + ")\n", + "\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "1c137483" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_2_2__'></a>\n", + "\n", + "#### Log ClassImbalance test with custom paramaters\n", + "\n", + "From the output, we've confirmed that the class imbalance test passes the pass-fail criteria with the default threshold of 10%. Let's try to run the test with a threshold of 20% to see if it fails.\n", + "\n", + "When running individual tests, **you can use a custom `result_id` to tag the individual result with a unique identifier**, allowing you to submit individual results for the same test to the ValidMind Platform:\n", + "\n", + "- This `result_id` can be appended to `test_id` with a `:` separator.\n", + "- The `custom_threshold` identifier will correspond with the results of our adjusted `min_percent_threshold` parameter." + ], + "id": "6cc499de" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = vm.tests.run_test(\n", + " \"validmind.data_validation.ClassImbalance:custom_threshold\",\n", + " inputs={\"dataset\": vm_dataset},\n", + " params={\"min_percent_threshold\": 20},\n", + ")\n", + "\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "2c6f19ad" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "When the threshold is set to 20%, the results show that the class imbalance test fails." + ], + "id": "30e82fc3" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Work with test results\n", + "\n", + "You can look at the output of tests produced by the ValidMind Library right in this notebook where you ran the tests, as you would expect. But there is a better way — use the ValidMind Platform to attach the logged test results your documentation (**Learn more:** [Work with test results](https://docs.validmind.ai/guide/documentation/work-with-test-results.html)):\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you connected to earlier.\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "3. Locate the Data Preparation section and click on **2.1. Data Description** to expand that section.\n", + "\n", + "4. Hover under the logged test block for the default Class Imbalance test until a horizontal dashed line with a **+** button appears, indicating that you can insert a new block.\n", + "\n", + "5. Click **+** and then select **Test-Driven Block** under FROM LIBRARY:\n", + "\n", + " - Click on **VM Library** under TEST-DRIVEN in the left sidebar.\n", + " - Select `ClassImbalance:custom_threshold` as the test.\n", + "\n", + "6. Finally, click **Insert 1 Test Result to Document** to add the test result to the documentation.\n", + "\n", + " Confirm that the individual results for the adjusted threshold class imbalance test has been correctly inserted into section **2.1. Data Description** of the documentation.\n", + "\n", + "You just worked with a draft of your model's documentation, in an easily consumable format matching the structure of the template you previewed in the beginning of this notebook. When you connect to a model with the ValidMind Library, logged test results automatically populate for easy insertion into your documentation.\n", + "\n", + "In the ValidMind Platform, you can make qualitative edits to model documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))" + ], + "id": "faa09935" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "Now that you know the basics of how to run out-of-the-box tests in the ValidMind Library, you’re ready to take the next step. Use `run_test()` with any combination of datasets or records (models) as inputs to run comparison tests, and log your consolidated test results to the ValidMind Platform.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn how to run comparison tests with the ValidMind Library.</b></span>\n", + "<br></br>\n", + "Check out our <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/run_tests/2-run_comparison_tests.html\" style=\"color: #DE257E;\"><b>Run comparison tests</b></a> notebook for code examples and usage of key functions.</div>" + ], + "id": "cbe20d76" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7_1__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ], + "id": "ec08c9bc" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ], + "id": "bff625a1" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [], + "id": "b5f64e27" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ], + "id": "da29fb9d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ], + "id": "82837a85" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-38501808b29c456ab97562eebdd497d4" + } + ], + "metadata": { + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.12" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} \ No newline at end of file diff --git a/site/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb b/site/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb index a027d52dce..1766a413fe 100644 --- a/site/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb +++ b/site/notebooks/how_to/tests/run_tests/2-run_comparison_tests.ipynb @@ -1,1115 +1,1119 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "ed8282aa", - "metadata": {}, - "source": [ - "# Run comparison tests\n", - "\n", - "Learn how to use the ValidMind Library to run comparison tests that take any datasets or records (models) as inputs. Identify comparison tests to run, initialize ValidMind dataset and model objects in preparation for passing them to tests, and then run tests — generating outputs automatically logged to your documentation in the ValidMind Platform.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>We recommend that you first complete our introductory notebook on running tests.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.html\" style=\"color: #DE257E;\"><b>Run dataset-based tests</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "id": "90ab1b8a", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Preview the documentation template](#toc2_3__) \n", - " - [Initialize the Python environment](#toc2_4__) \n", - "- [Explore a ValidMind test](#toc3__) \n", - "- [Working with ValidMind datasets](#toc4__) \n", - " - [Prepare the sample dataset](#toc4_1__) \n", - " - [Import the sample dataset](#toc4_1_1__) \n", - " - [Split the dataset](#toc4_1_2__) \n", - " - [Initialize the ValidMind dataset](#toc4_2__) \n", - "- [Working with ValidMind models](#toc5__) \n", - " - [Train a sample model](#toc5_1__) \n", - " - [Initialize the ValidMind model](#toc5_2__) \n", - " - [Assign predictions](#toc5_3__) \n", - "- [Running ValidMind tests](#toc6__) \n", - " - [Run classifier performance test with one model](#toc6_1__) \n", - " - [Run comparison tests](#toc6_2__) \n", - " - [Run classifier performance test with multiple models](#toc6_2_1__) \n", - " - [Run classifier performance test with multiple parameter values](#toc6_2_2__) \n", - " - [Run comparison test with multiple datasets](#toc6_2_3__) \n", - "- [Work with test results](#toc7__) \n", - "- [Next steps](#toc8__) \n", - " - [Discover more learning resources](#toc8_1__) \n", - "- [Upgrade ValidMind](#toc9__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "id": "60aa37b6", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." - ] - }, - { - "cell_type": "markdown", - "id": "6dfa3d15", - "metadata": {}, - "source": [ - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." - ] - }, - { - "cell_type": "markdown", - "id": "8e87dd4d", - "metadata": {}, - "source": [ - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "id": "64971d85", - "metadata": {}, - "source": [ - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "id": "69a40ac3", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "id": "ec35c724", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", - "<br></br>\n", - "Python 3.8 <= x <= 3.14</div>\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "fc97888f", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "id": "b3c0c2f5", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "id": "d3e3302f", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook.\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "id": "679d46b2", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "id": "2b6e1fb1", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "c51ae01c", - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "52b68564", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "fd332a9d", - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "id": "184b8c97", - "metadata": {}, - "source": [ - "<a id='toc2_4__'></a>\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Next, let's import the necessary libraries and set up your Python environment for data analysis:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "8e2127cd", - "metadata": {}, - "outputs": [], - "source": [ - "import xgboost as xgb\n", - "\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "id": "c3098355", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Explore a ValidMind test\n", - "\n", - "Before we run a test, use [the `vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to return information on out-of-the-box tests available in the ValidMind Library.\n", - "\n", - "Let's assume you want to evaluate *classifier performance* for a model. Classifier performance measures how well a classification model correctly predicts outcomes, using metrics like [precision, recall, and F1 score](https://en.wikipedia.org/wiki/Precision_and_recall).\n", - "\n", - "We'll pass in a `filter` to the `list_tests` function to find the test ID for classifier performance:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "a6a6f715", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tests(filter=\"ClassifierPerformance\")" - ] - }, - { - "cell_type": "markdown", - "id": "d1f08b64", - "metadata": {}, - "source": [ - "We've identified from the output that the test ID for the classifier performance test is `validmind.model_validation.ClassifierPerformance`.\n", - "\n", - "Use this ID combined with [the `describe_test()` function](https://docs.validmind.ai/validmind/validmind/tests.html#describe_test) to retrieve more information about the test, including its **Required Inputs**:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "f8a46c7d", - "metadata": {}, - "outputs": [], - "source": [ - "test_id = \"validmind.model_validation.sklearn.ClassifierPerformance\"\n", - "vm.tests.describe_test(test_id)" - ] - }, - { - "cell_type": "markdown", - "id": "10a49439", - "metadata": {}, - "source": [ - "Since this test requires both a dataset object and a model object, you can expect it to throw an error when we run it without passing in either as input:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "f853c272", - "metadata": {}, - "outputs": [], - "source": [ - "try:\n", - " vm.tests.run_test(test_id)\n", - "except Exception as e:\n", - " print(e)" - ] - }, - { - "cell_type": "markdown", - "id": "da36ba6b", - "metadata": {}, - "source": [ - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn more about the individual tests available in the ValidMind Library</b></span>\n", - "<br></br>\n", - "Check out our <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/explore_tests/explore_tests.html\" style=\"color: #DE257E;\"><b>Explore tests</b></a> notebook for more code examples and usage of key functions.</div>" - ] - }, - { - "cell_type": "markdown", - "id": "40324c13", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Working with ValidMind datasets" - ] - }, - { - "cell_type": "markdown", - "id": "3f28ffe2", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Prepare the sample dataset" - ] - }, - { - "cell_type": "markdown", - "id": "4c45a55c", - "metadata": {}, - "source": [ - "<a id='toc4_1_1__'></a>\n", - "\n", - "#### Import the sample dataset\n", - "\n", - "Since we need a dataset to run tests, let's import the public [Bank Customer Churn Prediction](https://www.kaggle.com/datasets/shantanudhakadd/bank-customer-churn-prediction) dataset from Kaggle so that we have something to work with.\n", - "\n", - "In our below example, note that:\n", - "\n", - "- The target column, `Exited` has a value of `1` when a customer has churned and `0` otherwise.\n", - "- The ValidMind Library provides a wrapper to automatically load the dataset as a [Pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) object. A Pandas Dataframe is a two-dimensional tabular data structure that makes use of rows and columns." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "3ef2dfbb", - "metadata": {}, - "outputs": [], - "source": [ - "# Import the sample dataset from the library\n", - "\n", - "from validmind.datasets.classification import customer_churn\n", - "\n", - "print(\n", - " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{customer_churn.target_column}' \\n\\t• Class labels: {customer_churn.class_labels}\"\n", - ")\n", - "\n", - "raw_df = customer_churn.load_data()\n", - "raw_df.head()" - ] - }, - { - "cell_type": "markdown", - "id": "2fc43d28", - "metadata": {}, - "source": [ - "<a id='toc4_1_2__'></a>\n", - "\n", - "#### Split the dataset\n", - "\n", - "Let's first split our dataset to help assess how well the model generalizes to unseen data.\n", - "\n", - "Use [`preprocess()`](https://docs.validmind.ai/validmind/validmind/datasets/classification/customer_churn.html#preprocess) to split our dataset into three subsets:\n", - "\n", - "1. **train_df** — Used to train the model.\n", - "2. **validation_df** — Used to evaluate the model's performance during training.\n", - "3. **test_df** — Used later on to asses the model's performance on new, unseen data." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "88c87d4a", - "metadata": {}, - "outputs": [], - "source": [ - "train_df, validation_df, test_df = customer_churn.preprocess(raw_df)" - ] - }, - { - "cell_type": "markdown", - "id": "a5d77885", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### Initialize the ValidMind dataset\n", - "\n", - "The next step is to connect your data with a ValidMind `Dataset` object. **This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind,** but you only need to do it once per dataset.\n", - "\n", - "ValidMind dataset objects provide a wrapper to any type of dataset (NumPy, Pandas, Polars, etc.) so that tests can run transparently regardless of the underlying library.\n", - "\n", - "Initialize a ValidMind dataset object using the [`init_dataset` function](https://docs.validmind.ai/validmind/validmind.html#init_dataset) from the ValidMind (`vm`) module. For this example, we'll pass in the following arguments:\n", - "\n", - "- **`dataset`** — The raw dataset that you want to provide as input to tests.\n", - "- **`input_id`** — A unique identifier that allows tracking what inputs are used when running each individual test.\n", - "- **`target_column`** — A required argument if tests require access to true values. This is the name of the target column in the dataset." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "bf0ec747", - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds = vm.init_dataset(\n", - " dataset=train_df,\n", - " input_id=\"train_dataset\",\n", - " target_column=customer_churn.target_column,\n", - ")\n", - "\n", - "vm_test_ds = vm.init_dataset(\n", - " dataset=test_df,\n", - " input_id=\"test_dataset\",\n", - " target_column=customer_churn.target_column,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "cbb1a68f", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Working with ValidMind models" - ] - }, - { - "cell_type": "markdown", - "id": "68089f0a", - "metadata": {}, - "source": [ - "<a id='toc5_1__'></a>\n", - "\n", - "### Train a sample model\n", - "\n", - "To train the model, we need to provide it with:\n", - "\n", - "1. **Inputs** — Features such as customer age, usage, etc.\n", - "2. **Outputs (Expected answers/labels)** — in our case, we would like to know whether the customer churned or not.\n", - "\n", - "Here, we'll use `x_train` and `x_val` to hold the input data (features), and `y_train` and `y_val` to hold the answers (the target we want to predict):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "39e8c7ea", - "metadata": {}, - "outputs": [], - "source": [ - "x_train = train_df.drop(customer_churn.target_column, axis=1)\n", - "y_train = train_df[customer_churn.target_column]\n", - "x_val = validation_df.drop(customer_churn.target_column, axis=1)\n", - "y_val = validation_df[customer_churn.target_column]" - ] - }, - { - "cell_type": "markdown", - "id": "6d93642b", - "metadata": {}, - "source": [ - "Next, let's create an *XGBoost classifier model* that will automatically stop training if it doesn't improve after 10 tries. XGBoost is a gradient-boosted tree ensemble that builds trees sequentially, with each tree correcting the errors of the previous ones — typically known for strong predictive performance and built-in regularization to reduce overfitting.\n", - "\n", - "Setting an explicit threshold avoids wasting time and helps prevent further overfitting by stopping training when further improvement isn't happening. We'll also set three evaluation metrics to get a more complete picture of model performance:\n", - "\n", - "1. **error** — Measures how often the model makes incorrect predictions.\n", - "2. **logloss** — Indicates how confident the predictions are.\n", - "3. **auc** — Evaluates how well the model distinguishes between churn and not churn." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "255e3583", - "metadata": {}, - "outputs": [], - "source": [ - "model = xgb.XGBClassifier(early_stopping_rounds=10)\n", - "model.set_params(\n", - " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "a021582a", - "metadata": {}, - "source": [ - "Finally, our actual training step — where the model learns patterns from the data, so it can make predictions later:\n", - "\n", - "- The model is trained on `x_train` and `y_train`, and evaluates its performance using `x_val` and `y_val` to check if it’s learning well.\n", - "- To turn off printed output while training, we'll set `verbose` to `False`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e3aa3657", - "metadata": {}, - "outputs": [], - "source": [ - "model.fit(\n", - " x_train,\n", - " y_train,\n", - " eval_set=[(x_val, y_val)],\n", - " verbose=False,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "ed11ea0b", - "metadata": {}, - "source": [ - "<a id='toc5_2__'></a>\n", - "\n", - "### Initialize the ValidMind model\n", - "\n", - "You'll also need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data for our model.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "4b2be11f", - "metadata": {}, - "outputs": [], - "source": [ - "vm_model_xgb = vm.init_model(\n", - " model,\n", - " input_id=\"xgboost\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "53f12da6", - "metadata": {}, - "source": [ - "<a id='toc5_3__'></a>\n", - "\n", - "### Assign predictions\n", - "\n", - "Once the model has been registered, you can assign model predictions to the training and testing datasets.\n", - "\n", - "- The [`assign_predictions()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#assign_predictions) from the `Dataset` object can link existing predictions to any number of models.\n", - "- This method links the model's class prediction values and probabilities to our `vm_train_ds` and `vm_test_ds` datasets.\n", - "\n", - "If no prediction values are passed, the method will compute predictions automatically:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "229185fd", - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(model=vm_model_xgb)\n", - "vm_test_ds.assign_predictions(model=vm_model_xgb)" - ] - }, - { - "cell_type": "markdown", - "id": "18c1cb2e", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Running ValidMind tests\n", - "\n", - "Now that we know how to initialize ValidMind `dataset` and `model` objects, we're ready to run some tests!\n", - "\n", - "You run individual tests by calling [the `run_test` function](https://docs.validmind.ai/validmind/validmind/tests.html#run_test) provided by the `validmind.tests` module. For the examples below, we'll pass in the following arguments:\n", - "\n", - "- **`test_id`** — The ID of the test to run, as seen in the `ID` column when you run `list_tests`.\n", - "- **`inputs`** — A dictionary of test inputs, such as `dataset`, `model`, `datasets`, or `models`. These are ValidMind objects initialized with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) or [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model)." - ] - }, - { - "cell_type": "markdown", - "id": "6f7e7779", - "metadata": {}, - "source": [ - "<a id='toc6_1__'></a>\n", - "\n", - "### Run classifier performance test with one model\n", - "\n", - "Run `validmind.data_validation.ClassifierPerformance` test with the testing dataset (`vm_test_ds`) and model (`vm_model_xgb`) as inputs:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "85189af9", - "metadata": {}, - "outputs": [], - "source": [ - "result = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.ClassifierPerformance\",\n", - " inputs={\n", - " \"dataset\": vm_test_ds,\n", - " \"model\": vm_model_xgb,\n", - " },\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "5e8be8d5", - "metadata": {}, - "source": [ - "<a id='toc6_2__'></a>\n", - "\n", - "### Run comparison tests\n", - "\n", - "To evaluate which models might be a better fit for a use case based on their performance on selected criteria, we can run the same test with multiple models. We'll train three additional models and run the classifier performance test with for all four models using a single `run_test()` call.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>ValidMind helps streamline your documentation and testing.</b></span>\n", - "<br></br>\n", - "You could call <code>run_test()</code> multiple times passing in different inputs, but you can also pass an <code>input_grid</code> object — a dictionary of test input keys and values that allow you to run a single test for a combination of models and datasets.\n", - "<br></br>\n", - "With <code>input_grid</code>, run comparison tests for multiple datasets, or even multiple datasets and models simultaneously — <code>input_grid</code> can be used with <code>run_test()</code> for all possible combinations of inputs, generating a cohesive and comprehensive single output.\n", - "</div>" - ] - }, - { - "cell_type": "markdown", - "id": "e33c7a82", - "metadata": {}, - "source": [ - "*Random forest classifier* models use an ensemble method that builds multiple decision trees and averages their predictions. Random forest is robust to overfitting and handles non-linear relations well, but is typically less interpretable than simpler models:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "1976b7e8", - "metadata": {}, - "outputs": [], - "source": [ - "from sklearn.ensemble import RandomForestClassifier\n", - "\n", - "# Train the random forest classifer model\n", - "model_rf = RandomForestClassifier()\n", - "model_rf.fit(x_train, y_train)\n", - "\n", - "# Initialize the ValidMind model object for the random forest classifer model\n", - "vm_model_rf = vm.init_model(\n", - " model_rf,\n", - " input_id=\"random_forest\",\n", - ")\n", - "\n", - "# Assign predictions to the test dataset for the random forest classifer model\n", - "vm_test_ds.assign_predictions(model=vm_model_rf)" - ] - }, - { - "cell_type": "markdown", - "id": "f8e167cf", - "metadata": {}, - "source": [ - "*Logistic regression* models are linear models that estimate class probabilities via a logistic (sigmoid) function. Logistic regression is highly interpretable with fast training, establishing a strong baseline — however, they struggle when relationships are non-linear as real-world relationships often are:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "90bbf148", - "metadata": {}, - "outputs": [], - "source": [ - "from sklearn.linear_model import LogisticRegression\n", - "from sklearn.preprocessing import StandardScaler\n", - "from sklearn.pipeline import Pipeline\n", - "\n", - "# Scaling features ensures the lbfgs solver converges reliably\n", - "model_lr = Pipeline([\n", - " (\"scaler\", StandardScaler()),\n", - " (\"lr\", LogisticRegression()),\n", - "])\n", - "model_lr.fit(x_train, y_train)\n", - "\n", - "# Initialize the ValidMind model object for the logistic regression model\n", - "vm_model_lr = vm.init_model(\n", - " model_lr,\n", - " input_id=\"logistic_regression\",\n", - ")\n", - "\n", - "# Assign predictions to the test dataset for the logistic regression model\n", - "vm_test_ds.assign_predictions(model=vm_model_lr)" - ] - }, - { - "cell_type": "markdown", - "id": "d3478f86", - "metadata": {}, - "source": [ - "*Decision tree classifier* models are a single tree with data split on feature thresholds. Useful as an explanability benchmark, decision trees are easy to visualize and interpret — but are prone to overfitting without pruning or ensemble techniques:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "bfa1e17d", - "metadata": {}, - "outputs": [], - "source": [ - "from sklearn.tree import DecisionTreeClassifier\n", - "\n", - "# Train the decision tree classifer model\n", - "model_dt = DecisionTreeClassifier()\n", - "model_dt.fit(x_train, y_train)\n", - "\n", - "# Initialize the ValidMind model object for the decision tree classifier model\n", - "vm_model_dt = vm.init_model(\n", - " model_dt,\n", - " input_id=\"decision_tree\",\n", - ")\n", - "\n", - "# Assign predictions to the test dataset for the decision tree classifiermodel\n", - "vm_test_ds.assign_predictions(model=vm_model_dt)" - ] - }, - { - "cell_type": "markdown", - "id": "59428da9", - "metadata": {}, - "source": [ - "<a id='toc6_2_1__'></a>\n", - "\n", - "#### Run classifier performance test with multiple models\n", - "\n", - "Now, we'll use the `input_grid` to run the `ClassifierPerformance` test on all four models using the testing dataset (`vm_test_ds`).\n", - "\n", - "When running individual tests, you can use a custom `result_id` to tag the individual result with a unique identifier by appending this `result_id` to the `test_id` with a `:` separator. We'll append an identifier to signify that this test was run on `all_models` to differentiate this test run from other runs:\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "2e48ce1e", - "metadata": {}, - "outputs": [], - "source": [ - "perf_comparison_result = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.ClassifierPerformance:all_models\",\n", - " input_grid={\n", - " \"dataset\": [vm_test_ds],\n", - " \"model\": [vm_model_xgb, vm_model_rf, vm_model_lr, vm_model_dt],\n", - " },\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "1b76eae0", - "metadata": {}, - "source": [ - "Our output indicates that the XGBoost and random forest classification models provide the strongest overall classification performance, so we'll continue our testing with those two models as input only." - ] - }, - { - "cell_type": "markdown", - "id": "9fcc67b9", - "metadata": {}, - "source": [ - "<a id='toc6_2_2__'></a>\n", - "\n", - "#### Run classifier performance test with multiple parameter values\n", - "\n", - "Next, let's run the classifier performance test with the `param_grid` object, which runs the same test multiple times with different parameter values. We'll append an identifier to signify that this test was run with our `parameter_grid` configuration:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d0ad94c9", - "metadata": {}, - "outputs": [], - "source": [ - "parameter_comparison_result = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.ClassifierPerformance:parameter_grid\",\n", - " input_grid={\n", - " \"dataset\": [vm_test_ds],\n", - " \"model\": [vm_model_xgb,vm_model_rf]\n", - " },\n", - " param_grid={\n", - " \"average\": [\"macro\", \"micro\"]\n", - " },\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "19e8251b", - "metadata": {}, - "source": [ - "<a id='toc6_2_3__'></a>\n", - "\n", - "#### Run comparison test with multiple datasets\n", - "\n", - "Let's also run the ROCCurve test using `input_grid` to iterate through multiple datasets, which plots the ROC curves for the training (`vm_train_ds`) and test (`vm_test_ds`) datasets side by side — a common scenario when you want to compare the performance of a model on the training and test datasets and visually assess how much performance is lost in the test dataset.\n", - "\n", - "We'll also need to assign predictions to the training dataset for the random forest classifier model, since we didn't do that in our earlier setup:\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "96c3b426", - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(model=vm_model_rf)" - ] - }, - { - "cell_type": "markdown", - "id": "7e07db9d", - "metadata": {}, - "source": [ - "We'll append an identifier to signify that this test was run with our `train_vs_test` dataset comparison configuration:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "4056aa1e", - "metadata": {}, - "outputs": [], - "source": [ - "roc_curve_result = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.ROCCurve:train_vs_test\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " \"model\": [vm_model_xgb,vm_model_rf],\n", - " },\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "a899fb84", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Work with test results\n", - "\n", - "Every test result returned by the `run_test()` function has a [`.log()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#TestResult.log) that can be used to send the test results to the ValidMind Platform. When logging individual test results to the platform, you'll need to manually add those results to the desired section of the documentation.\n", - "\n", - "You can do this through the ValidMind Platform interface after logging your test results (**Learn more:** [Work with test results](https://docs.validmind.ai/guide/documentation/work-with-test-results.html)), or directly via the ValidMind Library when calling `.log()` by providing an optional `section_id`. The `section_id` should be a string that matches the title of a section in the documentation template in `snake_case`.\n", - "\n", - "Let's log the results of the classifier performance test (`perf_comparison_result`) and the ROCCurve (`roc_curve_result`) test in the `model_evaluation` section of the documentation — present in the template we previewed in the beginning of this notebook:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e119bf1e", - "metadata": {}, - "outputs": [], - "source": [ - "perf_comparison_result.log(section_id=\"model_evaluation\")\n", - "roc_curve_result.log(section_id=\"model_evaluation\")" - ] - }, - { - "cell_type": "markdown", - "id": "098dba6c", - "metadata": {}, - "source": [ - "Finally, let's head to the model we connected to at the beginning of this notebook and view our inserted test results in the updated documentation (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html)):\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you connected to earlier.\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "3. Expand the **3.2. Model Evaluation** section.\n", - "\n", - "4. Confirm that `perf_comparison_result` and `roc_curve_result` display in this section as expected." - ] - }, - { - "cell_type": "markdown", - "id": "a658f908", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "Now that you know how to run comparison tests with the ValidMind Library, you’re ready to take the next step. Extend the functionality of `run_test()` with your own custom test functions that can be incorporated into documentation templates just like any default out-of-the-box ValidMind test.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn how to implement custom tests with the ValidMind Library.</b></span>\n", - "<br></br>\n", - "Check out our <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/custom_tests/implement_custom_tests.html\" style=\"color: #DE257E;\"><b>Implement comparison tests</b></a> notebook for code examples and usage of key functions.</div>" - ] - }, - { - "cell_type": "markdown", - "id": "407b6c2b", - "metadata": {}, - "source": [ - "<a id='toc8_1__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "id": "82b51b49", - "metadata": {}, - "source": [ - "<a id='toc9__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0d35972c", - "metadata": { - "vscode": { - "languageId": "plaintext" + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Run comparison tests\n", + "\n", + "Learn how to use the ValidMind Library to run comparison tests that take any datasets or records (models) as inputs. Identify comparison tests to run, initialize ValidMind dataset and model objects in preparation for passing them to tests, and then run tests — generating outputs automatically logged to your documentation in the ValidMind Platform.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>We recommend that you first complete our introductory notebook on running tests.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/notebooks/how_to/tests/run_tests/1-run_dataset-based_tests.html\" style=\"color: #DE257E;\"><b>Run dataset-based tests</b></a></div>" + ], + "id": "ed8282aa" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Preview the documentation template](#toc2_3__) \n", + " - [Initialize the Python environment](#toc2_4__) \n", + "- [Explore a ValidMind test](#toc3__) \n", + "- [Working with ValidMind datasets](#toc4__) \n", + " - [Prepare the sample dataset](#toc4_1__) \n", + " - [Import the sample dataset](#toc4_1_1__) \n", + " - [Split the dataset](#toc4_1_2__) \n", + " - [Initialize the ValidMind dataset](#toc4_2__) \n", + "- [Working with ValidMind models](#toc5__) \n", + " - [Train a sample model](#toc5_1__) \n", + " - [Initialize the ValidMind model](#toc5_2__) \n", + " - [Assign predictions](#toc5_3__) \n", + "- [Running ValidMind tests](#toc6__) \n", + " - [Run classifier performance test with one model](#toc6_1__) \n", + " - [Run comparison tests](#toc6_2__) \n", + " - [Run classifier performance test with multiple models](#toc6_2_1__) \n", + " - [Run classifier performance test with multiple parameter values](#toc6_2_2__) \n", + " - [Run comparison test with multiple datasets](#toc6_2_3__) \n", + "- [Work with test results](#toc7__) \n", + "- [Next steps](#toc8__) \n", + " - [Discover more learning resources](#toc8_1__) \n", + "- [Upgrade ValidMind](#toc9__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ], + "id": "90ab1b8a" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." + ], + "id": "60aa37b6" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." + ], + "id": "6dfa3d15" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ], + "id": "8e87dd4d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ], + "id": "64971d85" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ], + "id": "69a40ac3" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", + "<br></br>\n", + "Python 3.8 <= x <= 3.14</div>\n", + "\n", + "To install the library:" + ], + "id": "ec35c724" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [], + "id": "fc97888f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ], + "id": "b3c0c2f5" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook.\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ], + "id": "d3e3302f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ], + "id": "679d46b2" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ], + "id": "2b6e1fb1" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "c51ae01c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ], + "id": "52b68564" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [], + "id": "fd332a9d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_4__'></a>\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Next, let's import the necessary libraries and set up your Python environment for data analysis:" + ], + "id": "184b8c97" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import xgboost as xgb\n", + "\n", + "%matplotlib inline" + ], + "execution_count": null, + "outputs": [], + "id": "8e2127cd" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Explore a ValidMind test\n", + "\n", + "Before we run a test, use [the `vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to return information on out-of-the-box tests available in the ValidMind Library.\n", + "\n", + "Let's assume you want to evaluate *classifier performance* for a model. Classifier performance measures how well a classification model correctly predicts outcomes, using metrics like [precision, recall, and F1 score](https://en.wikipedia.org/wiki/Precision_and_recall).\n", + "\n", + "We'll pass in a `filter` to the `list_tests` function to find the test ID for classifier performance:" + ], + "id": "c3098355" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tests(filter=\"ClassifierPerformance\")" + ], + "execution_count": null, + "outputs": [], + "id": "a6a6f715" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We've identified from the output that the test ID for the classifier performance test is `validmind.model_validation.ClassifierPerformance`.\n", + "\n", + "Use this ID combined with [the `describe_test()` function](https://docs.validmind.ai/validmind/validmind/tests.html#describe_test) to retrieve more information about the test, including its **Required Inputs**:" + ], + "id": "d1f08b64" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test_id = \"validmind.model_validation.sklearn.ClassifierPerformance\"\n", + "vm.tests.describe_test(test_id)" + ], + "execution_count": null, + "outputs": [], + "id": "f8a46c7d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Since this test requires both a dataset object and a model object, you can expect it to throw an error when we run it without passing in either as input:" + ], + "id": "10a49439" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "try:\n", + " vm.tests.run_test(test_id)\n", + "except Exception as e:\n", + " print(e)" + ], + "execution_count": null, + "outputs": [], + "id": "f853c272" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn more about the individual tests available in the ValidMind Library</b></span>\n", + "<br></br>\n", + "Check out our <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/explore_tests/explore_tests.html\" style=\"color: #DE257E;\"><b>Explore tests</b></a> notebook for more code examples and usage of key functions.</div>" + ], + "id": "da36ba6b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Working with ValidMind datasets" + ], + "id": "40324c13" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Prepare the sample dataset" + ], + "id": "3f28ffe2" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1_1__'></a>\n", + "\n", + "#### Import the sample dataset\n", + "\n", + "Since we need a dataset to run tests, let's import the public [Bank Customer Churn Prediction](https://www.kaggle.com/datasets/shantanudhakadd/bank-customer-churn-prediction) dataset from Kaggle so that we have something to work with.\n", + "\n", + "In our below example, note that:\n", + "\n", + "- The target column, `Exited` has a value of `1` when a customer has churned and `0` otherwise.\n", + "- The ValidMind Library provides a wrapper to automatically load the dataset as a [Pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) object. A Pandas Dataframe is a two-dimensional tabular data structure that makes use of rows and columns." + ], + "id": "4c45a55c" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Import the sample dataset from the library\n", + "\n", + "from validmind.datasets.classification import customer_churn\n", + "\n", + "print(\n", + " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{customer_churn.target_column}' \\n\\t• Class labels: {customer_churn.class_labels}\"\n", + ")\n", + "\n", + "raw_df = customer_churn.load_data()\n", + "raw_df.head()" + ], + "execution_count": null, + "outputs": [], + "id": "3ef2dfbb" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1_2__'></a>\n", + "\n", + "#### Split the dataset\n", + "\n", + "Let's first split our dataset to help assess how well the model generalizes to unseen data.\n", + "\n", + "Use [`preprocess()`](https://docs.validmind.ai/validmind/validmind/datasets/classification/customer_churn.html#preprocess) to split our dataset into three subsets:\n", + "\n", + "1. **train_df** — Used to train the model.\n", + "2. **validation_df** — Used to evaluate the model's performance during training.\n", + "3. **test_df** — Used later on to asses the model's performance on new, unseen data." + ], + "id": "2fc43d28" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_df, validation_df, test_df = customer_churn.preprocess(raw_df)" + ], + "execution_count": null, + "outputs": [], + "id": "88c87d4a" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### Initialize the ValidMind dataset\n", + "\n", + "The next step is to connect your data with a ValidMind `Dataset` object. **This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind,** but you only need to do it once per dataset.\n", + "\n", + "ValidMind dataset objects provide a wrapper to any type of dataset (NumPy, Pandas, Polars, etc.) so that tests can run transparently regardless of the underlying library.\n", + "\n", + "Initialize a ValidMind dataset object using the [`init_dataset` function](https://docs.validmind.ai/validmind/validmind.html#init_dataset) from the ValidMind (`vm`) module. For this example, we'll pass in the following arguments:\n", + "\n", + "- **`dataset`** — The raw dataset that you want to provide as input to tests.\n", + "- **`input_id`** — A unique identifier that allows tracking what inputs are used when running each individual test.\n", + "- **`target_column`** — A required argument if tests require access to true values. This is the name of the target column in the dataset." + ], + "id": "a5d77885" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds = vm.init_dataset(\n", + " dataset=train_df,\n", + " input_id=\"train_dataset\",\n", + " target_column=customer_churn.target_column,\n", + ")\n", + "\n", + "vm_test_ds = vm.init_dataset(\n", + " dataset=test_df,\n", + " input_id=\"test_dataset\",\n", + " target_column=customer_churn.target_column,\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "bf0ec747" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Working with ValidMind models" + ], + "id": "cbb1a68f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1__'></a>\n", + "\n", + "### Train a sample model\n", + "\n", + "To train the model, we need to provide it with:\n", + "\n", + "1. **Inputs** — Features such as customer age, usage, etc.\n", + "2. **Outputs (Expected answers/labels)** — in our case, we would like to know whether the customer churned or not.\n", + "\n", + "Here, we'll use `x_train` and `x_val` to hold the input data (features), and `y_train` and `y_val` to hold the answers (the target we want to predict):" + ], + "id": "68089f0a" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "x_train = train_df.drop(customer_churn.target_column, axis=1)\n", + "y_train = train_df[customer_churn.target_column]\n", + "x_val = validation_df.drop(customer_churn.target_column, axis=1)\n", + "y_val = validation_df[customer_churn.target_column]" + ], + "execution_count": null, + "outputs": [], + "id": "39e8c7ea" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next, let's create an *XGBoost classifier model* that will automatically stop training if it doesn't improve after 10 tries. XGBoost is a gradient-boosted tree ensemble that builds trees sequentially, with each tree correcting the errors of the previous ones — typically known for strong predictive performance and built-in regularization to reduce overfitting.\n", + "\n", + "Setting an explicit threshold avoids wasting time and helps prevent further overfitting by stopping training when further improvement isn't happening. We'll also set three evaluation metrics to get a more complete picture of model performance:\n", + "\n", + "1. **error** — Measures how often the model makes incorrect predictions.\n", + "2. **logloss** — Indicates how confident the predictions are.\n", + "3. **auc** — Evaluates how well the model distinguishes between churn and not churn." + ], + "id": "6d93642b" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "model = xgb.XGBClassifier(early_stopping_rounds=10)\n", + "model.set_params(\n", + " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "255e3583" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Finally, our actual training step — where the model learns patterns from the data, so it can make predictions later:\n", + "\n", + "- The model is trained on `x_train` and `y_train`, and evaluates its performance using `x_val` and `y_val` to check if it’s learning well.\n", + "- To turn off printed output while training, we'll set `verbose` to `False`." + ], + "id": "a021582a" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "model.fit(\n", + " x_train,\n", + " y_train,\n", + " eval_set=[(x_val, y_val)],\n", + " verbose=False,\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "e3aa3657" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_2__'></a>\n", + "\n", + "### Initialize the ValidMind model\n", + "\n", + "You'll also need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data for our model.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" + ], + "id": "ed11ea0b" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_model_xgb = vm.init_model(\n", + " model,\n", + " input_id=\"xgboost\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "4b2be11f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_3__'></a>\n", + "\n", + "### Assign predictions\n", + "\n", + "Once the model has been registered, you can assign model predictions to the training and testing datasets.\n", + "\n", + "- The [`assign_predictions()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#assign_predictions) from the `Dataset` object can link existing predictions to any number of models.\n", + "- This method links the model's class prediction values and probabilities to our `vm_train_ds` and `vm_test_ds` datasets.\n", + "\n", + "If no prediction values are passed, the method will compute predictions automatically:" + ], + "id": "53f12da6" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(model=vm_model_xgb)\n", + "vm_test_ds.assign_predictions(model=vm_model_xgb)" + ], + "execution_count": null, + "outputs": [], + "id": "229185fd" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Running ValidMind tests\n", + "\n", + "Now that we know how to initialize ValidMind `dataset` and `model` objects, we're ready to run some tests!\n", + "\n", + "You run individual tests by calling [the `run_test` function](https://docs.validmind.ai/validmind/validmind/tests.html#run_test) provided by the `validmind.tests` module. For the examples below, we'll pass in the following arguments:\n", + "\n", + "- **`test_id`** — The ID of the test to run, as seen in the `ID` column when you run `list_tests`.\n", + "- **`inputs`** — A dictionary of test inputs, such as `dataset`, `model`, `datasets`, or `models`. These are ValidMind objects initialized with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) or [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model)." + ], + "id": "18c1cb2e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_1__'></a>\n", + "\n", + "### Run classifier performance test with one model\n", + "\n", + "Run `validmind.data_validation.ClassifierPerformance` test with the testing dataset (`vm_test_ds`) and model (`vm_model_xgb`) as inputs:" + ], + "id": "6f7e7779" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.ClassifierPerformance\",\n", + " inputs={\n", + " \"dataset\": vm_test_ds,\n", + " \"model\": vm_model_xgb,\n", + " },\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "85189af9" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_2__'></a>\n", + "\n", + "### Run comparison tests\n", + "\n", + "To evaluate which models might be a better fit for a use case based on their performance on selected criteria, we can run the same test with multiple models. We'll train three additional models and run the classifier performance test with for all four models using a single `run_test()` call.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>ValidMind helps streamline your documentation and testing.</b></span>\n", + "<br></br>\n", + "You could call <code>run_test()</code> multiple times passing in different inputs, but you can also pass an <code>input_grid</code> object — a dictionary of test input keys and values that allow you to run a single test for a combination of models and datasets.\n", + "<br></br>\n", + "With <code>input_grid</code>, run comparison tests for multiple datasets, or even multiple datasets and models simultaneously — <code>input_grid</code> can be used with <code>run_test()</code> for all possible combinations of inputs, generating a cohesive and comprehensive single output.\n", + "</div>" + ], + "id": "5e8be8d5" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "*Random forest classifier* models use an ensemble method that builds multiple decision trees and averages their predictions. Random forest is robust to overfitting and handles non-linear relations well, but is typically less interpretable than simpler models:" + ], + "id": "e33c7a82" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from sklearn.ensemble import RandomForestClassifier\n", + "\n", + "# Train the random forest classifer model\n", + "model_rf = RandomForestClassifier()\n", + "model_rf.fit(x_train, y_train)\n", + "\n", + "# Initialize the ValidMind model object for the random forest classifer model\n", + "vm_model_rf = vm.init_model(\n", + " model_rf,\n", + " input_id=\"random_forest\",\n", + ")\n", + "\n", + "# Assign predictions to the test dataset for the random forest classifer model\n", + "vm_test_ds.assign_predictions(model=vm_model_rf)" + ], + "execution_count": null, + "outputs": [], + "id": "1976b7e8" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "*Logistic regression* models are linear models that estimate class probabilities via a logistic (sigmoid) function. Logistic regression is highly interpretable with fast training, establishing a strong baseline — however, they struggle when relationships are non-linear as real-world relationships often are:" + ], + "id": "f8e167cf" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from sklearn.linear_model import LogisticRegression\n", + "from sklearn.preprocessing import StandardScaler\n", + "from sklearn.pipeline import Pipeline\n", + "\n", + "# Scaling features ensures the lbfgs solver converges reliably\n", + "model_lr = Pipeline([\n", + " (\"scaler\", StandardScaler()),\n", + " (\"lr\", LogisticRegression()),\n", + "])\n", + "model_lr.fit(x_train, y_train)\n", + "\n", + "# Initialize the ValidMind model object for the logistic regression model\n", + "vm_model_lr = vm.init_model(\n", + " model_lr,\n", + " input_id=\"logistic_regression\",\n", + ")\n", + "\n", + "# Assign predictions to the test dataset for the logistic regression model\n", + "vm_test_ds.assign_predictions(model=vm_model_lr)" + ], + "execution_count": null, + "outputs": [], + "id": "90bbf148" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "*Decision tree classifier* models are a single tree with data split on feature thresholds. Useful as an explanability benchmark, decision trees are easy to visualize and interpret — but are prone to overfitting without pruning or ensemble techniques:" + ], + "id": "d3478f86" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from sklearn.tree import DecisionTreeClassifier\n", + "\n", + "# Train the decision tree classifer model\n", + "model_dt = DecisionTreeClassifier()\n", + "model_dt.fit(x_train, y_train)\n", + "\n", + "# Initialize the ValidMind model object for the decision tree classifier model\n", + "vm_model_dt = vm.init_model(\n", + " model_dt,\n", + " input_id=\"decision_tree\",\n", + ")\n", + "\n", + "# Assign predictions to the test dataset for the decision tree classifiermodel\n", + "vm_test_ds.assign_predictions(model=vm_model_dt)" + ], + "execution_count": null, + "outputs": [], + "id": "bfa1e17d" + }, + { + "cell_type": "markdown", + "id": "59428da9", + "metadata": {}, + "source": [ + "<a id='toc6_2_1__'></a>\n", + "\n", + "#### Run classifier performance test with multiple models\n", + "\n", + "Now, we'll use the `input_grid` to run the `model_validation.sklearn.ClassifierPerformance` test on all four models using the testing dataset (`vm_test_ds`).\n", + "\n", + "When running individual tests, you can use a custom `result_id` to tag the individual result with a unique identifier by appending this `result_id` to the `test_id` with a `:` separator. We'll append an identifier to signify that this test was run on `all_models` to differentiate this test run from other runs:\n" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "perf_comparison_result = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.ClassifierPerformance:all_models\",\n", + " input_grid={\n", + " \"dataset\": [vm_test_ds],\n", + " \"model\": [vm_model_xgb, vm_model_rf, vm_model_lr, vm_model_dt],\n", + " },\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "2e48ce1e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Our output indicates that the XGBoost and random forest classification models provide the strongest overall classification performance, so we'll continue our testing with those two models as input only." + ], + "id": "1b76eae0" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_2_2__'></a>\n", + "\n", + "#### Run classifier performance test with multiple parameter values\n", + "\n", + "Next, let's run the classifier performance test with the `param_grid` object, which runs the same test multiple times with different parameter values. We'll append an identifier to signify that this test was run with our `parameter_grid` configuration:" + ], + "id": "9fcc67b9" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "parameter_comparison_result = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.ClassifierPerformance:parameter_grid\",\n", + " input_grid={\n", + " \"dataset\": [vm_test_ds],\n", + " \"model\": [vm_model_xgb,vm_model_rf]\n", + " },\n", + " param_grid={\n", + " \"average\": [\"macro\", \"micro\"]\n", + " },\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "d0ad94c9" + }, + { + "cell_type": "markdown", + "id": "19e8251b", + "metadata": {}, + "source": [ + "<a id='toc6_2_3__'></a>\n", + "\n", + "#### Run comparison test with multiple datasets\n", + "\n", + "Let's also run the `model_validation.sklearn.ROCCurve` test using `input_grid` to iterate through multiple datasets, which plots the ROC curves for the training (`vm_train_ds`) and test (`vm_test_ds`) datasets side by side — a common scenario when you want to compare the performance of a model on the training and test datasets and visually assess how much performance is lost in the test dataset.\n", + "\n", + "We'll also need to assign predictions to the training dataset for the random forest classifier model, since we didn't do that in our earlier setup:\n" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(model=vm_model_rf)" + ], + "execution_count": null, + "outputs": [], + "id": "96c3b426" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We'll append an identifier to signify that this test was run with our `train_vs_test` dataset comparison configuration:" + ], + "id": "7e07db9d" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "roc_curve_result = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.ROCCurve:train_vs_test\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " \"model\": [vm_model_xgb,vm_model_rf],\n", + " },\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "4056aa1e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Work with test results\n", + "\n", + "Every test result returned by the `run_test()` function has a [`.log()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#TestResult.log) that can be used to send the test results to the ValidMind Platform. When logging individual test results to the platform, you'll need to manually add those results to the desired section of the documentation.\n", + "\n", + "You can do this through the ValidMind Platform interface after logging your test results (**Learn more:** [Work with test results](https://docs.validmind.ai/guide/documentation/work-with-test-results.html)), or directly via the ValidMind Library when calling `.log()` by providing an optional `section_id`. The `section_id` should be a string that matches the title of a section in the documentation template in `snake_case`.\n", + "\n", + "Let's log the results of the classifier performance test (`perf_comparison_result`) and the ROCCurve (`roc_curve_result`) test in the `model_evaluation` section of the documentation — present in the template we previewed in the beginning of this notebook:" + ], + "id": "a899fb84" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "perf_comparison_result.log(section_id=\"model_evaluation\")\n", + "roc_curve_result.log(section_id=\"model_evaluation\")" + ], + "execution_count": null, + "outputs": [], + "id": "e119bf1e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Finally, let's head to the model we connected to at the beginning of this notebook and view our inserted test results in the updated documentation (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html)):\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you connected to earlier.\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "3. Expand the **3.2. Model Evaluation** section.\n", + "\n", + "4. Confirm that `perf_comparison_result` and `roc_curve_result` display in this section as expected." + ], + "id": "098dba6c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "Now that you know how to run comparison tests with the ValidMind Library, you’re ready to take the next step. Extend the functionality of `run_test()` with your own custom test functions that can be incorporated into documentation templates just like any default out-of-the-box ValidMind test.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn how to implement custom tests with the ValidMind Library.</b></span>\n", + "<br></br>\n", + "Check out our <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/custom_tests/implement_custom_tests.html\" style=\"color: #DE257E;\"><b>Implement comparison tests</b></a> notebook for code examples and usage of key functions.</div>" + ], + "id": "a658f908" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8_1__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ], + "id": "407b6c2b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc9__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ], + "id": "82b51b49" + }, + { + "cell_type": "code", + "metadata": { + "vscode": { + "languageId": "plaintext" + } + }, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [], + "id": "0d35972c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ], + "id": "86478a30" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ], + "id": "10073159" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-5fe1b67f8fdc4d26bb090f5e655857bf" + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "name": "python", + "version": "3.10" } - }, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "id": "86478a30", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "id": "10073159", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-5fe1b67f8fdc4d26bb090f5e655857bf", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" }, - "language_info": { - "name": "python", - "version": "3.10" - } - }, - "nbformat": 4, - "nbformat_minor": 5 + "nbformat": 4, + "nbformat_minor": 5 } diff --git a/site/notebooks/how_to/tests/run_tests/configure_tests/configure_judge_llms.ipynb b/site/notebooks/how_to/tests/run_tests/configure_tests/configure_judge_llms.ipynb new file mode 100644 index 0000000000..3e8d27bcd1 --- /dev/null +++ b/site/notebooks/how_to/tests/run_tests/configure_tests/configure_judge_llms.ipynb @@ -0,0 +1,827 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "0935afb5", + "metadata": {}, + "source": [ + "# Configure judge LLM and judge embeddings\n", + "\n", + "This notebook shows how to configure and validate the default judge LLM and judge embeddings used by the ValidMind Library for LLM-focused tests.\n", + "\n", + "It exercises three important paths:\n", + "1. Prompt-validation tests, which depend on the default judge LLM.\n", + "2. RAGAS-based tests, which depend on both the default judge LLM and the default judge embeddings model.\n", + "3. DeepEval scorers, which depend on the default local scorer model path.\n", + "\n", + "The notebook automatically selects the available provider from your environment, with OpenAI taking precedence when both OpenAI and Gemini keys are set, to match the library's default-provider logic." + ] + }, + { + "cell_type": "markdown", + "id": "1f2befa6", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents\n", + "- [Introduction](#toc1__)\n", + "- [About ValidMind](#toc2__)\n", + " - [Before you begin](#toc2_1__)\n", + " - [New to ValidMind?](#toc2_2__)\n", + " - [Key concepts](#toc2_3__)\n", + "- [Setting up](#toc3__)\n", + " - [Install the ValidMind Library](#toc3_1__)\n", + " - [Connect to the ValidMind Platform](#toc3_2__)\n", + " - [Register or select a model](#toc3_2_1__)\n", + " - [Choose a documentation template](#toc3_2_2__)\n", + " - [Get your code snippet](#toc3_2_3__)\n", + " - [Initialize the notebook environment](#toc3_3__)\n", + "- [Getting to know ValidMind](#toc4__)\n", + " - [Preview the documentation template](#toc4_1__)\n", + " - [View model documentation in the ValidMind Platform](#toc4_2__)\n", + "- [Configure the judge provider](#toc5__)\n", + "- [Prompt-validation tests](#toc6__)\n", + "- [RAGAS tests](#toc7__)\n", + "- [DeepEval scorers](#toc8__)\n", + "- [In summary](#toc9__)\n", + "- [Next steps](#toc10__)\n", + " - [Discover more learning resources](#toc10_1__)\n", + "- [Upgrade ValidMind](#toc11__)\n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "id": "b77005b8", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## Introduction\n", + "\n", + "This notebook shows how to configure and validate the default judge LLM and judge embeddings used by the ValidMind Library for LLM-focused tests.\n", + "\n", + "It walks through the provider configuration used by three important evaluation paths:\n", + "- prompt-validation tests\n", + "- RAGAS-based tests\n", + "- DeepEval scorers\n", + "\n", + "Along the way, you will initialize ValidMind model and dataset objects, inspect the resolved judge configuration, run representative tests, and optionally log the results to the ValidMind Platform. By the end of the notebook, you will have a practical reference for configuring judge models and understanding how those settings affect different LLM evaluation workflows." + ] + }, + { + "cell_type": "markdown", + "id": "56ecff8d", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing model risk, including risk associated with AI and statistical models. \n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on model documentation. Together, these products simplify model risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and model validators." + ] + }, + { + "cell_type": "markdown", + "id": "e8743d30", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "Before running this notebook, make sure you have:\n", + "- a Python environment with the ValidMind Library and its LLM dependencies installed\n", + "- access to a ValidMind account if you want to log results to the ValidMind Platform\n", + "- credentials for one supported judge provider in your environment\n", + "\n", + "This notebook supports:\n", + "- OpenAI via `OPENAI_API_KEY`, with optional `OPENAI_MODEL` and `OPENAI_EMBEDDINGS_MODEL` overrides. The current default judge model is `gpt-4.1` and the default embeddings model is `text-embedding-3-small`.\n", + "- Gemini via `GOOGLE_API_KEY` or `GEMINI_API_KEY`, with optional `GEMINI_MODEL` and `GEMINI_EMBEDDINGS_MODEL` overrides. The current defaults are `gemini-2.5-pro` and `models/text-embedding-004`.\n", + "- Azure OpenAI via `AZURE_OPENAI_KEY`, `AZURE_OPENAI_ENDPOINT`, and `AZURE_OPENAI_MODEL`. The current default embeddings model is `text-embedding-3-small`.\n", + "\n", + "You can still run the notebook locally without connecting to the ValidMind Platform, but connecting a model document makes it easier to review and share results after the tests complete." + ] + }, + { + "cell_type": "markdown", + "id": "ee479eb5", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you are new to the ValidMind Library, start with the [ValidMind Library overview](https://docs.validmind.ai/developer/validmind-library.html). It introduces the core workflow for initializing models and datasets, running tests, and logging outputs back to the ValidMind Platform.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>You only need a ValidMind account if you want to log results to the ValidMind Platform.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/configuration/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ] + }, + { + "cell_type": "markdown", + "id": "689e55db", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**Judge LLM**: The language model used by ValidMind to evaluate prompts, answers, contexts, and other LLM outputs.\n", + "\n", + "**Judge embeddings**: The embeddings model used when a test requires semantic similarity or retrieval-based comparison.\n", + "\n", + "**Provider credentials**: Environment variables that tell ValidMind which provider to use for judge evaluation. In this notebook, the provider is resolved automatically from the credentials available in your environment.\n", + "\n", + "**ValidMind dataset**: A dataset initialized with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset). Wrapping a pandas DataFrame this way lets you pass the dataset into ValidMind tests with the metadata those tests expect.\n", + "\n", + "**ValidMind model**: A model initialized with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). In this notebook, we use a lightweight model object to run prompt-validation tests against a prompt template.\n", + "\n", + "**Prompt-validation tests**: Tests that evaluate prompt quality and instructions, such as clarity or bias, using a judge LLM.\n", + "\n", + "**RAGAS tests**: Retrieval-augmented generation tests that can rely on both a judge LLM and judge embeddings.\n", + "\n", + "**DeepEval scorers**: LLM-based scorers used for tasks such as answer relevancy and hallucination detection. These use the evaluation model path but do not require judge embeddings." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "id": "8d6a8300", + "metadata": {}, + "source": [ + "<a id='toc3_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", + "<br></br>\n", + "Python 3.8 <= x <= 3.14</div>\n", + "\n", + "Install the ValidMind Library with the optional LLM dependencies so the notebook can run prompt-validation tests, RAGAS tests, and DeepEval scorers:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "da644666", + "metadata": {}, + "outputs": [], + "source": [ + "%pip install -q \"validmind[llm]\"" + ] + }, + { + "cell_type": "markdown", + "id": "17c1c7d2", + "metadata": {}, + "source": [ + "<a id='toc3_2__'></a>\n", + "\n", + "### Connect to the ValidMind Platform\n", + "\n", + "If you want to log notebook outputs to the ValidMind Platform, start by selecting an existing model in your inventory or registering a new one. This notebook can run without platform connectivity, but linking it to a model document gives you a place to review the results after the examples finish.\n", + "\n", + "<a id='toc3_2_1__'></a>\n", + "\n", + "#### Register or select a model\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/configuration/log-in-to-validmind.html).\n", + "2. Open **Inventory** and either select an existing model or click **+ Register Model**.\n", + "3. Complete the model details and stakeholder assignments if you are registering a new model.\n", + "4. Open the document where you want notebook results to be logged.\n", + "\n", + "Using a real model document is especially helpful in this notebook because it lets you compare the locally executed tests with the sections available in your template." + ] + }, + { + "cell_type": "markdown", + "id": "dc628c6c", + "metadata": {}, + "source": [ + "<a id='toc3_2_2__'></a>\n", + "\n", + "#### Choose a documentation template\n", + "\n", + "If you plan to log results from this notebook, make sure your model document uses a template that includes sections for the LLM evaluation results you want to capture.\n", + "\n", + "This is important because tests that are not included in the selected template will not appear automatically in the Platform document, even if you run and log them successfully from the notebook. If you want to document those results as well, you can add the relevant sections or tests manually in the Platform.\n", + "\n", + "Before running the notebook, preview the template structure and confirm that the document has the sections you expect for your workflow." + ] + }, + { + "cell_type": "markdown", + "id": "702f5196", + "metadata": {}, + "source": [ + "<a id='toc3_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the code snippet associated with your model document so that test results are uploaded to the correct destination in the ValidMind Platform.\n", + "\n", + "1. In the model sidebar, open **Getting Started**.\n", + "2. Select the document you want to update.\n", + "3. Copy the generated code snippet.\n", + "4. Load the values from an `.env` file or replace the placeholders in the example below with your own values.\n", + "\n", + "Using environment variables is usually the easiest way to keep the notebook portable across environments and avoid hard-coding connection details in the notebook itself." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c52a42d0", + "metadata": {}, + "outputs": [], + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " api_host=\"http://localhost:5000/api/v1/tracking\",\n", + " api_key=\"..\",\n", + " api_secret=\"..\",\n", + " document=\"documentation\", # requires library >=2.12.0\n", + " model=\"..\",\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "3657a7a4", + "metadata": {}, + "source": [ + "<a id='toc3_3__'></a>\n", + "\n", + "### Initialize the notebook environment\n", + "\n", + "Load environment variables and prepare the notebook session. In the execution cells that follow, you will import the libraries needed for this walkthrough, inspect the configured judge provider, and create the ValidMind objects used by the example tests.\n", + "\n", + "This section is also where the notebook becomes reproducible: once your credentials and dependencies are in place, the remaining sections can be run top to bottom." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "979988a9", + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "\n", + "import pandas as pd\n", + "\n", + "from validmind.ai import utils as ai_utils\n", + "from validmind.models import Prompt\n", + "from validmind.tests import run_test" + ] + }, + { + "cell_type": "markdown", + "id": "3db58d74", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Getting to know ValidMind" + ] + }, + { + "cell_type": "markdown", + "id": "45450d55", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "If you have already connected this notebook to a model document, you can preview the active template structure directly from the library.\n", + "\n", + "This is useful for confirming where logged results will appear before you run the prompt-validation, RAGAS, and DeepEval examples below. It also helps you spot gaps early if a test you plan to run is not represented in the current template:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "98f0b602", + "metadata": {}, + "outputs": [], + "source": [ + "vm.preview_template()" + ] + }, + { + "cell_type": "markdown", + "id": "58e1d75f", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### View model documentation in the ValidMind Platform\n", + "\n", + "After you run the notebook and log results, open your model document in the ValidMind Platform to review how the test outputs were added.\n", + "\n", + "Comparing the template preview with the rendered document is a good way to confirm that your notebook is writing results to the expected sections. If a result does not appear automatically, check whether the corresponding test is part of the selected template before troubleshooting the notebook run itself." + ] + }, + { + "cell_type": "markdown", + "id": "86038351", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Configure the judge provider\n", + "\n", + "The next cells load your environment variables, resolve the judge provider from the credentials available in your session, and initialize the ValidMind Library for result logging.\n", + "\n", + "This notebook uses the same provider resolution logic as the library itself:\n", + "- OpenAI is selected when `OPENAI_API_KEY` is available, with `OPENAI_MODEL` as an optional override. The current default judge model is `gpt-4.1`.\n", + "- Azure OpenAI is selected when Azure credentials are available, using `AZURE_OPENAI_MODEL` for the judge model.\n", + "- Gemini is selected when `GOOGLE_API_KEY` or `GEMINI_API_KEY` is available, with optional `GEMINI_MODEL` and `GEMINI_EMBEDDINGS_MODEL` overrides. The current defaults are `gemini-2.5-pro` and `models/text-embedding-004`.\n", + "\n", + "If more than one provider is configured, OpenAI takes precedence to match the library default.\n", + "\n", + "This matters because the same default judge configuration is reused across multiple evaluation paths, so checking it once here makes the later test results easier to interpret." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a3efda1f", + "metadata": {}, + "outputs": [], + "source": [ + "# Optional: override the default judge models for this notebook session.\n", + "# os.environ[\"OPENAI_MODEL\"] = \"gpt-4.1\"\n", + "# os.environ[\"GEMINI_MODEL\"] = \"gemini-2.5-pro\"\n", + "# os.environ[\"GEMINI_EMBEDDINGS_MODEL\"] = \"models/text-embedding-004\"" + ] + }, + { + "cell_type": "markdown", + "id": "f0438cf0", + "metadata": {}, + "source": [ + "The next cells import the required libraries, inspect the resolved provider configuration, and connect the notebook to the ValidMind Platform. Reading the printed provider and class names is a quick sanity check that your environment is using the judge setup you expect before any tests are executed." + ] + }, + { + "cell_type": "markdown", + "id": "089f10fa", + "metadata": {}, + "source": [ + "<a id='toc5_1__'></a>\n", + "\n", + "### Load credentials and resolve the provider\n", + "\n", + "Run the next cells to:\n", + "- import the libraries used in this notebook\n", + "- inspect the provider selected from your environment\n", + "- inspect the resolved judge LLM and judge embeddings classes\n", + "- initialize the ValidMind Library with your platform credentials\n", + "\n", + "If both OpenAI and Gemini credentials are available, OpenAI will be selected to match the default provider precedence used by the library.\n", + "\n", + "This section gives you a concrete view of the effective configuration that the later prompt-validation, RAGAS, and DeepEval examples will use." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f1479922", + "metadata": {}, + "outputs": [], + "source": [ + "from validmind.ai.utils import get_client_and_model, get_judge_config\n", + "\n", + "client, model = get_client_and_model()\n", + "judge_llm, judge_embeddings = get_judge_config()\n", + "\n", + "print(\"resolved_model:\", model)\n", + "print(\"judge_llm_type:\", type(judge_llm).__name__)\n", + "print(\"judge_embeddings_type:\", type(judge_embeddings).__name__)\n", + "\n", + "# Useful for Gemini/OpenAI/Azure debugging\n", + "print(\"judge_llm:\", judge_llm)\n", + "print(\"judge_embeddings:\", judge_embeddings)" + ] + }, + { + "cell_type": "markdown", + "id": "e7868c71", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Prompt-validation tests\n", + "\n", + "This section validates the default judge LLM path with two representative prompt-validation tests. For this smoke test, we use a simple prompt-only model because these tests evaluate the prompt template itself and do not require model predictions.\n", + "\n", + "The example below creates a ValidMind model with `vm.init_model()` and attaches a prompt template to it. That gives the tests a standard object to inspect, even though there is no real predictive model behind the example.\n", + "\n", + "- `Clarity` checks whether the prompt instructions are clear and well-scoped.\n", + "- `Bias` checks whether the prompt structure or examples could induce biased behavior." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7cc07ba8", + "metadata": {}, + "outputs": [], + "source": [ + "system_prompt = \"\"\"\n", + "You are an AI assistant specialized in sentiment analysis for financial news.\n", + "You will classify each sentence as positive, negative, or neutral.\n", + "Respond only with the sentiment label.\n", + "\"\"\".strip()\n", + "\n", + "\n", + "def noop_predict(_):\n", + " return \"dummy\"\n", + "\n", + "\n", + "vm_prompt_model = vm.init_model(\n", + " input_id=\"judge_prompt_model\",\n", + " predict_fn=noop_predict,\n", + " prompt=Prompt(template=system_prompt, variables=[]),\n", + ")\n", + "\n", + "vm_prompt_model.prompt.template" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "40298f6b", + "metadata": {}, + "outputs": [], + "source": [ + "run_test(\n", + " test_id=\"validmind.prompt_validation.Clarity\",\n", + " inputs={\"model\": vm_prompt_model},\n", + ").log()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4180b0f1", + "metadata": {}, + "outputs": [], + "source": [ + "run_test(\n", + " test_id=\"validmind.prompt_validation.Bias\",\n", + " inputs={\"model\": vm_prompt_model},\n", + ").log()" + ] + }, + { + "cell_type": "markdown", + "id": "9935d075", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## RAGAS tests\n", + "\n", + "This section validates the default judge LLM plus default judge embeddings path. The selected tests are useful because they exercise the RAGAS integration that historically depended on the default OpenAI setup.\n", + "\n", + "The example data is wrapped with `vm.init_dataset()`, which turns the pandas DataFrame into a ValidMind dataset object that can be passed directly into these tests.\n", + "\n", + "- `ResponseRelevancy` exercises the judge LLM and embeddings path.\n", + "- `AnswerCorrectness` exercises semantic and factual comparison with judge embeddings.\n", + "- `Faithfulness` is a companion smoke test for the judge LLM path on RAG data.\n", + "\n", + "These tests produce Plotly figures, so this notebook focuses on running and logging the results rather than comparing visual output in detail." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "17cbf0e3", + "metadata": {}, + "outputs": [], + "source": [ + "rag_df = pd.DataFrame(\n", + " {\n", + " \"user_input\": [\n", + " \"What happened to the company's revenue guidance?\",\n", + " \"Why did the bank's stock decline?\",\n", + " \"What was the announced dividend decision?\",\n", + " ],\n", + " \"retrieved_contexts\": [\n", + " [\n", + " \"The company raised its full-year revenue guidance after reporting strong demand in the enterprise segment.\",\n", + " \"Management said the improved forecast was driven by larger-than-expected renewals.\",\n", + " ],\n", + " [\n", + " \"The bank's stock declined after it reported higher-than-expected credit losses in its consumer portfolio.\",\n", + " \"Executives also warned that provisions may remain elevated next quarter.\",\n", + " ],\n", + " [\n", + " \"The board announced that it would keep the quarterly dividend unchanged.\",\n", + " \"Management said capital return policy remains the same for now.\",\n", + " ],\n", + " ],\n", + " \"response\": [\n", + " \"The company increased its full-year revenue guidance after stronger enterprise demand.\",\n", + " \"The bank's stock fell because it disclosed higher-than-expected credit losses.\",\n", + " \"The company kept its dividend unchanged.\",\n", + " ],\n", + " \"reference\": [\n", + " \"The company raised its full-year revenue guidance because demand in the enterprise segment was strong.\",\n", + " \"The bank's shares dropped after it reported higher-than-expected credit losses.\",\n", + " \"The board decided to leave the quarterly dividend unchanged.\",\n", + " ],\n", + " }\n", + ")\n", + "\n", + "vm_rag_ds = vm.init_dataset(\n", + " dataset=rag_df,\n", + " input_id=\"judge_rag_dataset\",\n", + " text_column=\"user_input\",\n", + " target_column=\"reference\",\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "fcdb6232", + "metadata": {}, + "outputs": [], + "source": [ + "run_test(\n", + " test_id=\"validmind.model_validation.ragas.ResponseRelevancy\",\n", + " inputs={\"dataset\": vm_rag_ds},\n", + ").log()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "25124a2f", + "metadata": {}, + "outputs": [], + "source": [ + "run_test(\n", + " test_id=\"validmind.model_validation.ragas.AnswerCorrectness\",\n", + " inputs={\"dataset\": vm_rag_ds},\n", + ").log()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3a58bd42", + "metadata": {}, + "outputs": [], + "source": [ + "run_test(\n", + " test_id=\"validmind.model_validation.ragas.Faithfulness\",\n", + " inputs={\"dataset\": vm_rag_ds},\n", + ").log()" + ] + }, + { + "cell_type": "markdown", + "id": "8b65420f", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## DeepEval scorers\n", + "\n", + "This section validates the default local scorer model path used by DeepEval-based scorers in `validmind.scorers.llm.deepeval`.\n", + "\n", + "As in the RAGAS example, we create a ValidMind dataset with `vm.init_dataset()` so the scorer workflow runs against the same kind of object customers would use in their own notebooks.\n", + "\n", + "These scorers do not use the judge embeddings object. For this notebook, we use two representative examples:\n", + "- `AnswerRelevancy`\n", + "- `Hallucination`\n", + "\n", + "They are included here so the notebook covers all three LLM evaluation surfaces:\n", + "- prompt-validation\n", + "- RAGAS\n", + "- DeepEval scorers" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c34f2484", + "metadata": {}, + "outputs": [], + "source": [ + "deepeval_df = pd.DataFrame(\n", + " {\n", + " \"input\": [\n", + " \"What is the capital of France?\",\n", + " \"Why did the company raise its full-year guidance?\",\n", + " \"What did the board decide about the quarterly dividend?\",\n", + " ],\n", + " \"actual_output\": [\n", + " \"The capital of France is Paris.\",\n", + " \"The company raised guidance because enterprise demand was stronger than expected.\",\n", + " \"The board kept the quarterly dividend unchanged.\",\n", + " ],\n", + " \"context\": [\n", + " [\"France's capital city is Paris.\"],\n", + " [\n", + " \"Management raised its full-year guidance after reporting stronger-than-expected demand in the enterprise segment.\"\n", + " ],\n", + " [\n", + " \"The board announced that the quarterly dividend would remain unchanged.\"\n", + " ],\n", + " ],\n", + " }\n", + ")\n", + "\n", + "vm_deepeval_ds = vm.init_dataset(\n", + " dataset=deepeval_df,\n", + " input_id=\"judge_deepeval_dataset\",\n", + " text_column=\"input\",\n", + " target_column=\"actual_output\",\n", + ")\n", + "\n", + "deepeval_df" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9a3cdae0", + "metadata": {}, + "outputs": [], + "source": [ + "vm_deepeval_ds.assign_scores(metrics=[\n", + " \"validmind.scorers.llm.deepeval.Hallucination\",\n", + " \"validmind.scorers.llm.deepeval.AnswerRelevancy\"\n", + "])" + ] + }, + { + "cell_type": "markdown", + "id": "d86a90ab", + "metadata": {}, + "source": [ + "<a id='toc9__'></a>\n", + "\n", + "## In summary\n", + "\n", + "In this notebook, you learned how to:\n", + "- [x] configure the judge provider from environment credentials\n", + "- [x] override the default judge LLM and judge embeddings models\n", + "- [x] initialize ValidMind model and dataset objects for LLM evaluation workflows\n", + "- [x] run prompt-validation tests that use the judge LLM\n", + "- [x] run RAGAS tests that use the judge LLM and judge embeddings\n", + "- [x] run DeepEval scorers that use the local scorer model path" + ] + }, + { + "cell_type": "markdown", + "id": "c7b72b3e", + "metadata": {}, + "source": [ + "<a id='toc10__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can use this notebook as a starting point for your own LLM evaluation workflows. A few practical follow-ups are:\n", + "- replace the sample prompt and datasets with your own evaluation inputs\n", + "- set `OPENAI_MODEL` / `OPENAI_EMBEDDINGS_MODEL` when you want to override the OpenAI judge pair, or `GEMINI_MODEL` / `GEMINI_EMBEDDINGS_MODEL` when you want to standardize the Gemini judge pair used across notebooks or environments\n", + "- expand the set of tests and scorers based on your use case" + ] + }, + { + "cell_type": "markdown", + "id": "e5eb12d8", + "metadata": {}, + "source": [ + "<a id='toc10_1__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "To continue learning about testing and evaluation with the ValidMind Library, explore:\n", + "\n", + "- [Run tests and test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [ValidMind Library overview](https://docs.validmind.ai/developer/validmind-library.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/guide/samples-jupyter-notebooks.html)\n", + "\n", + "You can also visit the [ValidMind documentation](https://docs.validmind.ai/) for broader guidance on configuration, testing workflows, and model documentation." + ] + }, + { + "cell_type": "markdown", + "id": "99a11a0e", + "metadata": {}, + "source": [ + "<a id='toc11__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, periodically check that you are using a recent version so you can access the latest provider integrations, tests, and product improvements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "cfed92f5", + "metadata": {}, + "outputs": [], + "source": [ + "%pip show validmind" + ] + }, + { + "cell_type": "markdown", + "id": "58cc2437", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "aa5c4672", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "id": "copyright-fe0b013da3464949b043e9dbdd34b608", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": ".venv-py31111", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.11" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/site/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb b/site/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb index 6d1a026436..bc07a3cffe 100644 --- a/site/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb +++ b/site/notebooks/how_to/tests/run_tests/configure_tests/enable_pii_detection.ipynb @@ -1,665 +1,669 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "adbd775e", - "metadata": {}, - "source": [ - "# Enable PII detection in tests" - ] - }, - { - "cell_type": "markdown", - "id": "6014f87e", - "metadata": {}, - "source": [ - "Learn how to enable and configure Personally Identifiable Information (PII) detection when running tests with the ValidMind Library. Choose whether or not to include PII in test descriptions generated, or whether or not to include PII in test results logged to the ValidMind Platform." - ] - }, - { - "cell_type": "markdown", - "id": "b92af62b", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library with PII detection](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Get your code snippet](#toc2_2_1__) \n", - "- [Create a custom test that outputs PII](#toc3__) \n", - "- [Running tests under different PII detection modes](#toc4__) \n", - " - [disabled](#toc4_1__) \n", - " - [test_results](#toc4_2__) \n", - " - [test_descriptions](#toc4_3__) \n", - " - [all](#toc4_4__) \n", - "- [Overriding detection](#toc5__) \n", - " - [Override test result logging](#toc5_1__) \n", - " - [Override test descriptions and test result logging](#toc5_2__) \n", - "- [Review logged test results](#toc6__) \n", - "- [Troubleshooting](#toc7__) \n", - "- [Learn more](#toc8__) \n", - "- [Upgrade ValidMind](#toc9__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "id": "570a178e", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." - ] - }, - { - "cell_type": "markdown", - "id": "df929220", - "metadata": {}, - "source": [ - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." - ] - }, - { - "cell_type": "markdown", - "id": "f626d8bd", - "metadata": {}, - "source": [ - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "id": "deb8fd73", - "metadata": {}, - "source": [ - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "id": "32293a17", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "id": "6e23f9b2", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library with PII detection\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", - "<br></br>\n", - "Python 3.8 <= x <= 3.14</div>\n", - "\n", - "To use PII detection powered by [Microsoft Presidio](https://microsoft.github.io/presidio/), install the library with the explicit `[pii-detection]` extra specifier:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b830ae91", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q \"validmind[pii-detection]\"" - ] - }, - { - "cell_type": "markdown", - "id": "fa8a1a7d", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library\n", - "\n", - "ValidMind generates a unique _code snippet_ for each registered model to connect with your developer environment. You initialize the ValidMind Library with this code snippet, which ensures that your documentation and tests are uploaded to the correct model when you run the notebook." - ] - }, - { - "cell_type": "markdown", - "id": "3a467dc2", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "eeda4c8c", - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "82638dab", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Create a custom test that outputs PII\n", - "\n", - "To demonstrate the feature, we'll need a test that outputs PII. First we'll create a custom test that returns:\n", - "\n", - "- A description string containing PII (name, email, phone)\n", - "- A small table containing PII in columns\n", - "\n", - "This output mirrors the structure used in other custom test notebooks and will exercise both table and description PII detection paths. However, if structured detection is unavailable, the library falls back to token-level text scans when possible." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "04d8c802", - "metadata": {}, - "outputs": [], - "source": [ - "import pandas as pd\n", - "\n", - "from validmind import test\n", - "\n", - "@test(\"pii_demo.PIIDetection\")\n", - "def pii_custom_test():\n", - " \"\"\"A custom test that returns demo PII.\n", - " This default test description will display when PII is not sent to the LLM to generate test descriptions based on test result data.\"\"\"\n", - " return pd.DataFrame(\n", - " {\n", - " \"name\": [\"Jane Smith\", \"John Doe\", \"Alice Johnson\"],\n", - " \"email\": [\n", - " \"jane.smith@bank.example\",\n", - " \"john.doe@company.example\",\n", - " \"alice.johnson@service.example\",\n", - " ],\n", - " \"phone\": [\"(212) 555-9876\", \"(415) 555-1234\", \"(646) 555-5678\"],\n", - " }\n", - " )" - ] - }, - { - "cell_type": "markdown", - "id": "96878fab", - "metadata": {}, - "source": [ - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Want to learn more about custom tests?</b></span>\n", - "<br></br>\n", - "Check out our extended introduction to custom tests — <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/custom_tests/implement_custom_tests.html\" style=\"color: #DE257E;\"><b>Implement custom tests</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "id": "0faaceb5", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Running tests under different PII detection modes\n", - "\n", - "Next, let's import [the `run_test` function](https://docs.validmind.ai/validmind/validmind/tests.html#run_test) provided by the `validmind.tests` module to run our custom test via a function called `run_pii_test()` that catches exceptions to observe blocking behavior when PII is present:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b42288e5", - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "from validmind.tests import run_test\n", - "\n", - "# Run test and tag result with unique `result_id`\n", - "def run_pii_test(result_id=\"\"):\n", - " try:\n", - " test_name = f\"pii_demo.PIIDetection:{result_id}\"\n", - " result = run_test(test_name)\n", - "\n", - " # Check if the test description was generated by LLM\n", - " if not result._was_description_generated:\n", - " print(\"PII detected: LLM-generated test description skipped\")\n", - " else:\n", - " print(\"No PII detected or detection disabled: Test description generated by LLM\")\n", - "\n", - " # Try logging test results to the ValidMind Platform\n", - " result.log()\n", - " print(\"No PII detected or detection disabled: Test results logged to the ValidMind Platform\")\n", - " except Exception as e:\n", - " print(\"PII detected: Test results not logged to the ValidMind Platform\")" - ] - }, - { - "cell_type": "markdown", - "id": "9a6e3398", - "metadata": {}, - "source": [ - "We'll then switch the `VALIDMIND_PII_DETECTION` environment variable across modes in the below examples.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Note that since we are running a custom test that does not exist in your model's default documentation template, we'll receive output indicating that a test-driven block doesn't currently exist in your model's documentation for that particular test ID.</b></span>\n", - "<br></br>\n", - "That's expected, as when we run custom tests the results logged need to be manually added to your documentation within the ValidMind Platform or added to your documentation template.</div>" - ] - }, - { - "cell_type": "markdown", - "id": "9801463d", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### disabled\n", - "\n", - "When detection is set to `disabled`, tests run and generate test descriptions. Logging tests with [`.log()`](https://docs.validmind.ai/validmind/validmind/vm_models.html#TestResult.log) will also send test descriptions and test results to the ValidMind Platform as usual:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "3078af64", - "metadata": {}, - "outputs": [], - "source": [ - "print(\"\\n=== Mode: disabled ===\")\n", - "os.environ[\"VALIDMIND_PII_DETECTION\"] = \"disabled\"\n", - "\n", - "# Run test and tag result with unique ID `disabled`\n", - "run_pii_test(\"disabled\")" - ] - }, - { - "cell_type": "markdown", - "id": "89de78cc", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### test_results\n", - "\n", - "When detection is set for `test_results`, tests run and generate test descriptions for review in your environment, but logging tests will not send descriptions or test results to the ValidMind Platform:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "12e61a80", - "metadata": {}, - "outputs": [], - "source": [ - "print(\"\\n=== Mode: test_results ===\")\n", - "os.environ[\"VALIDMIND_PII_DETECTION\"] = \"test_results\"\n", - "\n", - "# Run test and tag result with unique ID `results_blocked`\n", - "run_pii_test(\"results_blocked\")" - ] - }, - { - "cell_type": "markdown", - "id": "8fbe427e", - "metadata": {}, - "source": [ - "<a id='toc4_3__'></a>\n", - "\n", - "### test_descriptions\n", - "\n", - "When detection is set for `test_descriptions`, tests run but will not generate test descriptions, and logging tests will not send descriptions but will send test results to the ValidMind Platform:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "feba6207", - "metadata": {}, - "outputs": [], - "source": [ - "print(\"\\n=== Mode: test_descriptions ===\")\n", - "os.environ[\"VALIDMIND_PII_DETECTION\"] = \"test_descriptions\"\n", - "\n", - "# Run test and tag result with unique ID `desc_blocked`\n", - "run_pii_test(\"desc_blocked\")" - ] - }, - { - "cell_type": "markdown", - "id": "0e8950d1", - "metadata": {}, - "source": [ - "<a id='toc4_4__'></a>\n", - "\n", - "### all\n", - "\n", - "When detection is set to `all`, tests run will not generate test descriptions or log test results to the ValidMind Platform." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "af5040b5", - "metadata": {}, - "outputs": [], - "source": [ - "print(\"\\n=== Mode: all ===\")\n", - "os.environ[\"VALIDMIND_PII_DETECTION\"] = \"all\"\n", - "\n", - "# Run test and tag result with unique ID `all_blocked`\n", - "run_pii_test(\"all_blocked\")" - ] - }, - { - "cell_type": "markdown", - "id": "67240344", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Overriding detection\n", - "\n", - "You can override blocking by passing `unsafe=True` to `result.log(unsafe=True)`, but this is not recommended outside controlled workflows.\n", - "\n", - "To demonstrate, let's rerun our custom test with some override scenarios." - ] - }, - { - "cell_type": "markdown", - "id": "be0510b9", - "metadata": {}, - "source": [ - "<a id='toc5_1__'></a>\n", - "\n", - "### Override test result logging\n", - "\n", - "First, let's rerun our custom test with detection set to `all`, which will send the test results but not the test descriptions to the ValidMind Platform:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0387be21", - "metadata": {}, - "outputs": [], - "source": [ - "print(\"\\n=== Mode: all & unsafe=True ===\")\n", - "os.environ[\"VALIDMIND_PII_DETECTION\"] = \"all\"\n", - "\n", - "# Run test and tag result with unique ID `override_results`\n", - "try:\n", - " result = run_test(\"pii_demo.PIIDetection:override_results\")\n", - "\n", - " # Check if the test description was generated by LLM\n", - " if not result._was_description_generated:\n", - " print(\"PII detected: LLM-generated test description skipped\")\n", - " else:\n", - " print(\"No PII detected or detection disabled: Test description generated by LLM\")\n", - "\n", - " # Try logging test results to the ValidMind Platform\n", - " result.log(unsafe=True)\n", - " print(\"No PII detected, detection disabled, or override set: Test results logged to the ValidMind Platform\")\n", - "except Exception as e:\n", - " print(\"PII detected: Test results not logged to the ValidMind Platform\")" - ] - }, - { - "cell_type": "markdown", - "id": "4e65af32", - "metadata": {}, - "source": [ - "<a id='toc5_2__'></a>\n", - "\n", - "### Override test descriptions and test result logging\n", - "\n", - "To send both the test descriptions and test results via override, set the `VALIDMIND_PII_DETECTION` environment variable to `test_results` while including the override flag:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b40a2670", - "metadata": {}, - "outputs": [], - "source": [ - "print(\"\\n=== Mode: test_results & unsafe=True ===\")\n", - "os.environ[\"VALIDMIND_PII_DETECTION\"] = \"test_results\"\n", - "\n", - "# Run test and tag result with unique ID `override_both`\n", - "try:\n", - " result = run_test(\"pii_demo.PIIDetection:override_both\")\n", - "\n", - " # Check if the test description was generated by LLM\n", - " if not result._was_description_generated:\n", - " print(\"PII detected: LLM-generated test description skipped\")\n", - " else:\n", - " print(\"No PII detected, detection disabled, or override set: Test description generated by LLM\")\n", - "\n", - " # Try logging test results to the ValidMind Platform\n", - " result.log(unsafe=True)\n", - " print(\"No PII detected, detection disabled, or override set: Test results logged to the ValidMind Platform\")\n", - "except Exception as e:\n", - " print(\"PII detected: Test results not logged to the ValidMind Platform\")" - ] - }, - { - "cell_type": "markdown", - "id": "84d6ed78", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Review logged test results\n", - "\n", - "Now let's take a look at the results that were logged to the ValidMind Platform:\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier.\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "3. Click on any section heading to expand that section to add a new test-driven block. (**Learn more:** [Work with test results](https://docs.validmind.ai/guide/documentation/work-with-test-results.html))\n", - "\n", - "4. Under TEST-DRIVEN in the sidebar, click **Custom**.\n", - "\n", - "5. Confirm that you're able to insert the following logged results:\n", - "\n", - " - `pii_demo.PIIDetection:disabled`\n", - " - `pii_demo.PIIDetection:desc_blocked`\n", - " - `pii_demo.PIIDetection:override_results`\n", - " - `pii_demo.PIIDetection:override_both`" - ] - }, - { - "cell_type": "markdown", - "id": "faaa950f", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Troubleshooting\n", - "\n", - "- [x] If you see warnings that Presidio or Presidio analyzer is unavailable, ensure you installed extras: `validmind[pii-detection]`.\n", - "- [x] Ensure your environment is restarted after installing new packages if imports fail." - ] - }, - { - "cell_type": "markdown", - "id": "59c93159", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Learn more\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "id": "8eba96a6", - "metadata": {}, - "source": [ - "<a id='toc9__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you'll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "dffb39a5", - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "id": "dbce28c3", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "id": "6225eab3", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-0bc871eca4814e78b16e692e1f2b3209", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "name": "python", - "version": "3.10" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Enable PII detection in tests" + ], + "id": "adbd775e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Learn how to enable and configure Personally Identifiable Information (PII) detection when running tests with the ValidMind Library. Choose whether or not to include PII in test descriptions generated, or whether or not to include PII in test results logged to the ValidMind Platform." + ], + "id": "6014f87e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library with PII detection](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Get your code snippet](#toc2_2_1__) \n", + "- [Create a custom test that outputs PII](#toc3__) \n", + "- [Running tests under different PII detection modes](#toc4__) \n", + " - [disabled](#toc4_1__) \n", + " - [test_results](#toc4_2__) \n", + " - [test_descriptions](#toc4_3__) \n", + " - [all](#toc4_4__) \n", + "- [Overriding detection](#toc5__) \n", + " - [Override test result logging](#toc5_1__) \n", + " - [Override test descriptions and test result logging](#toc5_2__) \n", + "- [Review logged test results](#toc6__) \n", + "- [Troubleshooting](#toc7__) \n", + "- [Learn more](#toc8__) \n", + "- [Upgrade ValidMind](#toc9__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ], + "id": "b92af62b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." + ], + "id": "570a178e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." + ], + "id": "df929220" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ], + "id": "f626d8bd" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ], + "id": "deb8fd73" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ], + "id": "32293a17" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library with PII detection\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", + "<br></br>\n", + "Python 3.8 <= x <= 3.14</div>\n", + "\n", + "To use PII detection powered by [Microsoft Presidio](https://microsoft.github.io/presidio/), install the library with the explicit `[pii-detection]` extra specifier:" + ], + "id": "6e23f9b2" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q \"validmind[pii-detection]\"" + ], + "execution_count": null, + "outputs": [], + "id": "b830ae91" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library\n", + "\n", + "ValidMind generates a unique _code snippet_ for each registered model to connect with your developer environment. You initialize the ValidMind Library with this code snippet, which ensures that your documentation and tests are uploaded to the correct model when you run the notebook." + ], + "id": "fa8a1a7d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ], + "id": "3a467dc2" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "eeda4c8c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Create a custom test that outputs PII\n", + "\n", + "To demonstrate the feature, we'll need a test that outputs PII. First we'll create a custom test that returns:\n", + "\n", + "- A description string containing PII (name, email, phone)\n", + "- A small table containing PII in columns\n", + "\n", + "This output mirrors the structure used in other custom test notebooks and will exercise both table and description PII detection paths. However, if structured detection is unavailable, the library falls back to token-level text scans when possible." + ], + "id": "82638dab" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import pandas as pd\n", + "\n", + "from validmind import test\n", + "\n", + "@test(\"pii_demo.PIIDetection\")\n", + "def pii_custom_test():\n", + " \"\"\"A custom test that returns demo PII.\n", + " This default test description will display when PII is not sent to the LLM to generate test descriptions based on test result data.\"\"\"\n", + " return pd.DataFrame(\n", + " {\n", + " \"name\": [\"Jane Smith\", \"John Doe\", \"Alice Johnson\"],\n", + " \"email\": [\n", + " \"jane.smith@bank.example\",\n", + " \"john.doe@company.example\",\n", + " \"alice.johnson@service.example\",\n", + " ],\n", + " \"phone\": [\"(212) 555-9876\", \"(415) 555-1234\", \"(646) 555-5678\"],\n", + " }\n", + " )" + ], + "execution_count": null, + "outputs": [], + "id": "04d8c802" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Want to learn more about custom tests?</b></span>\n", + "<br></br>\n", + "Check out our extended introduction to custom tests — <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/custom_tests/implement_custom_tests.html\" style=\"color: #DE257E;\"><b>Implement custom tests</b></a></div>" + ], + "id": "96878fab" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Running tests under different PII detection modes\n", + "\n", + "Next, let's import [the `run_test` function](https://docs.validmind.ai/validmind/validmind/tests.html#run_test) provided by the `validmind.tests` module to run our custom test via a function called `run_pii_test()` that catches exceptions to observe blocking behavior when PII is present:" + ], + "id": "0faaceb5" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import os\n", + "from validmind.tests import run_test\n", + "\n", + "# Run test and tag result with unique `result_id`\n", + "def run_pii_test(result_id=\"\"):\n", + " try:\n", + " test_name = f\"pii_demo.PIIDetection:{result_id}\"\n", + " result = run_test(test_name)\n", + "\n", + " # Check if the test description was generated by LLM\n", + " if not result._was_description_generated:\n", + " print(\"PII detected: LLM-generated test description skipped\")\n", + " else:\n", + " print(\"No PII detected or detection disabled: Test description generated by LLM\")\n", + "\n", + " # Try logging test results to the ValidMind Platform\n", + " result.log()\n", + " print(\"No PII detected or detection disabled: Test results logged to the ValidMind Platform\")\n", + " except Exception as e:\n", + " print(\"PII detected: Test results not logged to the ValidMind Platform\")" + ], + "execution_count": null, + "outputs": [], + "id": "b42288e5" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We'll then switch the `VALIDMIND_PII_DETECTION` environment variable across modes in the below examples.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Note that since we are running a custom test that does not exist in your model's default documentation template, we'll receive output indicating that a test-driven block doesn't currently exist in your model's documentation for that particular test ID.</b></span>\n", + "<br></br>\n", + "That's expected, as when we run custom tests the results logged need to be manually added to your documentation within the ValidMind Platform or added to your documentation template.</div>" + ], + "id": "9a6e3398" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### disabled\n", + "\n", + "When detection is set to `disabled`, tests run and generate test descriptions. Logging tests with [`.log()`](https://docs.validmind.ai/validmind/validmind/vm_models.html#TestResult.log) will also send test descriptions and test results to the ValidMind Platform as usual:" + ], + "id": "9801463d" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "print(\"\\n=== Mode: disabled ===\")\n", + "os.environ[\"VALIDMIND_PII_DETECTION\"] = \"disabled\"\n", + "\n", + "# Run test and tag result with unique ID `disabled`\n", + "run_pii_test(\"disabled\")" + ], + "execution_count": null, + "outputs": [], + "id": "3078af64" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### test_results\n", + "\n", + "When detection is set for `test_results`, tests run and generate test descriptions for review in your environment, but logging tests will not send descriptions or test results to the ValidMind Platform:" + ], + "id": "89de78cc" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "print(\"\\n=== Mode: test_results ===\")\n", + "os.environ[\"VALIDMIND_PII_DETECTION\"] = \"test_results\"\n", + "\n", + "# Run test and tag result with unique ID `results_blocked`\n", + "run_pii_test(\"results_blocked\")" + ], + "execution_count": null, + "outputs": [], + "id": "12e61a80" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_3__'></a>\n", + "\n", + "### test_descriptions\n", + "\n", + "When detection is set for `test_descriptions`, tests run but will not generate test descriptions, and logging tests will not send descriptions but will send test results to the ValidMind Platform:" + ], + "id": "8fbe427e" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "print(\"\\n=== Mode: test_descriptions ===\")\n", + "os.environ[\"VALIDMIND_PII_DETECTION\"] = \"test_descriptions\"\n", + "\n", + "# Run test and tag result with unique ID `desc_blocked`\n", + "run_pii_test(\"desc_blocked\")" + ], + "execution_count": null, + "outputs": [], + "id": "feba6207" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_4__'></a>\n", + "\n", + "### all\n", + "\n", + "When detection is set to `all`, tests run will not generate test descriptions or log test results to the ValidMind Platform." + ], + "id": "0e8950d1" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "print(\"\\n=== Mode: all ===\")\n", + "os.environ[\"VALIDMIND_PII_DETECTION\"] = \"all\"\n", + "\n", + "# Run test and tag result with unique ID `all_blocked`\n", + "run_pii_test(\"all_blocked\")" + ], + "execution_count": null, + "outputs": [], + "id": "af5040b5" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Overriding detection\n", + "\n", + "You can override blocking by passing `unsafe=True` to `result.log(unsafe=True)`, but this is not recommended outside controlled workflows.\n", + "\n", + "To demonstrate, let's rerun our custom test with some override scenarios." + ], + "id": "67240344" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1__'></a>\n", + "\n", + "### Override test result logging\n", + "\n", + "First, let's rerun our custom test with detection set to `all`, which will send the test results but not the test descriptions to the ValidMind Platform:" + ], + "id": "be0510b9" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "print(\"\\n=== Mode: all & unsafe=True ===\")\n", + "os.environ[\"VALIDMIND_PII_DETECTION\"] = \"all\"\n", + "\n", + "# Run test and tag result with unique ID `override_results`\n", + "try:\n", + " result = run_test(\"pii_demo.PIIDetection:override_results\")\n", + "\n", + " # Check if the test description was generated by LLM\n", + " if not result._was_description_generated:\n", + " print(\"PII detected: LLM-generated test description skipped\")\n", + " else:\n", + " print(\"No PII detected or detection disabled: Test description generated by LLM\")\n", + "\n", + " # Try logging test results to the ValidMind Platform\n", + " result.log(unsafe=True)\n", + " print(\"No PII detected, detection disabled, or override set: Test results logged to the ValidMind Platform\")\n", + "except Exception as e:\n", + " print(\"PII detected: Test results not logged to the ValidMind Platform\")" + ], + "execution_count": null, + "outputs": [], + "id": "0387be21" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_2__'></a>\n", + "\n", + "### Override test descriptions and test result logging\n", + "\n", + "To send both the test descriptions and test results via override, set the `VALIDMIND_PII_DETECTION` environment variable to `test_results` while including the override flag:" + ], + "id": "4e65af32" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "print(\"\\n=== Mode: test_results & unsafe=True ===\")\n", + "os.environ[\"VALIDMIND_PII_DETECTION\"] = \"test_results\"\n", + "\n", + "# Run test and tag result with unique ID `override_both`\n", + "try:\n", + " result = run_test(\"pii_demo.PIIDetection:override_both\")\n", + "\n", + " # Check if the test description was generated by LLM\n", + " if not result._was_description_generated:\n", + " print(\"PII detected: LLM-generated test description skipped\")\n", + " else:\n", + " print(\"No PII detected, detection disabled, or override set: Test description generated by LLM\")\n", + "\n", + " # Try logging test results to the ValidMind Platform\n", + " result.log(unsafe=True)\n", + " print(\"No PII detected, detection disabled, or override set: Test results logged to the ValidMind Platform\")\n", + "except Exception as e:\n", + " print(\"PII detected: Test results not logged to the ValidMind Platform\")" + ], + "execution_count": null, + "outputs": [], + "id": "b40a2670" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Review logged test results\n", + "\n", + "Now let's take a look at the results that were logged to the ValidMind Platform:\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier.\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "3. Click on any section heading to expand that section to add a new test-driven block. (**Learn more:** [Work with test results](https://docs.validmind.ai/guide/documentation/work-with-test-results.html))\n", + "\n", + "4. Under TEST-DRIVEN in the sidebar, click **Custom**.\n", + "\n", + "5. Confirm that you're able to insert the following logged results:\n", + "\n", + " - `pii_demo.PIIDetection:disabled`\n", + " - `pii_demo.PIIDetection:desc_blocked`\n", + " - `pii_demo.PIIDetection:override_results`\n", + " - `pii_demo.PIIDetection:override_both`" + ], + "id": "84d6ed78" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Troubleshooting\n", + "\n", + "- [x] If you see warnings that Presidio or Presidio analyzer is unavailable, ensure you installed extras: `validmind[pii-detection]`.\n", + "- [x] Ensure your environment is restarted after installing new packages if imports fail." + ], + "id": "faaa950f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Learn more\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ], + "id": "59c93159" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc9__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you'll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ], + "id": "8eba96a6" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [], + "id": "dffb39a5" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ], + "id": "dbce28c3" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ], + "id": "6225eab3" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-0bc871eca4814e78b16e692e1f2b3209" + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "name": "python", + "version": "3.10" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} \ No newline at end of file diff --git a/site/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb b/site/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb index 9d4c0d72ea..6a4b81ba2d 100644 --- a/site/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb +++ b/site/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.ipynb @@ -1,582 +1,586 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Run tests with multiple datasets\n", - "\n", - "To support running tests that require more than one dataset, ValidMind provides a mechanim that allows you to pass multiple datasets as inputs.\n", - "\n", - "<!--- TO DO Check that this explanation is accurate --->\n", - "To ensure a model generalizes well to new, unseen data, it's common to use separate datasets for training, validation, and testing, with each set serving to check the model's performance at different stages of development. Additionally, since models often encounter data from various sources that might differ in distribution, quality, or type, using multiple datasets in testing can simulate this diversity and better prepare the model for deployment.\n", - "\n", - "This interactive notebook includes the code required to load the demo dataset, preprocess the raw dataset and train a model for testing, initialize ValidMind objects, and run a test that requires multiple datasets." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Preview the documentation template](#toc2_3__) \n", - "- [Load the sample dataset](#toc3__) \n", - "- [Prepocess the raw dataset](#toc4__) \n", - "- [Train models for testing](#toc5__) \n", - "- [Initialize ValidMind objects](#toc6__) \n", - " - [Initialize the ValidMind model](#toc6_1__) \n", - " - [Initialize the ValidMind datasets](#toc6_2__) \n", - "- [Run a test that requires multiple datasets](#toc7__) \n", - " - [Run predictions and link with the model](#toc7_1__) \n", - " - [Run test](#toc7_2__) \n", - "- [Next steps](#toc8__) \n", - " - [Work with your model documentation](#toc8_1__) \n", - " - [Discover more learning resources](#toc8_2__) \n", - "- [Upgrade ValidMind](#toc9__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Load the sample dataset\n", - "\n", - "The sample dataset used here is provided by the ValidMind library. To be able to use it, you need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Import the sample dataset from the library\n", - "\n", - "from validmind.datasets.classification import customer_churn as demo_dataset\n", - "\n", - "print(\n", - " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{demo_dataset.target_column}' \\n\\t• Class labels: {demo_dataset.class_labels}\"\n", - ")\n", - "\n", - "raw_df = demo_dataset.load_data()\n", - "raw_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Prepocess the raw dataset\n", - "\n", - "Preprocessing performs a number of operations to get ready for the subsequent steps:\n", - "\n", - "- Preprocess the data: Splits the DataFrame (`df`) into multiple datasets (`train_df`, `validation_df`, and `test_df`) using `demo_dataset.preprocess` to simplify preprocessing.\n", - "- Separate features and targets: Drops the target column to create feature sets (`x_train`, `x_val`) and target sets (`y_train`, `y_val`)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_df, validation_df, test_df = demo_dataset.preprocess(raw_df)\n", - "x_train = train_df.drop(demo_dataset.target_column, axis=1)\n", - "y_train = train_df[demo_dataset.target_column]\n", - "x_val = validation_df.drop(demo_dataset.target_column, axis=1)\n", - "y_val = validation_df[demo_dataset.target_column]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Train models for testing\n", - "\n", - "Initialize XGBoost and Logistic Regression Classifiers" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from sklearn.linear_model import LogisticRegression\n", - "import xgboost\n", - "\n", - "%matplotlib inline\n", - "\n", - "xgb = xgboost.XGBClassifier(early_stopping_rounds=10)\n", - "xgb.set_params(\n", - " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", - ")\n", - "xgb.fit(\n", - " x_train,\n", - " y_train,\n", - " eval_set=[(x_val, y_val)],\n", - " verbose=False,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Initialize ValidMind objects\n", - "\n", - "<a id='toc6_1__'></a>\n", - "\n", - "### Initialize the ValidMind model" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_model_xgb = vm.init_model(\n", - " xgb,\n", - " input_id=\"xgb\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_2__'></a>\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", - "\n", - "This function takes a number of arguments:\n", - "\n", - "- `dataset` — the raw dataset that you want to provide as input to tests\n", - "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", - "- `target_column` — a required argument if tests require access to true values. This is the name of the target column in the dataset\n", - "- `class_labels` — an optional value to map predicted classes to class labels\n", - "\n", - "With all datasets ready, you can now initialize the raw, training and test datasets (`raw_df`, `train_df` and `test_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds = vm.init_dataset(\n", - " input_id=\"train_dataset\",\n", - " dataset=train_df,\n", - " target_column=demo_dataset.target_column,\n", - ")\n", - "vm_test_ds = vm.init_dataset(\n", - " input_id=\"test_dataset\", dataset=test_df, target_column=demo_dataset.target_column\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Run a test that requires multiple datasets\n", - "\n", - "We are going to show the following in next two blocks:\n", - "\n", - "- Assign predictions for `vm_train_ds` and `vm_test_ds`\n", - "- Run `RobustnessDiagnosis` which is one example test that takes two input datasets" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7_1__'></a>\n", - "\n", - "### Run predictions and link with the model" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(model=vm_model_xgb)\n", - "vm_test_ds.assign_predictions(model=vm_model_xgb)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7_2__'></a>\n", - "\n", - "### Run test" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.RobustnessDiagnosis\",\n", - " inputs={\"datasets\": (vm_train_ds, vm_test_ds), \"model\": vm_model_xgb},\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "<a id='toc8_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "<a id='toc8_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc9__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-72af338f140e4a4bad5cb3954201d23e", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "colab": { - "provenance": [] - }, - "gpuClass": "standard", - "kernelspec": { - "display_name": ".venv", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.13" - } - }, - "nbformat": 4, - "nbformat_minor": 0 -} + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Run tests with multiple datasets\n", + "\n", + "To support running tests that require more than one dataset, ValidMind provides a mechanim that allows you to pass multiple datasets as inputs.\n", + "\n", + "<!--- TO DO Check that this explanation is accurate --->\n", + "To ensure a model generalizes well to new, unseen data, it's common to use separate datasets for training, validation, and testing, with each set serving to check the model's performance at different stages of development. Additionally, since models often encounter data from various sources that might differ in distribution, quality, or type, using multiple datasets in testing can simulate this diversity and better prepare the model for deployment.\n", + "\n", + "This interactive notebook includes the code required to load the demo dataset, preprocess the raw dataset and train a model for testing, initialize ValidMind objects, and run a test that requires multiple datasets." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Preview the documentation template](#toc2_3__) \n", + "- [Load the sample dataset](#toc3__) \n", + "- [Prepocess the raw dataset](#toc4__) \n", + "- [Train models for testing](#toc5__) \n", + "- [Initialize ValidMind objects](#toc6__) \n", + " - [Initialize the ValidMind model](#toc6_1__) \n", + " - [Initialize the ValidMind datasets](#toc6_2__) \n", + "- [Run a test that requires multiple datasets](#toc7__) \n", + " - [Run predictions and link with the model](#toc7_1__) \n", + " - [Run test](#toc7_2__) \n", + "- [Next steps](#toc8__) \n", + " - [Work with your model documentation](#toc8_1__) \n", + " - [Discover more learning resources](#toc8_2__) \n", + "- [Upgrade ValidMind](#toc9__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", + "\n", + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Load the sample dataset\n", + "\n", + "The sample dataset used here is provided by the ValidMind library. To be able to use it, you need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Import the sample dataset from the library\n", + "\n", + "from validmind.datasets.classification import customer_churn as demo_dataset\n", + "\n", + "print(\n", + " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{demo_dataset.target_column}' \\n\\t• Class labels: {demo_dataset.class_labels}\"\n", + ")\n", + "\n", + "raw_df = demo_dataset.load_data()\n", + "raw_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Prepocess the raw dataset\n", + "\n", + "Preprocessing performs a number of operations to get ready for the subsequent steps:\n", + "\n", + "- Preprocess the data: Splits the DataFrame (`df`) into multiple datasets (`train_df`, `validation_df`, and `test_df`) using `demo_dataset.preprocess` to simplify preprocessing.\n", + "- Separate features and targets: Drops the target column to create feature sets (`x_train`, `x_val`) and target sets (`y_train`, `y_val`)." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_df, validation_df, test_df = demo_dataset.preprocess(raw_df)\n", + "x_train = train_df.drop(demo_dataset.target_column, axis=1)\n", + "y_train = train_df[demo_dataset.target_column]\n", + "x_val = validation_df.drop(demo_dataset.target_column, axis=1)\n", + "y_val = validation_df[demo_dataset.target_column]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Train models for testing\n", + "\n", + "Initialize XGBoost and Logistic Regression Classifiers" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from sklearn.linear_model import LogisticRegression\n", + "import xgboost\n", + "\n", + "%matplotlib inline\n", + "\n", + "xgb = xgboost.XGBClassifier(early_stopping_rounds=10)\n", + "xgb.set_params(\n", + " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", + ")\n", + "xgb.fit(\n", + " x_train,\n", + " y_train,\n", + " eval_set=[(x_val, y_val)],\n", + " verbose=False,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Initialize ValidMind objects\n", + "\n", + "<a id='toc6_1__'></a>\n", + "\n", + "### Initialize the ValidMind model" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_model_xgb = vm.init_model(\n", + " xgb,\n", + " input_id=\"xgb\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_2__'></a>\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", + "\n", + "This function takes a number of arguments:\n", + "\n", + "- `dataset` — the raw dataset that you want to provide as input to tests\n", + "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", + "- `target_column` — a required argument if tests require access to true values. This is the name of the target column in the dataset\n", + "- `class_labels` — an optional value to map predicted classes to class labels\n", + "\n", + "With all datasets ready, you can now initialize the raw, training and test datasets (`raw_df`, `train_df` and `test_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds = vm.init_dataset(\n", + " input_id=\"train_dataset\",\n", + " dataset=train_df,\n", + " target_column=demo_dataset.target_column,\n", + ")\n", + "vm_test_ds = vm.init_dataset(\n", + " input_id=\"test_dataset\", dataset=test_df, target_column=demo_dataset.target_column\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Run a test that requires multiple datasets\n", + "\n", + "We are going to show the following in next two blocks:\n", + "\n", + "- Assign predictions for `vm_train_ds` and `vm_test_ds`\n", + "- Run `RobustnessDiagnosis` which is one example test that takes two input datasets" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7_1__'></a>\n", + "\n", + "### Run predictions and link with the model" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(model=vm_model_xgb)\n", + "vm_test_ds.assign_predictions(model=vm_model_xgb)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7_2__'></a>\n", + "\n", + "### Run test" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.RobustnessDiagnosis\",\n", + " inputs={\"datasets\": (vm_train_ds, vm_test_ds), \"model\": vm_model_xgb},\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "<a id='toc8_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "<a id='toc8_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc9__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-72af338f140e4a4bad5cb3954201d23e" + } + ], + "metadata": { + "colab": { + "provenance": [] + }, + "gpuClass": "standard", + "kernelspec": { + "display_name": ".venv", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.13" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/site/notebooks/how_to/tests/run_tests/configure_tests/understand_utilize_rawdata.ipynb b/site/notebooks/how_to/tests/run_tests/configure_tests/understand_utilize_rawdata.ipynb index 936ef70554..e5fb3ead8e 100644 --- a/site/notebooks/how_to/tests/run_tests/configure_tests/understand_utilize_rawdata.ipynb +++ b/site/notebooks/how_to/tests/run_tests/configure_tests/understand_utilize_rawdata.ipynb @@ -1,742 +1,742 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "c18ba8a2", - "metadata": {}, - "source": [ - "# Understand and utilize `RawData` in ValidMind tests\n", - "\n", - "Test functions in ValidMind can return a special object called *`RawData`*, which holds intermediate or unprocessed data produced somewhere in the test logic but not returned as part of the test's visible output, such as in tables or figures.\n", - "\n", - "- The `RawData` feature allows you to customize the output of tests, making it a powerful tool for creating custom tests and post-processing functions.\n", - "- `RawData` is useful when running post-processing functions with tests to recompute tabular outputs, redraw figures, or even create new outputs entirely.\n", - "\n", - "In this notebook, you'll learn how to access, inspect, and utilize `RawData` from ValidMind tests." - ] - }, - { - "cell_type": "markdown", - "id": "5b5b248c", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [Setup](#toc1_) \n", - " - [Installation and intialization](#toc1_1_) \n", - " - [Load the sample dataset](#toc1_2_) \n", - " - [Initialize the ValidMind objects](#toc1_3_) \n", - "- [`RawData` usage examples](#toc2_) \n", - " - [Using `RawData` from the ROC Curve Test](#toc2_1_) \n", - " - [Pearson Correlation Matrix](#toc2_2_) \n", - " - [Precision-Recall Curve](#toc2_3_) \n", - " - [Using `RawData` in custom tests](#toc2_4_) \n", - " - [Using `RawData` in comparison tests](#toc2_5_) \n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "id": "6dd79a98", - "metadata": {}, - "source": [ - "<a id='toc1_'></a>\n", - "\n", - "## Setup\n", - "\n", - "Before we can run our examples, we'll need to set the stage to enable running tests with the ValidMind Library. Since the focus of this notebook is on the `RawData` object, this section will merely summarize the steps instead of going into greater detail. \n", - "\n", - "\n", - "**To learn more about running tests with ValidMind:** [Run tests and test suites](https://docs.validmind.ai/developer/model-testing/testing-overview.html)" - ] - }, - { - "cell_type": "markdown", - "id": "5b6d8d15", - "metadata": {}, - "source": [ - "<a id='toc1_1_'></a>\n", - "\n", - "### Installation and intialization\n", - "\n", - "First, let's make sure that the ValidMind Library is installed and ready to go, and our Python environment set up for data analysis:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "04eb084e", - "metadata": {}, - "outputs": [], - "source": [ - "# Install the ValidMind Library\n", - "%pip install -q validmind\n", - "\n", - "# Initialize the ValidMind Library\n", - "import validmind as vm\n", - "\n", - "# Import the `xgboost` library with an alias\n", - "import xgboost as xgb\n" - ] - }, - { - "cell_type": "markdown", - "id": "5e6aa2cb", - "metadata": {}, - "source": [ - "<a id='toc1_2_'></a>\n", - "\n", - "### Load the sample dataset\n", - "\n", - "Then, we'll import a sample ValidMind dataset and preprocess it:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "50d72eba", - "metadata": {}, - "outputs": [], - "source": [ - "# Import the `customer_churn` sample dataset\n", - "from validmind.datasets.classification import customer_churn\n", - "raw_df = customer_churn.load_data()\n", - "\n", - "# Preprocess the raw dataset\n", - "train_df, validation_df, test_df = customer_churn.preprocess(raw_df)\n", - "\n", - "# Separate features and targets\n", - "x_train = train_df.drop(customer_churn.target_column, axis=1)\n", - "y_train = train_df[customer_churn.target_column]\n", - "x_val = validation_df.drop(customer_churn.target_column, axis=1)\n", - "y_val = validation_df[customer_churn.target_column]\n", - "\n", - "# Create an `XGBClassifier` object\n", - "model = xgb.XGBClassifier(early_stopping_rounds=10)\n", - "model.set_params(\n", - " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", - ")\n", - "\n", - "# Train the model using the validation set\n", - "model.fit(\n", - " x_train,\n", - " y_train,\n", - " eval_set=[(x_val, y_val)],\n", - " verbose=False,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "e3895d35", - "metadata": {}, - "source": [ - "<a id='toc1_3_'></a>\n", - "\n", - "### Initialize the ValidMind objects" - ] - }, - { - "cell_type": "markdown", - "id": "c0e441f4", - "metadata": {}, - "source": [ - "Before you can run tests, you'll need to initialize a ValidMind dataset object, as well as a ValidMind model object that can be passed to other functions for analysis and tests on the data:\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b2310bc4", - "metadata": {}, - "outputs": [], - "source": [ - "# Initialize the dataset object\n", - "vm_raw_dataset = vm.init_dataset(\n", - " dataset=raw_df,\n", - " input_id=\"raw_dataset\",\n", - " target_column=customer_churn.target_column,\n", - " class_labels=customer_churn.class_labels,\n", - " __log=False,\n", - ")\n", - "\n", - "# Initialize the datasets into their own ValidMind dataset objects\n", - "vm_train_ds = vm.init_dataset(\n", - " dataset=train_df,\n", - " input_id=\"train_dataset\",\n", - " target_column=customer_churn.target_column,\n", - " __log=False,\n", - ")\n", - "vm_test_ds = vm.init_dataset(\n", - " dataset=test_df,\n", - " input_id=\"test_dataset\",\n", - " target_column=customer_churn.target_column,\n", - " __log=False,\n", - ")\n", - "\n", - "# Initialize the ValidMind model object wrapper so that it can be passed as input to tests or test suites\n", - "# ValidMind model objects can be any type of record you want to test, document, validate, or monitor\n", - "vm_model = vm.init_model(\n", - " model,\n", - " input_id=\"model\",\n", - " __log=False,\n", - ")\n", - "\n", - "# Assign predictions to the datasets\n", - "vm_train_ds.assign_predictions(\n", - " model=vm_model,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_model,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "25ec99fc", - "metadata": {}, - "source": [ - "<a id='toc2_'></a>\n", - "\n", - "## `RawData` usage examples\n", - "\n", - "Once you're set up to run tests, you can then try out the following examples:\n", - "\n", - " - [Using `RawData` from the ROC Curve Test](#toc2_1_) \n", - " - [Pearson Correlation Matrix](#toc2_2_) \n", - " - [Precision-Recall Curve](#toc2_3_) \n", - " - [Using `RawData` in custom tests](#toc2_4_) \n", - " - [Using `RawData` in comparison tests](#toc2_5_) " - ] - }, - { - "cell_type": "markdown", - "id": "33d79841", - "metadata": {}, - "source": [ - "<a id='toc2_1_'></a>\n", - "\n", - "### Using `RawData` from the ROC Curve Test\n", - "\n", - "In this introductory example, we run the ROC Curve test, inspect its `RawData` output, and then create a custom ROC curve using the raw data values.\n", - "\n", - "First, let's run the default ROC Curve test for comparsion with later iterations:\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "58a3a779", - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.tests import run_test\n", - "\n", - "# Run the ROC Curve test normally\n", - "result_roc = run_test(\n", - " \"validmind.model_validation.sklearn.ROCCurve\",\n", - " inputs={\"dataset\": vm_test_ds, \"model\": vm_model},\n", - " generate_description=False,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "66c44fe0", - "metadata": {}, - "source": [ - "Now let's assume we want to create a custom version of the above figure. First, let's inspect the raw data that this test produces so we can see what we have to work with.\n", - "\n", - "`RawData` objects have a `inspect()` method that will pretty print the attributes of the object to be able to quickly see the data and its types:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "513ce01e", - "metadata": {}, - "outputs": [], - "source": [ - "# Inspect the RawData output from the ROC test\n", - "print(\"RawData from ROC Curve Test:\")\n", - "result_roc.raw_data.inspect()" - ] - }, - { - "cell_type": "markdown", - "id": "586f3a12", - "metadata": {}, - "source": [ - "As we can see, the ROC Curve returns a `RawData` object with the following attributes:\n", - "- **`fpr`:** A list of false positive rates\n", - "- **`tpr`:** A list of true positive rates\n", - "- **`auc`:** The area under the curve\n", - "\n", - "This should be enough to create our own custom ROC curve via a post-processing function without having to create a whole new test from scratch and without having to recompute any of the data:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "613778d2", - "metadata": {}, - "outputs": [], - "source": [ - "import matplotlib.pyplot as plt\n", - "\n", - "from validmind.vm_models.result import TestResult\n", - "\n", - "\n", - "def custom_roc_curve(result: TestResult):\n", - " # Extract raw data from the test result\n", - " fpr = result.raw_data.fpr\n", - " tpr = result.raw_data.tpr\n", - " auc = result.raw_data.auc\n", - "\n", - " # Create a custom ROC curve plot\n", - " fig = plt.figure()\n", - " plt.plot(fpr, tpr, label=f\"Custom ROC (AUC = {auc:.2f})\", color=\"blue\")\n", - " plt.plot([0, 1], [0, 1], linestyle=\"--\", color=\"gray\", label=\"Random Guess\")\n", - " plt.xlabel(\"False Positive Rate\")\n", - " plt.ylabel(\"True Positive Rate\")\n", - " plt.title(\"Custom ROC Curve from RawData\")\n", - " plt.legend()\n", - "\n", - " # close the plot to avoid it automatically being shown in the notebook\n", - " plt.close()\n", - "\n", - " # remove existing figure\n", - " result.remove_figure(0)\n", - "\n", - " # add new figure\n", - " result.add_figure(fig)\n", - "\n", - " return result\n", - "\n", - "# test it on the existing result\n", - "modified_result = custom_roc_curve(result_roc)\n", - "\n", - "# show the modified result\n", - "modified_result.show()" - ] - }, - { - "cell_type": "markdown", - "id": "794d026c", - "metadata": {}, - "source": [ - "Now that we have created a post-processing function and verified that it works on our existing test result, we can use it directly in `run_test()` from now on:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "7c7566f3", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"validmind.model_validation.sklearn.ROCCurve\",\n", - " inputs={\"dataset\": vm_test_ds, \"model\": vm_model},\n", - " post_process_fn=custom_roc_curve,\n", - " generate_description=False,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "1d0b94aa", - "metadata": {}, - "source": [ - "<a id='toc2_2_'></a>\n", - "\n", - "### Pearson Correlation Matrix\n", - "\n", - "In this next example, try commenting out the `post_process_fn` argument in the following cell and see what happens between different runs:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "c57fb01b", - "metadata": {}, - "outputs": [], - "source": [ - "import plotly.graph_objects as go\n", - "\n", - "\n", - "def custom_heatmap(result: TestResult):\n", - " corr_matrix = result.raw_data.correlation_matrix\n", - "\n", - " heatmap = go.Heatmap(\n", - " z=corr_matrix.values,\n", - " x=list(corr_matrix.columns),\n", - " y=list(corr_matrix.index),\n", - " colorscale=\"Viridis\",\n", - " )\n", - " fig = go.Figure(data=[heatmap])\n", - " fig.update_layout(title=\"Custom Heatmap from RawData\")\n", - "\n", - " plt.close()\n", - "\n", - " result.remove_figure(0)\n", - " result.add_figure(fig)\n", - "\n", - " return result\n", - "\n", - "\n", - "result_corr = run_test(\n", - " \"validmind.data_validation.PearsonCorrelationMatrix\",\n", - " inputs={\"dataset\": vm_test_ds},\n", - " generate_description=False,\n", - " # COMMENT OUT `post_process_fn`\n", - " post_process_fn=custom_heatmap,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "0a7cbbc6", - "metadata": {}, - "source": [ - "<a id='toc2_3_'></a>\n", - "\n", - "### Precision-Recall Curve\n", - "\n", - "Then, let's try the same thing with the Precision-Recall Curve test:\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d16c5209", - "metadata": {}, - "outputs": [], - "source": [ - "def custom_pr_curve(result: TestResult):\n", - " precision = result.raw_data.precision\n", - " recall = result.raw_data.recall\n", - "\n", - " fig = plt.figure()\n", - " plt.plot(recall, precision, label=\"Precision-Recall Curve\")\n", - " plt.xlabel(\"Recall\")\n", - " plt.ylabel(\"Precision\")\n", - " plt.title(\"Custom Precision-Recall Curve from RawData\")\n", - " plt.legend()\n", - "\n", - " plt.close()\n", - " result.remove_figure(0)\n", - " result.add_figure(fig)\n", - "\n", - " return result\n", - "\n", - "result_pr = run_test(\n", - " \"validmind.model_validation.sklearn.PrecisionRecallCurve\",\n", - " inputs={\"dataset\": vm_test_ds, \"model\": vm_model},\n", - " generate_description=False,\n", - " # COMMENT OUT `post_process_fn`\n", - " post_process_fn=custom_pr_curve,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "e25391a4", - "metadata": {}, - "source": [ - "<a id='toc2_4_'></a>\n", - "\n", - "### Using `RawData` in custom tests\n", - "\n", - "These examples demonstrate some very simple ways to use the `RawData` feature of ValidMind tests. The majority of ValidMind-developed tests return some form of raw data that can be used to customize the output of the test, but you can also create your own tests that return `RawData` objects and use them in the same way.\n", - "\n", - "Let's take a look at how this can be done in custom tests. To start, define and run your custom test:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "dc6a389f", - "metadata": {}, - "outputs": [], - "source": [ - "import pandas as pd\n", - "\n", - "from validmind import test, RawData\n", - "from validmind.vm_models import VMDataset, VMModel\n", - "\n", - "\n", - "@test(\"custom.MyCustomTest\")\n", - "def MyCustomTest(dataset: VMDataset, model: VMModel) -> tuple[go.Figure, RawData]:\n", - " \"\"\"Custom test that produces a figure and a RawData object\"\"\"\n", - " # pretend we are using the dataset and model to compute some data\n", - " # ...\n", - "\n", - " # create some fake data that will be used to generate a figure\n", - " data = pd.DataFrame({\"x\": [10, 20, 30, 40, 50], \"y\": [10, 20, 30, 40, 50]})\n", - "\n", - " # create the figure (scatter plot)\n", - " fig = go.Figure(data=go.Scatter(x=data[\"x\"], y=data[\"y\"]))\n", - "\n", - " # now let's create a RawData object that holds the \"computed\" data\n", - " raw_data = RawData(scatter_data_df=data)\n", - "\n", - " # finally, return both the figure and the raw data\n", - " return fig, raw_data\n", - "\n", - "\n", - "my_result = run_test(\n", - " \"custom.MyCustomTest\",\n", - " inputs={\"dataset\": vm_test_ds, \"model\": vm_model},\n", - " generate_description=False,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "854c219c", - "metadata": {}, - "source": [ - "We can see that the test result shows the figure. But since we returned a `RawData` object, we can also inspect the contents and see how we could use it to customize or regenerate the figure in the post-processing function:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "1cb661d1", - "metadata": {}, - "outputs": [], - "source": [ - "my_result.raw_data.inspect()" - ] - }, - { - "cell_type": "markdown", - "id": "55ad4acd", - "metadata": {}, - "source": [ - "We can see that we get a nicely-formatted preview of the dataframe we stored in the raw data object. Let's go ahead and use it to re-plot our data:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "c1242083", - "metadata": {}, - "outputs": [], - "source": [ - "def custom_plot(result: TestResult):\n", - " data = result.raw_data.scatter_data_df\n", - "\n", - " # use something other than a scatter plot\n", - " fig = go.Figure(data=go.Bar(x=data[\"x\"], y=data[\"y\"]))\n", - " fig.update_layout(title=\"Custom Bar Chart from RawData\")\n", - " fig.update_xaxes(title=\"X Axis\")\n", - " fig.update_yaxes(title=\"Y Axis\")\n", - "\n", - " result.remove_figure(0)\n", - " result.add_figure(fig)\n", - "\n", - " return result\n", - "\n", - "result = run_test(\n", - " \"custom.MyCustomTest\",\n", - " inputs={\"dataset\": vm_test_ds, \"model\": vm_model},\n", - " post_process_fn=custom_plot,\n", - " generate_description=False,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "53084493", - "metadata": {}, - "source": [ - "<a id='toc2_5_'></a>\n", - "\n", - "### Using `RawData` in comparison tests\n", - "\n", - "When running comparison tests, the `RawData` object will contain the raw data for each individual test result as well as the comparison results between the test results. To support this, the RawData object contains the model and dataset input_ids for each of the datasets and models in the test, so that the post-processing function can use them to customize the output. The example below shows how to use the `RawData` object to customize the output of a comparison test and add a table to the test result that shows the confusion matrix for each individual test result as well as the comparison results between the test results.\n", - "\n", - "When designing post-processing functions that need to handle both individual and comparison test results, you can check the structure of the raw data to determine which case you're dealing with. In the example below, we check if `confusion_matrix` is a list (comparison test with multiple matrices) or a single matrix (individual test). For comparison tests, the function creates two tables: one showing the confusion matrices for each test case, and another showing the percentage drift between them. For individual tests, it creates a single table with the confusion matrix values. This pattern of checking the raw data structure can be applied to other tests to create versatile post-processing functions that work in both scenarios.\n" - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "id": "bcbbe9f4", - "metadata": {}, - "outputs": [], - "source": [ - "def cm_table(result: TestResult):\n", - " # For individual results\n", - " if not isinstance(result.raw_data.confusion_matrix, list):\n", - " # Extract values from single confusion matrix\n", - " cm = result.raw_data.confusion_matrix\n", - " tn, fp = cm[0, 0], cm[0, 1]\n", - " fn, tp = cm[1, 0], cm[1, 1]\n", - " \n", - " # Create DataFrame for individual matrix\n", - " cm_df = pd.DataFrame({\n", - " 'TN': [tn],\n", - " 'FP': [fp],\n", - " 'FN': [fn],\n", - " 'TP': [tp]\n", - " })\n", - " \n", - " # Add individual table\n", - " result.add_table(cm_df, title=\"Confusion Matrix\")\n", - " \n", - " # For comparison results\n", - " else:\n", - " cms = result.raw_data.confusion_matrix\n", - " cm1, cm2 = cms[0], cms[1]\n", - " \n", - " # Create individual results table\n", - " rows = []\n", - " for i, cm in enumerate(cms):\n", - " rows.append({\n", - " 'dataset': result.raw_data.dataset[i],\n", - " 'model': result.raw_data.model[i],\n", - " 'TN': cm[0, 0],\n", - " 'FP': cm[0, 1],\n", - " 'FN': cm[1, 0],\n", - " 'TP': cm[1, 1]\n", - " })\n", - " individual_df = pd.DataFrame(rows)\n", - " \n", - " # Calculate percentage differences\n", - " diff_df = pd.DataFrame({\n", - " 'TN_drift (%)': [(cm2[0, 0] - cm1[0, 0]) / cm1[0, 0] * 100],\n", - " 'FP_drift (%)': [(cm2[0, 1] - cm1[0, 1]) / cm1[0, 1] * 100],\n", - " 'FN_drift (%)': [(cm2[1, 0] - cm1[1, 0]) / cm1[1, 0] * 100],\n", - " 'TP_drift (%)': [(cm2[1, 1] - cm1[1, 1]) / cm1[1, 1] * 100]\n", - " }).round(2)\n", - " \n", - " # Add both tables\n", - " result.add_table(individual_df, title=\"Individual Confusion Matrices\")\n", - " result.add_table(diff_df, title=\"Confusion Matrix Drift\")\n", - " \n", - " return result" - ] - }, - { - "cell_type": "markdown", - "id": "41edd959", - "metadata": {}, - "source": [ - "Let's first run the confusion matrix test on a single dataset-model pair to see how our post-processing function handles individual results:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "cf3c47fe", - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.tests import run_test\n", - "\n", - "result_cm = run_test(\n", - " \"validmind.model_validation.sklearn.ConfusionMatrix\",\n", - " inputs={\n", - " \"dataset\": vm_test_ds,\n", - " \"model\": vm_model,\n", - " },\n", - " post_process_fn=cm_table,\n", - " generate_description=False,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "a2482c54", - "metadata": {}, - "source": [ - "Now let's run a comparison test between test and train datasets to see how the function handles multiple results:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "6a1b4388", - "metadata": {}, - "outputs": [], - "source": [ - "result_cm = run_test(\n", - " \"validmind.model_validation.sklearn.ConfusionMatrix\",\n", - " input_grid={\n", - " \"dataset\": [vm_test_ds, vm_train_ds],\n", - " \"model\": [vm_model]\n", - " },\n", - " post_process_fn=cm_table,\n", - " generate_description=False,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "9f7d361a", - "metadata": {}, - "source": [ - "Let's inspect the raw data to see how comparison tests structure their data - notice how the `RawData` object contains not just the confusion matrices for both datasets, but also tracks which dataset and model each result came from:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "012ec495", - "metadata": {}, - "outputs": [], - "source": [ - "result_cm.raw_data.inspect()" - ] - }, - { - "cell_type": "markdown", - "id": "copyright-d9a502e868ba4fc1a70056873609b472", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.15" - } - }, - "nbformat": 4, - "nbformat_minor": 5 + "cells": [ + { + "cell_type": "markdown", + "id": "c18ba8a2", + "metadata": {}, + "source": [ + "# Understand and utilize `RawData` in ValidMind tests\n", + "\n", + "Test functions in ValidMind can return a special object called *`RawData`*, which holds intermediate or unprocessed data produced somewhere in the test logic but not returned as part of the test's visible output, such as in tables or figures.\n", + "\n", + "- The `RawData` feature allows you to customize the output of tests, making it a powerful tool for creating custom tests and post-processing functions.\n", + "- `RawData` is useful when running post-processing functions with tests to recompute tabular outputs, redraw figures, or even create new outputs entirely.\n", + "\n", + "In this notebook, you'll learn how to access, inspect, and utilize `RawData` from ValidMind tests." + ] + }, + { + "cell_type": "markdown", + "id": "5b5b248c", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [Setup](#toc1_) \n", + " - [Installation and intialization](#toc1_1_) \n", + " - [Load the sample dataset](#toc1_2_) \n", + " - [Initialize the ValidMind objects](#toc1_3_) \n", + "- [`RawData` usage examples](#toc2_) \n", + " - [Using `RawData` from the ROC Curve Test](#toc2_1_) \n", + " - [Pearson Correlation Matrix](#toc2_2_) \n", + " - [Precision-Recall Curve](#toc2_3_) \n", + " - [Using `RawData` in custom tests](#toc2_4_) \n", + " - [Using `RawData` in comparison tests](#toc2_5_) \n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "id": "6dd79a98", + "metadata": {}, + "source": [ + "<a id='toc1_'></a>\n", + "\n", + "## Setup\n", + "\n", + "Before we can run our examples, we'll need to set the stage to enable running tests with the ValidMind Library. Since the focus of this notebook is on the `RawData` object, this section will merely summarize the steps instead of going into greater detail. \n", + "\n", + "\n", + "**To learn more about running tests with ValidMind:** [Run tests and test suites](https://docs.validmind.ai/developer/model-testing/testing-overview.html)" + ] + }, + { + "cell_type": "markdown", + "id": "5b6d8d15", + "metadata": {}, + "source": [ + "<a id='toc1_1_'></a>\n", + "\n", + "### Installation and intialization\n", + "\n", + "First, let's make sure that the ValidMind Library is installed and ready to go, and our Python environment set up for data analysis:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "04eb084e", + "metadata": {}, + "outputs": [], + "source": [ + "# Install the ValidMind Library\n", + "%pip install -q validmind\n", + "\n", + "# Initialize the ValidMind Library\n", + "import validmind as vm\n", + "\n", + "# Import the `xgboost` library with an alias\n", + "import xgboost as xgb\n" + ] + }, + { + "cell_type": "markdown", + "id": "5e6aa2cb", + "metadata": {}, + "source": [ + "<a id='toc1_2_'></a>\n", + "\n", + "### Load the sample dataset\n", + "\n", + "Then, we'll import a sample ValidMind dataset and preprocess it:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "50d72eba", + "metadata": {}, + "outputs": [], + "source": [ + "# Import the `customer_churn` sample dataset\n", + "from validmind.datasets.classification import customer_churn\n", + "raw_df = customer_churn.load_data()\n", + "\n", + "# Preprocess the raw dataset\n", + "train_df, validation_df, test_df = customer_churn.preprocess(raw_df)\n", + "\n", + "# Separate features and targets\n", + "x_train = train_df.drop(customer_churn.target_column, axis=1)\n", + "y_train = train_df[customer_churn.target_column]\n", + "x_val = validation_df.drop(customer_churn.target_column, axis=1)\n", + "y_val = validation_df[customer_churn.target_column]\n", + "\n", + "# Create an `XGBClassifier` object\n", + "model = xgb.XGBClassifier(early_stopping_rounds=10)\n", + "model.set_params(\n", + " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", + ")\n", + "\n", + "# Train the model using the validation set\n", + "model.fit(\n", + " x_train,\n", + " y_train,\n", + " eval_set=[(x_val, y_val)],\n", + " verbose=False,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "e3895d35", + "metadata": {}, + "source": [ + "<a id='toc1_3_'></a>\n", + "\n", + "### Initialize the ValidMind objects" + ] + }, + { + "cell_type": "markdown", + "id": "c0e441f4", + "metadata": {}, + "source": [ + "Before you can run tests, you'll need to initialize a ValidMind dataset object, as well as a ValidMind model object that can be passed to other functions for analysis and tests on the data:\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b2310bc4", + "metadata": {}, + "outputs": [], + "source": [ + "# Initialize the dataset object\n", + "vm_raw_dataset = vm.init_dataset(\n", + " dataset=raw_df,\n", + " input_id=\"raw_dataset\",\n", + " target_column=customer_churn.target_column,\n", + " class_labels=customer_churn.class_labels,\n", + " __log=False,\n", + ")\n", + "\n", + "# Initialize the datasets into their own ValidMind dataset objects\n", + "vm_train_ds = vm.init_dataset(\n", + " dataset=train_df,\n", + " input_id=\"train_dataset\",\n", + " target_column=customer_churn.target_column,\n", + " __log=False,\n", + ")\n", + "vm_test_ds = vm.init_dataset(\n", + " dataset=test_df,\n", + " input_id=\"test_dataset\",\n", + " target_column=customer_churn.target_column,\n", + " __log=False,\n", + ")\n", + "\n", + "# Initialize the ValidMind model object wrapper so that it can be passed as input to tests or test suites\n", + "# ValidMind model objects can be any type of record you want to test, document, validate, or monitor\n", + "vm_model = vm.init_model(\n", + " model,\n", + " input_id=\"model\",\n", + " __log=False,\n", + ")\n", + "\n", + "# Assign predictions to the datasets\n", + "vm_train_ds.assign_predictions(\n", + " model=vm_model,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_model,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "25ec99fc", + "metadata": {}, + "source": [ + "<a id='toc2_'></a>\n", + "\n", + "## `RawData` usage examples\n", + "\n", + "Once you're set up to run tests, you can then try out the following examples:\n", + "\n", + " - [Using `RawData` from the ROC Curve Test](#toc2_1_) \n", + " - [Pearson Correlation Matrix](#toc2_2_) \n", + " - [Precision-Recall Curve](#toc2_3_) \n", + " - [Using `RawData` in custom tests](#toc2_4_) \n", + " - [Using `RawData` in comparison tests](#toc2_5_) " + ] + }, + { + "cell_type": "markdown", + "id": "33d79841", + "metadata": {}, + "source": [ + "<a id='toc2_1_'></a>\n", + "\n", + "### Using `RawData` from the ROC Curve Test\n", + "\n", + "In this introductory example, we run the `model_validation.sklearn.ROCCurve` test, inspect its `RawData` output, and then create a custom ROC curve using the raw data values.\n", + "\n", + "First, let's run the default ROC Curve test for comparison with later iterations:\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "58a3a779", + "metadata": {}, + "outputs": [], + "source": [ + "from validmind.tests import run_test\n", + "\n", + "# Run the ROC Curve test normally\n", + "result_roc = run_test(\n", + " \"validmind.model_validation.sklearn.ROCCurve\",\n", + " inputs={\"dataset\": vm_test_ds, \"model\": vm_model},\n", + " generate_description=False,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "66c44fe0", + "metadata": {}, + "source": [ + "Now let's assume we want to create a custom version of the above figure. First, let's inspect the raw data that this test produces so we can see what we have to work with.\n", + "\n", + "`RawData` objects have a `inspect()` method that will pretty print the attributes of the object to be able to quickly see the data and its types:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "513ce01e", + "metadata": {}, + "outputs": [], + "source": [ + "# Inspect the RawData output from the ROC test\n", + "print(\"RawData from ROC Curve Test:\")\n", + "result_roc.raw_data.inspect()" + ] + }, + { + "cell_type": "markdown", + "id": "586f3a12", + "metadata": {}, + "source": [ + "As we can see, the ROC Curve test returns a `RawData` object with the following attributes:\n", + "- **`fpr`:** A list of false positive rates\n", + "- **`tpr`:** A list of true positive rates\n", + "- **`auc`:** The area under the curve\n", + "\n", + "This should be enough to create our own custom ROC curve via a post-processing function without having to create a whole new test from scratch and without having to recompute any of the data:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "613778d2", + "metadata": {}, + "outputs": [], + "source": [ + "import matplotlib.pyplot as plt\n", + "\n", + "from validmind.vm_models.result import TestResult\n", + "\n", + "\n", + "def custom_roc_curve(result: TestResult):\n", + " # Extract raw data from the test result\n", + " fpr = result.raw_data.fpr\n", + " tpr = result.raw_data.tpr\n", + " auc = result.raw_data.auc\n", + "\n", + " # Create a custom ROC curve plot\n", + " fig = plt.figure()\n", + " plt.plot(fpr, tpr, label=f\"Custom ROC (AUC = {auc:.2f})\", color=\"blue\")\n", + " plt.plot([0, 1], [0, 1], linestyle=\"--\", color=\"gray\", label=\"Random Guess\")\n", + " plt.xlabel(\"False Positive Rate\")\n", + " plt.ylabel(\"True Positive Rate\")\n", + " plt.title(\"Custom ROC Curve from RawData\")\n", + " plt.legend()\n", + "\n", + " # close the plot to avoid it automatically being shown in the notebook\n", + " plt.close()\n", + "\n", + " # remove existing figure\n", + " result.remove_figure(0)\n", + "\n", + " # add new figure\n", + " result.add_figure(fig)\n", + "\n", + " return result\n", + "\n", + "# test it on the existing result\n", + "modified_result = custom_roc_curve(result_roc)\n", + "\n", + "# show the modified result\n", + "modified_result.show()" + ] + }, + { + "cell_type": "markdown", + "id": "794d026c", + "metadata": {}, + "source": [ + "Now that we have created a post-processing function and verified that it works on our existing test result, we can use it directly in `run_test()` from now on:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7c7566f3", + "metadata": {}, + "outputs": [], + "source": [ + "result = run_test(\n", + " \"validmind.model_validation.sklearn.ROCCurve\",\n", + " inputs={\"dataset\": vm_test_ds, \"model\": vm_model},\n", + " post_process_fn=custom_roc_curve,\n", + " generate_description=False,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "1d0b94aa", + "metadata": {}, + "source": [ + "<a id='toc2_2_'></a>\n", + "\n", + "### Pearson Correlation Matrix\n", + "\n", + "In this next example, try commenting out the `post_process_fn` argument in the following cell and see what happens between different runs:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c57fb01b", + "metadata": {}, + "outputs": [], + "source": [ + "import plotly.graph_objects as go\n", + "\n", + "\n", + "def custom_heatmap(result: TestResult):\n", + " corr_matrix = result.raw_data.correlation_matrix\n", + "\n", + " heatmap = go.Heatmap(\n", + " z=corr_matrix.values,\n", + " x=list(corr_matrix.columns),\n", + " y=list(corr_matrix.index),\n", + " colorscale=\"Viridis\",\n", + " )\n", + " fig = go.Figure(data=[heatmap])\n", + " fig.update_layout(title=\"Custom Heatmap from RawData\")\n", + "\n", + " plt.close()\n", + "\n", + " result.remove_figure(0)\n", + " result.add_figure(fig)\n", + "\n", + " return result\n", + "\n", + "\n", + "result_corr = run_test(\n", + " \"validmind.data_validation.PearsonCorrelationMatrix\",\n", + " inputs={\"dataset\": vm_test_ds},\n", + " generate_description=False,\n", + " # COMMENT OUT `post_process_fn`\n", + " post_process_fn=custom_heatmap,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "0a7cbbc6", + "metadata": {}, + "source": [ + "<a id='toc2_3_'></a>\n", + "\n", + "### Precision-Recall Curve\n", + "\n", + "Then, let's try the same thing with the `model_validation.sklearn.PrecisionRecallCurve` test:\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d16c5209", + "metadata": {}, + "outputs": [], + "source": [ + "def custom_pr_curve(result: TestResult):\n", + " precision = result.raw_data.precision\n", + " recall = result.raw_data.recall\n", + "\n", + " fig = plt.figure()\n", + " plt.plot(recall, precision, label=\"Precision-Recall Curve\")\n", + " plt.xlabel(\"Recall\")\n", + " plt.ylabel(\"Precision\")\n", + " plt.title(\"Custom Precision-Recall Curve from RawData\")\n", + " plt.legend()\n", + "\n", + " plt.close()\n", + " result.remove_figure(0)\n", + " result.add_figure(fig)\n", + "\n", + " return result\n", + "\n", + "result_pr = run_test(\n", + " \"validmind.model_validation.sklearn.PrecisionRecallCurve\",\n", + " inputs={\"dataset\": vm_test_ds, \"model\": vm_model},\n", + " generate_description=False,\n", + " # COMMENT OUT `post_process_fn`\n", + " post_process_fn=custom_pr_curve,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "e25391a4", + "metadata": {}, + "source": [ + "<a id='toc2_4_'></a>\n", + "\n", + "### Using `RawData` in custom tests\n", + "\n", + "These examples demonstrate some very simple ways to use the `RawData` feature of ValidMind tests. The majority of ValidMind-developed tests return some form of raw data that can be used to customize the output of the test, but you can also create your own tests that return `RawData` objects and use them in the same way.\n", + "\n", + "Let's take a look at how this can be done in custom tests. To start, define and run your custom test:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "dc6a389f", + "metadata": {}, + "outputs": [], + "source": [ + "import pandas as pd\n", + "\n", + "from validmind import test, RawData\n", + "from validmind.vm_models import VMDataset, VMModel\n", + "\n", + "\n", + "@test(\"custom.MyCustomTest\")\n", + "def MyCustomTest(dataset: VMDataset, model: VMModel) -> tuple[go.Figure, RawData]:\n", + " \"\"\"Custom test that produces a figure and a RawData object\"\"\"\n", + " # pretend we are using the dataset and model to compute some data\n", + " # ...\n", + "\n", + " # create some fake data that will be used to generate a figure\n", + " data = pd.DataFrame({\"x\": [10, 20, 30, 40, 50], \"y\": [10, 20, 30, 40, 50]})\n", + "\n", + " # create the figure (scatter plot)\n", + " fig = go.Figure(data=go.Scatter(x=data[\"x\"], y=data[\"y\"]))\n", + "\n", + " # now let's create a RawData object that holds the \"computed\" data\n", + " raw_data = RawData(scatter_data_df=data)\n", + "\n", + " # finally, return both the figure and the raw data\n", + " return fig, raw_data\n", + "\n", + "\n", + "my_result = run_test(\n", + " \"custom.MyCustomTest\",\n", + " inputs={\"dataset\": vm_test_ds, \"model\": vm_model},\n", + " generate_description=False,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "854c219c", + "metadata": {}, + "source": [ + "We can see that the test result shows the figure. But since we returned a `RawData` object, we can also inspect the contents and see how we could use it to customize or regenerate the figure in the post-processing function:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1cb661d1", + "metadata": {}, + "outputs": [], + "source": [ + "my_result.raw_data.inspect()" + ] + }, + { + "cell_type": "markdown", + "id": "55ad4acd", + "metadata": {}, + "source": [ + "We can see that we get a nicely-formatted preview of the dataframe we stored in the raw data object. Let's go ahead and use it to re-plot our data:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c1242083", + "metadata": {}, + "outputs": [], + "source": [ + "def custom_plot(result: TestResult):\n", + " data = result.raw_data.scatter_data_df\n", + "\n", + " # use something other than a scatter plot\n", + " fig = go.Figure(data=go.Bar(x=data[\"x\"], y=data[\"y\"]))\n", + " fig.update_layout(title=\"Custom Bar Chart from RawData\")\n", + " fig.update_xaxes(title=\"X Axis\")\n", + " fig.update_yaxes(title=\"Y Axis\")\n", + "\n", + " result.remove_figure(0)\n", + " result.add_figure(fig)\n", + "\n", + " return result\n", + "\n", + "result = run_test(\n", + " \"custom.MyCustomTest\",\n", + " inputs={\"dataset\": vm_test_ds, \"model\": vm_model},\n", + " post_process_fn=custom_plot,\n", + " generate_description=False,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "53084493", + "metadata": {}, + "source": [ + "<a id='toc2_5_'></a>\n", + "\n", + "### Using `RawData` in comparison tests\n", + "\n", + "When running comparison tests, the `RawData` object will contain the raw data for each individual test result as well as the comparison results between the test results. To support this, the RawData object contains the model and dataset input_ids for each of the datasets and models in the test, so that the post-processing function can use them to customize the output. The example below shows how to use the `RawData` object to customize the output of a comparison test and add a table to the test result that shows the confusion matrix for each individual test result as well as the comparison results between the test results.\n", + "\n", + "When designing post-processing functions that need to handle both individual and comparison test results, you can check the structure of the raw data to determine which case you're dealing with. In the example below, we check if `confusion_matrix` is a list (comparison test with multiple matrices) or a single matrix (individual test). For comparison tests, the function creates two tables: one showing the confusion matrices for each test case, and another showing the percentage drift between them. For individual tests, it creates a single table with the confusion matrix values. This pattern of checking the raw data structure can be applied to other tests to create versatile post-processing functions that work in both scenarios.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "id": "bcbbe9f4", + "metadata": {}, + "outputs": [], + "source": [ + "def cm_table(result: TestResult):\n", + " # For individual results\n", + " if not isinstance(result.raw_data.confusion_matrix, list):\n", + " # Extract values from single confusion matrix\n", + " cm = result.raw_data.confusion_matrix\n", + " tn, fp = cm[0, 0], cm[0, 1]\n", + " fn, tp = cm[1, 0], cm[1, 1]\n", + " \n", + " # Create DataFrame for individual matrix\n", + " cm_df = pd.DataFrame({\n", + " 'TN': [tn],\n", + " 'FP': [fp],\n", + " 'FN': [fn],\n", + " 'TP': [tp]\n", + " })\n", + " \n", + " # Add individual table\n", + " result.add_table(cm_df, title=\"Confusion Matrix\")\n", + " \n", + " # For comparison results\n", + " else:\n", + " cms = result.raw_data.confusion_matrix\n", + " cm1, cm2 = cms[0], cms[1]\n", + " \n", + " # Create individual results table\n", + " rows = []\n", + " for i, cm in enumerate(cms):\n", + " rows.append({\n", + " 'dataset': result.raw_data.dataset[i],\n", + " 'model': result.raw_data.model[i],\n", + " 'TN': cm[0, 0],\n", + " 'FP': cm[0, 1],\n", + " 'FN': cm[1, 0],\n", + " 'TP': cm[1, 1]\n", + " })\n", + " individual_df = pd.DataFrame(rows)\n", + " \n", + " # Calculate percentage differences\n", + " diff_df = pd.DataFrame({\n", + " 'TN_drift (%)': [(cm2[0, 0] - cm1[0, 0]) / cm1[0, 0] * 100],\n", + " 'FP_drift (%)': [(cm2[0, 1] - cm1[0, 1]) / cm1[0, 1] * 100],\n", + " 'FN_drift (%)': [(cm2[1, 0] - cm1[1, 0]) / cm1[1, 0] * 100],\n", + " 'TP_drift (%)': [(cm2[1, 1] - cm1[1, 1]) / cm1[1, 1] * 100]\n", + " }).round(2)\n", + " \n", + " # Add both tables\n", + " result.add_table(individual_df, title=\"Individual Confusion Matrices\")\n", + " result.add_table(diff_df, title=\"Confusion Matrix Drift\")\n", + " \n", + " return result" + ] + }, + { + "cell_type": "markdown", + "id": "41edd959", + "metadata": {}, + "source": [ + "Let's first run the confusion matrix test on a single dataset-model pair to see how our post-processing function handles individual results:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "cf3c47fe", + "metadata": {}, + "outputs": [], + "source": [ + "from validmind.tests import run_test\n", + "\n", + "result_cm = run_test(\n", + " \"validmind.model_validation.sklearn.ConfusionMatrix\",\n", + " inputs={\n", + " \"dataset\": vm_test_ds,\n", + " \"model\": vm_model,\n", + " },\n", + " post_process_fn=cm_table,\n", + " generate_description=False,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "a2482c54", + "metadata": {}, + "source": [ + "Now let's run a comparison test between test and train datasets to see how the function handles multiple results:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6a1b4388", + "metadata": {}, + "outputs": [], + "source": [ + "result_cm = run_test(\n", + " \"validmind.model_validation.sklearn.ConfusionMatrix\",\n", + " input_grid={\n", + " \"dataset\": [vm_test_ds, vm_train_ds],\n", + " \"model\": [vm_model]\n", + " },\n", + " post_process_fn=cm_table,\n", + " generate_description=False,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "9f7d361a", + "metadata": {}, + "source": [ + "Let's inspect the raw data to see how comparison tests structure their data - notice how the `RawData` object contains not just the confusion matrices for both datasets, but also tracks which dataset and model each result came from:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "012ec495", + "metadata": {}, + "outputs": [], + "source": [ + "result_cm.raw_data.inspect()" + ] + }, + { + "cell_type": "markdown", + "id": "copyright-d9a502e868ba4fc1a70056873609b472", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.15" + } + }, + "nbformat": 4, + "nbformat_minor": 5 } diff --git a/site/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb b/site/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb index 93092a7a18..fc7446d036 100644 --- a/site/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb +++ b/site/notebooks/how_to/tests/run_tests/documentation_tests/document_multiple_results_for_the_same_test.ipynb @@ -1,633 +1,639 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Document multiple results for the same test\n", - "\n", - "Documentation templates facilitate the presentation of multiple unique test results for a single test. \n", - "\n", - "Consider various scenarios where you may intend to showcase results of the same test with diverse inputs:\n", - "\n", - "- **Comparing test results with varied parameter values:** Illustrate model performance by contrasting test results achieved with different parameter values to identify optimal settings.\n", - "- **Displaying test results with distinct datasets:** Showcase test versatility by presenting results on diverse datasets, such as providing confusion matrices for both training and test data.\n", - "- **Model comparison:** Conduct a comprehensive model evaluation by comparing tests like `ROC curve` and `Accuracy` to discern and select the superior-performing model.\n", - "\n", - "This interactive notebook guides you through the process of documenting a model with the ValidMind Library. It uses the [Bank Customer Churn Prediction](https://www.kaggle.com/code/kmalit/bank-customer-churn-prediction/data) sample dataset from Kaggle to train a simple classification model. As part of the notebook, you will learn how to render more than one unique test result for the same test while exploring how the documentation process works:\n", - "\n", - "- Initializing the ValidMind Library\n", - "- Loading a sample dataset provided by the library to train a simple classification model\n", - "- Running a ValidMind test suite to quickly generate documentation about the data and model" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - "- [Update the customer churn demo template](#toc3__) \n", - "- [Initialize the Python environment](#toc4__) \n", - " - [Preview the documentation template](#toc4_1__) \n", - "- [Load the sample dataset](#toc5__) \n", - " - [Initialize a ValidMind dataset object](#toc5_1__) \n", - "- [Document the model](#toc6__) \n", - " - [Prepare datasets](#toc6_1__) \n", - " - [Initialize the training and test datasets](#toc6_2__) \n", - " - [Run documentation tests](#toc6_3__) \n", - " - [Run the individual tests using the `run_test`](#toc6_4__) \n", - "- [Next steps](#toc7__) \n", - " - [Work with your model documentation](#toc7_1__) \n", - " - [Discover more learning resources](#toc7_2__) \n", - "- [Upgrade ValidMind](#toc8__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Update the customer churn demo template\n", - "\n", - "Before you initialize the ValidMind Library by running the notebook, edit the **Binary classification** template to make a copy of a test of interest and update it with different `result_id` fields for each entry:\n", - "\n", - "- Go to **Settings > Templates** and click on the **Binary classification** template. Let's say we want to show `Skewness` results for `training` and `test` datasets.\n", - "\n", - "To do this we replace\n", - "\n", - "```yaml\n", - "- content_type: test\n", - " content_id: validmind.data_validation.Skewness\n", - "```\n", - "\n", - "with\n", - "\n", - "```yaml\n", - "- content_type: test\n", - " content_id: validmind.data_validation.Skewness:training_data\n", - "- content_type: test\n", - " content_id: validmind.data_validation.Skewness:test_data\n", - "```\n", - "\n", - "This way, we can show two results of the same test in the model document. Here, the `training_data` and `test_data` could be any string. However, they should be unique for the same test.\n", - "\n", - "- Click on **Prepare new version**, provide some version notes and click on **Save new version** to save a new version of this template.\n", - "- Next, we need to swap our model documentation to use this new version of the template. Follow the steps on [Manage document templates](https://docs.validmind.ai/guide/templates/manage-document-templates.html) to swap the template of our customer churn model.\n", - "\n", - "In the following sections we provide more context on how these `content_id` fields mentioned earlier get mapped to the actual tests." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Initialize the Python environment\n", - "\n", - "Next, let's import the necessary libraries and set up your Python environment for data analysis:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import pandas as pd\n", - "import xgboost as xgb\n", - "\n", - "from sklearn.metrics import accuracy_score\n", - "from sklearn.model_selection import train_test_split\n", - "\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Load the sample dataset\n", - "\n", - "The sample dataset used here is provided by the ValidMind library, along with a second, different dataset (`taiwan_credit`) you can try as well.\n", - "\n", - "To be able to use either sample dataset, you need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Import the sample dataset from the library\n", - "\n", - "from validmind.datasets.classification import customer_churn as demo_dataset\n", - "\n", - "df = demo_dataset.load_data()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_1__'></a>\n", - "\n", - "### Initialize a ValidMind dataset object\n", - "\n", - "Before you can run a test suite, which are a collection of tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", - "\n", - "This function takes a number of arguments:\n", - "\n", - "- `dataset` — the raw dataset that you want to analyze\n", - "- `target_column` — the name of the target column in the dataset\n", - "- `class_labels` — the list of class labels used for classification model training" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_dataset = vm.init_dataset(\n", - " input_id=\"raw_dataset\",\n", - " dataset=df,\n", - " target_column=demo_dataset.target_column,\n", - " class_labels=demo_dataset.class_labels,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Document the model\n", - "\n", - "As part of documenting the model with the ValidMind Library, you need to preprocess the raw dataset, initialize some training and test datasets, initialize a model object you can use for testing, and then run the full suite of tests." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_1__'></a>\n", - "\n", - "### Prepare datasets\n", - "\n", - "DataFrame (df) preprocessing is simplified by employing `demo_dataset.preprocess` to partition it into distinct datasets (`train_df`, `validation_df`, and `test_df`)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_df, validation_df, test_df = demo_dataset.preprocess(df)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_2__'></a>\n", - "\n", - "### Initialize the training and test datasets\n", - "\n", - "With the datasets ready, you can now initialize the training and test datasets (`train_df` and `test_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds = vm.init_dataset(\n", - " input_id=\"train_dataset\", dataset=train_df, target_column=demo_dataset.target_column\n", - ")\n", - "\n", - "vm_test_ds = vm.init_dataset(\n", - " input_id=\"test_dataset\", dataset=test_df, target_column=demo_dataset.target_column\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_3__'></a>\n", - "\n", - "### Run documentation tests\n", - "\n", - "Now specify `inputs` and `params` for individual tests using `config` parameter. The results for the both the datasets will be visible in the documentation. The `inputs` in the config get priority over global `inputs` in the `run_documentation_tests`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "config = {\n", - " \"validmind.data_validation.Skewness:training_data\": {\n", - " \"params\": {\"max_threshold\": 1},\n", - " \"inputs\": {\"dataset\": vm_train_ds},\n", - " },\n", - " \"validmind.data_validation.Skewness:test_data\": {\n", - " \"params\": {\"max_threshold\": 1.5},\n", - " \"inputs\": {\"dataset\": vm_test_ds},\n", - " },\n", - "}\n", - "\n", - "tests_suite = vm.run_documentation_tests(\n", - " inputs={\n", - " \"dataset\": vm_dataset,\n", - " },\n", - " config=config,\n", - " section=[\"data_preparation\"],\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_4__'></a>\n", - "\n", - "### Run the individual tests using the `run_test`\n", - "\n", - "Now run the `Skewness` tests for training and test datasets. The results for the both the datasets will be visible in the documentation." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " test_id=\"validmind.data_validation.Skewness:training_data\",\n", - " params={\"max_threshold\": 1},\n", - " inputs={\"dataset\": vm_train_ds},\n", - ")\n", - "test.log()\n", - "\n", - "test = vm.tests.run_test(\n", - " test_id=\"validmind.data_validation.Skewness:test_data\",\n", - " params={\"max_threshold\": 1.5},\n", - " inputs={\n", - " \"dataset\": vm_test_ds,\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "<a id='toc7_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "3. Expand the **2. Data Preparation** section and take a look around.\n", - "\n", - " You can now see the skewness tests results of training and test datasets in the `Data Preparation` section.\n", - "\n", - "From here, you can also make qualitative edits to model documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "<a id='toc7_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-6ce412276b6244aab16b2e3443c6a861", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "colab": { - "provenance": [] - }, - "gpuClass": "standard", - "kernelspec": { - "display_name": "validmind-1QuffXMV-py3.9", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.13" - } - }, - "nbformat": 4, - "nbformat_minor": 0 -} + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Document multiple results for the same test\n", + "\n", + "Documentation templates facilitate the presentation of multiple unique test results for a single test. \n", + "\n", + "Consider various scenarios where you may intend to showcase results of the same test with diverse inputs:\n", + "\n", + "- **Comparing test results with varied parameter values:** Illustrate model performance by contrasting test results achieved with different parameter values to identify optimal settings.\n", + "- **Displaying test results with distinct datasets:** Showcase test versatility by presenting results on diverse datasets, such as providing confusion matrices for both training and test data.\n", + "- **Model comparison:** Conduct a comprehensive model evaluation by comparing tests like `ROC curve` and `Accuracy` to discern and select the superior-performing model.\n", + "\n", + "This interactive notebook guides you through the process of documenting a model with the ValidMind Library. It uses the [Bank Customer Churn Prediction](https://www.kaggle.com/code/kmalit/bank-customer-churn-prediction/data) sample dataset from Kaggle to train a simple classification model. As part of the notebook, you will learn how to render more than one unique test result for the same test while exploring how the documentation process works:\n", + "\n", + "- Initializing the ValidMind Library\n", + "- Loading a sample dataset provided by the library to train a simple classification model\n", + "- Running a ValidMind test suite to quickly generate documentation about the data and model" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + "- [Update the customer churn demo template](#toc3__) \n", + "- [Initialize the Python environment](#toc4__) \n", + " - [Preview the documentation template](#toc4_1__) \n", + "- [Load the sample dataset](#toc5__) \n", + " - [Initialize a ValidMind dataset object](#toc5_1__) \n", + "- [Document the model](#toc6__) \n", + " - [Prepare datasets](#toc6_1__) \n", + " - [Initialize the training and test datasets](#toc6_2__) \n", + " - [Run documentation tests](#toc6_3__) \n", + " - [Run the individual tests using the `run_test`](#toc6_4__) \n", + "- [Next steps](#toc7__) \n", + " - [Work with your model documentation](#toc7_1__) \n", + " - [Discover more learning resources](#toc7_2__) \n", + "- [Upgrade ValidMind](#toc8__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", + "\n", + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Update the customer churn demo template\n", + "\n", + "Before you initialize the ValidMind Library by running the notebook, edit the **Binary classification** template to make a copy of a test of interest and update it with different `result_id` fields for each entry:\n", + "\n", + "- Go to **Settings > Templates** and click on the **Binary classification** template. Let's say we want to show `Skewness` results for `training` and `test` datasets.\n", + "\n", + "To do this we replace\n", + "\n", + "```yaml\n", + "- content_type: test\n", + " content_id: validmind.data_validation.Skewness\n", + "```\n", + "\n", + "with\n", + "\n", + "```yaml\n", + "- content_type: test\n", + " content_id: validmind.data_validation.Skewness:training_data\n", + "- content_type: test\n", + " content_id: validmind.data_validation.Skewness:test_data\n", + "```\n", + "\n", + "This way, we can show two results of the same test in the model document. Here, the `training_data` and `test_data` could be any string. However, they should be unique for the same test.\n", + "\n", + "- Click on **Prepare new version**, provide some version notes and click on **Save new version** to save a new version of this template.\n", + "- Next, we need to swap our model documentation to use this new version of the template. Follow the steps on [Manage document templates](https://docs.validmind.ai/guide/templates/manage-document-templates.html) to swap the template of our customer churn model.\n", + "\n", + "In the following sections we provide more context on how these `content_id` fields mentioned earlier get mapped to the actual tests." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Initialize the Python environment\n", + "\n", + "Next, let's import the necessary libraries and set up your Python environment for data analysis:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import pandas as pd\n", + "import xgboost as xgb\n", + "\n", + "from sklearn.metrics import accuracy_score\n", + "from sklearn.model_selection import train_test_split\n", + "\n", + "%matplotlib inline" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Load the sample dataset\n", + "\n", + "The sample dataset used here is provided by the ValidMind library, along with a second, different dataset (`taiwan_credit`) you can try as well.\n", + "\n", + "To be able to use either sample dataset, you need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Import the sample dataset from the library\n", + "\n", + "from validmind.datasets.classification import customer_churn as demo_dataset\n", + "\n", + "df = demo_dataset.load_data()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1__'></a>\n", + "\n", + "### Initialize a ValidMind dataset object\n", + "\n", + "Before you can run a test suite, which are a collection of tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", + "\n", + "This function takes a number of arguments:\n", + "\n", + "- `dataset` — the raw dataset that you want to analyze\n", + "- `target_column` — the name of the target column in the dataset\n", + "- `class_labels` — the list of class labels used for classification model training" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_dataset = vm.init_dataset(\n", + " input_id=\"raw_dataset\",\n", + " dataset=df,\n", + " target_column=demo_dataset.target_column,\n", + " class_labels=demo_dataset.class_labels,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Document the model\n", + "\n", + "As part of documenting the model with the ValidMind Library, you need to preprocess the raw dataset, initialize some training and test datasets, initialize a model object you can use for testing, and then run the full suite of tests." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_1__'></a>\n", + "\n", + "### Prepare datasets\n", + "\n", + "DataFrame (df) preprocessing is simplified by employing `demo_dataset.preprocess` to partition it into distinct datasets (`train_df`, `validation_df`, and `test_df`)" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_df, validation_df, test_df = demo_dataset.preprocess(df)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_2__'></a>\n", + "\n", + "### Initialize the training and test datasets\n", + "\n", + "With the datasets ready, you can now initialize the training and test datasets (`train_df` and `test_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds = vm.init_dataset(\n", + " input_id=\"train_dataset\", dataset=train_df, target_column=demo_dataset.target_column\n", + ")\n", + "\n", + "vm_test_ds = vm.init_dataset(\n", + " input_id=\"test_dataset\", dataset=test_df, target_column=demo_dataset.target_column\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_3__'></a>\n", + "\n", + "### Run documentation tests\n", + "\n", + "Now specify `inputs` and `params` for individual tests using `config` parameter. The results for the both the datasets will be visible in the documentation. The `inputs` in the config get priority over global `inputs` in the `run_documentation_tests`." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "config = {\n", + " \"validmind.data_validation.Skewness:training_data\": {\n", + " \"params\": {\"max_threshold\": 1},\n", + " \"inputs\": {\"dataset\": vm_train_ds},\n", + " },\n", + " \"validmind.data_validation.Skewness:test_data\": {\n", + " \"params\": {\"max_threshold\": 1.5},\n", + " \"inputs\": {\"dataset\": vm_test_ds},\n", + " },\n", + "}\n", + "\n", + "tests_suite = vm.run_documentation_tests(\n", + " inputs={\n", + " \"dataset\": vm_dataset,\n", + " },\n", + " config=config,\n", + " section=[\"data_preparation\"],\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_4__'></a>\n", + "\n", + "### Run the individual tests using the `run_test`\n", + "\n", + "Now run the `Skewness` tests for training and test datasets. The results for the both the datasets will be visible in the documentation." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " test_id=\"validmind.data_validation.Skewness:training_data\",\n", + " params={\"max_threshold\": 1},\n", + " inputs={\"dataset\": vm_train_ds},\n", + ")\n", + "test.log()\n", + "\n", + "test = vm.tests.run_test(\n", + " test_id=\"validmind.data_validation.Skewness:test_data\",\n", + " params={\"max_threshold\": 1.5},\n", + " inputs={\n", + " \"dataset\": vm_test_ds,\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "<a id='toc7_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "3. Expand the **2. Data Preparation** section and take a look around.\n", + "\n", + " You can now see the skewness tests results of training and test datasets in the `Data Preparation` section.\n", + "\n", + "From here, you can also make qualitative edits to model documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "<a id='toc7_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-6ce412276b6244aab16b2e3443c6a861" + } + ], + "metadata": { + "colab": { + "provenance": [] + }, + "gpuClass": "standard", + "kernelspec": { + "display_name": "validmind-1QuffXMV-py3.9", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.13" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/site/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb b/site/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb index 56eedd897e..42ef742f8e 100644 --- a/site/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb +++ b/site/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_sections.ipynb @@ -1,601 +1,605 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Run individual documentation sections\n", - "\n", - "For targeted testing, you can run tests on individual sections or specific groups of sections in your model documentation.\n", - "\n", - "As a model developer, running individual documentation sections is useful in various development scenarios. For instance, when updates are made to a model, often only certain parts of the documentation require revision. The `run_documentation_tests()` function allows you to directly test only these affected sections, thus saving you time and ensuring that the documentation accurately reflects the latest changes.\n", - "\n", - "This interactive notebook includes the code required to load the demo dataset, preprocess the raw dataset, train a model for testing, initialize ValidMind objects, and run the data preparation, model development, and multiple documentation sections." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Preview the documentation template](#toc2_3__) \n", - "- [Load the Demo Dataset](#toc3__) \n", - " - [Prepocess the raw dataset](#toc3_1__) \n", - "- [Train a model for testing](#toc4__) \n", - "- [Initialize ValidMind objects](#toc5__) \n", - " - [Assign predictions to the datasets](#toc5_1__) \n", - "- [Run the data preparation section](#toc6__) \n", - "- [Run the model development section](#toc7__) \n", - "- [Run multiple model documentation sections](#toc8__) \n", - "- [Next steps](#toc9__) \n", - " - [Work with your model documentation](#toc9_1__) \n", - " - [Discover more learning resources](#toc9_2__) \n", - "- [Upgrade ValidMind](#toc10__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%matplotlib inline\n", - "\n", - "import xgboost as xgb" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Load the Demo Dataset" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# You can also import taiwan_credit like this:\n", - "# from validmind.datasets.classification import taiwan_credit as demo_dataset\n", - "from validmind.datasets.classification import customer_churn as demo_dataset\n", - "\n", - "df = demo_dataset.load_data()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_1__'></a>\n", - "\n", - "### Prepocess the raw dataset" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_df, validation_df, test_df = demo_dataset.preprocess(df)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Train a model for testing\n", - "\n", - "We train a simple customer churn model for our test." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "x_train = train_df.drop(demo_dataset.target_column, axis=1)\n", - "y_train = train_df[demo_dataset.target_column]\n", - "x_val = validation_df.drop(demo_dataset.target_column, axis=1)\n", - "y_val = validation_df[demo_dataset.target_column]\n", - "\n", - "model = xgb.XGBClassifier(early_stopping_rounds=10)\n", - "model.set_params(\n", - " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", - ")\n", - "model.fit(\n", - " x_train,\n", - " y_train,\n", - " eval_set=[(x_val, y_val)],\n", - " verbose=False,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Initialize ValidMind objects\n", - "\n", - "We initize the objects required to run test suites using the ValidMind Library." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_dataset = vm.init_dataset(\n", - " input_id=\"raw_dataset\",\n", - " dataset=df,\n", - " target_column=demo_dataset.target_column,\n", - " class_labels=demo_dataset.class_labels,\n", - ")\n", - "\n", - "vm_train_ds = vm.init_dataset(\n", - " input_id=\"train_dataset\",\n", - " dataset=train_df,\n", - " type=\"generic\",\n", - " target_column=demo_dataset.target_column,\n", - ")\n", - "\n", - "vm_test_ds = vm.init_dataset(\n", - " input_id=\"test_dataset\",\n", - " dataset=test_df,\n", - " type=\"generic\",\n", - " target_column=demo_dataset.target_column,\n", - ")\n", - "\n", - "vm_model = vm.init_model(model, input_id=\"model\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_1__'></a>\n", - "\n", - "### Assign predictions to the datasets\n", - "\n", - "We can now use the `assign_predictions()` method from the `Dataset` object to link existing predictions to any model. If no prediction values are passed, the method will compute predictions automatically:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(\n", - " model=vm_model,\n", - ")\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_model,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Run the data preparation section\n", - "\n", - "In this section, we focus on running the tests within the data preparation section of the model documentation. After running this function, only the tests associated with this section will be executed, and the corresponding section in the model documentation will be updated." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "results = vm.run_documentation_tests(\n", - " section=\"data_preparation\",\n", - " inputs={\n", - " \"dataset\": vm_dataset,\n", - " },\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Run the model development section\n", - "\n", - "In this section, we focus on running the tests within the model development section of the model documentation. After running this function, only the tests associated with this section will be executed, and the corresponding section in the model documentation will be updated." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "results = vm.run_documentation_tests(\n", - " section=\"model_development\",\n", - " inputs={\n", - " \"dataset\": vm_train_ds,\n", - " \"model\": vm_model,\n", - " \"datasets\": (vm_train_ds, vm_test_ds),\n", - " },\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Run multiple model documentation sections\n", - "\n", - "This section demonstrates how you can execute both the data preparation and model development sections using `run_documentation_tests()`. After running this function, the tests associated with both sections will be executed, and their corresponding model documentation sections updated." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "results = vm.run_documentation_tests(\n", - " section=[\"model_development\", \"model_diagnosis\"],\n", - " inputs={\n", - " \"dataset\": vm_test_ds,\n", - " \"model\": vm_model,\n", - " \"datasets\": (vm_train_ds, vm_test_ds),\n", - " },\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc9__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "<a id='toc9_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "<a id='toc9_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc10__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-f4756a1f66ab49598b696ed86685fcc6", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": ".venv", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.13" - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Run individual documentation sections\n", + "\n", + "For targeted testing, you can run tests on individual sections or specific groups of sections in your model documentation.\n", + "\n", + "As a model developer, running individual documentation sections is useful in various development scenarios. For instance, when updates are made to a model, often only certain parts of the documentation require revision. The `run_documentation_tests()` function allows you to directly test only these affected sections, thus saving you time and ensuring that the documentation accurately reflects the latest changes.\n", + "\n", + "This interactive notebook includes the code required to load the demo dataset, preprocess the raw dataset, train a model for testing, initialize ValidMind objects, and run the data preparation, model development, and multiple documentation sections." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Preview the documentation template](#toc2_3__) \n", + "- [Load the Demo Dataset](#toc3__) \n", + " - [Prepocess the raw dataset](#toc3_1__) \n", + "- [Train a model for testing](#toc4__) \n", + "- [Initialize ValidMind objects](#toc5__) \n", + " - [Assign predictions to the datasets](#toc5_1__) \n", + "- [Run the data preparation section](#toc6__) \n", + "- [Run the model development section](#toc7__) \n", + "- [Run multiple model documentation sections](#toc8__) \n", + "- [Next steps](#toc9__) \n", + " - [Work with your model documentation](#toc9_1__) \n", + " - [Discover more learning resources](#toc9_2__) \n", + "- [Upgrade ValidMind](#toc10__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", + "\n", + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%matplotlib inline\n", + "\n", + "import xgboost as xgb" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Load the Demo Dataset" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# You can also import taiwan_credit like this:\n", + "# from validmind.datasets.classification import taiwan_credit as demo_dataset\n", + "from validmind.datasets.classification import customer_churn as demo_dataset\n", + "\n", + "df = demo_dataset.load_data()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1__'></a>\n", + "\n", + "### Prepocess the raw dataset" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_df, validation_df, test_df = demo_dataset.preprocess(df)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Train a model for testing\n", + "\n", + "We train a simple customer churn model for our test." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "x_train = train_df.drop(demo_dataset.target_column, axis=1)\n", + "y_train = train_df[demo_dataset.target_column]\n", + "x_val = validation_df.drop(demo_dataset.target_column, axis=1)\n", + "y_val = validation_df[demo_dataset.target_column]\n", + "\n", + "model = xgb.XGBClassifier(early_stopping_rounds=10)\n", + "model.set_params(\n", + " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", + ")\n", + "model.fit(\n", + " x_train,\n", + " y_train,\n", + " eval_set=[(x_val, y_val)],\n", + " verbose=False,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Initialize ValidMind objects\n", + "\n", + "We initize the objects required to run test suites using the ValidMind Library." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_dataset = vm.init_dataset(\n", + " input_id=\"raw_dataset\",\n", + " dataset=df,\n", + " target_column=demo_dataset.target_column,\n", + " class_labels=demo_dataset.class_labels,\n", + ")\n", + "\n", + "vm_train_ds = vm.init_dataset(\n", + " input_id=\"train_dataset\",\n", + " dataset=train_df,\n", + " type=\"generic\",\n", + " target_column=demo_dataset.target_column,\n", + ")\n", + "\n", + "vm_test_ds = vm.init_dataset(\n", + " input_id=\"test_dataset\",\n", + " dataset=test_df,\n", + " type=\"generic\",\n", + " target_column=demo_dataset.target_column,\n", + ")\n", + "\n", + "vm_model = vm.init_model(model, input_id=\"model\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1__'></a>\n", + "\n", + "### Assign predictions to the datasets\n", + "\n", + "We can now use the `assign_predictions()` method from the `Dataset` object to link existing predictions to any model. If no prediction values are passed, the method will compute predictions automatically:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(\n", + " model=vm_model,\n", + ")\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_model,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Run the data preparation section\n", + "\n", + "In this section, we focus on running the tests within the data preparation section of the model documentation. After running this function, only the tests associated with this section will be executed, and the corresponding section in the model documentation will be updated." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "results = vm.run_documentation_tests(\n", + " section=\"data_preparation\",\n", + " inputs={\n", + " \"dataset\": vm_dataset,\n", + " },\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Run the model development section\n", + "\n", + "In this section, we focus on running the tests within the model development section of the model documentation. After running this function, only the tests associated with this section will be executed, and the corresponding section in the model documentation will be updated." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "results = vm.run_documentation_tests(\n", + " section=\"model_development\",\n", + " inputs={\n", + " \"dataset\": vm_train_ds,\n", + " \"model\": vm_model,\n", + " \"datasets\": (vm_train_ds, vm_test_ds),\n", + " },\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Run multiple model documentation sections\n", + "\n", + "This section demonstrates how you can execute both the data preparation and model development sections using `run_documentation_tests()`. After running this function, the tests associated with both sections will be executed, and their corresponding model documentation sections updated." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "results = vm.run_documentation_tests(\n", + " section=[\"model_development\", \"model_diagnosis\"],\n", + " inputs={\n", + " \"dataset\": vm_test_ds,\n", + " \"model\": vm_model,\n", + " \"datasets\": (vm_train_ds, vm_test_ds),\n", + " },\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc9__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "<a id='toc9_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "<a id='toc9_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc10__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-f4756a1f66ab49598b696ed86685fcc6" + } + ], + "metadata": { + "kernelspec": { + "display_name": ".venv", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.13" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} \ No newline at end of file diff --git a/site/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb b/site/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb index 839df75427..48a3e439d8 100644 --- a/site/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb +++ b/site/notebooks/how_to/tests/run_tests/documentation_tests/run_documentation_tests_with_config.ipynb @@ -1,734 +1,738 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Run documentation tests with custom configurations\n", - "\n", - "When running documentation tests, you can configure inputs and parameters for individual tests by passing a config as a parameter.\n", - "\n", - "As a model developer, configuring individual tests is useful in various models development scenarios. For instance, based on a use case, a model might require changing inputs and/or parameters for certain tests. The `run_documentation_tests()` function allows you to directly configure tests through `config`, thus giving you flexibility to run tests according to your use case.\n", - "\n", - "This interactive notebook includes the code required to load the demo dataset, preprocess the raw dataset, train a model for testing, initialize ValidMind objects, and run documentation tests with custom configurations." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Preview the documentation template](#toc2_3__) \n", - "- [Load the sample dataset](#toc3__) \n", - "- [Document the model](#toc4__) \n", - "- [Prepocess the raw dataset](#toc5__) \n", - "- [Train a model for testing](#toc6__) \n", - "- [Initialize ValidMind objects](#toc7__) \n", - " - [Initialize the ValidMind model](#toc7_1__) \n", - " - [Initialize the ValidMind datasets](#toc7_2__) \n", - " - [Run predictions through `assign_predictions` interface](#toc7_3__) \n", - "- [Run documentation tests](#toc8__) \n", - " - [Preview config](#toc8_1__) \n", - " - [Updating config](#toc8_2__) \n", - " - [Run documentation tests](#toc8_3__) \n", - "- [Next steps](#toc9__) \n", - " - [Work with your documentation](#toc9_1__) \n", - " - [Discover more learning resources](#toc9_2__) \n", - "- [Upgrade ValidMind](#toc10__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Load the sample dataset\n", - "\n", - "The sample dataset used here is provided by the ValidMind library. To be able to use it, you need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Import the sample dataset from the library\n", - "\n", - "from validmind.datasets.classification import customer_churn as demo_dataset\n", - "\n", - "print(\n", - " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{demo_dataset.target_column}' \\n\\t• Class labels: {demo_dataset.class_labels}\"\n", - ")\n", - "\n", - "raw_df = demo_dataset.load_data()\n", - "raw_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Document the model\n", - "\n", - "As part of documenting the model with the ValidMind Library, you need to preprocess the raw dataset, initialize some training and test datasets, initialize a model object you can use for testing, and then run the full suite of tests." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Prepocess the raw dataset\n", - "\n", - "Preprocessing performs a number of operations to get ready for the subsequent steps:\n", - "\n", - "- Preprocess the data: Splits the DataFrame (`df`) into multiple datasets (`train_df`, `validation_df`, and `test_df`) using `demo_dataset.preprocess` to simplify preprocessing.\n", - "- Separate features and targets: Drops the target column to create feature sets (`x_train`, `x_val`) and target sets (`y_train`, `y_val`).\n", - "- Initialize XGBoost classifier: Creates an `XGBClassifier` object with early stopping rounds set to 10.\n", - "- Set evaluation metrics: Specifies metrics for model evaluation as \"error,\" \"logloss,\" and \"auc.\"\n", - "- Fit the model: Trains the model on `x_train` and `y_train` using the validation set `(x_val, y_val)`. Verbose output is disabled." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_df, validation_df, test_df = demo_dataset.preprocess(raw_df)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Train a model for testing\n", - "\n", - "We train a simple customer churn model for our test." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import xgboost\n", - "%matplotlib inline\n", - "\n", - "x_train = train_df.drop(demo_dataset.target_column, axis=1)\n", - "y_train = train_df[demo_dataset.target_column]\n", - "x_val = validation_df.drop(demo_dataset.target_column, axis=1)\n", - "y_val = validation_df[demo_dataset.target_column]\n", - "\n", - "xgb = xgboost.XGBClassifier(early_stopping_rounds=10)\n", - "xgb.set_params(\n", - " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", - ")\n", - "xgb.fit(\n", - " x_train,\n", - " y_train,\n", - " eval_set=[(x_val, y_val)],\n", - " verbose=False,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Initialize ValidMind objects" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7_1__'></a>\n", - "\n", - "### Initialize the ValidMind model\n", - "\n", - "Before you run tests, you'll need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data for our model.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# FUNCTION ARGUMENTS:\n", - "# model - the model that you want to provide as input to tests\n", - "# input_id - a unique identifier that allows tracking what inputs are used when running each individual test\n", - "\n", - "vm_model_xgb = vm.init_model(\n", - " xgb,\n", - " input_id=\"xgb\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7_2__'></a>\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Similarly, initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", - "\n", - "This function takes a number of arguments:\n", - "\n", - "- `dataset` — the raw dataset that you want to provide as input to tests\n", - "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", - "- `target_column` — a required argument if tests require access to true values. This is the name of the target column in the dataset\n", - "- `class_labels` — an optional value to map predicted classes to class labels\n", - "\n", - "With all datasets ready, you can now initialize the raw, training and test datasets (`raw_df`, `train_df` and `test_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_ds = vm.init_dataset(\n", - " input_id=\"raw_dataset\",\n", - " dataset=raw_df,\n", - " target_column=demo_dataset.target_column,\n", - ")\n", - "\n", - "feature_columns = [\n", - " \"CreditScore\",\n", - " \"Gender\",\n", - " \"Age\",\n", - " \"Tenure\",\n", - " \"Balance\",\n", - " \"NumOfProducts\",\n", - " \"HasCrCard\",\n", - " \"IsActiveMember\",\n", - " \"EstimatedSalary\",\n", - " \"Geography_France\",\n", - " \"Geography_Germany\",\n", - " \"Geography_Spain\",\n", - "]\n", - "\n", - "vm_train_ds = vm.init_dataset(\n", - " input_id=\"train_dataset\",\n", - " dataset=train_df,\n", - " target_column=demo_dataset.target_column,\n", - " feature_columns=feature_columns,\n", - ")\n", - "\n", - "vm_test_ds = vm.init_dataset(\n", - " input_id=\"test_dataset\",\n", - " dataset=test_df,\n", - " target_column=demo_dataset.target_column,\n", - " feature_columns=feature_columns,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7_3__'></a>\n", - "\n", - "### Run predictions through `assign_predictions` interface\n", - "\n", - "We can use `assign_predictions()` to run and assign model predictions to our training and test datasets:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(model=vm_model_xgb)\n", - "vm_test_ds.assign_predictions(model=vm_model_xgb)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Run documentation tests" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc8_1__'></a>\n", - "\n", - "### Preview config\n", - "\n", - "You can preview the default config for the documentation template using the `vm.get_test_suite().get_default_config()` interface." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import json\n", - "\n", - "model_test_suite = vm.get_test_suite()\n", - "config = model_test_suite.get_default_config()\n", - "print(\"Suite Config: \\n\", json.dumps(config, indent=2))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc8_2__'></a>\n", - "\n", - "### Updating config\n", - "\n", - "The test configuration can be updated to fit with your use case and requirements" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "config = {\n", - " \"validmind.data_validation.DatasetSplit\": {\n", - " \"inputs\": {\"datasets\": (vm_train_ds, vm_test_ds)},\n", - " },\n", - " \"validmind.model_validation.sklearn.PopulationStabilityIndex\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"datasets\": (vm_train_ds, vm_test_ds)},\n", - " },\n", - " \"validmind.model_validation.sklearn.ConfusionMatrix\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", - " },\n", - " \"validmind.model_validation.sklearn.ClassifierPerformance:in_sample\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_train_ds},\n", - " },\n", - " \"validmind.model_validation.sklearn.ClassifierPerformance:out_of_sample\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", - " },\n", - " \"validmind.model_validation.sklearn.PrecisionRecallCurve\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", - " },\n", - " \"validmind.model_validation.sklearn.ROCCurve\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", - " },\n", - " \"validmind.model_validation.sklearn.TrainingTestDegradation\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"datasets\": (vm_train_ds, vm_test_ds)},\n", - " },\n", - " \"validmind.model_validation.sklearn.MinimumAccuracy\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", - " },\n", - " \"validmind.model_validation.sklearn.MinimumF1Score\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", - " },\n", - " \"validmind.model_validation.sklearn.MinimumROCAUCScore\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", - " },\n", - " \"validmind.model_validation.sklearn.PermutationFeatureImportance\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", - " },\n", - " \"validmind.model_validation.sklearn.SHAPGlobalImportance\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", - " },\n", - " \"validmind.model_validation.sklearn.WeakspotsDiagnosis\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"datasets\": (vm_train_ds, vm_test_ds)},\n", - " },\n", - " \"validmind.model_validation.sklearn.OverfitDiagnosis\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"datasets\": (vm_train_ds, vm_test_ds)},\n", - " },\n", - " \"validmind.model_validation.sklearn.RobustnessDiagnosis\": {\n", - " \"inputs\": {\"model\": vm_model_xgb, \"datasets\": (vm_train_ds, vm_test_ds)},\n", - " },\n", - "}" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc8_3__'></a>\n", - "\n", - "### Run documentation tests\n", - "\n", - "You can now run all documentation tests and pass an extra `config` parameter that overrides input and parameter configuration for the tests specified in the object." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "full_suite = vm.run_documentation_tests(\n", - " inputs={\n", - " \"dataset\": vm_raw_ds,\n", - " \"model\": vm_model_xgb,\n", - " },\n", - " config=config,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc9__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "<a id='toc9_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "<a id='toc9_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc10__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-d0990f47a72e4eaab065be1540234792", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "colab": { - "provenance": [] - }, - "gpuClass": "standard", - "kernelspec": { - "display_name": ".venv", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.13" - } - }, - "nbformat": 4, - "nbformat_minor": 0 -} + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Run documentation tests with custom configurations\n", + "\n", + "When running documentation tests, you can configure inputs and parameters for individual tests by passing a config as a parameter.\n", + "\n", + "As a model developer, configuring individual tests is useful in various models development scenarios. For instance, based on a use case, a model might require changing inputs and/or parameters for certain tests. The `run_documentation_tests()` function allows you to directly configure tests through `config`, thus giving you flexibility to run tests according to your use case.\n", + "\n", + "This interactive notebook includes the code required to load the demo dataset, preprocess the raw dataset, train a model for testing, initialize ValidMind objects, and run documentation tests with custom configurations." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Preview the documentation template](#toc2_3__) \n", + "- [Load the sample dataset](#toc3__) \n", + "- [Document the model](#toc4__) \n", + "- [Prepocess the raw dataset](#toc5__) \n", + "- [Train a model for testing](#toc6__) \n", + "- [Initialize ValidMind objects](#toc7__) \n", + " - [Initialize the ValidMind model](#toc7_1__) \n", + " - [Initialize the ValidMind datasets](#toc7_2__) \n", + " - [Run predictions through `assign_predictions` interface](#toc7_3__) \n", + "- [Run documentation tests](#toc8__) \n", + " - [Preview config](#toc8_1__) \n", + " - [Updating config](#toc8_2__) \n", + " - [Run documentation tests](#toc8_3__) \n", + "- [Next steps](#toc9__) \n", + " - [Work with your documentation](#toc9_1__) \n", + " - [Discover more learning resources](#toc9_2__) \n", + "- [Upgrade ValidMind](#toc10__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", + "\n", + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Load the sample dataset\n", + "\n", + "The sample dataset used here is provided by the ValidMind library. To be able to use it, you need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Import the sample dataset from the library\n", + "\n", + "from validmind.datasets.classification import customer_churn as demo_dataset\n", + "\n", + "print(\n", + " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{demo_dataset.target_column}' \\n\\t• Class labels: {demo_dataset.class_labels}\"\n", + ")\n", + "\n", + "raw_df = demo_dataset.load_data()\n", + "raw_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Document the model\n", + "\n", + "As part of documenting the model with the ValidMind Library, you need to preprocess the raw dataset, initialize some training and test datasets, initialize a model object you can use for testing, and then run the full suite of tests." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Prepocess the raw dataset\n", + "\n", + "Preprocessing performs a number of operations to get ready for the subsequent steps:\n", + "\n", + "- Preprocess the data: Splits the DataFrame (`df`) into multiple datasets (`train_df`, `validation_df`, and `test_df`) using `demo_dataset.preprocess` to simplify preprocessing.\n", + "- Separate features and targets: Drops the target column to create feature sets (`x_train`, `x_val`) and target sets (`y_train`, `y_val`).\n", + "- Initialize XGBoost classifier: Creates an `XGBClassifier` object with early stopping rounds set to 10.\n", + "- Set evaluation metrics: Specifies metrics for model evaluation as \"error,\" \"logloss,\" and \"auc.\"\n", + "- Fit the model: Trains the model on `x_train` and `y_train` using the validation set `(x_val, y_val)`. Verbose output is disabled." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_df, validation_df, test_df = demo_dataset.preprocess(raw_df)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Train a model for testing\n", + "\n", + "We train a simple customer churn model for our test." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import xgboost\n", + "%matplotlib inline\n", + "\n", + "x_train = train_df.drop(demo_dataset.target_column, axis=1)\n", + "y_train = train_df[demo_dataset.target_column]\n", + "x_val = validation_df.drop(demo_dataset.target_column, axis=1)\n", + "y_val = validation_df[demo_dataset.target_column]\n", + "\n", + "xgb = xgboost.XGBClassifier(early_stopping_rounds=10)\n", + "xgb.set_params(\n", + " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", + ")\n", + "xgb.fit(\n", + " x_train,\n", + " y_train,\n", + " eval_set=[(x_val, y_val)],\n", + " verbose=False,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Initialize ValidMind objects" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7_1__'></a>\n", + "\n", + "### Initialize the ValidMind model\n", + "\n", + "Before you run tests, you'll need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data for our model.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# FUNCTION ARGUMENTS:\n", + "# model - the model that you want to provide as input to tests\n", + "# input_id - a unique identifier that allows tracking what inputs are used when running each individual test\n", + "\n", + "vm_model_xgb = vm.init_model(\n", + " xgb,\n", + " input_id=\"xgb\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7_2__'></a>\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Similarly, initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", + "\n", + "This function takes a number of arguments:\n", + "\n", + "- `dataset` — the raw dataset that you want to provide as input to tests\n", + "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", + "- `target_column` — a required argument if tests require access to true values. This is the name of the target column in the dataset\n", + "- `class_labels` — an optional value to map predicted classes to class labels\n", + "\n", + "With all datasets ready, you can now initialize the raw, training and test datasets (`raw_df`, `train_df` and `test_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_ds = vm.init_dataset(\n", + " input_id=\"raw_dataset\",\n", + " dataset=raw_df,\n", + " target_column=demo_dataset.target_column,\n", + ")\n", + "\n", + "feature_columns = [\n", + " \"CreditScore\",\n", + " \"Gender\",\n", + " \"Age\",\n", + " \"Tenure\",\n", + " \"Balance\",\n", + " \"NumOfProducts\",\n", + " \"HasCrCard\",\n", + " \"IsActiveMember\",\n", + " \"EstimatedSalary\",\n", + " \"Geography_France\",\n", + " \"Geography_Germany\",\n", + " \"Geography_Spain\",\n", + "]\n", + "\n", + "vm_train_ds = vm.init_dataset(\n", + " input_id=\"train_dataset\",\n", + " dataset=train_df,\n", + " target_column=demo_dataset.target_column,\n", + " feature_columns=feature_columns,\n", + ")\n", + "\n", + "vm_test_ds = vm.init_dataset(\n", + " input_id=\"test_dataset\",\n", + " dataset=test_df,\n", + " target_column=demo_dataset.target_column,\n", + " feature_columns=feature_columns,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7_3__'></a>\n", + "\n", + "### Run predictions through `assign_predictions` interface\n", + "\n", + "We can use `assign_predictions()` to run and assign model predictions to our training and test datasets:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(model=vm_model_xgb)\n", + "vm_test_ds.assign_predictions(model=vm_model_xgb)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Run documentation tests" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8_1__'></a>\n", + "\n", + "### Preview config\n", + "\n", + "You can preview the default config for the documentation template using the `vm.get_test_suite().get_default_config()` interface." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import json\n", + "\n", + "model_test_suite = vm.get_test_suite()\n", + "config = model_test_suite.get_default_config()\n", + "print(\"Suite Config: \\n\", json.dumps(config, indent=2))" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8_2__'></a>\n", + "\n", + "### Updating config\n", + "\n", + "The test configuration can be updated to fit with your use case and requirements" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "config = {\n", + " \"validmind.data_validation.DatasetSplit\": {\n", + " \"inputs\": {\"datasets\": (vm_train_ds, vm_test_ds)},\n", + " },\n", + " \"validmind.model_validation.sklearn.PopulationStabilityIndex\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"datasets\": (vm_train_ds, vm_test_ds)},\n", + " },\n", + " \"validmind.model_validation.sklearn.ConfusionMatrix\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", + " },\n", + " \"validmind.model_validation.sklearn.ClassifierPerformance:in_sample\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_train_ds},\n", + " },\n", + " \"validmind.model_validation.sklearn.ClassifierPerformance:out_of_sample\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", + " },\n", + " \"validmind.model_validation.sklearn.PrecisionRecallCurve\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", + " },\n", + " \"validmind.model_validation.sklearn.ROCCurve\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", + " },\n", + " \"validmind.model_validation.sklearn.TrainingTestDegradation\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"datasets\": (vm_train_ds, vm_test_ds)},\n", + " },\n", + " \"validmind.model_validation.sklearn.MinimumAccuracy\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", + " },\n", + " \"validmind.model_validation.sklearn.MinimumF1Score\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", + " },\n", + " \"validmind.model_validation.sklearn.MinimumROCAUCScore\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", + " },\n", + " \"validmind.model_validation.sklearn.PermutationFeatureImportance\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", + " },\n", + " \"validmind.model_validation.sklearn.SHAPGlobalImportance\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"dataset\": vm_test_ds},\n", + " },\n", + " \"validmind.model_validation.sklearn.WeakspotsDiagnosis\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"datasets\": (vm_train_ds, vm_test_ds)},\n", + " },\n", + " \"validmind.model_validation.sklearn.OverfitDiagnosis\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"datasets\": (vm_train_ds, vm_test_ds)},\n", + " },\n", + " \"validmind.model_validation.sklearn.RobustnessDiagnosis\": {\n", + " \"inputs\": {\"model\": vm_model_xgb, \"datasets\": (vm_train_ds, vm_test_ds)},\n", + " },\n", + "}" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8_3__'></a>\n", + "\n", + "### Run documentation tests\n", + "\n", + "You can now run all documentation tests and pass an extra `config` parameter that overrides input and parameter configuration for the tests specified in the object." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "full_suite = vm.run_documentation_tests(\n", + " inputs={\n", + " \"dataset\": vm_raw_ds,\n", + " \"model\": vm_model_xgb,\n", + " },\n", + " config=config,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc9__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "<a id='toc9_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "<a id='toc9_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc10__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-d0990f47a72e4eaab065be1540234792" + } + ], + "metadata": { + "colab": { + "provenance": [] + }, + "gpuClass": "standard", + "kernelspec": { + "display_name": ".venv", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.13" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/site/notebooks/quickstart/quickstart_documentation.ipynb b/site/notebooks/quickstart/quickstart_documentation.ipynb index b2d5c5d281..033e023454 100644 --- a/site/notebooks/quickstart/quickstart_documentation.ipynb +++ b/site/notebooks/quickstart/quickstart_documentation.ipynb @@ -1,926 +1,930 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "7b021b0d", - "metadata": {}, - "source": [ - "# Quickstart for documentation\n", - "\n", - "Learn the basics of using ValidMind to document records as part of a development workflow. Set up the ValidMind Library in your environment, and generate a draft of documentation using ValidMind tests for a binary classification model.\n", - "\n", - "To document our model with the ValidMind Library, we'll:\n", - "\n", - "1. Import a sample dataset and preprocess it\n", - "2. Split the datasets and initialize them for use with ValidMind\n", - "3. Initialize a ValidMind model object for use with testing\n", - "4. Run a full suite of tests as defined by our documentation template, which will send the results of those tests to the ValidMind Platform" - ] - }, - { - "cell_type": "markdown", - "id": "167aef58", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [Introduction](#toc1__) \n", - "- [About ValidMind](#toc2__) \n", - " - [Before you begin](#toc2_1__) \n", - " - [New to ValidMind?](#toc2_2__) \n", - " - [Key concepts](#toc2_3__) \n", - "- [Setting up](#toc3__) \n", - " - [Install the ValidMind Library](#toc3_1__) \n", - " - [Initialize the ValidMind Library](#toc3_2__) \n", - " - [Register sample model](#toc3_2_1__) \n", - " - [Apply documentation template](#toc3_2_2__) \n", - " - [Get your code snippet](#toc3_2_3__) \n", - " - [Initialize the Python environment](#toc3_3__) \n", - "- [Getting to know ValidMind](#toc4__) \n", - " - [Preview the documentation template](#toc4_1__) \n", - " - [View documentation in the ValidMind Platform](#toc4_2__) \n", - "- [Working with ValidMind datasets](#toc5__) \n", - " - [Prepare the sample dataset](#toc5_1__) \n", - " - [Import the sample dataset](#toc5_1_1__) \n", - " - [Preprocess the raw dataset](#toc5_1_2__) \n", - " - [Split the dataset](#toc5_1_3__) \n", - " - [Separate features and targets](#toc5_1_4__) \n", - " - [Initialize the ValidMind datasets](#toc5_2__) \n", - "- [Working with ValidMind models](#toc6__) \n", - " - [Train an XGBoost classifier model](#toc6_1__) \n", - " - [Set evaluation metrics](#toc6_1_1__) \n", - " - [Fit the model](#toc6_1_2__) \n", - " - [Initialize the ValidMind model](#toc6_2__) \n", - " - [Assign predictions](#toc6_3__) \n", - "- [Run a ValidMind test suite](#toc7__) \n", - "- [In summary](#toc8__) \n", - "- [Next steps](#toc9__) \n", - " - [Work with your documentation](#toc9_1__) \n", - " - [Discover more learning resources](#toc9_2__) \n", - "- [Upgrade ValidMind](#toc10__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "id": "1cce526f", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## Introduction\n", - "\n", - "Development aims to produce a fit-for-purpose *champion* by conducting thorough testing and analysis, supporting the capabilities of the champion with evidence in the form of documentation and test results. Documentation should be clear and comprehensive, ideally following a structure or template covering all aspects of compliance with risk regulation.\n", - "\n", - "A *binary classification model* is a type of predictive model used in churn analysis to identify customers who are likely to leave a service or subscription by analyzing various behavioral, transactional, and demographic factors.\n", - "\n", - "- This model helps businesses take proactive measures to retain at-risk customers by offering personalized incentives, improving customer service, or adjusting pricing strategies.\n", - "- Effective validation of a churn prediction model ensures that businesses can accurately identify potential churners, optimize retention efforts, and enhance overall customer satisfaction while minimizing revenue loss." - ] - }, - { - "cell_type": "markdown", - "id": "f9b5eac2", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." - ] - }, - { - "cell_type": "markdown", - "id": "650236de", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." - ] - }, - { - "cell_type": "markdown", - "id": "b9d9d4cf", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "id": "59b308f7", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "id": "61b5cbeb", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "id": "0f08166e", - "metadata": {}, - "source": [ - "<a id='toc3_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", - "<br></br>\n", - "Python 3.8 <= x <= 3.14</div>\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d1f6dbed", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "id": "1bf4e4cb", - "metadata": {}, - "source": [ - "<a id='toc3_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "id": "cb6e369b", - "metadata": {}, - "source": [ - "<a id='toc3_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "id": "7167d002", - "metadata": {}, - "source": [ - "<a id='toc3_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "id": "43037f46", - "metadata": {}, - "source": [ - "<a id='toc3_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e2c1dd22", - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "1a6933d3", - "metadata": {}, - "source": [ - "<a id='toc3_3__'></a>\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Then, let's import the necessary libraries and set up your Python environment for data analysis:\n", - "\n", - "- Import **Extreme Gradient Boosting** (XGBoost) with an alias so that we can reference its functions in later calls. XGBoost is a powerful machine learning library designed for speed and performance, especially in handling structured or tabular data.\n", - "- Enable **`matplotlib`**, a plotting library used for visualizing data. Ensures that any plots you generate will render inline in our notebook output rather than opening in a separate window." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "62d7c2c1", - "metadata": {}, - "outputs": [], - "source": [ - "import xgboost as xgb\n", - "\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "id": "fafe8fc2", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Getting to know ValidMind" - ] - }, - { - "cell_type": "markdown", - "id": "d7ee565f", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b2bce375", - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "id": "fa0e43cb", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### View documentation in the ValidMind Platform\n", - "\n", - "Next, let's head to the ValidMind Platform to see the template in action:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, navigate to **Inventory** and select the model you registered for this notebook.\n", - "\n", - "3. Click **Development** under Documents for your model and note how the structure of the documentation matches our preview above." - ] - }, - { - "cell_type": "markdown", - "id": "9d0d1005", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Working with ValidMind datasets" - ] - }, - { - "cell_type": "markdown", - "id": "1b94e39f", - "metadata": {}, - "source": [ - "<a id='toc5_1__'></a>\n", - "\n", - "### Prepare the sample dataset" - ] - }, - { - "cell_type": "markdown", - "id": "6fc79fc1", - "metadata": {}, - "source": [ - "<a id='toc5_1_1__'></a>\n", - "\n", - "#### Import the sample dataset\n", - "\n", - "First, let's import the public [Bank Customer Churn Prediction](https://www.kaggle.com/datasets/shantanudhakadd/bank-customer-churn-prediction) dataset from Kaggle so that we have something to work with.\n", - "\n", - "In our below example, note that: \n", - "\n", - "- The target column, `Exited` has a value of `1` when a customer has churned and `0` otherwise.\n", - "- The ValidMind Library provides a wrapper to automatically load the dataset as a [Pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) object. A Pandas Dataframe is a two-dimensional tabular data structure that makes use of rows and columns." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "58d1c94b", - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.datasets.classification import customer_churn\n", - "\n", - "print(\n", - " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{customer_churn.target_column}' \\n\\t• Class labels: {customer_churn.class_labels}\"\n", - ")\n", - "\n", - "raw_df = customer_churn.load_data()\n", - "raw_df.head()" - ] - }, - { - "cell_type": "markdown", - "id": "4fe0f216", - "metadata": {}, - "source": [ - "<a id='toc5_1_2__'></a>\n", - "\n", - "#### Preprocess the raw dataset\n", - "\n", - "Before running tests with ValidMind, we'll need to preprocess our imported dataset. This involves splitting the data and separating the features (inputs) from the targets (outputs)." - ] - }, - { - "cell_type": "markdown", - "id": "9f690a04", - "metadata": {}, - "source": [ - "<a id='toc5_1_3__'></a>\n", - "\n", - "#### Split the dataset\n", - "\n", - "Splitting our dataset helps assess how well the model generalizes to unseen data.\n", - "\n", - "Use [`preprocess()`](https://docs.validmind.ai/validmind/validmind/datasets/classification/customer_churn.html#preprocess) to split our dataset into three subsets:\n", - "\n", - "1. **train_df** — Used to train the model.\n", - "2. **validation_df** — Used to evaluate the model's performance during training.\n", - "3. **test_df** — Used later on to asses the model's performance on new, unseen data." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "418cb5aa", - "metadata": {}, - "outputs": [], - "source": [ - "train_df, validation_df, test_df = customer_churn.preprocess(raw_df)" - ] - }, - { - "cell_type": "markdown", - "id": "a9ad2104", - "metadata": {}, - "source": [ - "<a id='toc5_1_4__'></a>\n", - "\n", - "#### Separate features and targets\n", - "\n", - "To train the model, we need to provide it with:\n", - "\n", - "1. **Inputs** — Features such as customer age, usage, etc.\n", - "2. **Outputs (Expected answers/labels)** — in our case, we would like to know whether the customer churned or not.\n", - "\n", - "Here, we'll use `x_train` and `x_val` to hold the input data (features), and `y_train` and `y_val` to hold the answers (the target we want to predict):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "6fd365fd", - "metadata": {}, - "outputs": [], - "source": [ - "x_train = train_df.drop(customer_churn.target_column, axis=1)\n", - "y_train = train_df[customer_churn.target_column]\n", - "x_val = validation_df.drop(customer_churn.target_column, axis=1)\n", - "y_val = validation_df[customer_churn.target_column]" - ] - }, - { - "cell_type": "markdown", - "id": "73d767d7", - "metadata": {}, - "source": [ - "<a id='toc5_2__'></a>\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests with your preprocessed datasets, you must first initialize a ValidMind `Dataset` object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module. **This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind,** but you only need to do it once per dataset.\n", - "\n", - "For this example, we'll pass in the following arguments:\n", - "\n", - "- **`dataset`** — The raw dataset that you want to provide as input to tests.\n", - "- **`input_id`** — A unique identifier that allows tracking what inputs are used when running each individual test.\n", - "- **`target_column`** — A required argument if tests require access to true values. This is the name of the target column in the dataset.\n", - "- **`class_labels`** — An optional value to map predicted classes to class labels." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "bb6ad06a", - "metadata": {}, - "outputs": [], - "source": [ - "# Initialize the raw dataset\n", - "vm_raw_dataset = vm.init_dataset(\n", - " dataset=raw_df,\n", - " input_id=\"raw_dataset\",\n", - " target_column=customer_churn.target_column,\n", - " class_labels=customer_churn.class_labels,\n", - ")\n", - "\n", - "# Initialize the training dataset\n", - "vm_train_ds = vm.init_dataset(\n", - " dataset=train_df,\n", - " input_id=\"train_dataset\",\n", - " target_column=customer_churn.target_column,\n", - ")\n", - "\n", - "# Initialize the testing dataset\n", - "vm_test_ds = vm.init_dataset(\n", - " dataset=test_df,\n", - " input_id=\"test_dataset\",\n", - " target_column=customer_churn.target_column\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "0b33afca", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Working with ValidMind models" - ] - }, - { - "cell_type": "markdown", - "id": "5962362c", - "metadata": {}, - "source": [ - "<a id='toc6_1__'></a>\n", - "\n", - "### Train an XGBoost classifier model\n", - "\n", - "Next, let's create an XGBoost classifier model that will automatically stop training if it doesn’t improve after 10 tries.\n", - "\n", - "Setting a threshold avoids wasting time and helps prevent overfitting by stopping training when further improvement isn’t happening." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "3296cac6", - "metadata": {}, - "outputs": [], - "source": [ - "model = xgb.XGBClassifier(early_stopping_rounds=10)" - ] - }, - { - "cell_type": "markdown", - "id": "33cafbcf", - "metadata": {}, - "source": [ - "<a id='toc6_1_1__'></a>\n", - "\n", - "#### Set evaluation metrics\n", - "\n", - "Then, we'll set the evaluation metrics, which tells the model to use three different ways to measure its performance:\n", - "\n", - "1. **error** — Measures how often the model makes incorrect predictions.\n", - "2. **logloss** — Indicates how confident the predictions are.\n", - "3. **auc** — Evaluates how well the model distinguishes between churn and not churn.\n", - "\n", - "Using multiple metrics gives a more complete picture of how good (or bad) the model is." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "32d3c3f4", - "metadata": {}, - "outputs": [], - "source": [ - "model.set_params(\n", - " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "47d84a80", - "metadata": {}, - "source": [ - "<a id='toc6_1_2__'></a>\n", - "\n", - "#### Fit the model\n", - "\n", - "Finally, our actual training step — where the model learns patterns from the data, so it can make predictions later:\n", - "\n", - "- The model is trained on `x_train` and `y_train`, and evaluates its performance using `x_val` and `y_val` to check if it’s learning well.\n", - "- To turn off printed output while training, we'll set `verbose` to `False`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "3fb95ce4", - "metadata": {}, - "outputs": [], - "source": [ - "model.fit(\n", - " x_train,\n", - " y_train,\n", - " eval_set=[(x_val, y_val)],\n", - " verbose=False,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "23bccb27", - "metadata": {}, - "source": [ - "<a id='toc6_2__'></a>\n", - "\n", - "### Initialize the ValidMind model\n", - "\n", - "You'll also need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data for our model.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0e44eebd", - "metadata": {}, - "outputs": [], - "source": [ - "vm_model = vm.init_model(\n", - " model,\n", - " input_id=\"model\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "20c008bf", - "metadata": {}, - "source": [ - "<a id='toc6_3__'></a>\n", - "\n", - "### Assign predictions\n", - "\n", - "Once the model has been registered, you can assign model predictions to the training and testing datasets.\n", - "\n", - "- The [`assign_predictions()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#assign_predictions) from the `Dataset` object can link existing predictions to any number of models.\n", - "- This method links the model's class prediction values and probabilities to our `vm_train_ds` and `vm_test_ds` datasets.\n", - "\n", - "If no prediction values are passed, the method will compute predictions automatically:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "62bd94fc", - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(\n", - " model=vm_model,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_model,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "0e66a7cd", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Run a ValidMind test suite\n", - "\n", - "This is where it all comes together — you are now ready to **run the documentation tests for the model as defined by the documentation template** you looked at earlier.\n", - "\n", - "The [`vm.run_documentation_tests`](https://docs.validmind.ai/validmind/validmind.html#run_documentation_tests) function finds and runs every test specified in the template and then uploads all the documentation and test artifacts that get generated to the ValidMind Platform:\n", - "\n", - "- The function requires information about the inputs to use on every test. These inputs can be passed as an `inputs` argument if we want to use the same inputs for all tests. \n", - "- It's also possible to pass a `config` argument that has information about the `params` and `inputs` that each test requires. The `config` parameter is a dictionary with the following structure:\n", - "\n", - " ```python\n", - " config = {\n", - " \"<test-id>\": {\n", - " \"params\": {\n", - " \"param1\": \"value1\",\n", - " \"param2\": \"value2\",\n", - " ...\n", - " },\n", - " \"inputs\": {\n", - " \"input1\": \"value1\",\n", - " \"input2\": \"value2\",\n", - " ...\n", - " }\n", - " },\n", - " ...\n", - " }\n", - " ```\n", - "\n", - " Each `<test-id>` above corresponds to the test driven block identifiers shown by `vm.preview_template()`. For this model, we will use the default parameters for all tests, but we'll need to specify the input configuration for each one. The method `get_demo_test_config()` below constructs the default input configuration for our demo." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b3d6741b", - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.utils import preview_test_config\n", - "\n", - "test_config = customer_churn.get_demo_test_config()\n", - "preview_test_config(test_config)" - ] - }, - { - "cell_type": "markdown", - "id": "7eebd40f", - "metadata": {}, - "source": [ - "Now we can pass the input configuration to `vm.run_documentation_tests()` and run the full suite of tests.\n", - "\n", - "The variable `full_suite` then holds the result of these tests:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "ae3accf7", - "metadata": {}, - "outputs": [], - "source": [ - "full_suite = vm.run_documentation_tests(config=test_config)" - ] - }, - { - "cell_type": "markdown", - "id": "ed61fa23", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## In summary\n", - "\n", - "In this notebook, you learned how to:\n", - "\n", - "- [x] Register a record (model) within the ValidMind Platform\n", - "- [x] Install and initialize the ValidMind Library\n", - "- [x] Preview the documentation template for your model\n", - "- [x] Import a sample dataset\n", - "- [x] Initialize ValidMind datasets and model objects\n", - "- [x] Assign model predictions to your ValidMind model objects\n", - "- [x] Run a full suite of documentation tests" - ] - }, - { - "cell_type": "markdown", - "id": "68803cd9", - "metadata": {}, - "source": [ - "<a id='toc9__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the output produced by the ValidMind Library right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your documentation." - ] - }, - { - "cell_type": "markdown", - "id": "ba38b729", - "metadata": {}, - "source": [ - "<a id='toc9_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))" - ] - }, - { - "cell_type": "markdown", - "id": "ae046dc4", - "metadata": {}, - "source": [ - "<a id='toc9_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "For a more in-depth introduction to using the ValidMind Library for development, check out our introductory development series and the accompanying interactive training:\n", - "\n", - "- **[ValidMind for development](https://docs.validmind.ai/developer/validmind-library.html#development)**\n", - "- **[Developer Fundamentals](https://docs.validmind.ai/training/developer-fundamentals/developer-fundamentals-register.html)**\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "id": "4ce38015", - "metadata": {}, - "source": [ - "<a id='toc10__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "35955b6b", - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "id": "f865e64e", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "id": "65b36aa7", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-bd87da591b88473997979690dbffcfa5", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" - }, - "language_info": { - "name": "python", - "version": "3.12.12" - } - }, - "nbformat": 4, - "nbformat_minor": 5 + "cells": [ + { + "cell_type": "markdown", + "id": "7b021b0d", + "metadata": {}, + "source": [ + "# Quickstart for documentation\n", + "\n", + "Learn the basics of using ValidMind to document records as part of a development workflow. Set up the ValidMind Library in your environment, and generate a draft of documentation using ValidMind tests for a binary classification model.\n", + "\n", + "To document our model with the ValidMind Library, we'll:\n", + "\n", + "1. Import a sample dataset and preprocess it\n", + "2. Split the datasets and initialize them for use with ValidMind\n", + "3. Initialize a ValidMind model object for use with testing\n", + "4. Run a full suite of tests as defined by our documentation template, which will send the results of those tests to the ValidMind Platform" + ] + }, + { + "cell_type": "markdown", + "id": "167aef58", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [Introduction](#toc1__) \n", + "- [About ValidMind](#toc2__) \n", + " - [Before you begin](#toc2_1__) \n", + " - [New to ValidMind?](#toc2_2__) \n", + " - [Key concepts](#toc2_3__) \n", + "- [Setting up](#toc3__) \n", + " - [Install the ValidMind Library](#toc3_1__) \n", + " - [Initialize the ValidMind Library](#toc3_2__) \n", + " - [Register sample model](#toc3_2_1__) \n", + " - [Apply documentation template](#toc3_2_2__) \n", + " - [Get your code snippet](#toc3_2_3__) \n", + " - [Initialize the Python environment](#toc3_3__) \n", + "- [Getting to know ValidMind](#toc4__) \n", + " - [Preview the documentation template](#toc4_1__) \n", + " - [View documentation in the ValidMind Platform](#toc4_2__) \n", + "- [Working with ValidMind datasets](#toc5__) \n", + " - [Prepare the sample dataset](#toc5_1__) \n", + " - [Import the sample dataset](#toc5_1_1__) \n", + " - [Preprocess the raw dataset](#toc5_1_2__) \n", + " - [Split the dataset](#toc5_1_3__) \n", + " - [Separate features and targets](#toc5_1_4__) \n", + " - [Initialize the ValidMind datasets](#toc5_2__) \n", + "- [Working with ValidMind models](#toc6__) \n", + " - [Train an XGBoost classifier model](#toc6_1__) \n", + " - [Set evaluation metrics](#toc6_1_1__) \n", + " - [Fit the model](#toc6_1_2__) \n", + " - [Initialize the ValidMind model](#toc6_2__) \n", + " - [Assign predictions](#toc6_3__) \n", + "- [Run a ValidMind test suite](#toc7__) \n", + "- [In summary](#toc8__) \n", + "- [Next steps](#toc9__) \n", + " - [Work with your documentation](#toc9_1__) \n", + " - [Discover more learning resources](#toc9_2__) \n", + "- [Upgrade ValidMind](#toc10__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "id": "1cce526f", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## Introduction\n", + "\n", + "Development aims to produce a fit-for-purpose *champion* by conducting thorough testing and analysis, supporting the capabilities of the champion with evidence in the form of documentation and test results. Documentation should be clear and comprehensive, ideally following a structure or template covering all aspects of compliance with risk regulation.\n", + "\n", + "A *binary classification model* is a type of predictive model used in churn analysis to identify customers who are likely to leave a service or subscription by analyzing various behavioral, transactional, and demographic factors.\n", + "\n", + "- This model helps businesses take proactive measures to retain at-risk customers by offering personalized incentives, improving customer service, or adjusting pricing strategies.\n", + "- Effective validation of a churn prediction model ensures that businesses can accurately identify potential churners, optimize retention efforts, and enhance overall customer satisfaction while minimizing revenue loss." + ] + }, + { + "cell_type": "markdown", + "id": "f9b5eac2", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." + ] + }, + { + "cell_type": "markdown", + "id": "650236de", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." + ] + }, + { + "cell_type": "markdown", + "id": "b9d9d4cf", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ] + }, + { + "cell_type": "markdown", + "id": "59b308f7", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "id": "61b5cbeb", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "id": "0f08166e", + "metadata": {}, + "source": [ + "<a id='toc3_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", + "<br></br>\n", + "Python 3.8 <= x <= 3.14</div>\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d1f6dbed", + "metadata": {}, + "outputs": [], + "source": [ + "%pip install -q validmind" + ] + }, + { + "cell_type": "markdown", + "id": "1bf4e4cb", + "metadata": {}, + "source": [ + "<a id='toc3_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "id": "cb6e369b", + "metadata": {}, + "source": [ + "<a id='toc3_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "id": "7167d002", + "metadata": {}, + "source": [ + "<a id='toc3_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "id": "43037f46", + "metadata": {}, + "source": [ + "<a id='toc3_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e2c1dd22", + "metadata": {}, + "outputs": [], + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "1a6933d3", + "metadata": {}, + "source": [ + "<a id='toc3_3__'></a>\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Then, let's import the necessary libraries and set up your Python environment for data analysis:\n", + "\n", + "- Import **Extreme Gradient Boosting** (XGBoost) with an alias so that we can reference its functions in later calls. XGBoost is a powerful machine learning library designed for speed and performance, especially in handling structured or tabular data.\n", + "- Enable **`matplotlib`**, a plotting library used for visualizing data. Ensures that any plots you generate will render inline in our notebook output rather than opening in a separate window." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "62d7c2c1", + "metadata": {}, + "outputs": [], + "source": [ + "import xgboost as xgb\n", + "\n", + "%matplotlib inline" + ] + }, + { + "cell_type": "markdown", + "id": "fafe8fc2", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Getting to know ValidMind" + ] + }, + { + "cell_type": "markdown", + "id": "d7ee565f", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b2bce375", + "metadata": {}, + "outputs": [], + "source": [ + "vm.preview_template()" + ] + }, + { + "cell_type": "markdown", + "id": "fa0e43cb", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### View documentation in the ValidMind Platform\n", + "\n", + "Next, let's head to the ValidMind Platform to see the template in action:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, navigate to **Inventory** and select the model you registered for this notebook.\n", + "\n", + "3. Click **Development** under Documents for your model and note how the structure of the documentation matches our preview above." + ] + }, + { + "cell_type": "markdown", + "id": "9d0d1005", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Working with ValidMind datasets" + ] + }, + { + "cell_type": "markdown", + "id": "1b94e39f", + "metadata": {}, + "source": [ + "<a id='toc5_1__'></a>\n", + "\n", + "### Prepare the sample dataset" + ] + }, + { + "cell_type": "markdown", + "id": "6fc79fc1", + "metadata": {}, + "source": [ + "<a id='toc5_1_1__'></a>\n", + "\n", + "#### Import the sample dataset\n", + "\n", + "First, let's import the public [Bank Customer Churn Prediction](https://www.kaggle.com/datasets/shantanudhakadd/bank-customer-churn-prediction) dataset from Kaggle so that we have something to work with.\n", + "\n", + "In our below example, note that: \n", + "\n", + "- The target column, `Exited` has a value of `1` when a customer has churned and `0` otherwise.\n", + "- The ValidMind Library provides a wrapper to automatically load the dataset as a [Pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) object. A Pandas Dataframe is a two-dimensional tabular data structure that makes use of rows and columns." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "58d1c94b", + "metadata": {}, + "outputs": [], + "source": [ + "from validmind.datasets.classification import customer_churn\n", + "\n", + "print(\n", + " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{customer_churn.target_column}' \\n\\t• Class labels: {customer_churn.class_labels}\"\n", + ")\n", + "\n", + "raw_df = customer_churn.load_data()\n", + "raw_df.head()" + ] + }, + { + "cell_type": "markdown", + "id": "4fe0f216", + "metadata": {}, + "source": [ + "<a id='toc5_1_2__'></a>\n", + "\n", + "#### Preprocess the raw dataset\n", + "\n", + "Before running tests with ValidMind, we'll need to preprocess our imported dataset. This involves splitting the data and separating the features (inputs) from the targets (outputs)." + ] + }, + { + "cell_type": "markdown", + "id": "9f690a04", + "metadata": {}, + "source": [ + "<a id='toc5_1_3__'></a>\n", + "\n", + "#### Split the dataset\n", + "\n", + "Splitting our dataset helps assess how well the model generalizes to unseen data.\n", + "\n", + "Use [`preprocess()`](https://docs.validmind.ai/validmind/validmind/datasets/classification/customer_churn.html#preprocess) to split our dataset into three subsets:\n", + "\n", + "1. **train_df** — Used to train the model.\n", + "2. **validation_df** — Used to evaluate the model's performance during training.\n", + "3. **test_df** — Used later on to asses the model's performance on new, unseen data." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "418cb5aa", + "metadata": {}, + "outputs": [], + "source": [ + "train_df, validation_df, test_df = customer_churn.preprocess(raw_df)" + ] + }, + { + "cell_type": "markdown", + "id": "a9ad2104", + "metadata": {}, + "source": [ + "<a id='toc5_1_4__'></a>\n", + "\n", + "#### Separate features and targets\n", + "\n", + "To train the model, we need to provide it with:\n", + "\n", + "1. **Inputs** — Features such as customer age, usage, etc.\n", + "2. **Outputs (Expected answers/labels)** — in our case, we would like to know whether the customer churned or not.\n", + "\n", + "Here, we'll use `x_train` and `x_val` to hold the input data (features), and `y_train` and `y_val` to hold the answers (the target we want to predict):" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "6fd365fd", + "metadata": {}, + "outputs": [], + "source": [ + "x_train = train_df.drop(customer_churn.target_column, axis=1)\n", + "y_train = train_df[customer_churn.target_column]\n", + "x_val = validation_df.drop(customer_churn.target_column, axis=1)\n", + "y_val = validation_df[customer_churn.target_column]" + ] + }, + { + "cell_type": "markdown", + "id": "73d767d7", + "metadata": {}, + "source": [ + "<a id='toc5_2__'></a>\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests with your preprocessed datasets, you must first initialize a ValidMind `Dataset` object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module. **This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind,** but you only need to do it once per dataset.\n", + "\n", + "For this example, we'll pass in the following arguments:\n", + "\n", + "- **`dataset`** — The raw dataset that you want to provide as input to tests.\n", + "- **`input_id`** — A unique identifier that allows tracking what inputs are used when running each individual test.\n", + "- **`target_column`** — A required argument if tests require access to true values. This is the name of the target column in the dataset.\n", + "- **`class_labels`** — An optional value to map predicted classes to class labels." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "bb6ad06a", + "metadata": {}, + "outputs": [], + "source": [ + "# Initialize the raw dataset\n", + "vm_raw_dataset = vm.init_dataset(\n", + " dataset=raw_df,\n", + " input_id=\"raw_dataset\",\n", + " target_column=customer_churn.target_column,\n", + " class_labels=customer_churn.class_labels,\n", + ")\n", + "\n", + "# Initialize the training dataset\n", + "vm_train_ds = vm.init_dataset(\n", + " dataset=train_df,\n", + " input_id=\"train_dataset\",\n", + " target_column=customer_churn.target_column,\n", + ")\n", + "\n", + "# Initialize the testing dataset\n", + "vm_test_ds = vm.init_dataset(\n", + " dataset=test_df,\n", + " input_id=\"test_dataset\",\n", + " target_column=customer_churn.target_column\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "0b33afca", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Working with ValidMind models" + ] + }, + { + "cell_type": "markdown", + "id": "5962362c", + "metadata": {}, + "source": [ + "<a id='toc6_1__'></a>\n", + "\n", + "### Train an XGBoost classifier model\n", + "\n", + "Next, let's create an XGBoost classifier model that will automatically stop training if it doesn’t improve after 10 tries.\n", + "\n", + "Setting a threshold avoids wasting time and helps prevent overfitting by stopping training when further improvement isn’t happening." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3296cac6", + "metadata": {}, + "outputs": [], + "source": [ + "model = xgb.XGBClassifier(early_stopping_rounds=10)" + ] + }, + { + "cell_type": "markdown", + "id": "33cafbcf", + "metadata": {}, + "source": [ + "<a id='toc6_1_1__'></a>\n", + "\n", + "#### Set evaluation metrics\n", + "\n", + "Then, we'll set the evaluation metrics, which tells the model to use three different ways to measure its performance:\n", + "\n", + "1. **error** — Measures how often the model makes incorrect predictions.\n", + "2. **logloss** — Indicates how confident the predictions are.\n", + "3. **auc** — Evaluates how well the model distinguishes between churn and not churn.\n", + "\n", + "Using multiple metrics gives a more complete picture of how good (or bad) the model is." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "32d3c3f4", + "metadata": {}, + "outputs": [], + "source": [ + "model.set_params(\n", + " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "47d84a80", + "metadata": {}, + "source": [ + "<a id='toc6_1_2__'></a>\n", + "\n", + "#### Fit the model\n", + "\n", + "Finally, our actual training step — where the model learns patterns from the data, so it can make predictions later:\n", + "\n", + "- The model is trained on `x_train` and `y_train`, and evaluates its performance using `x_val` and `y_val` to check if it’s learning well.\n", + "- To turn off printed output while training, we'll set `verbose` to `False`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3fb95ce4", + "metadata": {}, + "outputs": [], + "source": [ + "model.fit(\n", + " x_train,\n", + " y_train,\n", + " eval_set=[(x_val, y_val)],\n", + " verbose=False,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "23bccb27", + "metadata": {}, + "source": [ + "<a id='toc6_2__'></a>\n", + "\n", + "### Initialize the ValidMind model\n", + "\n", + "You'll also need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data for our model.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0e44eebd", + "metadata": {}, + "outputs": [], + "source": [ + "vm_model = vm.init_model(\n", + " model,\n", + " input_id=\"model\",\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "20c008bf", + "metadata": {}, + "source": [ + "<a id='toc6_3__'></a>\n", + "\n", + "### Assign predictions\n", + "\n", + "Once the model has been registered, you can assign model predictions to the training and testing datasets.\n", + "\n", + "- The [`assign_predictions()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#assign_predictions) from the `Dataset` object can link existing predictions to any number of models.\n", + "- This method links the model's class prediction values and probabilities to our `vm_train_ds` and `vm_test_ds` datasets.\n", + "\n", + "If no prediction values are passed, the method will compute predictions automatically:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "62bd94fc", + "metadata": {}, + "outputs": [], + "source": [ + "vm_train_ds.assign_predictions(\n", + " model=vm_model,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_model,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "0e66a7cd", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Run a ValidMind test suite\n", + "\n", + "This is where it all comes together — you are now ready to **run the documentation tests for the model as defined by the documentation template** you looked at earlier.\n", + "\n", + "The [`vm.run_documentation_tests`](https://docs.validmind.ai/validmind/validmind.html#run_documentation_tests) function finds and runs every test specified in the template and then uploads all the documentation and test artifacts that get generated to the ValidMind Platform:\n", + "\n", + "- The function requires information about the inputs to use on every test. These inputs can be passed as an `inputs` argument if we want to use the same inputs for all tests. \n", + "- It's also possible to pass a `config` argument that has information about the `params` and `inputs` that each test requires. The `config` parameter is a dictionary with the following structure:\n", + "\n", + " ```python\n", + " config = {\n", + " \"<test-id>\": {\n", + " \"params\": {\n", + " \"param1\": \"value1\",\n", + " \"param2\": \"value2\",\n", + " ...\n", + " },\n", + " \"inputs\": {\n", + " \"input1\": \"value1\",\n", + " \"input2\": \"value2\",\n", + " ...\n", + " }\n", + " },\n", + " ...\n", + " }\n", + " ```\n", + "\n", + " Each `<test-id>` above corresponds to the test driven block identifiers shown by `vm.preview_template()`. For this model, we will use the default parameters for all tests, but we'll need to specify the input configuration for each one. The method `get_demo_test_config()` below constructs the default input configuration for our demo." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b3d6741b", + "metadata": {}, + "outputs": [], + "source": [ + "from validmind.utils import preview_test_config\n", + "\n", + "test_config = customer_churn.get_demo_test_config()\n", + "preview_test_config(test_config)" + ] + }, + { + "cell_type": "markdown", + "id": "7eebd40f", + "metadata": {}, + "source": [ + "Now we can pass the input configuration to `vm.run_documentation_tests()` and run the full suite of tests.\n", + "\n", + "The variable `full_suite` then holds the result of these tests:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ae3accf7", + "metadata": {}, + "outputs": [], + "source": [ + "full_suite = vm.run_documentation_tests(config=test_config)" + ] + }, + { + "cell_type": "markdown", + "id": "ed61fa23", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## In summary\n", + "\n", + "In this notebook, you learned how to:\n", + "\n", + "- [x] Register a record (model) within the ValidMind Platform\n", + "- [x] Install and initialize the ValidMind Library\n", + "- [x] Preview the documentation template for your model\n", + "- [x] Import a sample dataset\n", + "- [x] Initialize ValidMind datasets and model objects\n", + "- [x] Assign model predictions to your ValidMind model objects\n", + "- [x] Run a full suite of documentation tests" + ] + }, + { + "cell_type": "markdown", + "id": "68803cd9", + "metadata": {}, + "source": [ + "<a id='toc9__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the output produced by the ValidMind Library right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your documentation." + ] + }, + { + "cell_type": "markdown", + "id": "ba38b729", + "metadata": {}, + "source": [ + "<a id='toc9_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))" + ] + }, + { + "cell_type": "markdown", + "id": "ae046dc4", + "metadata": {}, + "source": [ + "<a id='toc9_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "For a more in-depth introduction to using the ValidMind Library for development, check out our introductory development series and the accompanying interactive training:\n", + "\n", + "- **[ValidMind for development](https://docs.validmind.ai/developer/validmind-library.html#development)**\n", + "- **[Developer Fundamentals](https://docs.validmind.ai/training/developer-fundamentals/developer-fundamentals-register.html)**\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "id": "4ce38015", + "metadata": {}, + "source": [ + "<a id='toc10__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "35955b6b", + "metadata": {}, + "outputs": [], + "source": [ + "%pip show validmind" + ] + }, + { + "cell_type": "markdown", + "id": "f865e64e", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "65b36aa7", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "id": "copyright-bd87da591b88473997979690dbffcfa5", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "name": "python", + "version": "3.12.12" + } + }, + "nbformat": 4, + "nbformat_minor": 5 } diff --git a/site/notebooks/quickstart/quickstart_validation.ipynb b/site/notebooks/quickstart/quickstart_validation.ipynb index bf94d3f927..4d871e1226 100644 --- a/site/notebooks/quickstart/quickstart_validation.ipynb +++ b/site/notebooks/quickstart/quickstart_validation.ipynb @@ -1,1238 +1,1248 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "1a88a895", - "metadata": {}, - "source": [ - "# Quickstart for validation\n", - "\n", - "Learn the basics of using ValidMind to validate records as part of a validation workflow. Set up the ValidMind Library in your environment, and generate a draft of a validation report using ValidMind tests for a binary classification model.\n", - "\n", - "To validate our model with the ValidMind Library, we'll:\n", - "\n", - "1. Import a sample dataset and preprocess it, then split the datasets and initialize them for use with ValidMind\n", - "2. Independently verify data quality tests performed on datasets by model development\n", - "3. Import a champion model for evaluation\n", - "4. Run model evaluation tests with the ValidMind Library, which will send the results of those tests to the ValidMind Platform" - ] - }, - { - "cell_type": "markdown", - "id": "0493b0cb", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [Introduction](#toc1__) \n", - "- [About ValidMind](#toc2__) \n", - " - [Before you begin](#toc2_1__) \n", - " - [New to ValidMind?](#toc2_2__) \n", - " - [Key concepts](#toc2_3__) \n", - "- [Setting up](#toc3__) \n", - " - [Register a sample model](#toc3_1__) \n", - " - [Assign validator credentials](#toc3_1_1__) \n", - " - [Apply validation report template](#toc3_1_2__) \n", - " - [Install the ValidMind Library](#toc3_2__) \n", - " - [Initialize the ValidMind Library](#toc3_3__) \n", - " - [Get your code snippet](#toc3_3_1__) \n", - " - [Initialize the Python environment](#toc3_4__) \n", - "- [Getting to know ValidMind](#toc4__) \n", - " - [Preview the validation report template](#toc4_1__) \n", - " - [View validation report in the ValidMind Platform](#toc4_2__) \n", - "- [Working with ValidMind datasets](#toc5__) \n", - " - [Prepare the sample dataset](#toc5_1__) \n", - " - [Load the sample dataset](#toc5_1_1__) \n", - " - [Preprocess the raw dataset](#toc5_1_2__) \n", - " - [Split the dataset](#toc5_1_3__) \n", - " - [Separate features and targets](#toc5_1_4__) \n", - " - [Initialize the ValidMind datasets](#toc5_2__) \n", - "- [Running data quality tests](#toc6__) \n", - " - [Identify qualitative tests](#toc6_1__) \n", - " - [Run an individual data quality test](#toc6_2__) \n", - " - [Run data comparison tests](#toc6_3__) \n", - "- [Working with ValidMind models](#toc7__) \n", - " - [Import the champion model](#toc7_1__) \n", - " - [Initialize the ValidMind model](#toc7_2__) \n", - " - [Assign predictions](#toc7_3__) \n", - "- [Running model evaluation tests](#toc8__) \n", - " - [Run model performance tests](#toc8_1__) \n", - " - [Run diagnostic tests](#toc8_2__) \n", - " - [Run feature importance tests](#toc8_3__) \n", - "- [In summary](#toc9__) \n", - "- [Next steps](#toc10__) \n", - " - [Work with your validation report](#toc10_1__) \n", - " - [Discover more learning resources](#toc10_2__) \n", - "- [Upgrade ValidMind](#toc11__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "id": "717d2a16", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## Introduction\n", - "\n", - "Validation aims to independently assess the compliance of *champions* created by developers with regulatory guidance by conducting thorough testing and analysis, potentially including the use of challengers to benchmark performance. Assessments, presented in the form of a validation report, typically include *artifacts (findings)* and recommendations to address those issues.\n", - "\n", - "A *binary classification model* is a type of predictive model used in churn analysis to identify customers who are likely to leave a service or subscription by analyzing various behavioral, transactional, and demographic factors.\n", - "\n", - "- This model helps businesses take proactive measures to retain at-risk customers by offering personalized incentives, improving customer service, or adjusting pricing strategies.\n", - "- Effective validation of a churn prediction model ensures that businesses can accurately identify potential churners, optimize retention efforts, and enhance overall customer satisfaction while minimizing revenue loss." - ] - }, - { - "cell_type": "markdown", - "id": "369d00db", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate comparison and other validation tests, and then use the ValidMind Platform to submit compliance assessments of champions via comprehensive validation reports. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and developers." - ] - }, - { - "cell_type": "markdown", - "id": "72800fc2", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." - ] - }, - { - "cell_type": "markdown", - "id": "e2beb1bb", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about validating records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "id": "78c8388c", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Validation report**: A comprehensive and structured assessment of a model’s development and performance, focusing on verifying its integrity, appropriateness, and alignment with its intended use. It includes analyses of model assumptions, data quality, performance metrics, outcomes of testing procedures, and risk considerations. The validation report supports transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", - "\n", - "**Validation report template**: Serves as a standardized framework for conducting and documenting model validation activities. It outlines the required sections, recommended analyses, and expected validation tests, ensuring consistency and completeness across validation reports. The template helps guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." - ] - }, - { - "cell_type": "markdown", - "id": "ec7b4755", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "id": "97d44f44", - "metadata": {}, - "source": [ - "<a id='toc3_1__'></a>\n", - "\n", - "### Register a sample model\n", - "\n", - "In a usual lifecycle, a champion will have been independently registered in your inventory and submitted to you for validation by your development team as part of the effective challenge process. (**Learn more:** [Submit documents](https://docs.validmind.ai/guide/documentation/submit-documents.html))\n", - "\n", - "For this notebook, we'll have you register a dummy record (model) in the ValidMind Platform inventory and assign yourself as the validator to familiarize you with the ValidMind interface and circumvent the need for an existing model:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down — don’t worry, we’ll adjust these permissions next for validation.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "id": "fc3e48e1", - "metadata": {}, - "source": [ - "<a id='toc3_1_1__'></a>\n", - "\n", - "#### Assign validator credentials\n", - "\n", - "In order to log tests as a validator instead of as a developer, on the details page that appears after you've successfully registered your sample model:\n", - "\n", - "1. Remove yourself as an owner:\n", - "\n", - " - Click on the **OWNERS** tile.\n", - " - Click the **x** next to your name to remove yourself from that model's role.\n", - " - Click **Save** to apply your changes to that role.\n", - "\n", - "2. Remove yourself as a developer:\n", - "\n", - " - Click on the **DEVELOPERS** tile.\n", - " - Click the **x** next to your name to remove yourself from that model's role.\n", - " - Click **Save** to apply your changes to that role.\n", - "\n", - "3. Add yourself as a validator:\n", - "\n", - " - Click on the **VALIDATORS** tile.\n", - " - Select your name from the drop-down menu.\n", - " - Click **Save** to apply your changes to that role." - ] - }, - { - "cell_type": "markdown", - "id": "428260e0", - "metadata": {}, - "source": [ - "<a id='toc3_1_2__'></a>\n", - "\n", - "#### Apply validation report template\n", - "\n", - "Next, let's select a validation report template. A template predefines sections for your report and provides a general outline to follow, making the validation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Validation**.\n", - "\n", - " If you cannot locate your Validation document, make sure Validation type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Generic Validation Report`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "id": "7b16c381", - "metadata": {}, - "source": [ - "<a id='toc3_2__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", - "<br></br>\n", - "Python 3.8 <= x <= 3.14</div>\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "64eb485c", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "id": "bf77550e", - "metadata": {}, - "source": [ - "<a id='toc3_3__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "id": "ae918c6c", - "metadata": {}, - "source": [ - "<a id='toc3_3_1__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Validation` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "9c6ce354", - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"validation-report\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "f9bc73e9", - "metadata": {}, - "source": [ - "<a id='toc3_4__'></a>\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Then, let's import the necessary libraries and set up your Python environment for data analysis by enabling **`matplotlib`**, a plotting library used for visualizing data.\n", - "\n", - "This ensures that any plots you generate will render inline in our notebook output rather than opening in a separate window:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "1e53065d", - "metadata": {}, - "outputs": [], - "source": [ - "\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "id": "e0e942dd", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Getting to know ValidMind" - ] - }, - { - "cell_type": "markdown", - "id": "0361d8bf", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Preview the validation report template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for validation. A template predefines sections for your validation report and provides a general outline to follow, making the validation process much easier.\n", - "\n", - "You will attach evidence to this template in the form of risk assessment notes, artifacts, and test results later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "be445598", - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "id": "4124c3d7", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### View validation report in the ValidMind Platform\n", - "\n", - "Next, let's head to the ValidMind Platform to see the template in action:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, navigate to **Inventory** and select the model you registered for this notebook.\n", - "\n", - "3. Click **Validation** under Documents for your model and note:\n", - "\n", - " - [x] The risk assessment compliance summary at the top of the report (screenshot below)\n", - " - [x] How the structure of the validation report reflects the previewed template\n", - "\n", - " <img src= \"../tutorials/validation/compliance-summary.png\" alt=\"Screenshot showing the risk assessment compliance summary\" style=\"border: 2px solid #083E44; border-radius: 8px; border-right-width: 2px; border-bottom-width: 3px;\">\n", - " <br><br>" - ] - }, - { - "cell_type": "markdown", - "id": "767ea445", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Working with ValidMind datasets" - ] - }, - { - "cell_type": "markdown", - "id": "ae3f832d", - "metadata": {}, - "source": [ - "<a id='toc5_1__'></a>\n", - "\n", - "### Prepare the sample dataset" - ] - }, - { - "cell_type": "markdown", - "id": "f91775e8", - "metadata": {}, - "source": [ - "<a id='toc5_1_1__'></a>\n", - "\n", - "#### Load the sample dataset\n", - "\n", - "First, let's import the public [Bank Customer Churn Prediction](https://www.kaggle.com/datasets/shantanudhakadd/bank-customer-churn-prediction) dataset from Kaggle, which was used to develop the dummy champion.\n", - "\n", - "We'll use this dataset to review steps that should have been conducted during the initial development and documentation of the champion to ensure that the model was built correctly. By independently performing steps taken by the development team, we can confirm whether the model was built using appropriate and properly processed data.\n", - "\n", - "In our below example, note that:\n", - "\n", - "- The target column, `Exited` has a value of `1` when a customer has churned and `0` otherwise.\n", - "- The ValidMind Library provides a wrapper to automatically load the dataset as a [Pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) object. A Pandas Dataframe is a two-dimensional tabular data structure that makes use of rows and columns." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "73076ee3", - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.datasets.classification import customer_churn\n", - "\n", - "print(\n", - " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{customer_churn.target_column}' \\n\\t• Class labels: {customer_churn.class_labels}\"\n", - ")\n", - "\n", - "raw_df = customer_churn.load_data()\n", - "raw_df.head()" - ] - }, - { - "cell_type": "markdown", - "id": "6ab7fd19", - "metadata": {}, - "source": [ - "<a id='toc5_1_2__'></a>\n", - "\n", - "#### Preprocess the raw dataset\n", - "\n", - "Let's say that thanks to the documentation submitted by the development team (**Learn more:** [Quickstart for documentation](quickstart_documentation.ipynb)), we know that the sample dataset was first preprocessed before being used to train the champion.\n", - "\n", - "During validation, we use the same data processing logic and training procedure to confirm that the champion's results can be reproduced independently, so let's also start by preprocessing our imported dataset to verify that preprocessing was done correctly. This involves splitting the data and separating the features (inputs) from the targets (outputs)." - ] - }, - { - "cell_type": "markdown", - "id": "af660bf4", - "metadata": {}, - "source": [ - "<a id='toc5_1_3__'></a>\n", - "\n", - "#### Split the dataset\n", - "\n", - "Splitting our dataset helps assess how well the model generalizes to unseen data.\n", - "\n", - "Use [`preprocess()`](https://docs.validmind.ai/validmind/validmind/datasets/classification/customer_churn.html#preprocess) to split our dataset into three subsets:\n", - "\n", - "1. **train_df** — Used to train the model.\n", - "2. **validation_df** — Used to evaluate the model's performance during training.\n", - "3. **test_df** — Used later on to asses the model's performance on new, unseen data." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "ee8cfaaf", - "metadata": {}, - "outputs": [], - "source": [ - "train_df, validation_df, test_df = customer_churn.preprocess(raw_df)" - ] - }, - { - "cell_type": "markdown", - "id": "125a39e6", - "metadata": {}, - "source": [ - "<a id='toc5_1_4__'></a>\n", - "\n", - "#### Separate features and targets\n", - "\n", - "To train the model, we need to provide it with:\n", - "\n", - "1. **Inputs** — Features such as customer age, usage, etc.\n", - "2. **Outputs (Expected answers/labels)** — in our case, we would like to know whether the customer churned or not.\n", - "\n", - "Here, we'll use `x_train` to hold the input features, and `y_train` to hold the target variable — the values we want the model to predict:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "6fe65be5", - "metadata": {}, - "outputs": [], - "source": [ - "x_train = train_df.drop(customer_churn.target_column, axis=1)\n", - "y_train = train_df[customer_churn.target_column]" - ] - }, - { - "cell_type": "markdown", - "id": "b6674505", - "metadata": {}, - "source": [ - "<a id='toc5_2__'></a>\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests with your preprocessed datasets, you must first initialize a ValidMind `Dataset` object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module. **This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind,** but you only need to do it once per dataset.\n", - "\n", - "For this example, we'll pass in the following arguments:\n", - "\n", - "- **`dataset`** — The raw dataset that you want to provide as input to tests.\n", - "- **`input_id`** — A unique identifier that allows tracking what inputs are used when running each individual test.\n", - "- **`target_column`** — A required argument if tests require access to true values. This is the name of the target column in the dataset.\n", - "- **`class_labels`** — An optional value to map predicted classes to class labels." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "ba677dd7", - "metadata": {}, - "outputs": [], - "source": [ - "# Initialize the raw dataset\n", - "vm_raw_dataset = vm.init_dataset(\n", - " dataset=raw_df,\n", - " input_id=\"raw_dataset\",\n", - " target_column=customer_churn.target_column,\n", - " class_labels=customer_churn.class_labels,\n", - ")\n", - "\n", - "# Initialize the training dataset\n", - "vm_train_ds = vm.init_dataset(\n", - " dataset=train_df,\n", - " input_id=\"train_dataset\",\n", - " target_column=customer_churn.target_column,\n", - ")\n", - "\n", - "# Initialize the validation dataset\n", - "vm_validation_ds = vm.init_dataset(\n", - " dataset=validation_df,\n", - " input_id=\"validation_dataset\",\n", - " target_column=customer_churn.target_column,\n", - ")\n", - "\n", - "# Initialize the testing dataset\n", - "vm_test_ds = vm.init_dataset(\n", - " dataset=test_df,\n", - " input_id=\"test_dataset\",\n", - " target_column=customer_churn.target_column\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "c53c6d35", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Running data quality tests\n", - "\n", - "With everything ready to go, let's explore some of ValidMind's available tests to help us assess the quality of our datasets. Using ValidMind’s repository of tests streamlines your validation testing, and helps you ensure that your records are being validated appropriately." - ] - }, - { - "cell_type": "markdown", - "id": "b6acd486", - "metadata": {}, - "source": [ - "<a id='toc6_1__'></a>\n", - "\n", - "### Identify qualitative tests\n", - "\n", - "We want to narrow down the tests we want to run from the selection provided by ValidMind, so we'll use the [`vm.tests.list_tasks_and_tags()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tasks_and_tags) to list which `tags` are associated with each `task` type:\n", - "\n", - "- **`tasks`** represent the kind of modeling task associated with a test. Here we'll focus on `classification` tasks.\n", - "- **`tags`** are free-form descriptions providing more details about the test, for example, what category the test falls into. Here we'll focus on the `data_quality` tag." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "85bc2f85", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tasks_and_tags()" - ] - }, - { - "cell_type": "markdown", - "id": "9881e58a", - "metadata": {}, - "source": [ - "Then we'll call [the `vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to list all the data quality tests for classification:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "31b31a51", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tests(\n", - " tags=[\"data_quality\"], task=\"classification\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "d3e27375", - "metadata": {}, - "source": [ - "<a id='toc6_2__'></a>\n", - "\n", - "### Run an individual data quality test\n", - "\n", - "Next, we'll use our previously initialized raw dataset (`vm_raw_dataset`) as input to run an individual test, then log the result to the ValidMind Platform.\n", - "\n", - "- You run validation tests by calling [the `run_test` function](https://docs.validmind.ai/validmind/validmind/tests.html#run_test) provided by the `validmind.tests` module.\n", - "- Every test result returned by the `run_test()` function has a [`.log()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#TestResult.log) that can be used to send the test results to the ValidMind Platform.\n", - "\n", - "Here, we'll use the `ClassImbalance` test as an example:\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "dcb9b017", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " test_id=\"validmind.data_validation.ClassImbalance\",\n", - " inputs={\n", - " \"dataset\": vm_raw_dataset\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "f6b7567b", - "metadata": {}, - "source": [ - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Note the output returned indicating that a test-driven block doesn't currently exist in your documentation for some test IDs. </b></span>\n", - "<br></br>\n", - "That's expected, as when we run validations tests the results logged need to be manually added to your report as part of your compliance assessment process within the ValidMind Platform. You'll continue to see this message throughout this notebook as we run and log more tests.</div>" - ] - }, - { - "cell_type": "markdown", - "id": "97286c0e", - "metadata": {}, - "source": [ - "<a id='toc6_3__'></a>\n", - "\n", - "### Run data comparison tests\n", - "<span id=\"data-comparison\">\n", - "\n", - "We can also use ValidMind to perform comparison tests between our datasets, again logging the results to the ValidMind Platform. Below, we'll perform two sets of comparison tests with a mix of our datasets and the same class imbalance test:\n", - "\n", - "- When running individual tests, you can use a custom **`result_id`** to tag the individual result with a unique identifier, appended to the `test_id` with a `:` separator.\n", - "- We can specify all the tests we'd ike to run in a dictionary called `test_config`, and we'll pass in an **`input_grid`** of individual test inputs to compare. In this case, we'll input our two datasets for comparison. Note here that the `input_grid` expects the `input_id` of the dataset as the value rather than the variable name we specified." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d53edde7", - "metadata": {}, - "outputs": [], - "source": [ - "# Individual test config with inputs specified\n", - "test_config = {\n", - " # Comparison between training and testing datasets to check if class balance is the same in both sets\n", - " \"validmind.data_validation.ClassImbalance:train_vs_validation\": {\n", - " \"input_grid\": {\"dataset\": [\"train_dataset\", \"validation_dataset\"]}\n", - " },\n", - " # Comparison between training and testing datasets to confirm that both sets have similar class distributions\n", - " \"validmind.data_validation.ClassImbalance:train_vs_test\": {\n", - " \"input_grid\": {\"dataset\": [\"train_dataset\", \"test_dataset\"]},\n", - " },\n", - "}" - ] - }, - { - "cell_type": "markdown", - "id": "1f1b796b", - "metadata": {}, - "source": [ - "Then batch run and log our tests in `test_config`:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "1b97e404", - "metadata": {}, - "outputs": [], - "source": [ - "for t in test_config:\n", - " print(t)\n", - " try:\n", - " # Check if test has input_grid\n", - " if 'input_grid' in test_config[t]:\n", - " # For tests with input_grid, pass the input_grid configuration\n", - " if 'params' in test_config[t]:\n", - " vm.tests.run_test(t, input_grid=test_config[t]['input_grid'], params=test_config[t]['params']).log()\n", - " else:\n", - " vm.tests.run_test(t, input_grid=test_config[t]['input_grid']).log()\n", - " else:\n", - " # Original logic for regular inputs\n", - " if 'params' in test_config[t]:\n", - " vm.tests.run_test(t, inputs=test_config[t]['inputs'], params=test_config[t]['params']).log()\n", - " else:\n", - " vm.tests.run_test(t, inputs=test_config[t]['inputs']).log()\n", - " except Exception as e:\n", - " print(f\"Error running test {t}: {str(e)}\")" - ] - }, - { - "cell_type": "markdown", - "id": "1ca8c343", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Working with ValidMind models" - ] - }, - { - "cell_type": "markdown", - "id": "1fd05953", - "metadata": {}, - "source": [ - "<a id='toc7_1__'></a>\n", - "\n", - "### Import the champion model\n", - "\n", - "With our raw dataset preprocessed, let's go ahead and import the champion submitted by the development team in the format of a `.pkl` file: **[xgboost_model_champion.pkl](xgboost_model_champion.pkl)**" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "7f18188e", - "metadata": {}, - "outputs": [], - "source": [ - "# Import the champion model\n", - "import joblib\n", - "\n", - "xgboost = joblib.load(\"xgboost_model_champion.pkl\")" - ] - }, - { - "cell_type": "markdown", - "id": "ee26b0b6", - "metadata": {}, - "source": [ - "<a id='toc7_2__'></a>\n", - "\n", - "### Initialize the ValidMind model\n", - "\n", - "In addition to the initialized datasets, you'll also need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data for our champion.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0a799cf2", - "metadata": {}, - "outputs": [], - "source": [ - "# Initialize the champion XGBoost model\n", - "vm_xgboost = vm.init_model(\n", - " xgboost,\n", - " input_id=\"xgboost_champion\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "823e49c5", - "metadata": {}, - "source": [ - "<a id='toc7_3__'></a>\n", - "\n", - "### Assign predictions\n", - "\n", - "Once the model has been registered, you can assign model predictions to the training and testing datasets.\n", - "\n", - "- The [`assign_predictions()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#assign_predictions) from the `Dataset` object can link existing predictions to any number of models.\n", - "- This method links the model's class prediction values and probabilities to our `vm_train_ds` and `vm_test_ds` datasets.\n", - "\n", - "If no prediction values are passed, the method will compute predictions automatically:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "71dd8e7b", - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(\n", - " model=vm_xgboost,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_xgboost,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "2e29df90", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Running model evaluation tests\n", - "\n", - "With our setup complete, let's run the rest of our validation tests. Since we have already verified the data quality of the dataset used to train our champion, we will now focus on evaluating the model's performance." - ] - }, - { - "cell_type": "markdown", - "id": "fc6af0e0", - "metadata": {}, - "source": [ - "<a id='toc8_1__'></a>\n", - "\n", - "### Run model performance tests\n", - "\n", - "First, let's run some performance tests. Use [`vm.tests.list_tests()`](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to identify all the model performance tests for classification:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "202792e8", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tests(tags=[\"model_performance\"], task=\"classification\")" - ] - }, - { - "cell_type": "markdown", - "id": "011b7c09", - "metadata": {}, - "source": [ - "We'll isolate the specific tests we want to run in `mpt`, and append an identifier for our champion model here to the `result_id` with a `:` separator like we did above in another test:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "9fc18843", - "metadata": {}, - "outputs": [], - "source": [ - "mpt = [\n", - " \"validmind.model_validation.sklearn.ClassifierPerformance:xgboost_champion\",\n", - " \"validmind.model_validation.sklearn.ConfusionMatrix:xgboost_champion\",\n", - " \"validmind.model_validation.sklearn.ROCCurve:xgboost_champion\"\n", - "]" - ] - }, - { - "cell_type": "markdown", - "id": "52096118", - "metadata": {}, - "source": [ - "Now, let's run and log our batch of model performance tests using our testing dataset (`vm_test_ds`) for our champion model:\n", - "\n", - "- The test set serves as a proxy for real-world data, providing an unbiased estimate of model performance since it was not used during training or tuning.\n", - "- The test set also acts as protection against selection bias and model tweaking, giving a final, more unbiased checkpoint." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "6866b21c", - "metadata": {}, - "outputs": [], - "source": [ - "for test in mpt:\n", - " vm.tests.run_test(\n", - " test,\n", - " inputs={\n", - " \"dataset\": vm_test_ds, \"model\" : vm_xgboost,\n", - " },\n", - " ).log()" - ] - }, - { - "cell_type": "markdown", - "id": "842707f9", - "metadata": {}, - "source": [ - "<a id='toc8_2__'></a>\n", - "\n", - "### Run diagnostic tests\n", - "\n", - "Next, we want to inspect the robustness and stability of our champion. Use `list_tests()` to list all available diagnosis tests applicable to classification tasks:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "c9b3caa4", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tests(tags=[\"model_diagnosis\"], task=\"classification\")" - ] - }, - { - "cell_type": "markdown", - "id": "5295d37b", - "metadata": {}, - "source": [ - "Let’s now assess the model for potential signs of *overfitting* and identify any sub-segments where performance may inconsistent.\n", - "\n", - "Overfitting occurs when a model learns the training data too well, capturing not only the true pattern but noise and random fluctuations resulting in excellent performance on the training dataset but poor generalization to new, unseen data:\n", - "\n", - "- Since the training dataset (`vm_train_ds`) was used to fit the model, we use this set to establish a baseline performance for how well the model performs on data it has already seen.\n", - "- The testing dataset (`vm_test_ds`) was never seen during training, and here simulates real-world generalization, or how well the model performs on new, unseen data. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "82f824f2", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " test_id=\"validmind.model_validation.sklearn.OverfitDiagnosis:xgboost_champion\",\n", - " input_grid={\n", - " \"datasets\": [[vm_train_ds,vm_test_ds]],\n", - " \"model\" : [vm_xgboost]\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "88db22ed", - "metadata": {}, - "source": [ - "Let's also conduct *robustness* and *stability* tests.\n", - "\n", - "- Robustness evaluates the model’s ability to maintain consistent performance under varying input conditions.\n", - "- Stability assesses whether the model produces consistent outputs across different data subsets or over time.\n", - "\n", - "Again, we'll use both the training and testing datasets to establish baseline performance and to simulate real-world generalization:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b2676197", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " test_id=\"validmind.model_validation.sklearn.RobustnessDiagnosis:xgboost_champion\",\n", - " input_grid={\n", - " \"datasets\": [[vm_train_ds,vm_test_ds]],\n", - " \"model\" : [vm_xgboost]\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "9226c6ea", - "metadata": {}, - "source": [ - "<a id='toc8_3__'></a>\n", - "\n", - "### Run feature importance tests\n", - "\n", - "We also want to verify the relative influence of different input features on our model's predictions. Use `list_tests()` to identify all the feature importance tests for classification and store them in `FI`:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "9c8c26e6", - "metadata": {}, - "outputs": [], - "source": [ - "# Store the feature importance tests\n", - "FI = vm.tests.list_tests(tags=[\"feature_importance\"], task=\"classification\",pretty=False)\n", - "FI" - ] - }, - { - "cell_type": "markdown", - "id": "d36a3544", - "metadata": {}, - "source": [ - "We'll only use our testing dataset (`vm_test_ds`) here, to provide a realistic, unseen sample that mimic future or production data, as the training dataset has already influenced our model during learning:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "5a49f550", - "metadata": {}, - "outputs": [], - "source": [ - "# Run and log our feature importance tests with the testing dataset\n", - "for test in FI:\n", - " vm.tests.run_test(\n", - " \"\".join((test,':xgboost_champion')),\n", - " inputs={\n", - " \"dataset\": vm_test_ds, \"model\": vm_xgboost\n", - " },\n", - " ).log()" - ] - }, - { - "cell_type": "markdown", - "id": "293bf4ca", - "metadata": {}, - "source": [ - "<a id='toc9__'></a>\n", - "\n", - "## In summary\n", - "\n", - "In this notebook, you learned how to:\n", - "\n", - "- [x] Register a record (model) within the ValidMind Platform\n", - "- [x] Install and initialize the ValidMind Library\n", - "- [x] Preview the validation report template for your model\n", - "- [x] Import a sample dataset and champion model\n", - "- [x] Initialize ValidMind datasets and model objects\n", - "- [x] Assign model predictions to your ValidMind model objects\n", - "- [x] Identify and run various validation tests\n", - "\n", - "In a usual validation workflow, you would wrap up your validation testing by verifying that all the tests provided by the development team were run and reported accurately, and perhaps even propose a challenger, comparing the performance of the challenger with the running champion.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>With ValidMind, you can easily:</b></span>\n", - "<ul>\n", - " <li>Specify all the tests you'd like to independently rerun, just like you did in the step <a href=\"#run-data-comparison-tests\" style=\"color: #DE257E;\">Run data comparision tests</a></li>\n", - " <li>Evaluate the performance of a challenger against the champion, just like you did in the steps under <a href=\"#running-model-evaluation-tests\" style=\"color: #DE257E;\">Running model evaluation tests</a></li>\n", - "</ul>\n", - "</div>" - ] - }, - { - "cell_type": "markdown", - "id": "b7fe1ed3", - "metadata": {}, - "source": [ - "<a id='toc10__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the output produced by the ValidMind Library right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your validation report." - ] - }, - { - "cell_type": "markdown", - "id": "1e30826e", - "metadata": {}, - "source": [ - "<a id='toc10_1__'></a>\n", - "\n", - "### Work with your validation report\n", - "\n", - "Now that you've logged all your test results and verified the work done by the development team, head to the ValidMind Platform to wrap up your validation report:\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you connected to earlier.\n", - "\n", - "2. In the left sidebar that appears for your model, click **Validation** under Documents.\n", - "\n", - "Include your logged test results as evidence, create risk assessment notes, add artifacts, and assess compliance, then submit your report for review when it's ready. (**Learn more:** [Preparing validation reports](https://docs.validmind.ai/guide/validation/preparing-validation-reports.html))" - ] - }, - { - "cell_type": "markdown", - "id": "8511e2f8", - "metadata": {}, - "source": [ - "<a id='toc10_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "For a more in-depth introduction to using the ValidMind Library for validation, check out our introductory validation series and the accompanying interactive training:\n", - "\n", - "- **[ValidMind for validation](https://docs.validmind.ai/developer/validmind-library.html#validation)**\n", - "- **[Validator Fundamentals](https://docs.validmind.ai/training/validator-fundamentals/validator-fundamentals-register.html)**\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:q\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "id": "58d2d5da", - "metadata": {}, - "source": [ - "<a id='toc11__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "upgrade-show-c0a446ff-f26f-4ad0-839a-e92927711798", - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "id": "7e76ca12", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "id": "6d3e2933", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-2427447e4fe348908b3423e86473bfeb", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" - }, - "language_info": { - "name": "python", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 5 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Quickstart for validation\n", + "\n", + "Learn the basics of using ValidMind to validate records as part of a validation workflow. Set up the ValidMind Library in your environment, and generate a draft of a validation report using ValidMind tests for a binary classification model.\n", + "\n", + "To validate our model with the ValidMind Library, we'll:\n", + "\n", + "1. Import a sample dataset and preprocess it, then split the datasets and initialize them for use with ValidMind\n", + "2. Independently verify data quality tests performed on datasets by model development\n", + "3. Import a champion model for evaluation\n", + "4. Run model evaluation tests with the ValidMind Library, which will send the results of those tests to the ValidMind Platform" + ], + "id": "1a88a895" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [Introduction](#toc1__) \n", + "- [About ValidMind](#toc2__) \n", + " - [Before you begin](#toc2_1__) \n", + " - [New to ValidMind?](#toc2_2__) \n", + " - [Key concepts](#toc2_3__) \n", + "- [Setting up](#toc3__) \n", + " - [Register a sample model](#toc3_1__) \n", + " - [Assign validator credentials](#toc3_1_1__) \n", + " - [Apply validation report template](#toc3_1_2__) \n", + " - [Install the ValidMind Library](#toc3_2__) \n", + " - [Initialize the ValidMind Library](#toc3_3__) \n", + " - [Get your code snippet](#toc3_3_1__) \n", + " - [Initialize the Python environment](#toc3_4__) \n", + "- [Getting to know ValidMind](#toc4__) \n", + " - [Preview the validation report template](#toc4_1__) \n", + " - [View validation report in the ValidMind Platform](#toc4_2__) \n", + "- [Working with ValidMind datasets](#toc5__) \n", + " - [Prepare the sample dataset](#toc5_1__) \n", + " - [Load the sample dataset](#toc5_1_1__) \n", + " - [Preprocess the raw dataset](#toc5_1_2__) \n", + " - [Split the dataset](#toc5_1_3__) \n", + " - [Separate features and targets](#toc5_1_4__) \n", + " - [Initialize the ValidMind datasets](#toc5_2__) \n", + "- [Running data quality tests](#toc6__) \n", + " - [Identify qualitative tests](#toc6_1__) \n", + " - [Run an individual data quality test](#toc6_2__) \n", + " - [Run data comparison tests](#toc6_3__) \n", + "- [Working with ValidMind models](#toc7__) \n", + " - [Import the champion model](#toc7_1__) \n", + " - [Initialize the ValidMind model](#toc7_2__) \n", + " - [Assign predictions](#toc7_3__) \n", + "- [Running model evaluation tests](#toc8__) \n", + " - [Run model performance tests](#toc8_1__) \n", + " - [Run diagnostic tests](#toc8_2__) \n", + " - [Run feature importance tests](#toc8_3__) \n", + "- [In summary](#toc9__) \n", + "- [Next steps](#toc10__) \n", + " - [Work with your validation report](#toc10_1__) \n", + " - [Discover more learning resources](#toc10_2__) \n", + "- [Upgrade ValidMind](#toc11__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ], + "id": "0493b0cb" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## Introduction\n", + "\n", + "Validation aims to independently assess the compliance of *champions* created by developers with regulatory guidance by conducting thorough testing and analysis, potentially including the use of challengers to benchmark performance. Assessments, presented in the form of a validation report, typically include *artifacts (findings)* and recommendations to address those issues.\n", + "\n", + "A *binary classification model* is a type of predictive model used in churn analysis to identify customers who are likely to leave a service or subscription by analyzing various behavioral, transactional, and demographic factors.\n", + "\n", + "- This model helps businesses take proactive measures to retain at-risk customers by offering personalized incentives, improving customer service, or adjusting pricing strategies.\n", + "- Effective validation of a churn prediction model ensures that businesses can accurately identify potential churners, optimize retention efforts, and enhance overall customer satisfaction while minimizing revenue loss." + ], + "id": "717d2a16" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate comparison and other validation tests, and then use the ValidMind Platform to submit compliance assessments of champions via comprehensive validation reports. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and developers." + ], + "id": "369d00db" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." + ], + "id": "72800fc2" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about validating records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ], + "id": "e2beb1bb" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**validation report:** A validation report is a comprehensive and structured review evaluating a record's accuracy, performance, and suitability for its intended purpose. A report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards — encompassing the process of risk assessment, identifying areas of potential error or risk within the record's components, supporting transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**validation report template**: A default ValidMind document template that serves as a standardized framework for conducting and documenting validation, including sections designated for attaching test results, evidence, or artifacts (findings). By outlining required documentation, recommended analyses, and expected validation tests, validation report templates ensure consistency and completeness across validation reports and help guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", + "\n", + "**artifacts (findings)**: Observations or issues identified during validation, including any deviations from expected performance or standards. Artifacts are organized by type — default types provided by ValidMind include Validation Issue, Policy Exception, and Limitation. Custom artifact types can be created to track other categories relevant to your organization.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ], + "id": "78c8388c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Setting up" + ], + "id": "ec7b4755" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1__'></a>\n", + "\n", + "### Register a sample model\n", + "\n", + "In a usual lifecycle, a champion will have been independently registered in your inventory and submitted to you for validation by your development team as part of the effective challenge process. (**Learn more:** [Submit documents](https://docs.validmind.ai/guide/documentation/submit-documents.html))\n", + "\n", + "For this notebook, we'll have you register a dummy record (model) in the ValidMind Platform inventory and assign yourself as the validator to familiarize you with the ValidMind interface and circumvent the need for an existing model:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down — don’t worry, we’ll adjust these permissions next for validation.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ], + "id": "97d44f44" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1_1__'></a>\n", + "\n", + "#### Assign validator credentials\n", + "\n", + "In order to log tests as a validator instead of as a developer, on the details page that appears after you've successfully registered your sample model:\n", + "\n", + "1. Remove yourself as an owner:\n", + "\n", + " - Click on the **OWNERS** tile.\n", + " - Click the **x** next to your name to remove yourself from that model's role.\n", + " - Click **Save** to apply your changes to that role.\n", + "\n", + "2. Remove yourself as a developer:\n", + "\n", + " - Click on the **DEVELOPERS** tile.\n", + " - Click the **x** next to your name to remove yourself from that model's role.\n", + " - Click **Save** to apply your changes to that role.\n", + "\n", + "3. Add yourself as a validator:\n", + "\n", + " - Click on the **VALIDATORS** tile.\n", + " - Select your name from the drop-down menu.\n", + " - Click **Save** to apply your changes to that role." + ], + "id": "fc3e48e1" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1_2__'></a>\n", + "\n", + "#### Apply validation report template\n", + "\n", + "Next, let's select a validation report template. A template predefines sections for your report and provides a general outline to follow, making the validation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Validation**.\n", + "\n", + " If you cannot locate your Validation document, make sure Validation type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Generic Validation Report`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ], + "id": "428260e0" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", + "<br></br>\n", + "Python 3.8 <= x <= 3.14</div>\n", + "\n", + "To install the library:" + ], + "id": "7b16c381" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [], + "id": "64eb485c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_3__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ], + "id": "bf77550e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_3_1__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Validation` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ], + "id": "ae918c6c" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"validation-report\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "9c6ce354" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_4__'></a>\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Then, let's import the necessary libraries and set up your Python environment for data analysis by enabling **`matplotlib`**, a plotting library used for visualizing data.\n", + "\n", + "This ensures that any plots you generate will render inline in our notebook output rather than opening in a separate window:" + ], + "id": "f9bc73e9" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "\n", + "%matplotlib inline" + ], + "execution_count": null, + "outputs": [], + "id": "1e53065d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Getting to know ValidMind" + ], + "id": "e0e942dd" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Preview the validation report template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for validation. A template predefines sections for your validation report and provides a general outline to follow, making the validation process much easier.\n", + "\n", + "You will attach evidence to this template in the form of risk assessment notes, artifacts, and test results later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library:" + ], + "id": "0361d8bf" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [], + "id": "be445598" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### View validation report in the ValidMind Platform\n", + "\n", + "Next, let's head to the ValidMind Platform to see the template in action:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, navigate to **Inventory** and select the model you registered for this notebook.\n", + "\n", + "3. Click **Validation** under Documents for your model and note:\n", + "\n", + " - [x] The risk assessment compliance summary at the top of the report (screenshot below)\n", + " - [x] How the structure of the validation report reflects the previewed template\n", + "\n", + " <img src= \"../tutorials/validation/compliance-summary.png\" alt=\"Screenshot showing the risk assessment compliance summary\" style=\"border: 2px solid #083E44; border-radius: 8px; border-right-width: 2px; border-bottom-width: 3px;\">\n", + " <br><br>" + ], + "id": "4124c3d7" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Working with ValidMind datasets" + ], + "id": "767ea445" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1__'></a>\n", + "\n", + "### Prepare the sample dataset" + ], + "id": "ae3f832d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1_1__'></a>\n", + "\n", + "#### Load the sample dataset\n", + "\n", + "First, let's import the public [Bank Customer Churn Prediction](https://www.kaggle.com/datasets/shantanudhakadd/bank-customer-churn-prediction) dataset from Kaggle, which was used to develop the dummy champion.\n", + "\n", + "We'll use this dataset to review steps that should have been conducted during the initial development and documentation of the champion to ensure that the model was built correctly. By independently performing steps taken by the development team, we can confirm whether the model was built using appropriate and properly processed data.\n", + "\n", + "In our below example, note that:\n", + "\n", + "- The target column, `Exited` has a value of `1` when a customer has churned and `0` otherwise.\n", + "- The ValidMind Library provides a wrapper to automatically load the dataset as a [Pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) object. A Pandas Dataframe is a two-dimensional tabular data structure that makes use of rows and columns." + ], + "id": "f91775e8" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.datasets.classification import customer_churn\n", + "\n", + "print(\n", + " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{customer_churn.target_column}' \\n\\t• Class labels: {customer_churn.class_labels}\"\n", + ")\n", + "\n", + "raw_df = customer_churn.load_data()\n", + "raw_df.head()" + ], + "execution_count": null, + "outputs": [], + "id": "73076ee3" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1_2__'></a>\n", + "\n", + "#### Preprocess the raw dataset\n", + "\n", + "Let's say that thanks to the documentation submitted by the development team (**Learn more:** [Quickstart for documentation](quickstart_documentation.ipynb)), we know that the sample dataset was first preprocessed before being used to train the champion.\n", + "\n", + "During validation, we use the same data processing logic and training procedure to confirm that the champion's results can be reproduced independently, so let's also start by preprocessing our imported dataset to verify that preprocessing was done correctly. This involves splitting the data and separating the features (inputs) from the targets (outputs)." + ], + "id": "6ab7fd19" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1_3__'></a>\n", + "\n", + "#### Split the dataset\n", + "\n", + "Splitting our dataset helps assess how well the model generalizes to unseen data.\n", + "\n", + "Use [`preprocess()`](https://docs.validmind.ai/validmind/validmind/datasets/classification/customer_churn.html#preprocess) to split our dataset into three subsets:\n", + "\n", + "1. **train_df** — Used to train the model.\n", + "2. **validation_df** — Used to evaluate the model's performance during training.\n", + "3. **test_df** — Used later on to asses the model's performance on new, unseen data." + ], + "id": "af660bf4" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_df, validation_df, test_df = customer_churn.preprocess(raw_df)" + ], + "execution_count": null, + "outputs": [], + "id": "ee8cfaaf" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1_4__'></a>\n", + "\n", + "#### Separate features and targets\n", + "\n", + "To train the model, we need to provide it with:\n", + "\n", + "1. **Inputs** — Features such as customer age, usage, etc.\n", + "2. **Outputs (Expected answers/labels)** — in our case, we would like to know whether the customer churned or not.\n", + "\n", + "Here, we'll use `x_train` to hold the input features, and `y_train` to hold the target variable — the values we want the model to predict:" + ], + "id": "125a39e6" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "x_train = train_df.drop(customer_churn.target_column, axis=1)\n", + "y_train = train_df[customer_churn.target_column]" + ], + "execution_count": null, + "outputs": [], + "id": "6fe65be5" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_2__'></a>\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests with your preprocessed datasets, you must first initialize a ValidMind `Dataset` object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module. **This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind,** but you only need to do it once per dataset.\n", + "\n", + "For this example, we'll pass in the following arguments:\n", + "\n", + "- **`dataset`** — The raw dataset that you want to provide as input to tests.\n", + "- **`input_id`** — A unique identifier that allows tracking what inputs are used when running each individual test.\n", + "- **`target_column`** — A required argument if tests require access to true values. This is the name of the target column in the dataset.\n", + "- **`class_labels`** — An optional value to map predicted classes to class labels." + ], + "id": "b6674505" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Initialize the raw dataset\n", + "vm_raw_dataset = vm.init_dataset(\n", + " dataset=raw_df,\n", + " input_id=\"raw_dataset\",\n", + " target_column=customer_churn.target_column,\n", + " class_labels=customer_churn.class_labels,\n", + ")\n", + "\n", + "# Initialize the training dataset\n", + "vm_train_ds = vm.init_dataset(\n", + " dataset=train_df,\n", + " input_id=\"train_dataset\",\n", + " target_column=customer_churn.target_column,\n", + ")\n", + "\n", + "# Initialize the validation dataset\n", + "vm_validation_ds = vm.init_dataset(\n", + " dataset=validation_df,\n", + " input_id=\"validation_dataset\",\n", + " target_column=customer_churn.target_column,\n", + ")\n", + "\n", + "# Initialize the testing dataset\n", + "vm_test_ds = vm.init_dataset(\n", + " dataset=test_df,\n", + " input_id=\"test_dataset\",\n", + " target_column=customer_churn.target_column\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "ba677dd7" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Running data quality tests\n", + "\n", + "With everything ready to go, let's explore some of ValidMind's available tests to help us assess the quality of our datasets. Using ValidMind’s repository of tests streamlines your validation testing, and helps you ensure that your records are being validated appropriately." + ], + "id": "c53c6d35" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_1__'></a>\n", + "\n", + "### Identify qualitative tests\n", + "\n", + "We want to narrow down the tests we want to run from the selection provided by ValidMind, so we'll use the [`vm.tests.list_tasks_and_tags()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tasks_and_tags) to list which `tags` are associated with each `task` type:\n", + "\n", + "- **`tasks`** represent the kind of modeling task associated with a test. Here we'll focus on `classification` tasks.\n", + "- **`tags`** are free-form descriptions providing more details about the test, for example, what category the test falls into. Here we'll focus on the `data_quality` tag." + ], + "id": "b6acd486" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tasks_and_tags()" + ], + "execution_count": null, + "outputs": [], + "id": "85bc2f85" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Then we'll call [the `vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to list all the data quality tests for classification:" + ], + "id": "9881e58a" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tests(\n", + " tags=[\"data_quality\"], task=\"classification\"\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "31b31a51" + }, + { + "cell_type": "markdown", + "id": "d3e27375", + "metadata": {}, + "source": [ + "<a id='toc6_2__'></a>\n", + "\n", + "### Run an individual data quality test\n", + "\n", + "Next, we'll use our previously initialized raw dataset (`vm_raw_dataset`) as input to run an individual test, then log the result to the ValidMind Platform.\n", + "\n", + "- You run validation tests by calling [the `run_test` function](https://docs.validmind.ai/validmind/validmind/tests.html#run_test) provided by the `validmind.tests` module.\n", + "- Every test result returned by the `run_test()` function has a [`.log()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#TestResult.log) that can be used to send the test results to the ValidMind Platform.\n", + "\n", + "Here, we'll use the `data_validation.ClassImbalance` test as an example:\n" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " test_id=\"validmind.data_validation.ClassImbalance\",\n", + " inputs={\n", + " \"dataset\": vm_raw_dataset\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "dcb9b017" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Note the output returned indicating that a test-driven block doesn't currently exist in your documentation for some test IDs. </b></span>\n", + "<br></br>\n", + "That's expected, as when we run validations tests the results logged need to be manually added to your report as part of your compliance assessment process within the ValidMind Platform. You'll continue to see this message throughout this notebook as we run and log more tests.</div>" + ], + "id": "f6b7567b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_3__'></a>\n", + "\n", + "### Run data comparison tests\n", + "<span id=\"data-comparison\">\n", + "\n", + "We can also use ValidMind to perform comparison tests between our datasets, again logging the results to the ValidMind Platform. Below, we'll perform two sets of comparison tests with a mix of our datasets and the same class imbalance test:\n", + "\n", + "- When running individual tests, you can use a custom **`result_id`** to tag the individual result with a unique identifier, appended to the `test_id` with a `:` separator.\n", + "- We can specify all the tests we'd ike to run in a dictionary called `test_config`, and we'll pass in an **`input_grid`** of individual test inputs to compare. In this case, we'll input our two datasets for comparison. Note here that the `input_grid` expects the `input_id` of the dataset as the value rather than the variable name we specified." + ], + "id": "97286c0e" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Individual test config with inputs specified\n", + "test_config = {\n", + " # Comparison between training and testing datasets to check if class balance is the same in both sets\n", + " \"validmind.data_validation.ClassImbalance:train_vs_validation\": {\n", + " \"input_grid\": {\"dataset\": [\"train_dataset\", \"validation_dataset\"]}\n", + " },\n", + " # Comparison between training and testing datasets to confirm that both sets have similar class distributions\n", + " \"validmind.data_validation.ClassImbalance:train_vs_test\": {\n", + " \"input_grid\": {\"dataset\": [\"train_dataset\", \"test_dataset\"]},\n", + " },\n", + "}" + ], + "execution_count": null, + "outputs": [], + "id": "d53edde7" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Then batch run and log our tests in `test_config`:" + ], + "id": "1f1b796b" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "for t in test_config:\n", + " print(t)\n", + " try:\n", + " # Check if test has input_grid\n", + " if 'input_grid' in test_config[t]:\n", + " # For tests with input_grid, pass the input_grid configuration\n", + " if 'params' in test_config[t]:\n", + " vm.tests.run_test(t, input_grid=test_config[t]['input_grid'], params=test_config[t]['params']).log()\n", + " else:\n", + " vm.tests.run_test(t, input_grid=test_config[t]['input_grid']).log()\n", + " else:\n", + " # Original logic for regular inputs\n", + " if 'params' in test_config[t]:\n", + " vm.tests.run_test(t, inputs=test_config[t]['inputs'], params=test_config[t]['params']).log()\n", + " else:\n", + " vm.tests.run_test(t, inputs=test_config[t]['inputs']).log()\n", + " except Exception as e:\n", + " print(f\"Error running test {t}: {str(e)}\")" + ], + "execution_count": null, + "outputs": [], + "id": "1b97e404" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Working with ValidMind models" + ], + "id": "1ca8c343" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7_1__'></a>\n", + "\n", + "### Import the champion model\n", + "\n", + "With our raw dataset preprocessed, let's go ahead and import the champion submitted by the development team in the format of a `.pkl` file: **[xgboost_model_champion.pkl](xgboost_model_champion.pkl)**" + ], + "id": "1fd05953" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Import the champion model\n", + "import joblib\n", + "\n", + "xgboost = joblib.load(\"xgboost_model_champion.pkl\")" + ], + "execution_count": null, + "outputs": [], + "id": "7f18188e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7_2__'></a>\n", + "\n", + "### Initialize the ValidMind model\n", + "\n", + "In addition to the initialized datasets, you'll also need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data for our champion.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" + ], + "id": "ee26b0b6" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Initialize the champion XGBoost model\n", + "vm_xgboost = vm.init_model(\n", + " xgboost,\n", + " input_id=\"xgboost_champion\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "0a799cf2" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7_3__'></a>\n", + "\n", + "### Assign predictions\n", + "\n", + "Once the model has been registered, you can assign model predictions to the training and testing datasets.\n", + "\n", + "- The [`assign_predictions()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#assign_predictions) from the `Dataset` object can link existing predictions to any number of models.\n", + "- This method links the model's class prediction values and probabilities to our `vm_train_ds` and `vm_test_ds` datasets.\n", + "\n", + "If no prediction values are passed, the method will compute predictions automatically:" + ], + "id": "823e49c5" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(\n", + " model=vm_xgboost,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_xgboost,\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "71dd8e7b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Running model evaluation tests\n", + "\n", + "With our setup complete, let's run the rest of our validation tests. Since we have already verified the data quality of the dataset used to train our champion, we will now focus on evaluating the model's performance." + ], + "id": "2e29df90" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8_1__'></a>\n", + "\n", + "### Run model performance tests\n", + "\n", + "First, let's run some performance tests. Use [`vm.tests.list_tests()`](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to identify all the model performance tests for classification:" + ], + "id": "fc6af0e0" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tests(tags=[\"model_performance\"], task=\"classification\")" + ], + "execution_count": null, + "outputs": [], + "id": "202792e8" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We'll isolate the specific tests we want to run in `mpt`, and append an identifier for our champion model here to the `result_id` with a `:` separator like we did above in another test:" + ], + "id": "011b7c09" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "mpt = [\n", + " \"validmind.model_validation.sklearn.ClassifierPerformance:xgboost_champion\",\n", + " \"validmind.model_validation.sklearn.ConfusionMatrix:xgboost_champion\",\n", + " \"validmind.model_validation.sklearn.ROCCurve:xgboost_champion\"\n", + "]" + ], + "execution_count": null, + "outputs": [], + "id": "9fc18843" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now, let's run and log our batch of model performance tests using our testing dataset (`vm_test_ds`) for our champion model:\n", + "\n", + "- The test set serves as a proxy for real-world data, providing an unbiased estimate of model performance since it was not used during training or tuning.\n", + "- The test set also acts as protection against selection bias and model tweaking, giving a final, more unbiased checkpoint." + ], + "id": "52096118" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "for test in mpt:\n", + " vm.tests.run_test(\n", + " test,\n", + " inputs={\n", + " \"dataset\": vm_test_ds, \"model\" : vm_xgboost,\n", + " },\n", + " ).log()" + ], + "execution_count": null, + "outputs": [], + "id": "6866b21c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8_2__'></a>\n", + "\n", + "### Run diagnostic tests\n", + "\n", + "Next, we want to inspect the robustness and stability of our champion. Use `list_tests()` to list all available diagnosis tests applicable to classification tasks:" + ], + "id": "842707f9" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tests(tags=[\"model_diagnosis\"], task=\"classification\")" + ], + "execution_count": null, + "outputs": [], + "id": "c9b3caa4" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let’s now assess the model for potential signs of *overfitting* and identify any sub-segments where performance may inconsistent.\n", + "\n", + "Overfitting occurs when a model learns the training data too well, capturing not only the true pattern but noise and random fluctuations resulting in excellent performance on the training dataset but poor generalization to new, unseen data:\n", + "\n", + "- Since the training dataset (`vm_train_ds`) was used to fit the model, we use this set to establish a baseline performance for how well the model performs on data it has already seen.\n", + "- The testing dataset (`vm_test_ds`) was never seen during training, and here simulates real-world generalization, or how well the model performs on new, unseen data. " + ], + "id": "5295d37b" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " test_id=\"validmind.model_validation.sklearn.OverfitDiagnosis:xgboost_champion\",\n", + " input_grid={\n", + " \"datasets\": [[vm_train_ds,vm_test_ds]],\n", + " \"model\" : [vm_xgboost]\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "82f824f2" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's also conduct *robustness* and *stability* tests.\n", + "\n", + "- Robustness evaluates the model’s ability to maintain consistent performance under varying input conditions.\n", + "- Stability assesses whether the model produces consistent outputs across different data subsets or over time.\n", + "\n", + "Again, we'll use both the training and testing datasets to establish baseline performance and to simulate real-world generalization:" + ], + "id": "88db22ed" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " test_id=\"validmind.model_validation.sklearn.RobustnessDiagnosis:xgboost_champion\",\n", + " input_grid={\n", + " \"datasets\": [[vm_train_ds,vm_test_ds]],\n", + " \"model\" : [vm_xgboost]\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "b2676197" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8_3__'></a>\n", + "\n", + "### Run feature importance tests\n", + "\n", + "We also want to verify the relative influence of different input features on our model's predictions. Use `list_tests()` to identify all the feature importance tests for classification and store them in `FI`:" + ], + "id": "9226c6ea" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Store the feature importance tests\n", + "FI = vm.tests.list_tests(tags=[\"feature_importance\"], task=\"classification\",pretty=False)\n", + "FI" + ], + "execution_count": null, + "outputs": [], + "id": "9c8c26e6" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We'll only use our testing dataset (`vm_test_ds`) here, to provide a realistic, unseen sample that mimic future or production data, as the training dataset has already influenced our model during learning:" + ], + "id": "d36a3544" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Run and log our feature importance tests with the testing dataset\n", + "for test in FI:\n", + " vm.tests.run_test(\n", + " \"\".join((test,':xgboost_champion')),\n", + " inputs={\n", + " \"dataset\": vm_test_ds, \"model\": vm_xgboost\n", + " },\n", + " ).log()" + ], + "execution_count": null, + "outputs": [], + "id": "5a49f550" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc9__'></a>\n", + "\n", + "## In summary\n", + "\n", + "In this notebook, you learned how to:\n", + "\n", + "- [x] Register a record (model) within the ValidMind Platform\n", + "- [x] Install and initialize the ValidMind Library\n", + "- [x] Preview the validation report template for your model\n", + "- [x] Import a sample dataset and champion model\n", + "- [x] Initialize ValidMind datasets and model objects\n", + "- [x] Assign model predictions to your ValidMind model objects\n", + "- [x] Identify and run various validation tests\n", + "\n", + "In a usual validation workflow, you would wrap up your validation testing by verifying that all the tests provided by the development team were run and reported accurately, and perhaps even propose a challenger, comparing the performance of the challenger with the running champion.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>With ValidMind, you can easily:</b></span>\n", + "<ul>\n", + " <li>Specify all the tests you'd like to independently rerun, just like you did in the step <a href=\"#run-data-comparison-tests\" style=\"color: #DE257E;\">Run data comparision tests</a></li>\n", + " <li>Evaluate the performance of a challenger against the champion, just like you did in the steps under <a href=\"#running-model-evaluation-tests\" style=\"color: #DE257E;\">Running model evaluation tests</a></li>\n", + "</ul>\n", + "</div>" + ], + "id": "293bf4ca" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc10__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the output produced by the ValidMind Library right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your validation report." + ], + "id": "b7fe1ed3" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc10_1__'></a>\n", + "\n", + "### Work with your validation report\n", + "\n", + "Now that you've logged all your test results and verified the work done by the development team, head to the ValidMind Platform to wrap up your validation report:\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you connected to earlier.\n", + "\n", + "2. In the left sidebar that appears for your model, click **Validation** under Documents.\n", + "\n", + "Include your logged test results as evidence, create risk assessment notes, add artifacts, and assess compliance, then submit your report for review when it's ready. (**Learn more:** [Preparing validation reports](https://docs.validmind.ai/guide/validation/preparing-validation-reports.html))" + ], + "id": "1e30826e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc10_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "For a more in-depth introduction to using the ValidMind Library for validation, check out our introductory validation series and the accompanying interactive training:\n", + "\n", + "- **[ValidMind for validation](https://docs.validmind.ai/developer/validmind-library.html#validation)**\n", + "- **[Validator Fundamentals](https://docs.validmind.ai/training/validator-fundamentals/validator-fundamentals-register.html)**\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:q\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ], + "id": "8511e2f8" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc11__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ], + "id": "58d2d5da" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [], + "id": "upgrade-show-c0a446ff-f26f-4ad0-839a-e92927711798" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ], + "id": "7e76ca12" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ], + "id": "6d3e2933" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-2427447e4fe348908b3423e86473bfeb" + } + ], + "metadata": { + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "name": "python", + "version": "3.10.13" + } + }, + "nbformat": 4, + "nbformat_minor": 5 } diff --git a/site/notebooks/tutorials/development/1-set_up_validmind.ipynb b/site/notebooks/tutorials/development/1-set_up_validmind.ipynb index 0c8316c27d..9ba5431049 100644 --- a/site/notebooks/tutorials/development/1-set_up_validmind.ipynb +++ b/site/notebooks/tutorials/development/1-set_up_validmind.ipynb @@ -1,477 +1,481 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "3bd9bc41", - "metadata": {}, - "source": [ - "# ValidMind for development 1 — Set up the ValidMind Library\n", - "\n", - "Learn how to use ValidMind for your end-to-end documentation process based on common development scenarios with our series of four introductory notebooks. This first notebook walks you through the initial setup of the ValidMind Library.\n", - "\n", - "These notebooks use a binary classification model as an example, but the same principles shown here apply to other record (model) types.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn by doing</b></span>\n", - "<br></br>\n", - "Our course tailor-made for developers new to ValidMind combines this series of notebooks with more a more in-depth introduction to the ValidMind Platform — <a href=\"https://docs.validmind.ai/training/developer-fundamentals/developer-fundamentals-register.html\" style=\"color: #DE257E;\"><b>Developer Fundamentals</b></a></div>" - ] + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# ValidMind for development 1 — Set up the ValidMind Library\n", + "\n", + "Learn how to use ValidMind for your end-to-end documentation process based on common development scenarios with our series of four introductory notebooks. This first notebook walks you through the initial setup of the ValidMind Library.\n", + "\n", + "These notebooks use a binary classification model as an example, but the same principles shown here apply to other record (model) types.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn by doing</b></span>\n", + "<br></br>\n", + "Our course tailor-made for developers new to ValidMind combines this series of notebooks with more a more in-depth introduction to the ValidMind Platform — <a href=\"https://docs.validmind.ai/training/developer-fundamentals/developer-fundamentals-register.html\" style=\"color: #DE257E;\"><b>Developer Fundamentals</b></a></div>" + ], + "id": "3bd9bc41" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [Introduction](#toc1__) \n", + "- [About ValidMind](#toc2__) \n", + " - [Before you begin](#toc2_1__) \n", + " - [New to ValidMind?](#toc2_2__) \n", + " - [Key concepts](#toc2_3__) \n", + "- [Setting up](#toc3__) \n", + " - [Install the ValidMind Library](#toc3_1__) \n", + " - [Initialize the ValidMind Library](#toc3_2__) \n", + " - [Register sample model](#toc3_2_1__) \n", + " - [Apply documentation template](#toc3_2_2__) \n", + " - [Get your code snippet](#toc3_2_3__) \n", + "- [Getting to know ValidMind](#toc4__) \n", + " - [Preview the documentation template](#toc4_1__) \n", + " - [View documentation in the ValidMind Platform](#toc4_1_1__) \n", + " - [Explore available tests](#toc4_2__) \n", + "- [Upgrade ValidMind](#toc5__) \n", + "- [In summary](#toc6__) \n", + "- [Next steps](#toc7__) \n", + " - [Start the model development process](#toc7_1__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ], + "id": "b4b7c002" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## Introduction\n", + "\n", + "Development aims to produce a fit-for-purpose *champion* by conducting thorough testing and analysis, supporting the capabilities of the champion with evidence in the form of documentation and test results. Documentation should be clear and comprehensive, ideally following a structure or template covering all aspects of compliance with risk regulation.\n", + "\n", + "A *binary classification model* is a type of predictive model used in churn analysis to identify customers who are likely to leave a service or subscription by analyzing various behavioral, transactional, and demographic factors.\n", + "\n", + "- This model helps businesses take proactive measures to retain at-risk customers by offering personalized incentives, improving customer service, or adjusting pricing strategies.\n", + "- Effective validation of a churn prediction model ensures that businesses can accurately identify potential churners, optimize retention efforts, and enhance overall customer satisfaction while minimizing revenue loss." + ], + "id": "7b7de259" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." + ], + "id": "b68b9958" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." + ], + "id": "3b520a7e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ], + "id": "9b3108db" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ], + "id": "f97d4266" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Setting up" + ], + "id": "bf5cd6c2" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", + "<br></br>\n", + "Python 3.8 <= x <= 3.14</div>\n", + "\n", + "To install the library:" + ], + "id": "95bf9e4b" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [], + "id": "827eb6bd" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library\n", + "\n", + "The ValidMind Library provides a rich collection of documentation tools and test suites, from documenting descriptions of datasets to validation and testing using a variety of open-source testing frameworks." + ], + "id": "ad74254d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ], + "id": "a48cd34d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ], + "id": "8ad7e39a" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ], + "id": "3339f683" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "a58d951f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Getting to know ValidMind" + ], + "id": "61a021f3" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ], + "id": "852db20d" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [], + "id": "819a40bc" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1_1__'></a>\n", + "\n", + "#### View documentation in the ValidMind Platform\n", + "\n", + "Next, let's head to the ValidMind Platform to see the template in action:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, navigate to **Inventory** and select the model you registered for this \"ValidMind for development\" series of notebooks.\n", + "\n", + "3. Click **Development** under Documents for your model and note how the structure of the documentation matches our preview above." + ], + "id": "65ed2873" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### Explore available tests\n", + "\n", + "Next, let's explore the list of all available tests in the ValidMind Library with [the `vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) — we'll learn how to run tests shortly. \n", + "\n", + "You can see that the documentation template for this model has references to some of the **test `ID`s used to run tests listed below:**" + ], + "id": "cdbb94d2" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tests()" + ], + "execution_count": null, + "outputs": [], + "id": "7ccc7776" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ], + "id": "786f0d9c" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [], + "id": "f5d3216d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ], + "id": "d2010ad4" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ], + "id": "b637c5c6" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## In summary\n", + "\n", + "In this first notebook, you learned how to:\n", + "\n", + "- [x] Register a record (model) within the ValidMind Platform\n", + "- [x] Install and initialize the ValidMind Library\n", + "- [x] Preview the documentation template for your model\n", + "- [x] Explore the available tests offered by the ValidMind Library" + ], + "id": "dfef8925" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Next steps" + ], + "id": "186bee4f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7_1__'></a>\n", + "\n", + "### Start the development process\n", + "\n", + "Now that the ValidMind Library is connected to your model in the ValidMind Library with the correct template applied, we can go ahead and start the development process: **[2 — Start the development process](2-start_development_process.ipynb)**" + ], + "id": "7dbb07a1" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-63fcb66be39b42d38ad874a72a66581b" + } + ], + "metadata": { + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "name": "python", + "version": "3.10.13" + } }, - { - "cell_type": "markdown", - "id": "b4b7c002", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [Introduction](#toc1__) \n", - "- [About ValidMind](#toc2__) \n", - " - [Before you begin](#toc2_1__) \n", - " - [New to ValidMind?](#toc2_2__) \n", - " - [Key concepts](#toc2_3__) \n", - "- [Setting up](#toc3__) \n", - " - [Install the ValidMind Library](#toc3_1__) \n", - " - [Initialize the ValidMind Library](#toc3_2__) \n", - " - [Register sample model](#toc3_2_1__) \n", - " - [Apply documentation template](#toc3_2_2__) \n", - " - [Get your code snippet](#toc3_2_3__) \n", - "- [Getting to know ValidMind](#toc4__) \n", - " - [Preview the documentation template](#toc4_1__) \n", - " - [View documentation in the ValidMind Platform](#toc4_1_1__) \n", - " - [Explore available tests](#toc4_2__) \n", - "- [Upgrade ValidMind](#toc5__) \n", - "- [In summary](#toc6__) \n", - "- [Next steps](#toc7__) \n", - " - [Start the model development process](#toc7_1__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "id": "7b7de259", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## Introduction\n", - "\n", - "Development aims to produce a fit-for-purpose *champion* by conducting thorough testing and analysis, supporting the capabilities of the champion with evidence in the form of documentation and test results. Documentation should be clear and comprehensive, ideally following a structure or template covering all aspects of compliance with risk regulation.\n", - "\n", - "A *binary classification model* is a type of predictive model used in churn analysis to identify customers who are likely to leave a service or subscription by analyzing various behavioral, transactional, and demographic factors.\n", - "\n", - "- This model helps businesses take proactive measures to retain at-risk customers by offering personalized incentives, improving customer service, or adjusting pricing strategies.\n", - "- Effective validation of a churn prediction model ensures that businesses can accurately identify potential churners, optimize retention efforts, and enhance overall customer satisfaction while minimizing revenue loss." - ] - }, - { - "cell_type": "markdown", - "id": "b68b9958", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." - ] - }, - { - "cell_type": "markdown", - "id": "3b520a7e", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." - ] - }, - { - "cell_type": "markdown", - "id": "9b3108db", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "id": "f97d4266", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "id": "bf5cd6c2", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "id": "95bf9e4b", - "metadata": {}, - "source": [ - "<a id='toc3_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", - "<br></br>\n", - "Python 3.8 <= x <= 3.14</div>\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "827eb6bd", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "id": "ad74254d", - "metadata": {}, - "source": [ - "<a id='toc3_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library\n", - "\n", - "The ValidMind Library provides a rich collection of documentation tools and test suites, from documenting descriptions of datasets to validation and testing using a variety of open-source testing frameworks." - ] - }, - { - "cell_type": "markdown", - "id": "a48cd34d", - "metadata": {}, - "source": [ - "<a id='toc3_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "id": "8ad7e39a", - "metadata": {}, - "source": [ - "<a id='toc3_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "id": "3339f683", - "metadata": {}, - "source": [ - "<a id='toc3_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "a58d951f", - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "61a021f3", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Getting to know ValidMind" - ] - }, - { - "cell_type": "markdown", - "id": "852db20d", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "819a40bc", - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "id": "65ed2873", - "metadata": {}, - "source": [ - "<a id='toc4_1_1__'></a>\n", - "\n", - "#### View documentation in the ValidMind Platform\n", - "\n", - "Next, let's head to the ValidMind Platform to see the template in action:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, navigate to **Inventory** and select the model you registered for this \"ValidMind for development\" series of notebooks.\n", - "\n", - "3. Click **Development** under Documents for your model and note how the structure of the documentation matches our preview above." - ] - }, - { - "cell_type": "markdown", - "id": "cdbb94d2", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### Explore available tests\n", - "\n", - "Next, let's explore the list of all available tests in the ValidMind Library with [the `vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) — we'll learn how to run tests shortly. \n", - "\n", - "You can see that the documentation template for this model has references to some of the **test `ID`s used to run tests listed below:**" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "7ccc7776", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tests()" - ] - }, - { - "cell_type": "markdown", - "id": "786f0d9c", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "f5d3216d", - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "id": "d2010ad4", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "id": "b637c5c6", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "dfef8925", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## In summary\n", - "\n", - "In this first notebook, you learned how to:\n", - "\n", - "- [x] Register a record (model) within the ValidMind Platform\n", - "- [x] Install and initialize the ValidMind Library\n", - "- [x] Preview the documentation template for your model\n", - "- [x] Explore the available tests offered by the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "id": "186bee4f", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Next steps" - ] - }, - { - "cell_type": "markdown", - "id": "7dbb07a1", - "metadata": {}, - "source": [ - "<a id='toc7_1__'></a>\n", - "\n", - "### Start the development process\n", - "\n", - "Now that the ValidMind Library is connected to your model in the ValidMind Library with the correct template applied, we can go ahead and start the development process: **[2 — Start the development process](2-start_development_process.ipynb)**" - ] - }, - { - "cell_type": "markdown", - "id": "copyright-63fcb66be39b42d38ad874a72a66581b", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" - }, - "language_info": { - "name": "python", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 5 + "nbformat": 4, + "nbformat_minor": 5 } diff --git a/site/notebooks/tutorials/development/2-start_development_process.ipynb b/site/notebooks/tutorials/development/2-start_development_process.ipynb index 51fd1724ab..4016e2a97a 100644 --- a/site/notebooks/tutorials/development/2-start_development_process.ipynb +++ b/site/notebooks/tutorials/development/2-start_development_process.ipynb @@ -10,7 +10,7 @@ "\n", "You'll become familiar with the individual tests available in ValidMind, as well as how to run them and change parameters as necessary. Using ValidMind's repository of individual tests as building blocks helps you ensure that a record (model) is being built appropriately.\n", "\n", - "**For a full list of out-of-the-box tests and descriptions,** use the interactive [Test sandbox](https://docs.validmind.ai/developer/how-to/test-sandbox.html).\n", + "**For a full list of out-of-the-box tests and descriptions,** use the interactive [ValidMind test sandbox](https://docs.validmind.ai/developer/how-to/test-sandbox.html).\n", "\n", "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn by doing</b></span>\n", "<br></br>\n", @@ -329,7 +329,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The output above shows that the class imbalance test did not pass according to the value we set for `min_percent_threshold`.\n", + "The output above shows that the `validmind.data_validation.ClassImbalance` test did not pass according to the value we set for `min_percent_threshold`.\n", "\n", "To address this issue, we'll re-run the test on some processed data. In this case let's apply a very simple rebalancing technique to the dataset:\n" ] diff --git a/site/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb b/site/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb index 6f8d378ceb..feda59a354 100644 --- a/site/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb +++ b/site/notebooks/tutorials/validation/1-set_up_validmind_for_validation.ipynb @@ -1,523 +1,533 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "821a881e", - "metadata": {}, - "source": [ - "# ValidMind for validation 1 — Set up the ValidMind Library for validation\n", - "\n", - "Learn how to use ValidMind for your end-to-end validation process based on common scenarios with our series of four introductory notebooks. In this first notebook, set up the ValidMind Library in preparation for validating a champion.\n", - "\n", - "These notebooks use a binary classification model as an example, but the same principles shown here apply to other record (model) types.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn by doing</b></span>\n", - "<br></br>\n", - "Our course tailor-made for validators new to ValidMind combines this series of notebooks with more a more in-depth introduction to the ValidMind Platform — <a href=\"https://docs.validmind.ai/training/validator-fundamentals/validator-fundamentals-register.html\" style=\"color: #DE257E;\"><b>Validator Fundamentals</b></a></div>" - ] + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# ValidMind for validation 1 — Set up the ValidMind Library for validation\n", + "\n", + "Learn how to use ValidMind for your end-to-end validation process based on common scenarios with our series of four introductory notebooks. In this first notebook, set up the ValidMind Library in preparation for validating a champion.\n", + "\n", + "These notebooks use a binary classification model as an example, but the same principles shown here apply to other record (model) types.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn by doing</b></span>\n", + "<br></br>\n", + "Our course tailor-made for validators new to ValidMind combines this series of notebooks with more a more in-depth introduction to the ValidMind Platform — <a href=\"https://docs.validmind.ai/training/validator-fundamentals/validator-fundamentals-register.html\" style=\"color: #DE257E;\"><b>Validator Fundamentals</b></a></div>" + ], + "id": "821a881e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [Introduction](#toc1__) \n", + "- [About ValidMind](#toc2__) \n", + " - [Before you begin](#toc2_1__) \n", + " - [New to ValidMind?](#toc2_2__) \n", + " - [Key concepts](#toc2_3__) \n", + "- [Setting up](#toc3__) \n", + " - [Register a sample model](#toc3_1__) \n", + " - [Assign validator credentials](#toc3_1_1__) \n", + " - [Apply documentation template](#toc3_1_2__) \n", + " - [Apply validation report template](#toc3_1_3__) \n", + " - [Install the ValidMind Library](#toc3_2__) \n", + " - [Initialize the ValidMind Library](#toc3_3__) \n", + " - [Get your code snippet](#toc3_3_1__) \n", + "- [Getting to know ValidMind](#toc4__) \n", + " - [Preview the validation report template](#toc4_1__) \n", + " - [View validation report in the ValidMind Platform](#toc4_1_1__) \n", + " - [Explore available tests](#toc4_2__) \n", + "- [Upgrade ValidMind](#toc5__) \n", + "- [In summary](#toc6__) \n", + "- [Next steps](#toc7__) \n", + " - [Start the validation process](#toc7_1__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ], + "id": "19ea797c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## Introduction\n", + "\n", + "Validation aims to independently assess the compliance of *champions* created by developers with regulatory guidance by conducting thorough testing and analysis, potentially including the use of challengers to benchmark performance. Assessments, presented in the form of a validation report, typically include *artifacts (findings)* and recommendations to address those issues.\n", + "\n", + "A *binary classification model* is a type of predictive model used in churn analysis to identify customers who are likely to leave a service or subscription by analyzing various behavioral, transactional, and demographic factors.\n", + "\n", + "- This model helps businesses take proactive measures to retain at-risk customers by offering personalized incentives, improving customer service, or adjusting pricing strategies.\n", + "- Effective validation of a churn prediction model ensures that businesses can accurately identify potential churners, optimize retention efforts, and enhance overall customer satisfaction while minimizing revenue loss." + ], + "id": "d624f88d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate comparison and other validation tests, and then use the ValidMind Platform to submit compliance assessments of champions via comprehensive validation reports. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and developers." + ], + "id": "4fb1ef5a" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." + ], + "id": "594f9fd4" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ], + "id": "262ed111" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**validation report:** A validation report is a comprehensive and structured review evaluating a record's accuracy, performance, and suitability for its intended purpose. A report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards — encompassing the process of risk assessment, identifying areas of potential error or risk within the record's components, supporting transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**validation report template**: A default ValidMind document template that serves as a standardized framework for conducting and documenting validation, including sections designated for attaching test results, evidence, or artifacts (findings). By outlining required documentation, recommended analyses, and expected validation tests, validation report templates ensure consistency and completeness across validation reports and help guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", + "\n", + "**artifacts (findings)**: Observations or issues identified during validation, including any deviations from expected performance or standards. Artifacts are organized by type — default types provided by ValidMind include Validation Issue, Policy Exception, and Limitation. Custom artifact types can be created to track other categories relevant to your organization.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ], + "id": "0eb67fe9" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Setting up" + ], + "id": "e0e1cf3d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1__'></a>\n", + "\n", + "### Register a sample model\n", + "\n", + "In a usual lifecycle, a champion will have been independently registered in your inventory and submitted to you for validation by your development team as part of the effective challenge process. (**Learn more:** [Submit documents](https://docs.validmind.ai/guide/documentation/submit-documents.html))\n", + "\n", + "For this notebook, we'll have you register a dummy record (model) in the ValidMind Platform inventory and assign yourself as the validator to familiarize you with the ValidMind interface and circumvent the need for an existing model:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down — don’t worry, we’ll adjust these permissions next for validation.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ], + "id": "609fe59b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1_1__'></a>\n", + "\n", + "#### Assign validator credentials\n", + "\n", + "In order to log tests as a validator instead of as a developer, on the details page that appears after you've successfully registered your sample model:\n", + "\n", + "1. Remove yourself as an owner:\n", + "\n", + " - Click on the **OWNERS** tile.\n", + " - Click the **x** next to your name to remove yourself from that model's role.\n", + " - Click **Save** to apply your changes to that role.\n", + "\n", + "2. Remove yourself as a developer:\n", + "\n", + " - Click on the **DEVELOPERS** tile.\n", + " - Click the **x** next to your name to remove yourself from that model's role.\n", + " - Click **Save** to apply your changes to that role.\n", + "\n", + "3. Add yourself as a validator:\n", + "\n", + " - Click on the **VALIDATORS** tile.\n", + " - Select your name from the drop-down menu.\n", + " - Click **Save** to apply your changes to that role." + ], + "id": "58e552bb" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier for developers.\n", + "\n", + "We'll need this documentation template later for reference as we draft our validation report:\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Documentation**.\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ], + "id": "84251589" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1_3__'></a>\n", + "\n", + "#### Apply validation report template\n", + "\n", + "Next, let's select a validation report template. A template predefines sections for your report and provides a general outline to follow, making the validation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Validation**.\n", + "\n", + " If you cannot locate your Validation document, make sure Validation type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Generic Validation Report`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ], + "id": "fdfb5dc5" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", + "<br></br>\n", + "Python 3.8 <= x <= 3.14</div>\n", + "\n", + "To install the library:" + ], + "id": "f656d0d6" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [], + "id": "931d8f7f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_3__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ], + "id": "1435fd5b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_3_1__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Validation` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ], + "id": "b375b341" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"validation-report\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "d5d87e2d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Getting to know ValidMind" + ], + "id": "331e1c07" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Preview the validation report template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will attach evidence to this template in the form of risk assessment notes, artifacts, and test results later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library:" + ], + "id": "f6331a98" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [], + "id": "13d34bbb" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1_1__'></a>\n", + "\n", + "#### View validation report in the ValidMind Platform\n", + "\n", + "Next, let's head to the ValidMind Platform to see the template in action:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, navigate to **Inventory** and select the model you registered for this \"ValidMind for validation\" series of notebooks.\n", + "\n", + "3. Click **Validation** under Documents for your model and note:\n", + "\n", + " - [x] The risk assessment compliance summary at the top of the report (screenshot below)\n", + " - [x] How the structure of the validation report reflects the previewed template\n", + "\n", + " <img src= \"compliance-summary.png\" alt=\"Screenshot showing the risk assessment compliance summary\" style=\"border: 2px solid #083E44; border-radius: 8px; border-right-width: 2px; border-bottom-width: 3px;\">\n", + " <br><br>" + ], + "id": "20717133" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### Explore available tests\n", + "\n", + "Next, let's explore the list of all available tests in the ValidMind Library with [the `vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) — we'll later narrow down the tests we want to run from this list when we learn to run tests." + ], + "id": "f5d0aaab" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tests()" + ], + "execution_count": null, + "outputs": [], + "id": "de6abc2a" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ], + "id": "dce47e40" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [], + "id": "10272aa9" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ], + "id": "7a0c3cc2" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ], + "id": "2dac11d5" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## In summary\n", + "\n", + "In this first notebook, you learned how to:\n", + "\n", + "- [x] Register a record (model) within the ValidMind Platform and assign yourself as the validator\n", + "- [x] Install and initialize the ValidMind Library\n", + "- [x] Preview the validation report template for your model\n", + "- [x] Explore the available tests offered by the ValidMind Library" + ], + "id": "174d2c8d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "<a id='toc7_1__'></a>\n", + "\n", + "### Start the validation process\n", + "\n", + "Now that the ValidMind Library is connected to your model in the ValidMind Library with the correct template applied, we can go ahead and start the validation process: **[2 — Start the validation process](2-start_validation_process.ipynb)**" + ], + "id": "d8ffdcf7" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-5d7a1c159e4840fca79011d1c0380725" + } + ], + "metadata": { + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "name": "python", + "version": "3.10.13" + } }, - { - "cell_type": "markdown", - "id": "19ea797c", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [Introduction](#toc1__) \n", - "- [About ValidMind](#toc2__) \n", - " - [Before you begin](#toc2_1__) \n", - " - [New to ValidMind?](#toc2_2__) \n", - " - [Key concepts](#toc2_3__) \n", - "- [Setting up](#toc3__) \n", - " - [Register a sample model](#toc3_1__) \n", - " - [Assign validator credentials](#toc3_1_1__) \n", - " - [Apply documentation template](#toc3_1_2__) \n", - " - [Apply validation report template](#toc3_1_3__) \n", - " - [Install the ValidMind Library](#toc3_2__) \n", - " - [Initialize the ValidMind Library](#toc3_3__) \n", - " - [Get your code snippet](#toc3_3_1__) \n", - "- [Getting to know ValidMind](#toc4__) \n", - " - [Preview the validation report template](#toc4_1__) \n", - " - [View validation report in the ValidMind Platform](#toc4_1_1__) \n", - " - [Explore available tests](#toc4_2__) \n", - "- [Upgrade ValidMind](#toc5__) \n", - "- [In summary](#toc6__) \n", - "- [Next steps](#toc7__) \n", - " - [Start the validation process](#toc7_1__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "id": "d624f88d", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## Introduction\n", - "\n", - "Validation aims to independently assess the compliance of *champions* created by developers with regulatory guidance by conducting thorough testing and analysis, potentially including the use of challengers to benchmark performance. Assessments, presented in the form of a validation report, typically include *artifacts (findings)* and recommendations to address those issues.\n", - "\n", - "A *binary classification model* is a type of predictive model used in churn analysis to identify customers who are likely to leave a service or subscription by analyzing various behavioral, transactional, and demographic factors.\n", - "\n", - "- This model helps businesses take proactive measures to retain at-risk customers by offering personalized incentives, improving customer service, or adjusting pricing strategies.\n", - "- Effective validation of a churn prediction model ensures that businesses can accurately identify potential churners, optimize retention efforts, and enhance overall customer satisfaction while minimizing revenue loss." - ] - }, - { - "cell_type": "markdown", - "id": "4fb1ef5a", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate comparison and other validation tests, and then use the ValidMind Platform to submit compliance assessments of champions via comprehensive validation reports. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and developers." - ] - }, - { - "cell_type": "markdown", - "id": "594f9fd4", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." - ] - }, - { - "cell_type": "markdown", - "id": "262ed111", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "id": "0eb67fe9", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Validation report**: A comprehensive and structured assessment of a model’s development and performance, focusing on verifying its integrity, appropriateness, and alignment with its intended use. It includes analyses of model assumptions, data quality, performance metrics, outcomes of testing procedures, and risk considerations. The validation report supports transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", - "\n", - "**Validation report template**: Serves as a standardized framework for conducting and documenting model validation activities. It outlines the required sections, recommended analyses, and expected validation tests, ensuring consistency and completeness across validation reports. The template helps guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." - ] - }, - { - "cell_type": "markdown", - "id": "e0e1cf3d", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "id": "609fe59b", - "metadata": {}, - "source": [ - "<a id='toc3_1__'></a>\n", - "\n", - "### Register a sample model\n", - "\n", - "In a usual lifecycle, a champion will have been independently registered in your inventory and submitted to you for validation by your development team as part of the effective challenge process. (**Learn more:** [Submit documents](https://docs.validmind.ai/guide/documentation/submit-documents.html))\n", - "\n", - "For this notebook, we'll have you register a dummy record (model) in the ValidMind Platform inventory and assign yourself as the validator to familiarize you with the ValidMind interface and circumvent the need for an existing model:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down — don’t worry, we’ll adjust these permissions next for validation.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "id": "58e552bb", - "metadata": {}, - "source": [ - "<a id='toc3_1_1__'></a>\n", - "\n", - "#### Assign validator credentials\n", - "\n", - "In order to log tests as a validator instead of as a developer, on the details page that appears after you've successfully registered your sample model:\n", - "\n", - "1. Remove yourself as an owner:\n", - "\n", - " - Click on the **OWNERS** tile.\n", - " - Click the **x** next to your name to remove yourself from that model's role.\n", - " - Click **Save** to apply your changes to that role.\n", - "\n", - "2. Remove yourself as a developer:\n", - "\n", - " - Click on the **DEVELOPERS** tile.\n", - " - Click the **x** next to your name to remove yourself from that model's role.\n", - " - Click **Save** to apply your changes to that role.\n", - "\n", - "3. Add yourself as a validator:\n", - "\n", - " - Click on the **VALIDATORS** tile.\n", - " - Select your name from the drop-down menu.\n", - " - Click **Save** to apply your changes to that role." - ] - }, - { - "cell_type": "markdown", - "id": "84251589", - "metadata": {}, - "source": [ - "<a id='toc3_1_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier for developers.\n", - "\n", - "We'll need this documentation template later for reference as we draft our validation report:\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Documentation**.\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "id": "fdfb5dc5", - "metadata": {}, - "source": [ - "<a id='toc3_1_3__'></a>\n", - "\n", - "#### Apply validation report template\n", - "\n", - "Next, let's select a validation report template. A template predefines sections for your report and provides a general outline to follow, making the validation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Validation**.\n", - "\n", - " If you cannot locate your Validation document, make sure Validation type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Generic Validation Report`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "id": "f656d0d6", - "metadata": {}, - "source": [ - "<a id='toc3_2__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", - "<br></br>\n", - "Python 3.8 <= x <= 3.14</div>\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "931d8f7f", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "id": "1435fd5b", - "metadata": {}, - "source": [ - "<a id='toc3_3__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "id": "b375b341", - "metadata": {}, - "source": [ - "<a id='toc3_3_1__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Validation` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d5d87e2d", - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"validation-report\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "331e1c07", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Getting to know ValidMind" - ] - }, - { - "cell_type": "markdown", - "id": "f6331a98", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Preview the validation report template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will attach evidence to this template in the form of risk assessment notes, artifacts, and test results later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "13d34bbb", - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "id": "20717133", - "metadata": {}, - "source": [ - "<a id='toc4_1_1__'></a>\n", - "\n", - "#### View validation report in the ValidMind Platform\n", - "\n", - "Next, let's head to the ValidMind Platform to see the template in action:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, navigate to **Inventory** and select the model you registered for this \"ValidMind for validation\" series of notebooks.\n", - "\n", - "3. Click **Validation** under Documents for your model and note:\n", - "\n", - " - [x] The risk assessment compliance summary at the top of the report (screenshot below)\n", - " - [x] How the structure of the validation report reflects the previewed template\n", - "\n", - " <img src= \"compliance-summary.png\" alt=\"Screenshot showing the risk assessment compliance summary\" style=\"border: 2px solid #083E44; border-radius: 8px; border-right-width: 2px; border-bottom-width: 3px;\">\n", - " <br><br>" - ] - }, - { - "cell_type": "markdown", - "id": "f5d0aaab", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### Explore available tests\n", - "\n", - "Next, let's explore the list of all available tests in the ValidMind Library with [the `vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) — we'll later narrow down the tests we want to run from this list when we learn to run tests." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "de6abc2a", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tests()" - ] - }, - { - "cell_type": "markdown", - "id": "dce47e40", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "10272aa9", - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "id": "7a0c3cc2", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "id": "2dac11d5", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "174d2c8d", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## In summary\n", - "\n", - "In this first notebook, you learned how to:\n", - "\n", - "- [x] Register a record (model) within the ValidMind Platform and assign yourself as the validator\n", - "- [x] Install and initialize the ValidMind Library\n", - "- [x] Preview the validation report template for your model\n", - "- [x] Explore the available tests offered by the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "id": "d8ffdcf7", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "<a id='toc7_1__'></a>\n", - "\n", - "### Start the validation process\n", - "\n", - "Now that the ValidMind Library is connected to your model in the ValidMind Library with the correct template applied, we can go ahead and start the validation process: **[2 — Start the validation process](2-start_validation_process.ipynb)**" - ] - }, - { - "cell_type": "markdown", - "id": "copyright-5d7a1c159e4840fca79011d1c0380725", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" - }, - "language_info": { - "name": "python", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 5 + "nbformat": 4, + "nbformat_minor": 5 } diff --git a/site/notebooks/tutorials/validation/2-start_validation_process.ipynb b/site/notebooks/tutorials/validation/2-start_validation_process.ipynb index 9547a1367e..a0d4440e6c 100644 --- a/site/notebooks/tutorials/validation/2-start_validation_process.ipynb +++ b/site/notebooks/tutorials/validation/2-start_validation_process.ipynb @@ -15,7 +15,7 @@ "- Ensuring that data used for training and testing the champion is of appropriate data quality\n", "- Ensuring that the raw data has been preprocessed appropriately and that the resulting final datasets reflects this\n", "\n", - "**For a full list of out-of-the-box tests and descriptions,** use the interactive [Test sandbox](https://docs.validmind.ai/developer/how-to/test-sandbox.html).\n", + "**For a full list of out-of-the-box tests and descriptions,** use the interactive [ValidMind test sandbox](https://docs.validmind.ai/developer/how-to/test-sandbox.html).\n", "\n", "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Learn by doing</b></span>\n", "<br></br>\n", @@ -295,7 +295,7 @@ "\n", "#### Run tabular data tests\n", "\n", - "The inputs expected by a test can also be found in the test definition — let's take [`validmind.data_validation.DescriptiveStatistics`](https://docs.validmind.ai/tests/data_validation/DescriptiveStatistics.html) as an example.\n", + "The inputs expected by a test can also be found in the test definition — let's take `validmind.data_validation.DescriptiveStatistics` as an example.\n", "\n", "Note that the output of the [`describe_test()` function](https://docs.validmind.ai/validmind/validmind/tests.html#describe_test) below shows that this test expects a `dataset` as input:" ] @@ -333,7 +333,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The output above shows that [the class imbalance test](https://docs.validmind.ai/tests/data_validation/ClassImbalance.html) did not pass according to the value we set for `min_percent_threshold` — great, this matches what was reported by the development team.\n", + "The output above shows that the `validmind.data_validation.ClassImbalance` test did not pass according to the value we set for `min_percent_threshold` — great, this matches what was reported by the development team.\n", "\n", "To address this issue, we'll re-run the test on some processed data. In this case let's apply a very simple rebalancing technique to the dataset:" ] @@ -405,7 +405,7 @@ "\n", "You can utilize the output from a ValidMind test for further use — in this below example, to retrieve the list of features with the highest correlation coefficients and use them to reduce the final list of features for modeling.\n", "\n", - "First, we'll run [`validmind.data_validation.HighPearsonCorrelation`](https://docs.validmind.ai/tests/data_validation/HighPearsonCorrelation.html) with the `balanced_raw_dataset` we initialized previously as input as is for comparison with later runs:" + "First, we'll run `validmind.data_validation.HighPearsonCorrelation` with the `balanced_raw_dataset` we initialized previously as input as is for comparison with later runs:\n" ] }, { diff --git a/site/notebooks/tutorials/validation/3-developing_potential_challenger.ipynb b/site/notebooks/tutorials/validation/3-developing_potential_challenger.ipynb index 3c5db2507d..2ed29a195f 100644 --- a/site/notebooks/tutorials/validation/3-developing_potential_challenger.ipynb +++ b/site/notebooks/tutorials/validation/3-developing_potential_challenger.ipynb @@ -544,11 +544,11 @@ "source": [ "We'll isolate the specific tests we want to run in `mpt`:\n", "\n", - "- [`ClassifierPerformance`](https://docs.validmind.ai/tests/model_validation/sklearn/ClassifierPerformance.html)\n", - "- [`ConfusionMatrix`](https://docs.validmind.ai/tests/model_validation/sklearn/ConfusionMatrix.html)\n", - "- [`MinimumAccuracy`](https://docs.validmind.ai/tests/model_validation/sklearn/MinimumAccuracy.html)\n", - "- [`MinimumF1Score`](https://docs.validmind.ai/tests/model_validation/sklearn/MinimumF1Score.html)\n", - "- [`ROCCurve`](https://docs.validmind.ai/tests/model_validation/sklearn/ROCCurve.html)\n", + "- `model_validation.sklearn.ClassifierPerformance`\n", + "- `model_validation.sklearn.ConfusionMatrix`\n", + "- `model_validation.sklearn.MinimumAccuracy`\n", + "- `model_validation.sklearn.MinimumF1Score`\n", + "- `model_validation.sklearn.ROCCurve`\n", "\n", "As we learned in the previous notebook [2 — Start the model validation process](2-start_validation_process.ipynb), you can use a custom `result_id` to tag the individual result with a unique identifier by appending this `result_id` to the `test_id` with a `:` separator. We'll append an identifier for our champion model here:" ] @@ -639,7 +639,7 @@ "\n", "9. Click **Update Linked Artifacts** to insert your validation issue.\n", "\n", - "10. Confirm that validation issue you inserted has been correctly inserted into section 2.2.2. Model Performance of the report.\n", + "10. Confirm that the validation issue you inserted has been correctly inserted into section 2.2.2. Model Performance of the report.\n", "\n", "11. Click on the validation issue to expand the issue, where you can adjust details such as severity, owner, due date, status, etc. as well as include proposed remediation plans or supporting documentation as attachments." ] @@ -729,7 +729,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Let’s now assess the models for potential signs of *overfitting* and identify any sub-segments where performance may inconsistent with the [`OverfitDiagnosis` test](https://docs.validmind.ai/tests/model_validation/sklearn/OverfitDiagnosis.html).\n", + "Let’s now assess the models for potential signs of *overfitting* and identify any sub-segments where performance may inconsistent with the `model_validation.sklearn.OverfitDiagnosis` test.\n", "\n", "Overfitting occurs when a model learns the training data too well, capturing not only the true pattern but noise and random fluctuations resulting in excellent performance on the training dataset but poor generalization to new, unseen data:\n", "\n", @@ -756,7 +756,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Let's also conduct *robustness* and *stability* testing of the two models with the [`RobustnessDiagnosis` test](https://docs.validmind.ai/tests/model_validation/sklearn/RobustnessDiagnosis.html). Robustness refers to a model's ability to maintain consistent performance, and stability refers to a model's ability to produce consistent outputs over time across different data subsets.\n", + "Let's also conduct *robustness* and *stability* testing of the two models with the `model_validation.sklearn.RobustnessDiagnosis` test.\n", + "\n", + "Robustness refers to a model's ability to maintain consistent performance, and stability refers to a model's ability to produce consistent outputs over time across different data subsets.\n", "\n", "Again, we'll use both the training and testing datasets to establish baseline performance and to simulate real-world generalization:" ] diff --git a/site/notebooks/tutorials/validation/4-finalize_validation_reporting.ipynb b/site/notebooks/tutorials/validation/4-finalize_validation_reporting.ipynb index 768c569b26..32d46c6e2d 100644 --- a/site/notebooks/tutorials/validation/4-finalize_validation_reporting.ipynb +++ b/site/notebooks/tutorials/validation/4-finalize_validation_reporting.ipynb @@ -121,7 +121,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Make sure the ValidMind Library is installed\n", "\n", @@ -143,9 +145,7 @@ " # model=\"...\",\n", " document=\"validation-report\",\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -160,7 +160,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Load the sample dataset\n", "from validmind.datasets.classification import customer_churn as demo_dataset\n", @@ -170,13 +172,13 @@ ")\n", "\n", "raw_df = demo_dataset.load_data()" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Initialize the raw dataset for use in ValidMind tests\n", "vm_raw_dataset = vm.init_dataset(\n", @@ -184,13 +186,13 @@ " input_id=\"raw_dataset\",\n", " target_column=\"Exited\",\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "import pandas as pd\n", "\n", @@ -202,9 +204,7 @@ "\n", "balanced_raw_df = pd.concat([exited_df, not_exited_df])\n", "balanced_raw_df = balanced_raw_df.sample(frac=1, random_state=42)" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -215,7 +215,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Register new data and now 'balanced_raw_dataset' is the new dataset object of interest\n", "vm_balanced_raw_dataset = vm.init_dataset(\n", @@ -223,13 +225,13 @@ " input_id=\"balanced_raw_dataset\",\n", " target_column=\"Exited\",\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Run HighPearsonCorrelation test with our balanced dataset as input and return a result object\n", "corr_result = vm.tests.run_test(\n", @@ -237,46 +239,46 @@ " params={\"max_threshold\": 0.3},\n", " inputs={\"dataset\": vm_balanced_raw_dataset},\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# From result object, extract table from `corr_result.tables`\n", "features_df = corr_result.tables[0].data\n", "features_df" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Extract list of features that failed the test\n", "high_correlation_features = features_df[features_df[\"Pass/Fail\"] == \"Fail\"][\"Columns\"].tolist()\n", "high_correlation_features" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Extract feature names from the list of strings\n", "high_correlation_features = [feature.split(\",\")[0].strip(\"()\") for feature in high_correlation_features]\n", "high_correlation_features" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Remove the highly correlated features from the dataset\n", "balanced_raw_no_age_df = balanced_raw_df.drop(columns=high_correlation_features)\n", @@ -287,13 +289,13 @@ " input_id=\"raw_dataset_preprocessed\",\n", " target_column=\"Exited\",\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Re-run the test with the reduced feature set\n", "corr_result = vm.tests.run_test(\n", @@ -301,9 +303,7 @@ " params={\"max_threshold\": 0.3},\n", " inputs={\"dataset\": vm_raw_dataset_preprocessed},\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -318,20 +318,22 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Encode categorical features in the dataset\n", "balanced_raw_no_age_df = pd.get_dummies(\n", " balanced_raw_no_age_df, columns=[\"Geography\", \"Gender\"], drop_first=True\n", ")\n", "balanced_raw_no_age_df.head()" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "from sklearn.model_selection import train_test_split\n", "\n", @@ -342,13 +344,13 @@ "y_train = train_df[\"Exited\"]\n", "X_test = test_df.drop(\"Exited\", axis=1)\n", "y_test = test_df[\"Exited\"]" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Initialize the split datasets\n", "vm_train_ds = vm.init_dataset(\n", @@ -362,9 +364,7 @@ " dataset=test_df,\n", " target_column=\"Exited\",\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -379,16 +379,16 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Import the champion model\n", "import pickle as pkl\n", "\n", "with open(\"lr_model_champion.pkl\", \"rb\") as f:\n", " log_reg = pkl.load(f)" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -403,7 +403,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Import the Random Forest Classification model\n", "from sklearn.ensemble import RandomForestClassifier\n", @@ -416,9 +418,7 @@ "\n", "# Train the model\n", "rf_model.fit(X_train, y_train)" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -433,7 +433,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Initialize the champion logistic regression model\n", "vm_log_model = vm.init_model(\n", @@ -446,13 +448,13 @@ " rf_model,\n", " input_id=\"rf_model\",\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Assign predictions to Champion — Logistic regression model\n", "vm_train_ds.assign_predictions(model=vm_log_model)\n", @@ -461,9 +463,7 @@ "# Assign predictions to Challenger — Random forest classification model\n", "vm_train_ds.assign_predictions(model=vm_rf_model)\n", "vm_test_ds.assign_predictions(model=vm_rf_model)" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -509,7 +509,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "from sklearn import metrics\n", @@ -523,9 +525,7 @@ " confusion_matrix=confusion_matrix, display_labels=[False, True]\n", ")\n", "cm_display.plot()" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -544,7 +544,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "@vm.test(\"my_custom_tests.ConfusionMatrix\")\n", "def confusion_matrix(dataset, model):\n", @@ -572,9 +574,7 @@ " plt.close() # close the plot to avoid displaying it\n", "\n", " return cm_display.figure_ # return the figure object itself" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -585,7 +585,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Champion train and test\n", "vm.tests.run_test(\n", @@ -595,13 +597,13 @@ " \"model\" : [vm_log_model]\n", " }\n", ").log()" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Challenger train and test\n", "vm.tests.run_test(\n", @@ -611,9 +613,7 @@ " \"model\" : [vm_rf_model]\n", " }\n", ").log()" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -637,7 +637,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "@vm.test(\"my_custom_tests.ConfusionMatrix\")\n", "def confusion_matrix(dataset, model, normalize=False):\n", @@ -668,9 +670,7 @@ " plt.close() # close the plot to avoid displaying it\n", "\n", " return cm_display.figure_ # return the figure object itself" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -690,7 +690,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Champion with test dataset and normalize=True\n", "vm.tests.run_test(\n", @@ -701,13 +703,13 @@ " },\n", " params={\"normalize\": True}\n", ").log()" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Challenger with test dataset and normalize=True\n", "vm.tests.run_test(\n", @@ -718,9 +720,7 @@ " },\n", " params={\"normalize\": True}\n", ").log()" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -756,7 +756,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "tests_folder = \"my_tests\"\n", "\n", @@ -770,9 +772,7 @@ " # remove files and pycache\n", " if f.endswith(\".py\") or f == \"__pycache__\":\n", " os.system(f\"rm -rf {tests_folder}/{f}\")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -809,16 +809,16 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "confusion_matrix.save(\n", " # Save it to the custom tests folder we created\n", " tests_folder,\n", " imports=[\"import matplotlib.pyplot as plt\", \"from sklearn import metrics\"],\n", ")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -873,7 +873,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "from validmind.tests import LocalTestProvider\n", "\n", @@ -886,9 +888,7 @@ ")\n", "# `my_test_provider.load_test()` will be called for any test ID that starts with `my_test_provider`\n", "# e.g. `my_test_provider.ConfusionMatrix` will look for a function named `ConfusionMatrix` in `my_tests/ConfusionMatrix.py` file" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -906,7 +906,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Champion with test dataset and test provider custom test\n", "vm.tests.run_test(\n", @@ -916,13 +918,13 @@ " \"model\" : [vm_log_model]\n", " }\n", ").log()" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "# Challenger with test dataset and test provider custom test\n", "vm.tests.run_test(\n", @@ -932,9 +934,7 @@ " \"model\" : [vm_rf_model]\n", " }\n", ").log()" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -951,7 +951,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "test_config = {\n", " # Run with the raw dataset\n", @@ -1061,9 +1063,7 @@ " 'params': {'min_threshold': 0.5}\n", " }\n", "}" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -1074,7 +1074,9 @@ }, { "cell_type": "code", + "execution_count": null, "metadata": {}, + "outputs": [], "source": [ "for t in test_config:\n", " print(t)\n", @@ -1094,9 +1096,7 @@ " vm.tests.run_test(t, inputs=test_config[t]['inputs']).log()\n", " except Exception as e:\n", " print(f\"Error running test {t}: {str(e)}\")" - ], - "execution_count": null, - "outputs": [] + ] }, { "cell_type": "markdown", @@ -1141,7 +1141,7 @@ "\n", "Now that you've logged all your test results and verified the work done by the development team, head to the ValidMind Platform to wrap up your validation report. Continue to work on your validation report by:\n", "\n", - "- **Inserting additional test results:** Click **Link Evidence to Report** under any section of 2. Validation in your validation report. (Learn more: [Link evidence to reports](https://docs.validmind.ai/guide/validation/assess-compliance.html#link-evidence-to-reports))\n", + "- **Inserting additional test results:** Click **Link Evidence** under any Evidence panel of 2. Validation in your validation report. (Learn more: [Link evidence to reports](https://docs.validmind.ai/guide/validation/assess-compliance.html#link-evidence-to-reports))\n", "\n", "- **Making qualitative edits to your test descriptions:** Expand any linked evidence under Validator Evidence and click **See evidence details** to review and edit the ValidMind-generated test descriptions for quality and accuracy. (Learn more: [Preparing validation reports](https://docs.validmind.ai/guide/validation/preparing-validation-reports.html#validation-overview))\n", "\n", @@ -1149,7 +1149,7 @@ "\n", "- **Adding risk assessment notes:** Click under **Risk Assessment Notes** in any validation report section to access the text editor and content editing toolbar, including an option to generate a draft with AI. Once generated, edit your ValidMind-generated test descriptions to adhere to your organization's requirements. (Learn more: [Work with content blocks](https://docs.validmind.ai/guide/documentation/work-with-content-blocks.html#content-editing-toolbar))\n", "\n", - "- **Assessing compliance:** Under the Guideline for any validation report section, click **ASSESSMENT** and select the compliance status from the drop-down menu. (Learn more: [Provide compliance assessments](https://docs.validmind.ai/guide/validation/assess-compliance.html#provide-compliance-assessments))\n", + "- **Assessing compliance:** Under the Guideline for any validation report section, click **ASSESSMENT** and select the compliance status from the drop-down menu. (Learn more: [Assign compliance assessments](https://docs.validmind.ai/guide/validation/assess-compliance.html#assign-compliance-assessments))\n", "\n", "- **Collaborate with other stakeholders:** Use the ValidMind Platform's real-time collaborative features to work seamlessly together with the rest of your organization, including developers. Propose suggested changes in the documentation, work with versioned history, and use comments to discuss specific portions of the documentation. (Learn more: [Collaborate with others](https://docs.validmind.ai/guide/documentation/collaborate-with-others.html))\n", "\n", diff --git a/site/notebooks/use_cases/agents/document_agentic_ai.ipynb b/site/notebooks/use_cases/agents/document_agentic_ai.ipynb index 4b0d6cf915..621fe8b171 100644 --- a/site/notebooks/use_cases/agents/document_agentic_ai.ipynb +++ b/site/notebooks/use_cases/agents/document_agentic_ai.ipynb @@ -1,2190 +1,2194 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "eee6b64c", - "metadata": {}, - "source": [ - "# Document an agentic AI system\n", - "\n", - "Build and document an agentic AI system with the ValidMind Library. Construct a LangGraph-based banking agent, assign AI evaluation metric scores to your agent, and run accuracy, RAGAS, and safety tests, then log those test results to the ValidMind Platform.\n", - "\n", - "An _AI agent_ is an autonomous system that interprets inputs, selects from available tools or actions, and executes multi-step behaviors to achieve defined goals. In this notebook, the agent acts as a banking assistant that analyzes user requests and automatically selects and invokes the appropriate specialized banking tool to deliver accurate, compliant, and actionable responses.\n", - "\n", - "- This agent enables financial institutions to automate complex banking workflows where different customer requests require different specialized tools and knowledge bases.\n", - "- Effective validation of agentic AI systems reduces the risks of agents misinterpreting inputs, failing to extract required parameters, or producing incorrect assessments or actions — such as selecting the wrong tool.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For the LLM components in this notebook to function properly, you'll need access to OpenAI.</b></span>\n", - "<br></br>\n", - "Before you continue, ensure that a valid <code>OPENAI_API_KEY</code> is set in your <code>.env</code> file.</div>" - ] - }, - { - "cell_type": "markdown", - "id": "30927b2b", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Preview the documentation template](#toc2_2_4__) \n", - " - [Verify OpenAI API access](#toc2_3__) \n", - " - [Initialize the Python environment](#toc2_4__) \n", - "- [Building the LangGraph agent](#toc3__) \n", - " - [Test available banking tools](#toc3_1__) \n", - " - [Create LangGraph banking agent](#toc3_2__) \n", - " - [Define system prompt](#toc3_2_1__) \n", - " - [Initialize the LLM](#toc3_2_2__) \n", - " - [Define agent state structure](#toc3_2_3__) \n", - " - [Create agent workflow function](#toc3_2_4__) \n", - " - [Instantiate the banking agent](#toc3_2_5__) \n", - " - [Integrate agent with ValidMind](#toc3_3__) \n", - " - [Import ValidMind components](#toc3_3_1__) \n", - " - [Create agent wrapper function](#toc3_3_2__) \n", - " - [Initialize the ValidMind model object](#toc3_3_3__) \n", - " - [Store the agent reference](#toc3_3_4__) \n", - " - [Verify integration](#toc3_3_5__) \n", - " - [Validate the system prompt](#toc3_4__) \n", - "- [Initializing the ValidMind dataset](#toc4__) \n", - " - [Assign predictions](#toc4_1__) \n", - "- [Running accuracy tests](#toc5__) \n", - " - [Response accuracy test](#toc5_1__) \n", - " - [Tool selection accuracy test](#toc5_2__) \n", - "- [Assigning AI evaluation metric scores](#toc6__) \n", - " - [Identify relevant DeepEval scorers](#toc6_1__) \n", - " - [Assign reasoning scores](#toc6_2__) \n", - " - [Plan quality score](#toc6_2_1__) \n", - " - [Plan adherence score](#toc6_2_2__) \n", - " - [Assign action scores](#toc6_3__) \n", - " - [Tool correctness score](#toc6_3_1__) \n", - " - [Argument correctness score](#toc6_3_2__) \n", - " - [Assign execution score](#toc6_4__) \n", - " - [Task completion score](#toc6_4_1__) \n", - "- [Running RAGAS tests](#toc7__) \n", - " - [Identify relevant RAGAS tests](#toc7_1__) \n", - " - [Faithfulness](#toc7_1_1__) \n", - " - [Response Relevancy](#toc7_1_2__) \n", - " - [Context Recall](#toc7_1_3__) \n", - "- [Running safety tests](#toc8__) \n", - " - [AspectCritic](#toc8_1_1__) \n", - " - [Bias](#toc8_1_2__) \n", - "- [Next steps](#toc9__) \n", - " - [Work with your model documentation](#toc9_1__) \n", - " - [Customize the banking agent for your use case](#toc9_2__) \n", - " - [Discover more learning resources](#toc9_3__) \n", - "- [Upgrade ValidMind](#toc10__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "id": "b58139db", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." - ] - }, - { - "cell_type": "markdown", - "id": "7e30d36b", - "metadata": {}, - "source": [ - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." - ] - }, - { - "cell_type": "markdown", - "id": "1cba586e", - "metadata": {}, - "source": [ - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "id": "5c46f003", - "metadata": {}, - "source": [ - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "id": "11a2d7a5", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "id": "fbab0edf", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", - "<br></br>\n", - "Python 3.9 <= x <= 3.14</div>\n", - "\n", - "Let's begin by installing the ValidMind Library with large language model (LLM) support:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "1982a118", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q \"validmind[llm]\" \"langgraph==0.3.21\"" - ] - }, - { - "cell_type": "markdown", - "id": "14578e26", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "id": "83d47d89", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook.\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "id": "bb2c5670", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Agentic AI`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "id": "98e475c1", - "metadata": {}, - "source": [ - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Can't select this template?</b></span>\n", - "<br></br>\n", - "Your organization administrators may need to add it to your template library:\n", - "<ul>\n", - "<li><a href=\"agentic_ai_template.yaml\" style=\"color: #DE257E;\"><b>Download Template YAML</b></a></li>\n", - "<li><a href=\"https://docs.validmind.ai/guide/templates/customize-document-templates.html\" style=\"color: #DE257E;\"><b>Customize Document Templates</b></a></li>\n", - "</ul>\n", - "</div>" - ] - }, - { - "cell_type": "markdown", - "id": "0d1a13ca", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d6ccbefc", - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "3605df4f", - "metadata": {}, - "source": [ - "<a id='toc2_2_4__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "dffdaa6f", - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "id": "d467c1d2", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Verify OpenAI API access\n", - "\n", - "Verify that a valid `OPENAI_API_KEY` is set in your `.env` file:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "22cc39cb", - "metadata": {}, - "outputs": [], - "source": [ - "# Load environment variables if using .env file\n", - "try:\n", - " from dotenv import load_dotenv\n", - " load_dotenv()\n", - "except ImportError:\n", - " print(\"dotenv not installed. Make sure OPENAI_API_KEY is set in your environment.\")" - ] - }, - { - "cell_type": "markdown", - "id": "b56c3f39", - "metadata": {}, - "source": [ - "<a id='toc2_4__'></a>\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Let's import all the necessary libraries to prepare for building our banking LangGraph agentic system:\n", - "\n", - "- **Standard libraries** for data handling and environment management.\n", - "- **pandas**, a Python library for data manipulation and analytics, as an alias. We'll also configure pandas to show all columns and all rows at full width for easier debugging and inspection.\n", - "- **LangChain** components for LLM integration and tool management.\n", - "- **LangGraph** for building stateful, multi-step agent workflows.\n", - "- **Banking tools** for specialized financial services as defined in [banking_tools.py](banking_tools.py)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "2058d1ac", - "metadata": {}, - "outputs": [], - "source": [ - "from typing import TypedDict, Annotated, Sequence\n", - "\n", - "from langchain_core.messages import BaseMessage, HumanMessage, SystemMessage\n", - "from langchain_openai import ChatOpenAI\n", - "from langgraph.checkpoint.memory import MemorySaver\n", - "from langgraph.graph import StateGraph, END, START\n", - "from langgraph.graph.message import add_messages\n", - "from langgraph.prebuilt import ToolNode\n", - "\n", - "# LOCAL IMPORTS FROM banking_tools.py\n", - "from banking_tools import AVAILABLE_TOOLS\n", - "\n", - "import pandas as pd\n", - "# Configure pandas to show all columns and all rows at full width\n", - "pd.set_option('display.max_columns', None)\n", - "pd.set_option('display.max_colwidth', None)\n", - "pd.set_option('display.width', None)\n", - "pd.set_option('display.max_rows', None)" - ] - }, - { - "cell_type": "markdown", - "id": "cc1d3265", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Building the LangGraph agent" - ] - }, - { - "cell_type": "markdown", - "id": "a3c421c4", - "metadata": {}, - "source": [ - "<a id='toc3_1__'></a>\n", - "\n", - "### Test available banking tools\n", - "\n", - "We'll use the demo banking tools defined in `banking_tools.py` that provide use cases of financial services:\n", - "\n", - "- **Credit Risk Analyzer** - Loan applications and credit decisions\n", - "- **Customer Account Manager** - Account services and customer support\n", - "- **Fraud Detection System** - Security and fraud prevention" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "1e0a120c", - "metadata": {}, - "outputs": [], - "source": [ - "print(f\"Available tools: {len(AVAILABLE_TOOLS)}\")\n", - "print(\"\\nTool Details:\")\n", - "for i, tool in enumerate(AVAILABLE_TOOLS, 1):\n", - " print(f\" - {tool.name}\")" - ] - }, - { - "cell_type": "markdown", - "id": "53906630", - "metadata": {}, - "source": [ - "Let's test each banking tool individually to ensure they're working correctly before integrating them into our agent:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "dc0caff2", - "metadata": {}, - "outputs": [], - "source": [ - "# Test 1: Credit Risk Analyzer\n", - "print(\"TEST 1: Credit Risk Analyzer\")\n", - "print(\"-\" * 40)\n", - "try:\n", - " # Access the underlying function using .func\n", - " credit_result = AVAILABLE_TOOLS[0].func(\n", - " customer_income=75000,\n", - " customer_debt=1200,\n", - " credit_score=720,\n", - " loan_amount=50000,\n", - " loan_type=\"personal\"\n", - " )\n", - " print(credit_result)\n", - " print(\"Credit Risk Analyzer test PASSED\")\n", - "except Exception as e:\n", - " print(f\"Credit Risk Analyzer test FAILED: {e}\")\n", - "\n", - "print(\"\" + \"=\" * 60)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b6b227db", - "metadata": {}, - "outputs": [], - "source": [ - "\n", - "# Test 2: Customer Account Manager\n", - "print(\"TEST 2: Customer Account Manager\")\n", - "print(\"-\" * 40)\n", - "try:\n", - " # Test checking balance\n", - " account_result = AVAILABLE_TOOLS[1].func(\n", - " account_type=\"checking\",\n", - " customer_id=\"12345\",\n", - " action=\"check_balance\"\n", - " )\n", - " print(account_result)\n", - "\n", - " # Test getting account info\n", - " info_result = AVAILABLE_TOOLS[1].func(\n", - " account_type=\"all\",\n", - " customer_id=\"12345\", \n", - " action=\"get_info\"\n", - " )\n", - " print(info_result)\n", - " print(\"Customer Account Manager test PASSED\")\n", - "except Exception as e:\n", - " print(f\"Customer Account Manager test FAILED: {e}\")\n", - "\n", - "print(\"\" + \"=\" * 60)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "a983b30d", - "metadata": {}, - "outputs": [], - "source": [ - "\n", - "# Test 3: Fraud Detection System\n", - "print(\"TEST 3: Fraud Detection System\")\n", - "print(\"-\" * 40)\n", - "try:\n", - " fraud_result = AVAILABLE_TOOLS[2].func(\n", - " transaction_id=\"TX123\",\n", - " customer_id=\"12345\",\n", - " transaction_amount=500.00,\n", - " transaction_type=\"withdrawal\",\n", - " location=\"Miami, FL\",\n", - " device_id=\"DEVICE_001\"\n", - " )\n", - " print(fraud_result)\n", - " print(\"Fraud Detection System test PASSED\")\n", - "except Exception as e:\n", - " print(f\"Fraud Detection System test FAILED: {e}\")\n", - "\n", - "print(\"\" + \"=\" * 60)" - ] - }, - { - "cell_type": "markdown", - "id": "1424baed", - "metadata": {}, - "source": [ - "<a id='toc3_2__'></a>\n", - "\n", - "### Create LangGraph banking agent\n", - "\n", - "With our tools ready to go, we'll create our intelligent banking agent with LangGraph that automatically selects and uses the appropriate banking tool based on a user request." - ] - }, - { - "cell_type": "markdown", - "id": "3469d656", - "metadata": {}, - "source": [ - "<a id='toc3_2_1__'></a>\n", - "\n", - "#### Define system prompt\n", - "\n", - "We'll begin by defining our system prompt, which provides the LLM with context about its role as a banking assistant and guidance on when to use each available tool:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "7971c427", - "metadata": {}, - "outputs": [], - "source": [ - "\n", - "# Enhanced banking system prompt with tool selection guidance\n", - "system_context = \"\"\"You are a professional banking AI assistant with access to specialized banking tools.\n", - " Analyze the user's banking request and directly use the most appropriate tools to help them.\n", - " \n", - " AVAILABLE BANKING TOOLS:\n", - " \n", - " credit_risk_analyzer - Analyze credit risk for loan applications and credit decisions\n", - " - Use for: loan applications, credit assessments, risk analysis, mortgage eligibility\n", - " - Examples: \"Analyze credit risk for $50k personal loan\", \"Assess mortgage eligibility for $300k home purchase\"\n", - " - Parameters: customer_income, customer_debt, credit_score, loan_amount, loan_type\n", - "\n", - " customer_account_manager - Manage customer accounts and provide banking services\n", - " - Use for: account information, transaction processing, product recommendations, customer service\n", - " - Examples: \"Check balance for checking account 12345\", \"Recommend products for customer with high balance\"\n", - " - Parameters: account_type, customer_id, action, amount, account_details\n", - "\n", - " fraud_detection_system - Analyze transactions for potential fraud and security risks\n", - " - Use for: transaction monitoring, fraud prevention, risk assessment, security alerts\n", - " - Examples: \"Analyze fraud risk for $500 ATM withdrawal in Miami\", \"Check security for $2000 online purchase\"\n", - " - Parameters: transaction_id, customer_id, transaction_amount, transaction_type, location, device_id\n", - "\n", - " BANKING INSTRUCTIONS:\n", - " - Analyze the user's banking request carefully and identify the primary need\n", - " - If they need credit analysis → use credit_risk_analyzer\n", - " - If they need financial calculations → use financial_calculator\n", - " - If they need account services → use customer_account_manager\n", - " - If they need security analysis → use fraud_detection_system\n", - " - Extract relevant parameters from the user's request\n", - " - Provide helpful, accurate banking responses based on tool outputs\n", - " - Always consider banking regulations, risk management, and best practices\n", - " - Be professional and thorough in your analysis\n", - "\n", - " Choose and use tools wisely to provide the most helpful banking assistance.\n", - " Describe the response in user friendly manner with details describing the tool output. \n", - " Provide the response in at least 500 words.\n", - " Generate a concise execution plan for the banking request.\n", - " \"\"\"" - ] - }, - { - "cell_type": "markdown", - "id": "b66c1ac4", - "metadata": {}, - "source": [ - "<a id='toc3_2_2__'></a>\n", - "\n", - "#### Initialize the LLM\n", - "\n", - "Let's initialize the LLM that will power our banking agent:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "866066e7", - "metadata": {}, - "outputs": [], - "source": [ - "# Initialize the main LLM for banking responses\n", - "main_llm = ChatOpenAI(\n", - " model=\"gpt-5-mini\",\n", - " reasoning={\n", - " \"effort\": \"low\",\n", - " \"summary\": \"auto\"\n", - " }\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "8220afd6", - "metadata": {}, - "source": [ - "Then bind the available banking tools to the LLM, enabling the model to automatically recognize and invoke each tool when appropriate based on request input and the system prompt we defined above:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "906d8132", - "metadata": {}, - "outputs": [], - "source": [ - "# Bind all banking tools to the main LLM\n", - "llm_with_tools = main_llm.bind_tools(AVAILABLE_TOOLS)" - ] - }, - { - "cell_type": "markdown", - "id": "43f56651", - "metadata": {}, - "source": [ - "<a id='toc3_2_3__'></a>\n", - "\n", - "#### Define agent state structure\n", - "\n", - "The agent state defines the data structure that flows through the LangGraph workflow. It includes:\n", - "\n", - "- **messages** — The conversation history between the user and agent\n", - "- **user_input** — The current user request\n", - "- **session_id** — A unique identifier for the conversation session\n", - "- **context** — Additional context that can be passed between nodes\n", - "\n", - "Defining this state structure maintains the structure throughout the agent's execution and allows for multi-turn conversations with memory:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "6b926ddf", - "metadata": {}, - "outputs": [], - "source": [ - "# Banking Agent State Definition\n", - "class BankingAgentState(TypedDict):\n", - " messages: Annotated[Sequence[BaseMessage], add_messages]\n", - " user_input: str\n", - " session_id: str\n", - " context: dict" - ] - }, - { - "cell_type": "markdown", - "id": "387ba780", - "metadata": {}, - "source": [ - "<a id='toc3_2_4__'></a>\n", - "\n", - "#### Create agent workflow function\n", - "\n", - "We'll build the LangGraph agent workflow with two main components:\n", - "\n", - "1. **LLM node** — Processes user requests, applies the system prompt, and decides whether to use tools.\n", - "2. **Tools node** — Executes the selected banking tools when the LLM determines they're needed.\n", - "\n", - "The workflow begins with the LLM analyzing the request, then uses tools if needed — or ends if the response is complete, and finally returns to the LLM to generate the final response." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "2c9bf585", - "metadata": {}, - "outputs": [], - "source": [ - "def create_banking_langgraph_agent():\n", - " \"\"\"Create a comprehensive LangGraph banking agent with intelligent tool selection.\"\"\"\n", - " def llm_node(state: BankingAgentState) -> BankingAgentState:\n", - " \"\"\"Main LLM node that processes banking requests and selects appropriate tools.\"\"\"\n", - " messages = state[\"messages\"]\n", - " # Add system context to messages\n", - " enhanced_messages = [SystemMessage(content=system_context)] + list(messages)\n", - " # Get LLM response with tool selection\n", - " response = llm_with_tools.invoke(enhanced_messages)\n", - " return {\n", - " **state,\n", - " \"messages\": messages + [response]\n", - " }\n", - " \n", - " def should_continue(state: BankingAgentState) -> str:\n", - " \"\"\"Decide whether to use tools or end the conversation.\"\"\"\n", - " last_message = state[\"messages\"][-1]\n", - " # Check if the LLM wants to use tools\n", - " if hasattr(last_message, 'tool_calls') and last_message.tool_calls:\n", - " return \"tools\"\n", - " return END\n", - " \n", - " # Create the banking state graph\n", - " workflow = StateGraph(BankingAgentState)\n", - " # Add nodes\n", - " workflow.add_node(\"llm\", llm_node)\n", - " workflow.add_node(\"tools\", ToolNode(AVAILABLE_TOOLS))\n", - " # Simplified entry point - go directly to LLM\n", - " workflow.add_edge(START, \"llm\")\n", - " # From LLM, decide whether to use tools or end\n", - " workflow.add_conditional_edges(\n", - " \"llm\",\n", - " should_continue,\n", - " {\"tools\": \"tools\", END: END}\n", - " )\n", - " # Tool execution flows back to LLM for final response\n", - " workflow.add_edge(\"tools\", \"llm\")\n", - " # Set up memory\n", - " memory = MemorySaver()\n", - " # Compile the graph\n", - " agent = workflow.compile(checkpointer=memory)\n", - " return agent" - ] - }, - { - "cell_type": "markdown", - "id": "765242e9", - "metadata": {}, - "source": [ - "<a id='toc3_2_5__'></a>\n", - "\n", - "#### Instantiate the banking agent\n", - "\n", - "Now, we'll create an instance of the banking agent by calling the workflow creation function.\n", - "\n", - "This compiled agent is ready to process banking requests and will automatically select and use the appropriate tools based on user queries:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "455b8ee4", - "metadata": {}, - "outputs": [], - "source": [ - "# Create the banking intelligent agent\n", - "banking_agent = create_banking_langgraph_agent()\n", - "\n", - "print(\"Banking LangGraph Agent Created Successfully!\")\n", - "print(\"\\nFeatures:\")\n", - "print(\" - Intelligent banking tool selection\")\n", - "print(\" - Comprehensive banking system prompt\")\n", - "print(\" - Streamlined workflow: LLM → Tools → Response\")\n", - "print(\" - Automatic tool parameter extraction\")\n", - "print(\" - Professional banking assistance\")" - ] - }, - { - "cell_type": "markdown", - "id": "e00dac77", - "metadata": {}, - "source": [ - "<a id='toc3_3__'></a>\n", - "\n", - "### Integrate agent with ValidMind\n", - "\n", - "To integrate our LangGraph banking agent with ValidMind, we need to create a wrapper function that ValidMind can use to invoke the agent and extract the necessary information for testing and documentation, allowing ValidMind to run validation tests on the agent's behavior, tool usage, and responses." - ] - }, - { - "cell_type": "markdown", - "id": "a124857e", - "metadata": {}, - "source": [ - "<a id='toc3_3_1__'></a>\n", - "\n", - "#### Import ValidMind components\n", - "\n", - "We'll start with importing the necessary ValidMind components for integrating our agent:\n", - "\n", - "- `Prompt` from `validmind.models` for handling prompt-based model inputs\n", - "- `extract_tool_calls_from_agent_output` and `_convert_to_tool_call_list` from `validmind.scorers.llm.deepeval` for extracting and converting tool calls from agent outputs" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "9aeb8969", - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.models import Prompt\n", - "from validmind.scorers.llm.deepeval import extract_tool_calls_from_agent_output, _convert_to_tool_call_list\n", - "from deepeval.tracing import observe, update_current_span\n", - "from deepeval.test_case import LLMTestCase" - ] - }, - { - "cell_type": "markdown", - "id": "ed72903f", - "metadata": {}, - "source": [ - "<a id='toc3_3_2__'></a>\n", - "\n", - "#### Create agent wrapper function\n", - "\n", - "We'll then create a wrapper function that:\n", - "\n", - "- Accepts input in ValidMind's expected format (with `input` and `session_id` fields)\n", - "- Invokes the banking agent with the proper state initialization\n", - "- Captures tool outputs and tool calls for evaluation\n", - "- Returns a standardized response format that includes the prediction, full output, tool messages, and tool call information\n", - "- Handles errors gracefully with fallback responses" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0e4d5a82", - "metadata": {}, - "outputs": [], - "source": [ - "@observe(type=\"agent\")\n", - "def banking_agent_fn(input):\n", - " \"\"\"\n", - " Invoke the banking agent with the given input.\n", - " \"\"\"\n", - " try:\n", - " # Initial state for banking agent\n", - " initial_state = {\n", - " \"user_input\": input[\"input\"],\n", - " \"messages\": [HumanMessage(content=input[\"input\"])],\n", - " \"session_id\": input[\"session_id\"],\n", - " \"context\": {}\n", - " }\n", - " session_config = {\"configurable\": {\"thread_id\": input[\"session_id\"]}}\n", - " result = banking_agent.invoke(initial_state, config=session_config)\n", - "\n", - " from utils import capture_tool_output_messages\n", - "\n", - " # Capture all tool outputs and metadata\n", - " captured_data = capture_tool_output_messages(result)\n", - " \n", - " # Access specific tool outputs, this will be used for RAGAS tests\n", - " tool_message = \"\"\n", - " for output in captured_data[\"tool_outputs\"]:\n", - " tool_message += output['content']\n", - " \n", - " tool_calls_found = []\n", - " messages = result['messages']\n", - " for message in messages:\n", - " if hasattr(message, 'tool_calls') and message.tool_calls:\n", - " for tool_call in message.tool_calls:\n", - " # Handle both dictionary and object formats\n", - " if isinstance(tool_call, dict):\n", - " tool_calls_found.append(tool_call['name'])\n", - " else:\n", - " # ToolCall object - use attribute access\n", - " tool_calls_found.append(tool_call.name)\n", - "\n", - " prediction_text = result['messages'][-1].content[0]['text']\n", - " tools_called_value = _convert_to_tool_call_list(extract_tool_calls_from_agent_output(result))\n", - " expected_tools_value = _convert_to_tool_call_list(input.get(\"expected_tools\", []))\n", - "\n", - " # Feed trace data for DeepEval metrics (e.g. PlanQuality) that require tracing\n", - " update_current_span(\n", - " test_case=LLMTestCase(\n", - " input=input[\"input\"],\n", - " actual_output=prediction_text,\n", - " tools_called=tools_called_value,\n", - " expected_tools=expected_tools_value\n", - " )\n", - " )\n", - "\n", - " return {\n", - " \"prediction\": prediction_text,\n", - " \"output\": result,\n", - " \"tool_messages\": [tool_message],\n", - " # \"tool_calls\": tool_calls_found,\n", - " \"tool_called\": tools_called_value\n", - " }\n", - " except Exception as e:\n", - " # Return a fallback response if the agent fails\n", - " error_message = f\"\"\"I apologize, but I encountered an error while processing your banking request: {str(e)}.\n", - " Please try rephrasing your question or contact support if the issue persists.\"\"\"\n", - " return {\n", - " \"prediction\": error_message, \n", - " \"output\": {\n", - " \"messages\": [HumanMessage(content=input[\"input\"]), SystemMessage(content=error_message)],\n", - " \"error\": str(e)\n", - " }\n", - " }" - ] - }, - { - "cell_type": "markdown", - "id": "fda87401", - "metadata": {}, - "source": [ - "<a id='toc3_3_3__'></a>\n", - "\n", - "#### Initialize the ValidMind model\n", - "\n", - "We'll also need to register the banking agent as a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model) which:\n", - "\n", - "- Associates the wrapper function with the model for prediction\n", - "- Stores the system prompt template for documentation\n", - "- Provides a unique `input_id` for tracking and identification\n", - "- Enables the agent to be used with ValidMind's testing and documentation features" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "60a2ce7a", - "metadata": {}, - "outputs": [], - "source": [ - "# Initialize the agent as a model\n", - "vm_banking_model = vm.init_model(\n", - " input_id=\"banking_agent_model\",\n", - " predict_fn=banking_agent_fn,\n", - " prompt=Prompt(template=system_context)\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "949bcf53", - "metadata": {}, - "source": [ - "<a id='toc3_3_4__'></a>\n", - "\n", - "#### Store the agent reference\n", - "\n", - "We'll also store a reference to the original banking agent object in the ValidMind model. This allows us to access the full agent functionality directly if needed, while still maintaining the wrapper function interface for ValidMind's testing framework." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "2c653471", - "metadata": {}, - "outputs": [], - "source": [ - "# Add the banking agent to the vm model\n", - "vm_banking_model.model = banking_agent" - ] - }, - { - "cell_type": "markdown", - "id": "d8d0c1c1", - "metadata": {}, - "source": [ - "<a id='toc3_3_5__'></a>\n", - "\n", - "#### Verify integration\n", - "\n", - "Let's confirm that the banking agent has been successfully integrated with ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "8e101b0f", - "metadata": {}, - "outputs": [], - "source": [ - "print(\"Banking Agent Successfully Integrated with ValidMind!\")\n", - "print(f\"Model ID: {vm_banking_model.input_id}\")" - ] - }, - { - "cell_type": "markdown", - "id": "2a5f874e", - "metadata": {}, - "source": [ - "<a id='toc3_4__'></a>\n", - "\n", - "### Validate the system prompt\n", - "\n", - "Let's get an initial sense of how well our defined system prompt meets a few best practices for prompt engineering by running a few tests — we'll run evaluation tests later on our agent's performance.\n", - "\n", - "You run individual tests by calling [the `run_test` function](https://docs.validmind.ai/validmind/validmind/tests.html#run_test) provided by the `validmind.tests` module. Passing in our agentic model as an input, the tests below rate the prompt on a scale of 1-10 against the following criteria:\n", - "\n", - "- **Clarity** — How clearly the prompt states the task.\n", - "- **Conciseness** — How succinctly the prompt states the task.\n", - "- **Delimitation** — When using complex prompts containing examples, contextual information, or other elements, is the prompt formatted in such a way that each element is clearly separated?\n", - "- **NegativeInstruction** — Whether the prompt contains negative instructions.\n", - "- **Specificity** — How specific the prompt defines the task.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "f52dceb1", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.prompt_validation.Clarity\",\n", - " inputs={\n", - " \"model\": vm_banking_model,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "70d52333", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.prompt_validation.Conciseness\",\n", - " inputs={\n", - " \"model\": vm_banking_model,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "5aa89976", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.prompt_validation.Delimitation\",\n", - " inputs={\n", - " \"model\": vm_banking_model,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "8630197e", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.prompt_validation.NegativeInstruction\",\n", - " inputs={\n", - " \"model\": vm_banking_model,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "bba99915", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.prompt_validation.Specificity\",\n", - " inputs={\n", - " \"model\": vm_banking_model,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "51d61141", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Initializing the ValidMind dataset\n", - "\n", - "After validation our system prompt, let's import our sample dataset ([banking_test_dataset.py](banking_test_dataset.py)), which we'll use in the next section to evaluate our agent's performance across different banking scenarios:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0c70ca2c", - "metadata": {}, - "outputs": [], - "source": [ - "from banking_test_dataset import banking_test_dataset" - ] - }, - { - "cell_type": "markdown", - "id": "442ab66d", - "metadata": {}, - "source": [ - "The next step is to connect your data with a ValidMind `Dataset` object. **This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind,** but you only need to do it once per dataset.\n", - "\n", - "Initialize a ValidMind dataset object using the [`init_dataset` function](https://docs.validmind.ai/validmind/validmind.html#init_dataset) from the ValidMind (`vm`) module. For this example, we'll pass in the following arguments:\n", - "\n", - "- **`input_id`** — A unique identifier that allows tracking what inputs are used when running each individual test.\n", - "- **`dataset`** — The raw dataset that you want to provide as input to tests.\n", - "- **`text_column`** — The name of the column containing the text input data.\n", - "- **`target_column`** — A required argument if tests require access to true values. This is the name of the target column in the dataset." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "a7e9d158", - "metadata": {}, - "outputs": [], - "source": [ - "vm_test_dataset = vm.init_dataset(\n", - " input_id=\"banking_test_dataset\",\n", - " dataset=banking_test_dataset,\n", - " text_column=\"input\",\n", - " target_column=\"possible_outputs\",\n", - ")\n", - "\n", - "print(\"Banking Test Dataset Initialized in ValidMind!\")\n", - "print(f\"Dataset ID: {vm_test_dataset.input_id}\")\n", - "print(f\"Dataset columns: {vm_test_dataset._df.columns}\")\n", - "vm_test_dataset._df" - ] - }, - { - "cell_type": "markdown", - "id": "7b01021c", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Assign predictions\n", - "\n", - "Now that both the model object and the datasets have been registered, we'll assign predictions to capture the banking agent's responses for evaluation:\n", - "\n", - "- The [`assign_predictions()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#assign_predictions) from the `Dataset` object can link existing predictions to any number of models.\n", - "- This method links the model's class prediction values and probabilities to our `vm_train_ds` and `vm_test_ds` datasets.\n", - "\n", - "If no prediction values are passed, the method will compute predictions automatically:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "1d462663", - "metadata": {}, - "outputs": [], - "source": [ - "vm_test_dataset.assign_predictions(vm_banking_model)\n", - "\n", - "print(\"Banking Agent Predictions Generated Successfully!\")\n", - "print(f\"Predictions assigned to {len(vm_test_dataset._df)} test cases\")\n", - "vm_test_dataset._df.head()" - ] - }, - { - "cell_type": "markdown", - "id": "4e56f556", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Running accuracy tests\n", - "\n", - "Using [`@vm.test`](https://docs.validmind.ai/validmind/validmind.html#test), let's implement some reusable custom *inline tests* to assess the accuracy of our banking agent:\n", - "\n", - "- An inline test refers to a test written and executed within the same environment as the code being tested — in this case, right in this Jupyter Notebook — without requiring a separate test file or framework.\n", - "- You'll note that the custom test functions are just regular Python functions that can include and require any Python library as you see fit." - ] - }, - { - "cell_type": "markdown", - "id": "1bce9258", - "metadata": {}, - "source": [ - "<a id='toc5_1__'></a>\n", - "\n", - "### Response accuracy test\n", - "\n", - "We'll create a custom test that evaluates the banking agent's ability to provide accurate responses by:\n", - "\n", - "- Testing against a dataset of predefined banking questions and expected answers.\n", - "- Checking if responses contain expected keywords and banking terminology.\n", - "- Providing detailed test results including pass/fail status.\n", - "- Helping identify any gaps in the agent's banking knowledge or response quality." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "90232066", - "metadata": {}, - "outputs": [], - "source": [ - "\n", - "@vm.test(\"my_custom_tests.banking_accuracy_test\")\n", - "def banking_accuracy_test(model, dataset, list_of_columns):\n", - " \"\"\"\n", - " The Banking Accuracy Test evaluates whether the agent’s responses include \n", - " critical domain-specific keywords and phrases that indicate accurate, compliant,\n", - " and contextually appropriate banking information. This test ensures that the agent\n", - " provides responses containing the expected banking terminology, risk classifications,\n", - " account details, or other domain-relevant information required for regulatory compliance,\n", - " customer safety, and operational accuracy.\n", - " \"\"\"\n", - " df = dataset._df\n", - " \n", - " # Pre-compute responses for all tests\n", - " y_true = dataset.y.tolist()\n", - " y_pred = dataset.y_pred(model).tolist()\n", - "\n", - " # Vectorized test results\n", - " test_results = []\n", - " for response, keywords in zip(y_pred, y_true):\n", - " # Convert keywords to list if not already a list\n", - " if not isinstance(keywords, list):\n", - " keywords = [keywords]\n", - " test_results.append(any(str(keyword).lower() in str(response).lower() for keyword in keywords))\n", - " \n", - " results = pd.DataFrame()\n", - " column_names = [col + \"_details\" for col in list_of_columns]\n", - " results[column_names] = df[list_of_columns]\n", - " results[\"actual\"] = y_pred\n", - " results[\"expected\"] = y_true\n", - " results[\"passed\"] = test_results\n", - " results[\"error\"] = None if test_results else f'Response did not contain any expected keywords: {y_true}'\n", - " \n", - " return results" - ] - }, - { - "cell_type": "markdown", - "id": "2a7f71f8", - "metadata": {}, - "source": [ - "Now that we've defined our custom response accuracy test, we can run the test using the same `run_test()` function we used earlier to validate the system prompt using our sample dataset and agentic model as input, and log the test results to the ValidMind Platform with the [`log()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#log):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e68884d5", - "metadata": {}, - "outputs": [], - "source": [ - "result = vm.tests.run_test(\n", - " \"my_custom_tests.banking_accuracy_test\",\n", - " inputs={\n", - " \"dataset\": vm_test_dataset,\n", - " \"model\": vm_banking_model\n", - " },\n", - " params={\n", - " \"list_of_columns\": [\"input\"]\n", - " }\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "94a717e7", - "metadata": {}, - "source": [ - "Let's review the first five rows of the test dataset to inspect the results to see how well the banking agent performed. Each column in the output serves a specific purpose in evaluating agent performance:\n", - "\n", - "| Column header | Description | Importance |\n", - "|--------------|-------------|------------|\n", - "| **`input`** | Original user query or request | Essential for understanding the context of each test case and tracing which inputs led to specific agent behaviors. |\n", - "| **`expected_tools`** | Banking tools that should be invoked for this request | Enables validation of correct tool selection, which is critical for agentic AI systems where choosing the right tool is a key success metric. |\n", - "| **`expected_output`** | Expected output or keywords that should appear in the response | Defines the success criteria for each test case, enabling objective evaluation of whether the agent produced the correct result. |\n", - "| **`session_id`** | Unique identifier for each test session | Allows tracking and correlation of related test runs, debugging specific sessions, and maintaining audit trails. |\n", - "| **`category`** | Classification of the request type | Helps organize test results by domain and identify performance patterns across different banking use cases. |\n", - "| **`banking_agent_model_output`** | Complete agent response including all messages and reasoning | Allows you to examine the full output to assess response quality, completeness, and correctness beyond just keyword matching. |\n", - "| **`banking_agent_model_tool_messages`** | Messages exchanged with the banking tools | Critical for understanding how the agent interacted with tools, what parameters were passed, and what tool outputs were received. |\n", - "| **`banking_agent_model_tool_called`** | Specific tool that was invoked | Enables validation that the agent selected the correct tool for each request, which is fundamental to agentic AI validation. |\n", - "| **`possible_outputs`** | Alternative valid outputs or keywords that could appear in the response | Provides flexibility in evaluation by accounting for multiple acceptable response formats or variations. |" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "78f7edb1", - "metadata": {}, - "outputs": [], - "source": [ - "vm_test_dataset.df.head(5)" - ] - }, - { - "cell_type": "markdown", - "id": "1cb3e8bd", - "metadata": {}, - "source": [ - "<a id='toc5_2__'></a>\n", - "\n", - "### Tool selection accuracy test\n", - "\n", - "We'll also create a custom test that evaluates the banking agent's ability to select the correct tools for different requests by:\n", - "\n", - "- Testing against a dataset of predefined banking queries with expected tool selections.\n", - "- Comparing the tools actually invoked by the agent against the expected tools for each request.\n", - "- Providing quantitative accuracy scores that measure the proportion of expected tools correctly selected.\n", - "- Helping identify gaps in the agent's understanding of user needs and tool selection logic." - ] - }, - { - "cell_type": "markdown", - "id": "69263d62", - "metadata": {}, - "source": [ - "First, we'll define a helper function that extracts tool calls from the agent's messages and compares them against the expected tools. This function handles different message formats (dictionary or object) and calculates accuracy scores:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e68798be", - "metadata": {}, - "outputs": [], - "source": [ - "def validate_tool_calls_simple(messages, expected_tools):\n", - " \"\"\"Simple validation of tool calls without RAGAS dependency issues.\"\"\"\n", - " \n", - " tool_calls_found = []\n", - " \n", - " for message in messages:\n", - " if hasattr(message, 'tool_calls') and message.tool_calls:\n", - " for tool_call in message.tool_calls:\n", - " # Handle both dictionary and object formats\n", - " if isinstance(tool_call, dict):\n", - " tool_calls_found.append(tool_call['name'])\n", - " else:\n", - " # ToolCall object - use attribute access\n", - " tool_calls_found.append(tool_call.name)\n", - " \n", - " # Check if expected tools were called\n", - " accuracy = 0.0\n", - " matches = 0\n", - " if expected_tools:\n", - " matches = sum(1 for tool in expected_tools if tool in tool_calls_found)\n", - " accuracy = matches / len(expected_tools)\n", - " \n", - " return {\n", - " 'expected_tools': expected_tools,\n", - " 'found_tools': tool_calls_found,\n", - " 'matches': matches,\n", - " 'total_expected': len(expected_tools) if expected_tools else 0,\n", - " 'accuracy': accuracy,\n", - " }" - ] - }, - { - "cell_type": "markdown", - "id": "8f494fd3", - "metadata": {}, - "source": [ - "Now we'll define the main test function that uses the helper function to evaluate tool selection accuracy across all test cases in the dataset:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "604d7313", - "metadata": {}, - "outputs": [], - "source": [ - "@vm.test(\"my_custom_tests.BankingToolCallAccuracy\")\n", - "def BankingToolCallAccuracy(dataset, agent_output_column, expected_tools_column):\n", - " \"\"\"\n", - " Evaluates the tool selection accuracy of a LangGraph-powered banking agent.\n", - "\n", - " This test measures whether the agent correctly identifies and invokes the required banking tools\n", - " for each user query scenario.\n", - " For each case, the outputs generated by the agent (including its tool calls) are compared against an\n", - " expected set of tools. The test considers both coverage and exactness: it computes the proportion of\n", - " expected tools correctly called by the agent for each instance.\n", - "\n", - " Parameters:\n", - " dataset (VMDataset): The dataset containing user queries, agent outputs, and ground-truth tool expectations.\n", - " agent_output_column (str): Dataset column name containing agent outputs (should include tool call details in 'messages').\n", - " expected_tools_column (str): Dataset column specifying the true expected tools (as lists).\n", - "\n", - " Returns:\n", - " List[dict]: Per-row dictionaries with details: expected tools, found tools, match count, total expected, and accuracy score.\n", - "\n", - " Purpose:\n", - " Provides diagnostic evidence of the banking agent's core reasoning ability—specifically, its capacity to\n", - " interpret user needs and select the correct banking actions. Useful for diagnosing gaps in tool coverage,\n", - " misclassifications, or breakdowns in agent logic.\n", - "\n", - " Interpretation:\n", - " - An accuracy of 1.0 signals perfect tool selection for that example.\n", - " - Lower scores may indicate partial or complete failures to invoke required tools.\n", - " - Review 'found_tools' vs. 'expected_tools' to understand the source of discrepancies.\n", - "\n", - " Strengths:\n", - " - Directly tests a core capability of compositional tool-use agents.\n", - " - Framework-agnostic; robust to tool call output format (object or dict).\n", - " - Supports batch validation and result logging for systematic documentation.\n", - "\n", - " Limitations:\n", - " - Does not penalize extra, unnecessary tool calls.\n", - " - Does not assess result quality—only correct invocation.\n", - "\n", - " \"\"\"\n", - " df = dataset._df\n", - " \n", - " results = []\n", - " for i, row in df.iterrows():\n", - " result = validate_tool_calls_simple(row[agent_output_column]['messages'], row[expected_tools_column])\n", - " results.append(result)\n", - " \n", - " return results" - ] - }, - { - "cell_type": "markdown", - "id": "57ab606b", - "metadata": {}, - "source": [ - "Finally, we can call our function with `run_test()` and log the test results to the ValidMind Platform:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "dd14115e", - "metadata": {}, - "outputs": [], - "source": [ - "result = vm.tests.run_test(\n", - " \"my_custom_tests.BankingToolCallAccuracy\",\n", - " inputs={\n", - " \"dataset\": vm_test_dataset,\n", - " },\n", - " params={\n", - " \"agent_output_column\": \"banking_agent_model_output\",\n", - " \"expected_tools_column\": \"expected_tools\"\n", - " }\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "be8d5270", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Assigning AI evaluation metric scores\n", - "\n", - "*AI agent evaluation metrics* are specialized measurements designed to assess how well autonomous LLM-based agents reason, plan, select and execute tools, and ultimately complete user tasks by analyzing the *full execution trace* — including reasoning steps, tool calls, intermediate decisions, and outcomes, rather than just single input–output pairs. These metrics are essential because agent failures often occur in ways traditional LLM metrics miss — for example, choosing the right tool with wrong arguments, creating a good plan but not following it, or completing a task inefficiently.\n", - "\n", - "In this section, we'll evaluate our banking agent's outputs and add scoring to our sample dataset against metrics defined in [DeepEval’s AI agent evaluation framework](https://deepeval.com/guides/guides-ai-agent-evaluation-metrics) which breaks down AI agent evaluation into three layers with corresponding subcategories: **reasoning**, **action**, and **execution**.\n", - "\n", - "Together, these three metrics enable granular diagnosis of agent behavior, help pinpoint where failures occur (reasoning, action, or execution), and support both development benchmarking and production monitoring." - ] - }, - { - "cell_type": "markdown", - "id": "25828bef", - "metadata": {}, - "source": [ - "<a id='toc6_1__'></a>\n", - "\n", - "### Identify relevant DeepEval scorers\n", - "\n", - "*Scorers* are evaluation metrics that analyze model outputs and store their results in the dataset:\n", - "\n", - "- Each scorer adds a new column to the dataset with format: `{scorer_name}_{metric_name}`\n", - "- The column contains the numeric score (typically `0`-`1`) for each example\n", - "- Multiple scorers can be run on the same dataset, each adding their own column\n", - "- Scores are persisted in the dataset for later analysis and visualization\n", - "- Common scorer patterns include:\n", - " - Model performance metrics (accuracy, F1, etc.)\n", - " - Output quality metrics (relevance, faithfulness)\n", - " - Task-specific metrics (completion, correctness)\n", - "\n", - "Use `list_scorers()` from [`validmind.scorers`](https://docs.validmind.ai/validmind/validmind/tests.html#scorer) to discover all available scoring methods and their IDs that can be used with `assign_scores()`. We'll filter these results to return only DeepEval scorers for our desired three metrics in a formatted table with descriptions:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "730c70ec", - "metadata": {}, - "outputs": [], - "source": [ - "# Load all DeepEval scorers\n", - "llm_scorers_dict = vm.tests.load._load_tests([s for s in vm.scorer.list_scorers() if \"deepeval\" in s.lower()])\n", - "\n", - "# Categorize scorers by metric layer\n", - "reasoning_scorers = {}\n", - "action_scorers = {}\n", - "execution_scorers = {}\n", - "\n", - "for scorer_id, scorer_func in llm_scorers_dict.items():\n", - " tags = getattr(scorer_func, \"__tags__\", [])\n", - " scorer_name = scorer_id.split(\".\")[-1]\n", - "\n", - " if \"reasoning_layer\" in tags:\n", - " reasoning_scorers[scorer_id] = scorer_func\n", - " elif \"action_layer\" in tags:\n", - " action_scorers[scorer_id] = scorer_func\n", - " elif \"TaskCompletion\" in scorer_name:\n", - " execution_scorers[scorer_id] = scorer_func\n", - "\n", - "# Display scorers by category\n", - "print(\"=\" * 80)\n", - "print(\"REASONING LAYER\")\n", - "print(\"=\" * 80)\n", - "if reasoning_scorers:\n", - " reasoning_df = vm.tests.load._pretty_list_tests(reasoning_scorers, truncate=True)\n", - " display(reasoning_df)\n", - "else:\n", - " print(\"No reasoning layer scorers found.\")\n", - "\n", - "print(\"\\n\" + \"=\" * 80)\n", - "print(\"ACTION LAYER\")\n", - "print(\"=\" * 80)\n", - "if action_scorers:\n", - " action_df = vm.tests.load._pretty_list_tests(action_scorers, truncate=True)\n", - " display(action_df)\n", - "else:\n", - " print(\"No action layer scorers found.\")\n", - "\n", - "print(\"\\n\" + \"=\" * 80)\n", - "print(\"EXECUTION LAYER\")\n", - "print(\"=\" * 80)\n", - "if execution_scorers:\n", - " execution_df = vm.tests.load._pretty_list_tests(execution_scorers, truncate=True)\n", - " display(execution_df)\n", - "else:\n", - " print(\"No execution layer scorers found.\")" - ] - }, - { - "cell_type": "markdown", - "id": "e5fb739b", - "metadata": {}, - "source": [ - "<a id='toc6_2__'></a>\n", - "\n", - "### Assign reasoning scores\n", - "\n", - "*Reasoning* evaluates planning and strategy generation:\n", - "\n", - "- **Plan quality** – How logical, complete, and efficient the agent’s plan is.\n", - "- **Plan adherence** – Whether the agent follows its own plan during execution." - ] - }, - { - "cell_type": "markdown", - "id": "fde94d01", - "metadata": {}, - "source": [ - "<a id='toc6_2_1__'></a>\n", - "\n", - "#### Plan quality score\n", - "\n", - "Let's measure how well our banking agent generates a plan before acting. A high score means the plan is logical, complete, and efficient." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "52f362ba", - "metadata": {}, - "outputs": [], - "source": [ - "vm_test_dataset.assign_scores(\n", - " metrics = \"validmind.scorers.llm.deepeval.PlanQuality\",\n", - " model = vm_banking_model,\n", - " input_column = \"input\",\n", - ")\n", - "vm_test_dataset._df[['banking_agent_model_PlanQuality_score','banking_agent_model_PlanQuality_reason']]" - ] - }, - { - "cell_type": "markdown", - "id": "d631fd12", - "metadata": {}, - "source": [ - "<a id='toc6_2_2__'></a>\n", - "\n", - "#### Plan adherence score\n", - "\n", - "Let's check whether our banking agent follows the plan it created. Deviations lower this score and indicate gaps between reasoning and execution." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "4124a7c2", - "metadata": {}, - "outputs": [], - "source": [ - "vm_test_dataset.assign_scores(\n", - " metrics = \"validmind.scorers.llm.deepeval.PlanAdherence\",\n", - " input_column = \"input\",\n", - " model = vm_banking_model,\n", - ")\n", - "vm_test_dataset._df[['banking_agent_model_PlanAdherence_score','banking_agent_model_PlanAdherence_reason']]" - ] - }, - { - "cell_type": "markdown", - "id": "82e5e6f1", - "metadata": {}, - "source": [ - "<a id='toc6_3__'></a>\n", - "\n", - "### Assign action scores\n", - "\n", - "*Action* assesses tool usage and argument generation:\n", - "\n", - "- **Tool correctness** – Whether the agent selects and calls the right tools.\n", - "- **Argument correctness** – Whether the agent generates correct tool arguments." - ] - }, - { - "cell_type": "markdown", - "id": "e641c9f2", - "metadata": {}, - "source": [ - "<a id='toc6_3_1__'></a>\n", - "\n", - "#### Tool correctness score\n", - "\n", - "Let's evaluate if our banking agent selects the appropriate tool for the task. Choosing the wrong tool reduces performance even if reasoning was correct." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "8d2e8a25", - "metadata": {}, - "outputs": [], - "source": [ - "vm_test_dataset.assign_scores(\n", - " metrics = \"validmind.scorers.llm.deepeval.ToolCorrectness\",\n", - " input_column = \"input\",\n", - " model = vm_banking_model,\n", - " expected_tools_called_column = \"expected_tools\",\n", - " actual_tools_called_column = \"banking_agent_model_tool_called\",\n", - ")\n", - "vm_test_dataset._df[['banking_agent_model_ToolCorrectness_score','banking_agent_model_ToolCorrectness_reason']]" - ] - }, - { - "cell_type": "markdown", - "id": "dd758ba5", - "metadata": {}, - "source": [ - "<a id='toc6_3_2__'></a>\n", - "\n", - "#### Argument correctness score\n", - "\n", - "Let's assesses whether our banking agent provides correct inputs or arguments to the selected tool. Incorrect arguments can lead to failed or unexpected results." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "04f90489", - "metadata": {}, - "outputs": [], - "source": [ - "vm_test_dataset.assign_scores(\n", - " metrics = \"validmind.scorers.llm.deepeval.ArgumentCorrectness\",\n", - " input_column = \"input\",\n", - " model = vm_banking_model,\n", - " actual_tools_called_column = \"banking_agent_model_tool_called\",\n", - ")\n", - "vm_test_dataset._df[['banking_agent_model_ArgumentCorrectness_score','banking_agent_model_ArgumentCorrectness_reason']]" - ] - }, - { - "cell_type": "markdown", - "id": "1aeec2f5", - "metadata": {}, - "source": [ - "<a id='toc6_4__'></a>\n", - "\n", - "### Assign execution score\n", - "\n", - "*Execution* measures end-to-end performance:\n", - "\n", - "- **Task completion** – Whether the agent successfully completes the intended task." - ] - }, - { - "cell_type": "markdown", - "id": "eb9ab8de", - "metadata": {}, - "source": [ - "<a id='toc6_4_1__'></a>\n", - "\n", - "#### Task completion score\n", - "\n", - "Let's evaluate whether our banking agent successfully completes the requested tasks. Incomplete task execution can lead to user dissatisfaction and failed banking operations." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "05024f1f", - "metadata": {}, - "outputs": [], - "source": [ - "vm_test_dataset.assign_scores(\n", - " metrics = \"validmind.scorers.llm.deepeval.TaskCompletion\",\n", - " input_column = \"input\",\n", - " model = vm_banking_model,\n", - " actual_tools_called_column = \"banking_agent_model_tool_called\",\n", - ")\n", - "vm_test_dataset._df[['banking_agent_model_TaskCompletion_score','banking_agent_model_TaskCompletion_reason']]" - ] - }, - { - "cell_type": "markdown", - "id": "b577c282", - "metadata": {}, - "source": [ - "As you recall from the beginning of this section, when we run scorers through `assign_scores()`, the return values are automatically processed and added as new columns with the format `{scorer_name}_{metric_name}`. Note that the task completion scorer has added a new column `TaskCompletion_score` to our dataset.\n", - "\n", - "We'll use this column to visualize the distribution of task completion scores across our test cases through the [BoxPlot test](https://docs.validmind.ai/validmind/validmind/tests/plots/BoxPlot.html#boxplot):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "7f6d08ca", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.plots.BoxPlot\",\n", - " inputs={\"dataset\": vm_test_dataset},\n", - " params={\n", - " \"columns\": \"banking_agent_model_TaskCompletion_score\",\n", - " \"title\": \"Distribution of Task Completion Scores\",\n", - " \"ylabel\": \"Score\",\n", - " \"figsize\": (8, 6)\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "30d9ec62", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Running RAGAS tests\n", - "\n", - "Next, let's run some out-of-the-box *Retrieval-Augmented Generation Assessment* (RAGAS) tests available in the ValidMind Library. RAGAS provides specialized metrics for evaluating retrieval-augmented generation systems and conversational AI agents. These metrics analyze different aspects of agent performance by assessing how well systems integrate retrieved information with generated responses.\n", - "\n", - "Our banking agent uses tools to retrieve information and generates responses based on that context, making it similar to a RAG system. RAGAS metrics help evaluate the quality of this integration by analyzing the relationship between retrieved tool outputs, user queries, and generated responses.\n", - "\n", - "These tests provide insights into how well our banking agent integrates tool usage with conversational abilities, ensuring it provides accurate, relevant, and helpful responses to banking users while maintaining fidelity to retrieved information." - ] - }, - { - "cell_type": "markdown", - "id": "8288f6c3", - "metadata": {}, - "source": [ - "<a id='toc7_1__'></a>\n", - "\n", - "### Identify relevant RAGAS tests\n", - "\n", - "Let's explore some of ValidMind's available tests. Using ValidMind’s repository of tests streamlines your development testing, and helps you ensure that your records are being documented and evaluated appropriately.\n", - "\n", - "You can pass `tasks` and `tags` as parameters to the [`vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to filter the tests based on the tags and task types:\n", - "\n", - "- **`tasks`** represent the kind of modeling task associated with a test. Here we'll focus on `text_qa` tasks.\n", - "- **`tags`** are free-form descriptions providing more details about the test, for example, what category the test falls into. Here we'll focus on the `ragas` tag.\n", - "\n", - "We'll then run three of these tests returned as examples below." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0701f5a9", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tests(task=\"text_qa\", tags=[\"ragas\"])" - ] - }, - { - "cell_type": "markdown", - "id": "2ce24ba0", - "metadata": {}, - "source": [ - "<a id='toc7_1_1__'></a>\n", - "\n", - "#### Faithfulness\n", - "\n", - "Let's evaluate whether the banking agent's responses accurately reflect the information retrieved from tools. Unfaithful responses can misreport credit analysis, financial calculations, and compliance results—undermining user trust in the banking agent." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "92044533", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.model_validation.ragas.Faithfulness\",\n", - " inputs={\"dataset\": vm_test_dataset},\n", - " param_grid={\n", - " \"user_input_column\": [\"input\"],\n", - " \"response_column\": [\"banking_agent_model_prediction\"],\n", - " \"retrieved_contexts_column\": [\"banking_agent_model_tool_messages\"],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "4d1fcfcd", - "metadata": {}, - "source": [ - "<a id='toc7_1_2__'></a>\n", - "\n", - "#### Response Relevancy\n", - "\n", - "Let's evaluate whether the banking agent's answers address the user's original question or request. Irrelevant or off-topic responses can frustrate users and fail to deliver the banking information they need." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d7483bc3", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.model_validation.ragas.ResponseRelevancy\",\n", - " inputs={\"dataset\": vm_test_dataset},\n", - " params={\n", - " \"user_input_column\": \"input\",\n", - " \"response_column\": \"banking_agent_model_prediction\",\n", - " \"retrieved_contexts_column\": \"banking_agent_model_tool_messages\",\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "38c1dfb5", - "metadata": {}, - "source": [ - "<a id='toc7_1_3__'></a>\n", - "\n", - "#### Context Recall\n", - "\n", - "Let's evaluate how well the banking agent uses the information retrieved from tools when generating its responses. Poor context recall can lead to incomplete or underinformed answers even when the right tools were selected." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e5dc00ce", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.model_validation.ragas.ContextRecall\",\n", - " inputs={\"dataset\": vm_test_dataset},\n", - " param_grid={\n", - " \"user_input_column\": [\"input\"],\n", - " \"retrieved_contexts_column\": [\"banking_agent_model_tool_messages\"],\n", - " \"reference_column\": [\"banking_agent_model_prediction\"],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "95e1e16a", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Running safety tests\n", - "\n", - "Finally, let's run some out-of-the-box *safety* tests available in the ValidMind Library. Safety tests provide specialized metrics for evaluating whether AI agents operate reliably and securely. These metrics analyze different aspects of agent behavior by assessing adherence to safety guidelines, consistency of outputs, and resistance to harmful or inappropriate requests.\n", - "\n", - "Our banking agent handles sensitive financial information and user requests, making safety and reliability essential. Safety tests help evaluate whether the agent maintains appropriate boundaries, responds consistently and correctly to inputs, and avoids generating harmful, biased, or unprofessional content.\n", - "\n", - "These tests provide insights into how well our banking agent upholds standards of fairness and professionalism, ensuring it operates reliably and securely for banking users." - ] - }, - { - "cell_type": "markdown", - "id": "e0972afa", - "metadata": {}, - "source": [ - "<a id='toc8_1_1__'></a>\n", - "\n", - "#### AspectCritic\n", - "\n", - "Let's evaluate our banking agent's responses across multiple quality dimensions — conciseness, coherence, correctness, harmfulness, and maliciousness. Weak performance on these dimensions can degrade user experience, fall short of professional banking standards, or introduce safety risks. \n", - "\n", - "We'll use the `AspectCritic` we identified earlier:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "148daa2b", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.model_validation.ragas.AspectCritic\",\n", - " inputs={\"dataset\": vm_test_dataset},\n", - " param_grid={\n", - " \"user_input_column\": [\"input\"],\n", - " \"response_column\": [\"banking_agent_model_prediction\"],\n", - " \"retrieved_contexts_column\": [\"banking_agent_model_tool_messages\"],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "16f29c8d", - "metadata": {}, - "source": [ - "<a id='toc8_1_2__'></a>\n", - "\n", - "#### Bias\n", - "\n", - "Let's evaluate whether our banking agent's prompts contain unintended biases that could affect banking decisions. Biased prompts can lead to unfair or discriminatory outcomes — undermining customer trust and exposing the institution to compliance risk.\n", - "\n", - "We'll first use `list_tests()` again to filter for tests relating to `prompt_validation`:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "74eba86c", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tests(filter=\"prompt_validation\")" - ] - }, - { - "cell_type": "markdown", - "id": "e9413803", - "metadata": {}, - "source": [ - "And then run the identified `Bias` test:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "062cf8e7", - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.prompt_validation.Bias\",\n", - " inputs={\n", - " \"model\": vm_banking_model,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "8f3f2dbe", - "metadata": {}, - "source": [ - "<a id='toc9__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the output produced by the ValidMind Library right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your documentation." - ] - }, - { - "cell_type": "markdown", - "id": "8716165d", - "metadata": {}, - "source": [ - "<a id='toc9_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - " What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "3. Click into any section related to the tests we ran in this notebook, for example: **4.3. Prompt Evaluation** to review the results of the tests we logged." - ] - }, - { - "cell_type": "markdown", - "id": "7c4a78ce", - "metadata": {}, - "source": [ - "<a id='toc9_2__'></a>\n", - "\n", - "### Customize the banking agent for your use case\n", - "\n", - "You've now built an agentic AI system designed for banking use cases that supports compliance with supervisory guidance such as SR 26-2 and SS1/23. While SR 26-2 explicitly excludes generative and agentic AI from its scope, underlying principles — materiality, ongoing monitoring, and effective challenge — still apply to governance of these systems. The example covers credit and fraud risk assessment for both retail and commercial banking. Extend this example agent to real-world banking scenarios and production deployment by:\n", - "\n", - "- Adapting the banking tools to your organization's specific requirements\n", - "- Adding more banking scenarios and edge cases to your test set\n", - "- Connecting the agent to your banking systems and databases\n", - "- Implementing additional banking-specific tools and workflows" - ] - }, - { - "cell_type": "markdown", - "id": "7f9385d3", - "metadata": {}, - "source": [ - "<a id='toc9_3__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "Learn more about the ValidMind Library tools we used in this notebook:\n", - "\n", - "- [Custom prompts](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/customize_test_result_descriptions.html)\n", - "- [Custom tests](https://docs.validmind.ai/notebooks/how_to/tests/custom_tests/implement_custom_tests.html)\n", - "- [ValidMind scorers](https://docs.validmind.ai/notebooks/how_to/scoring/assign_scores_complete_tutorial.html)\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "id": "fdd5c0db", - "metadata": {}, - "source": [ - "<a id='toc10__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "9733adff", - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "id": "829429fd", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "id": "55339760", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-b9e82bcf4e364c4f8e5ae4bb0e4b2865", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "validmind-1QuffXMV-py3.11", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.11.9" - } - }, - "nbformat": 4, - "nbformat_minor": 5 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Document an agentic AI system\n", + "\n", + "Build and document an agentic AI system with the ValidMind Library. Construct a LangGraph-based banking agent, assign AI evaluation metric scores to your agent, and run accuracy, RAGAS, and safety tests, then log those test results to the ValidMind Platform.\n", + "\n", + "An _AI agent_ is an autonomous system that interprets inputs, selects from available tools or actions, and executes multi-step behaviors to achieve defined goals. In this notebook, the agent acts as a banking assistant that analyzes user requests and automatically selects and invokes the appropriate specialized banking tool to deliver accurate, compliant, and actionable responses.\n", + "\n", + "- This agent enables financial institutions to automate complex banking workflows where different customer requests require different specialized tools and knowledge bases.\n", + "- Effective validation of agentic AI systems reduces the risks of agents misinterpreting inputs, failing to extract required parameters, or producing incorrect assessments or actions — such as selecting the wrong tool.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For the LLM components in this notebook to function properly, you'll need access to OpenAI.</b></span>\n", + "<br></br>\n", + "Before you continue, ensure that a valid <code>OPENAI_API_KEY</code> is set in your <code>.env</code> file.</div>" + ], + "id": "eee6b64c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Preview the documentation template](#toc2_2_4__) \n", + " - [Verify OpenAI API access](#toc2_3__) \n", + " - [Initialize the Python environment](#toc2_4__) \n", + "- [Building the LangGraph agent](#toc3__) \n", + " - [Test available banking tools](#toc3_1__) \n", + " - [Create LangGraph banking agent](#toc3_2__) \n", + " - [Define system prompt](#toc3_2_1__) \n", + " - [Initialize the LLM](#toc3_2_2__) \n", + " - [Define agent state structure](#toc3_2_3__) \n", + " - [Create agent workflow function](#toc3_2_4__) \n", + " - [Instantiate the banking agent](#toc3_2_5__) \n", + " - [Integrate agent with ValidMind](#toc3_3__) \n", + " - [Import ValidMind components](#toc3_3_1__) \n", + " - [Create agent wrapper function](#toc3_3_2__) \n", + " - [Initialize the ValidMind model object](#toc3_3_3__) \n", + " - [Store the agent reference](#toc3_3_4__) \n", + " - [Verify integration](#toc3_3_5__) \n", + " - [Validate the system prompt](#toc3_4__) \n", + "- [Initializing the ValidMind dataset](#toc4__) \n", + " - [Assign predictions](#toc4_1__) \n", + "- [Running accuracy tests](#toc5__) \n", + " - [Response accuracy test](#toc5_1__) \n", + " - [Tool selection accuracy test](#toc5_2__) \n", + "- [Assigning AI evaluation metric scores](#toc6__) \n", + " - [Identify relevant DeepEval scorers](#toc6_1__) \n", + " - [Assign reasoning scores](#toc6_2__) \n", + " - [Plan quality score](#toc6_2_1__) \n", + " - [Plan adherence score](#toc6_2_2__) \n", + " - [Assign action scores](#toc6_3__) \n", + " - [Tool correctness score](#toc6_3_1__) \n", + " - [Argument correctness score](#toc6_3_2__) \n", + " - [Assign execution score](#toc6_4__) \n", + " - [Task completion score](#toc6_4_1__) \n", + "- [Running RAGAS tests](#toc7__) \n", + " - [Identify relevant RAGAS tests](#toc7_1__) \n", + " - [Faithfulness](#toc7_1_1__) \n", + " - [Response Relevancy](#toc7_1_2__) \n", + " - [Context Recall](#toc7_1_3__) \n", + "- [Running safety tests](#toc8__) \n", + " - [AspectCritic](#toc8_1_1__) \n", + " - [Bias](#toc8_1_2__) \n", + "- [Next steps](#toc9__) \n", + " - [Work with your model documentation](#toc9_1__) \n", + " - [Customize the banking agent for your use case](#toc9_2__) \n", + " - [Discover more learning resources](#toc9_3__) \n", + "- [Upgrade ValidMind](#toc10__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ], + "id": "30927b2b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models. \n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators." + ], + "id": "b58139db" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." + ], + "id": "7e30d36b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ], + "id": "1cba586e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ], + "id": "5c46f003" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ], + "id": "11a2d7a5" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", + "<br></br>\n", + "Python 3.9 <= x <= 3.14</div>\n", + "\n", + "Let's begin by installing the ValidMind Library with large language model (LLM) support:" + ], + "id": "fbab0edf" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q \"validmind[llm]\" \"langgraph==0.3.21\"" + ], + "execution_count": null, + "outputs": [], + "id": "1982a118" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ], + "id": "14578e26" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook.\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ], + "id": "83d47d89" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Agentic AI`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ], + "id": "bb2c5670" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Can't select this template?</b></span>\n", + "<br></br>\n", + "Your organization administrators may need to add it to your template library:\n", + "<ul>\n", + "<li><a href=\"agentic_ai_template.yaml\" style=\"color: #DE257E;\"><b>Download Template YAML</b></a></li>\n", + "<li><a href=\"https://docs.validmind.ai/guide/templates/customize-document-templates.html\" style=\"color: #DE257E;\"><b>Customize Document Templates</b></a></li>\n", + "</ul>\n", + "</div>" + ], + "id": "98e475c1" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ], + "id": "0d1a13ca" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "d6ccbefc" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_4__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ], + "id": "3605df4f" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [], + "id": "dffdaa6f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Verify OpenAI API access\n", + "\n", + "Verify that a valid `OPENAI_API_KEY` is set in your `.env` file:" + ], + "id": "d467c1d2" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load environment variables if using .env file\n", + "try:\n", + " from dotenv import load_dotenv\n", + " load_dotenv()\n", + "except ImportError:\n", + " print(\"dotenv not installed. Make sure OPENAI_API_KEY is set in your environment.\")" + ], + "execution_count": null, + "outputs": [], + "id": "22cc39cb" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_4__'></a>\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Let's import all the necessary libraries to prepare for building our banking LangGraph agentic system:\n", + "\n", + "- **Standard libraries** for data handling and environment management.\n", + "- **pandas**, a Python library for data manipulation and analytics, as an alias. We'll also configure pandas to show all columns and all rows at full width for easier debugging and inspection.\n", + "- **LangChain** components for LLM integration and tool management.\n", + "- **LangGraph** for building stateful, multi-step agent workflows.\n", + "- **Banking tools** for specialized financial services as defined in [banking_tools.py](banking_tools.py)." + ], + "id": "b56c3f39" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from typing import TypedDict, Annotated, Sequence\n", + "\n", + "from langchain_core.messages import BaseMessage, HumanMessage, SystemMessage\n", + "from langchain_openai import ChatOpenAI\n", + "from langgraph.checkpoint.memory import MemorySaver\n", + "from langgraph.graph import StateGraph, END, START\n", + "from langgraph.graph.message import add_messages\n", + "from langgraph.prebuilt import ToolNode\n", + "\n", + "# LOCAL IMPORTS FROM banking_tools.py\n", + "from banking_tools import AVAILABLE_TOOLS\n", + "\n", + "import pandas as pd\n", + "# Configure pandas to show all columns and all rows at full width\n", + "pd.set_option('display.max_columns', None)\n", + "pd.set_option('display.max_colwidth', None)\n", + "pd.set_option('display.width', None)\n", + "pd.set_option('display.max_rows', None)" + ], + "execution_count": null, + "outputs": [], + "id": "2058d1ac" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Building the LangGraph agent" + ], + "id": "cc1d3265" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1__'></a>\n", + "\n", + "### Test available banking tools\n", + "\n", + "We'll use the demo banking tools defined in `banking_tools.py` that provide use cases of financial services:\n", + "\n", + "- **Credit Risk Analyzer** - Loan applications and credit decisions\n", + "- **Customer Account Manager** - Account services and customer support\n", + "- **Fraud Detection System** - Security and fraud prevention" + ], + "id": "a3c421c4" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "print(f\"Available tools: {len(AVAILABLE_TOOLS)}\")\n", + "print(\"\\nTool Details:\")\n", + "for i, tool in enumerate(AVAILABLE_TOOLS, 1):\n", + " print(f\" - {tool.name}\")" + ], + "execution_count": null, + "outputs": [], + "id": "1e0a120c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's test each banking tool individually to ensure they're working correctly before integrating them into our agent:" + ], + "id": "53906630" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Test 1: Credit Risk Analyzer\n", + "print(\"TEST 1: Credit Risk Analyzer\")\n", + "print(\"-\" * 40)\n", + "try:\n", + " # Access the underlying function using .func\n", + " credit_result = AVAILABLE_TOOLS[0].func(\n", + " customer_income=75000,\n", + " customer_debt=1200,\n", + " credit_score=720,\n", + " loan_amount=50000,\n", + " loan_type=\"personal\"\n", + " )\n", + " print(credit_result)\n", + " print(\"Credit Risk Analyzer test PASSED\")\n", + "except Exception as e:\n", + " print(f\"Credit Risk Analyzer test FAILED: {e}\")\n", + "\n", + "print(\"\" + \"=\" * 60)" + ], + "execution_count": null, + "outputs": [], + "id": "dc0caff2" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "\n", + "# Test 2: Customer Account Manager\n", + "print(\"TEST 2: Customer Account Manager\")\n", + "print(\"-\" * 40)\n", + "try:\n", + " # Test checking balance\n", + " account_result = AVAILABLE_TOOLS[1].func(\n", + " account_type=\"checking\",\n", + " customer_id=\"12345\",\n", + " action=\"check_balance\"\n", + " )\n", + " print(account_result)\n", + "\n", + " # Test getting account info\n", + " info_result = AVAILABLE_TOOLS[1].func(\n", + " account_type=\"all\",\n", + " customer_id=\"12345\", \n", + " action=\"get_info\"\n", + " )\n", + " print(info_result)\n", + " print(\"Customer Account Manager test PASSED\")\n", + "except Exception as e:\n", + " print(f\"Customer Account Manager test FAILED: {e}\")\n", + "\n", + "print(\"\" + \"=\" * 60)" + ], + "execution_count": null, + "outputs": [], + "id": "b6b227db" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "\n", + "# Test 3: Fraud Detection System\n", + "print(\"TEST 3: Fraud Detection System\")\n", + "print(\"-\" * 40)\n", + "try:\n", + " fraud_result = AVAILABLE_TOOLS[2].func(\n", + " transaction_id=\"TX123\",\n", + " customer_id=\"12345\",\n", + " transaction_amount=500.00,\n", + " transaction_type=\"withdrawal\",\n", + " location=\"Miami, FL\",\n", + " device_id=\"DEVICE_001\"\n", + " )\n", + " print(fraud_result)\n", + " print(\"Fraud Detection System test PASSED\")\n", + "except Exception as e:\n", + " print(f\"Fraud Detection System test FAILED: {e}\")\n", + "\n", + "print(\"\" + \"=\" * 60)" + ], + "execution_count": null, + "outputs": [], + "id": "a983b30d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2__'></a>\n", + "\n", + "### Create LangGraph banking agent\n", + "\n", + "With our tools ready to go, we'll create our intelligent banking agent with LangGraph that automatically selects and uses the appropriate banking tool based on a user request." + ], + "id": "1424baed" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_1__'></a>\n", + "\n", + "#### Define system prompt\n", + "\n", + "We'll begin by defining our system prompt, which provides the LLM with context about its role as a banking assistant and guidance on when to use each available tool:" + ], + "id": "3469d656" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "\n", + "# Enhanced banking system prompt with tool selection guidance\n", + "system_context = \"\"\"You are a professional banking AI assistant with access to specialized banking tools.\n", + " Analyze the user's banking request and directly use the most appropriate tools to help them.\n", + " \n", + " AVAILABLE BANKING TOOLS:\n", + " \n", + " credit_risk_analyzer - Analyze credit risk for loan applications and credit decisions\n", + " - Use for: loan applications, credit assessments, risk analysis, mortgage eligibility\n", + " - Examples: \"Analyze credit risk for $50k personal loan\", \"Assess mortgage eligibility for $300k home purchase\"\n", + " - Parameters: customer_income, customer_debt, credit_score, loan_amount, loan_type\n", + "\n", + " customer_account_manager - Manage customer accounts and provide banking services\n", + " - Use for: account information, transaction processing, product recommendations, customer service\n", + " - Examples: \"Check balance for checking account 12345\", \"Recommend products for customer with high balance\"\n", + " - Parameters: account_type, customer_id, action, amount, account_details\n", + "\n", + " fraud_detection_system - Analyze transactions for potential fraud and security risks\n", + " - Use for: transaction monitoring, fraud prevention, risk assessment, security alerts\n", + " - Examples: \"Analyze fraud risk for $500 ATM withdrawal in Miami\", \"Check security for $2000 online purchase\"\n", + " - Parameters: transaction_id, customer_id, transaction_amount, transaction_type, location, device_id\n", + "\n", + " BANKING INSTRUCTIONS:\n", + " - Analyze the user's banking request carefully and identify the primary need\n", + " - If they need credit analysis → use credit_risk_analyzer\n", + " - If they need financial calculations → use financial_calculator\n", + " - If they need account services → use customer_account_manager\n", + " - If they need security analysis → use fraud_detection_system\n", + " - Extract relevant parameters from the user's request\n", + " - Provide helpful, accurate banking responses based on tool outputs\n", + " - Always consider banking regulations, risk management, and best practices\n", + " - Be professional and thorough in your analysis\n", + "\n", + " Choose and use tools wisely to provide the most helpful banking assistance.\n", + " Describe the response in user friendly manner with details describing the tool output. \n", + " Provide the response in at least 500 words.\n", + " Generate a concise execution plan for the banking request.\n", + " \"\"\"" + ], + "execution_count": null, + "outputs": [], + "id": "7971c427" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_2__'></a>\n", + "\n", + "#### Initialize the LLM\n", + "\n", + "Let's initialize the LLM that will power our banking agent:" + ], + "id": "b66c1ac4" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Initialize the main LLM for banking responses\n", + "main_llm = ChatOpenAI(\n", + " model=\"gpt-5-mini\",\n", + " reasoning={\n", + " \"effort\": \"low\",\n", + " \"summary\": \"auto\"\n", + " }\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "866066e7" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Then bind the available banking tools to the LLM, enabling the model to automatically recognize and invoke each tool when appropriate based on request input and the system prompt we defined above:" + ], + "id": "8220afd6" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Bind all banking tools to the main LLM\n", + "llm_with_tools = main_llm.bind_tools(AVAILABLE_TOOLS)" + ], + "execution_count": null, + "outputs": [], + "id": "906d8132" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_3__'></a>\n", + "\n", + "#### Define agent state structure\n", + "\n", + "The agent state defines the data structure that flows through the LangGraph workflow. It includes:\n", + "\n", + "- **messages** — The conversation history between the user and agent\n", + "- **user_input** — The current user request\n", + "- **session_id** — A unique identifier for the conversation session\n", + "- **context** — Additional context that can be passed between nodes\n", + "\n", + "Defining this state structure maintains the structure throughout the agent's execution and allows for multi-turn conversations with memory:" + ], + "id": "43f56651" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Banking Agent State Definition\n", + "class BankingAgentState(TypedDict):\n", + " messages: Annotated[Sequence[BaseMessage], add_messages]\n", + " user_input: str\n", + " session_id: str\n", + " context: dict" + ], + "execution_count": null, + "outputs": [], + "id": "6b926ddf" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_4__'></a>\n", + "\n", + "#### Create agent workflow function\n", + "\n", + "We'll build the LangGraph agent workflow with two main components:\n", + "\n", + "1. **LLM node** — Processes user requests, applies the system prompt, and decides whether to use tools.\n", + "2. **Tools node** — Executes the selected banking tools when the LLM determines they're needed.\n", + "\n", + "The workflow begins with the LLM analyzing the request, then uses tools if needed — or ends if the response is complete, and finally returns to the LLM to generate the final response." + ], + "id": "387ba780" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "def create_banking_langgraph_agent():\n", + " \"\"\"Create a comprehensive LangGraph banking agent with intelligent tool selection.\"\"\"\n", + " def llm_node(state: BankingAgentState) -> BankingAgentState:\n", + " \"\"\"Main LLM node that processes banking requests and selects appropriate tools.\"\"\"\n", + " messages = state[\"messages\"]\n", + " # Add system context to messages\n", + " enhanced_messages = [SystemMessage(content=system_context)] + list(messages)\n", + " # Get LLM response with tool selection\n", + " response = llm_with_tools.invoke(enhanced_messages)\n", + " return {\n", + " **state,\n", + " \"messages\": messages + [response]\n", + " }\n", + " \n", + " def should_continue(state: BankingAgentState) -> str:\n", + " \"\"\"Decide whether to use tools or end the conversation.\"\"\"\n", + " last_message = state[\"messages\"][-1]\n", + " # Check if the LLM wants to use tools\n", + " if hasattr(last_message, 'tool_calls') and last_message.tool_calls:\n", + " return \"tools\"\n", + " return END\n", + " \n", + " # Create the banking state graph\n", + " workflow = StateGraph(BankingAgentState)\n", + " # Add nodes\n", + " workflow.add_node(\"llm\", llm_node)\n", + " workflow.add_node(\"tools\", ToolNode(AVAILABLE_TOOLS))\n", + " # Simplified entry point - go directly to LLM\n", + " workflow.add_edge(START, \"llm\")\n", + " # From LLM, decide whether to use tools or end\n", + " workflow.add_conditional_edges(\n", + " \"llm\",\n", + " should_continue,\n", + " {\"tools\": \"tools\", END: END}\n", + " )\n", + " # Tool execution flows back to LLM for final response\n", + " workflow.add_edge(\"tools\", \"llm\")\n", + " # Set up memory\n", + " memory = MemorySaver()\n", + " # Compile the graph\n", + " agent = workflow.compile(checkpointer=memory)\n", + " return agent" + ], + "execution_count": null, + "outputs": [], + "id": "2c9bf585" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_5__'></a>\n", + "\n", + "#### Instantiate the banking agent\n", + "\n", + "Now, we'll create an instance of the banking agent by calling the workflow creation function.\n", + "\n", + "This compiled agent is ready to process banking requests and will automatically select and use the appropriate tools based on user queries:" + ], + "id": "765242e9" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Create the banking intelligent agent\n", + "banking_agent = create_banking_langgraph_agent()\n", + "\n", + "print(\"Banking LangGraph Agent Created Successfully!\")\n", + "print(\"\\nFeatures:\")\n", + "print(\" - Intelligent banking tool selection\")\n", + "print(\" - Comprehensive banking system prompt\")\n", + "print(\" - Streamlined workflow: LLM → Tools → Response\")\n", + "print(\" - Automatic tool parameter extraction\")\n", + "print(\" - Professional banking assistance\")" + ], + "execution_count": null, + "outputs": [], + "id": "455b8ee4" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_3__'></a>\n", + "\n", + "### Integrate agent with ValidMind\n", + "\n", + "To integrate our LangGraph banking agent with ValidMind, we need to create a wrapper function that ValidMind can use to invoke the agent and extract the necessary information for testing and documentation, allowing ValidMind to run validation tests on the agent's behavior, tool usage, and responses." + ], + "id": "e00dac77" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_3_1__'></a>\n", + "\n", + "#### Import ValidMind components\n", + "\n", + "We'll start with importing the necessary ValidMind components for integrating our agent:\n", + "\n", + "- `Prompt` from `validmind.models` for handling prompt-based model inputs\n", + "- `extract_tool_calls_from_agent_output` and `_convert_to_tool_call_list` from `validmind.scorers.llm.deepeval` for extracting and converting tool calls from agent outputs" + ], + "id": "a124857e" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.models import Prompt\n", + "from validmind.scorers.llm.deepeval import extract_tool_calls_from_agent_output, _convert_to_tool_call_list\n", + "from deepeval.tracing import observe, update_current_span\n", + "from deepeval.test_case import LLMTestCase" + ], + "execution_count": null, + "outputs": [], + "id": "9aeb8969" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_3_2__'></a>\n", + "\n", + "#### Create agent wrapper function\n", + "\n", + "We'll then create a wrapper function that:\n", + "\n", + "- Accepts input in ValidMind's expected format (with `input` and `session_id` fields)\n", + "- Invokes the banking agent with the proper state initialization\n", + "- Captures tool outputs and tool calls for evaluation\n", + "- Returns a standardized response format that includes the prediction, full output, tool messages, and tool call information\n", + "- Handles errors gracefully with fallback responses" + ], + "id": "ed72903f" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "@observe(type=\"agent\")\n", + "def banking_agent_fn(input):\n", + " \"\"\"\n", + " Invoke the banking agent with the given input.\n", + " \"\"\"\n", + " try:\n", + " # Initial state for banking agent\n", + " initial_state = {\n", + " \"user_input\": input[\"input\"],\n", + " \"messages\": [HumanMessage(content=input[\"input\"])],\n", + " \"session_id\": input[\"session_id\"],\n", + " \"context\": {}\n", + " }\n", + " session_config = {\"configurable\": {\"thread_id\": input[\"session_id\"]}}\n", + " result = banking_agent.invoke(initial_state, config=session_config)\n", + "\n", + " from utils import capture_tool_output_messages\n", + "\n", + " # Capture all tool outputs and metadata\n", + " captured_data = capture_tool_output_messages(result)\n", + " \n", + " # Access specific tool outputs, this will be used for RAGAS tests\n", + " tool_message = \"\"\n", + " for output in captured_data[\"tool_outputs\"]:\n", + " tool_message += output['content']\n", + " \n", + " tool_calls_found = []\n", + " messages = result['messages']\n", + " for message in messages:\n", + " if hasattr(message, 'tool_calls') and message.tool_calls:\n", + " for tool_call in message.tool_calls:\n", + " # Handle both dictionary and object formats\n", + " if isinstance(tool_call, dict):\n", + " tool_calls_found.append(tool_call['name'])\n", + " else:\n", + " # ToolCall object - use attribute access\n", + " tool_calls_found.append(tool_call.name)\n", + "\n", + " prediction_text = result['messages'][-1].content[0]['text']\n", + " tools_called_value = _convert_to_tool_call_list(extract_tool_calls_from_agent_output(result))\n", + " expected_tools_value = _convert_to_tool_call_list(input.get(\"expected_tools\", []))\n", + "\n", + " # Feed trace data for DeepEval metrics (e.g. PlanQuality) that require tracing\n", + " update_current_span(\n", + " test_case=LLMTestCase(\n", + " input=input[\"input\"],\n", + " actual_output=prediction_text,\n", + " tools_called=tools_called_value,\n", + " expected_tools=expected_tools_value\n", + " )\n", + " )\n", + "\n", + " return {\n", + " \"prediction\": prediction_text,\n", + " \"output\": result,\n", + " \"tool_messages\": [tool_message],\n", + " # \"tool_calls\": tool_calls_found,\n", + " \"tool_called\": tools_called_value\n", + " }\n", + " except Exception as e:\n", + " # Return a fallback response if the agent fails\n", + " error_message = f\"\"\"I apologize, but I encountered an error while processing your banking request: {str(e)}.\n", + " Please try rephrasing your question or contact support if the issue persists.\"\"\"\n", + " return {\n", + " \"prediction\": error_message, \n", + " \"output\": {\n", + " \"messages\": [HumanMessage(content=input[\"input\"]), SystemMessage(content=error_message)],\n", + " \"error\": str(e)\n", + " }\n", + " }" + ], + "execution_count": null, + "outputs": [], + "id": "0e4d5a82" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_3_3__'></a>\n", + "\n", + "#### Initialize the ValidMind model\n", + "\n", + "We'll also need to register the banking agent as a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model) which:\n", + "\n", + "- Associates the wrapper function with the model for prediction\n", + "- Stores the system prompt template for documentation\n", + "- Provides a unique `input_id` for tracking and identification\n", + "- Enables the agent to be used with ValidMind's testing and documentation features" + ], + "id": "fda87401" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Initialize the agent as a model\n", + "vm_banking_model = vm.init_model(\n", + " input_id=\"banking_agent_model\",\n", + " predict_fn=banking_agent_fn,\n", + " prompt=Prompt(template=system_context)\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "60a2ce7a" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_3_4__'></a>\n", + "\n", + "#### Store the agent reference\n", + "\n", + "We'll also store a reference to the original banking agent object in the ValidMind model. This allows us to access the full agent functionality directly if needed, while still maintaining the wrapper function interface for ValidMind's testing framework." + ], + "id": "949bcf53" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Add the banking agent to the vm model\n", + "vm_banking_model.model = banking_agent" + ], + "execution_count": null, + "outputs": [], + "id": "2c653471" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_3_5__'></a>\n", + "\n", + "#### Verify integration\n", + "\n", + "Let's confirm that the banking agent has been successfully integrated with ValidMind:" + ], + "id": "d8d0c1c1" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "print(\"Banking Agent Successfully Integrated with ValidMind!\")\n", + "print(f\"Model ID: {vm_banking_model.input_id}\")" + ], + "execution_count": null, + "outputs": [], + "id": "8e101b0f" + }, + { + "cell_type": "markdown", + "id": "2a5f874e", + "metadata": {}, + "source": [ + "<a id='toc3_4__'></a>\n", + "\n", + "### Validate the system prompt\n", + "\n", + "Let's get an initial sense of how well our defined system prompt meets a few best practices for prompt engineering by running a few tests — we'll run evaluation tests later on our agent's performance.\n", + "\n", + "You run individual tests by calling [the `run_test` function](https://docs.validmind.ai/validmind/validmind/tests.html#run_test) provided by the `validmind.tests` module. Passing in our agentic model as an input, the tests below rate the prompt on a scale of 1-10 against the following criteria:\n", + "\n", + "- **prompt_validation.Clarity** — How clearly the prompt states the task.\n", + "- **prompt_validation.Conciseness** — How succinctly the prompt states the task.\n", + "- **prompt_validation.Delimitation** — When using complex prompts containing examples, contextual information, or other elements, is the prompt formatted in such a way that each element is clearly separated?\n", + "- **prompt_validation.NegativeInstruction** — Whether the prompt contains negative instructions.\n", + "- **prompt_validation.Specificity** — How specific the prompt defines the task.\n" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.prompt_validation.Clarity\",\n", + " inputs={\n", + " \"model\": vm_banking_model,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "f52dceb1" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.prompt_validation.Conciseness\",\n", + " inputs={\n", + " \"model\": vm_banking_model,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "70d52333" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.prompt_validation.Delimitation\",\n", + " inputs={\n", + " \"model\": vm_banking_model,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "5aa89976" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.prompt_validation.NegativeInstruction\",\n", + " inputs={\n", + " \"model\": vm_banking_model,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "8630197e" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.prompt_validation.Specificity\",\n", + " inputs={\n", + " \"model\": vm_banking_model,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "bba99915" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Initializing the ValidMind dataset\n", + "\n", + "After validation our system prompt, let's import our sample dataset ([banking_test_dataset.py](banking_test_dataset.py)), which we'll use in the next section to evaluate our agent's performance across different banking scenarios:" + ], + "id": "51d61141" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from banking_test_dataset import banking_test_dataset" + ], + "execution_count": null, + "outputs": [], + "id": "0c70ca2c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The next step is to connect your data with a ValidMind `Dataset` object. **This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind,** but you only need to do it once per dataset.\n", + "\n", + "Initialize a ValidMind dataset object using the [`init_dataset` function](https://docs.validmind.ai/validmind/validmind.html#init_dataset) from the ValidMind (`vm`) module. For this example, we'll pass in the following arguments:\n", + "\n", + "- **`input_id`** — A unique identifier that allows tracking what inputs are used when running each individual test.\n", + "- **`dataset`** — The raw dataset that you want to provide as input to tests.\n", + "- **`text_column`** — The name of the column containing the text input data.\n", + "- **`target_column`** — A required argument if tests require access to true values. This is the name of the target column in the dataset." + ], + "id": "442ab66d" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_test_dataset = vm.init_dataset(\n", + " input_id=\"banking_test_dataset\",\n", + " dataset=banking_test_dataset,\n", + " text_column=\"input\",\n", + " target_column=\"possible_outputs\",\n", + ")\n", + "\n", + "print(\"Banking Test Dataset Initialized in ValidMind!\")\n", + "print(f\"Dataset ID: {vm_test_dataset.input_id}\")\n", + "print(f\"Dataset columns: {vm_test_dataset._df.columns}\")\n", + "vm_test_dataset._df" + ], + "execution_count": null, + "outputs": [], + "id": "a7e9d158" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Assign predictions\n", + "\n", + "Now that both the model object and the datasets have been registered, we'll assign predictions to capture the banking agent's responses for evaluation:\n", + "\n", + "- The [`assign_predictions()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#assign_predictions) from the `Dataset` object can link existing predictions to any number of models.\n", + "- This method links the model's class prediction values and probabilities to our `vm_train_ds` and `vm_test_ds` datasets.\n", + "\n", + "If no prediction values are passed, the method will compute predictions automatically:" + ], + "id": "7b01021c" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_test_dataset.assign_predictions(vm_banking_model)\n", + "\n", + "print(\"Banking Agent Predictions Generated Successfully!\")\n", + "print(f\"Predictions assigned to {len(vm_test_dataset._df)} test cases\")\n", + "vm_test_dataset._df.head()" + ], + "execution_count": null, + "outputs": [], + "id": "1d462663" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Running accuracy tests\n", + "\n", + "Using [`@vm.test`](https://docs.validmind.ai/validmind/validmind.html#test), let's implement some reusable custom *inline tests* to assess the accuracy of our banking agent:\n", + "\n", + "- An inline test refers to a test written and executed within the same environment as the code being tested — in this case, right in this Jupyter Notebook — without requiring a separate test file or framework.\n", + "- You'll note that the custom test functions are just regular Python functions that can include and require any Python library as you see fit." + ], + "id": "4e56f556" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1__'></a>\n", + "\n", + "### Response accuracy test\n", + "\n", + "We'll create a custom test that evaluates the banking agent's ability to provide accurate responses by:\n", + "\n", + "- Testing against a dataset of predefined banking questions and expected answers.\n", + "- Checking if responses contain expected keywords and banking terminology.\n", + "- Providing detailed test results including pass/fail status.\n", + "- Helping identify any gaps in the agent's banking knowledge or response quality." + ], + "id": "1bce9258" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "\n", + "@vm.test(\"my_custom_tests.banking_accuracy_test\")\n", + "def banking_accuracy_test(model, dataset, list_of_columns):\n", + " \"\"\"\n", + " The Banking Accuracy Test evaluates whether the agent’s responses include \n", + " critical domain-specific keywords and phrases that indicate accurate, compliant,\n", + " and contextually appropriate banking information. This test ensures that the agent\n", + " provides responses containing the expected banking terminology, risk classifications,\n", + " account details, or other domain-relevant information required for regulatory compliance,\n", + " customer safety, and operational accuracy.\n", + " \"\"\"\n", + " df = dataset._df\n", + " \n", + " # Pre-compute responses for all tests\n", + " y_true = dataset.y.tolist()\n", + " y_pred = dataset.y_pred(model).tolist()\n", + "\n", + " # Vectorized test results\n", + " test_results = []\n", + " for response, keywords in zip(y_pred, y_true):\n", + " # Convert keywords to list if not already a list\n", + " if not isinstance(keywords, list):\n", + " keywords = [keywords]\n", + " test_results.append(any(str(keyword).lower() in str(response).lower() for keyword in keywords))\n", + " \n", + " results = pd.DataFrame()\n", + " column_names = [col + \"_details\" for col in list_of_columns]\n", + " results[column_names] = df[list_of_columns]\n", + " results[\"actual\"] = y_pred\n", + " results[\"expected\"] = y_true\n", + " results[\"passed\"] = test_results\n", + " results[\"error\"] = None if test_results else f'Response did not contain any expected keywords: {y_true}'\n", + " \n", + " return results" + ], + "execution_count": null, + "outputs": [], + "id": "90232066" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now that we've defined our custom response accuracy test, we can run the test using the same `run_test()` function we used earlier to validate the system prompt using our sample dataset and agentic model as input, and log the test results to the ValidMind Platform with the [`log()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#log):" + ], + "id": "2a7f71f8" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = vm.tests.run_test(\n", + " \"my_custom_tests.banking_accuracy_test\",\n", + " inputs={\n", + " \"dataset\": vm_test_dataset,\n", + " \"model\": vm_banking_model\n", + " },\n", + " params={\n", + " \"list_of_columns\": [\"input\"]\n", + " }\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "e68884d5" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's review the first five rows of the test dataset to inspect the results to see how well the banking agent performed. Each column in the output serves a specific purpose in evaluating agent performance:\n", + "\n", + "| Column header | Description | Importance |\n", + "|--------------|-------------|------------|\n", + "| **`input`** | Original user query or request | Essential for understanding the context of each test case and tracing which inputs led to specific agent behaviors. |\n", + "| **`expected_tools`** | Banking tools that should be invoked for this request | Enables validation of correct tool selection, which is critical for agentic AI systems where choosing the right tool is a key success metric. |\n", + "| **`expected_output`** | Expected output or keywords that should appear in the response | Defines the success criteria for each test case, enabling objective evaluation of whether the agent produced the correct result. |\n", + "| **`session_id`** | Unique identifier for each test session | Allows tracking and correlation of related test runs, debugging specific sessions, and maintaining audit trails. |\n", + "| **`category`** | Classification of the request type | Helps organize test results by domain and identify performance patterns across different banking use cases. |\n", + "| **`banking_agent_model_output`** | Complete agent response including all messages and reasoning | Allows you to examine the full output to assess response quality, completeness, and correctness beyond just keyword matching. |\n", + "| **`banking_agent_model_tool_messages`** | Messages exchanged with the banking tools | Critical for understanding how the agent interacted with tools, what parameters were passed, and what tool outputs were received. |\n", + "| **`banking_agent_model_tool_called`** | Specific tool that was invoked | Enables validation that the agent selected the correct tool for each request, which is fundamental to agentic AI validation. |\n", + "| **`possible_outputs`** | Alternative valid outputs or keywords that could appear in the response | Provides flexibility in evaluation by accounting for multiple acceptable response formats or variations. |" + ], + "id": "94a717e7" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_test_dataset.df.head(5)" + ], + "execution_count": null, + "outputs": [], + "id": "78f7edb1" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_2__'></a>\n", + "\n", + "### Tool selection accuracy test\n", + "\n", + "We'll also create a custom test that evaluates the banking agent's ability to select the correct tools for different requests by:\n", + "\n", + "- Testing against a dataset of predefined banking queries with expected tool selections.\n", + "- Comparing the tools actually invoked by the agent against the expected tools for each request.\n", + "- Providing quantitative accuracy scores that measure the proportion of expected tools correctly selected.\n", + "- Helping identify gaps in the agent's understanding of user needs and tool selection logic." + ], + "id": "1cb3e8bd" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "First, we'll define a helper function that extracts tool calls from the agent's messages and compares them against the expected tools. This function handles different message formats (dictionary or object) and calculates accuracy scores:" + ], + "id": "69263d62" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "def validate_tool_calls_simple(messages, expected_tools):\n", + " \"\"\"Simple validation of tool calls without RAGAS dependency issues.\"\"\"\n", + " \n", + " tool_calls_found = []\n", + " \n", + " for message in messages:\n", + " if hasattr(message, 'tool_calls') and message.tool_calls:\n", + " for tool_call in message.tool_calls:\n", + " # Handle both dictionary and object formats\n", + " if isinstance(tool_call, dict):\n", + " tool_calls_found.append(tool_call['name'])\n", + " else:\n", + " # ToolCall object - use attribute access\n", + " tool_calls_found.append(tool_call.name)\n", + " \n", + " # Check if expected tools were called\n", + " accuracy = 0.0\n", + " matches = 0\n", + " if expected_tools:\n", + " matches = sum(1 for tool in expected_tools if tool in tool_calls_found)\n", + " accuracy = matches / len(expected_tools)\n", + " \n", + " return {\n", + " 'expected_tools': expected_tools,\n", + " 'found_tools': tool_calls_found,\n", + " 'matches': matches,\n", + " 'total_expected': len(expected_tools) if expected_tools else 0,\n", + " 'accuracy': accuracy,\n", + " }" + ], + "execution_count": null, + "outputs": [], + "id": "e68798be" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now we'll define the main test function that uses the helper function to evaluate tool selection accuracy across all test cases in the dataset:" + ], + "id": "8f494fd3" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "@vm.test(\"my_custom_tests.BankingToolCallAccuracy\")\n", + "def BankingToolCallAccuracy(dataset, agent_output_column, expected_tools_column):\n", + " \"\"\"\n", + " Evaluates the tool selection accuracy of a LangGraph-powered banking agent.\n", + "\n", + " This test measures whether the agent correctly identifies and invokes the required banking tools\n", + " for each user query scenario.\n", + " For each case, the outputs generated by the agent (including its tool calls) are compared against an\n", + " expected set of tools. The test considers both coverage and exactness: it computes the proportion of\n", + " expected tools correctly called by the agent for each instance.\n", + "\n", + " Parameters:\n", + " dataset (VMDataset): The dataset containing user queries, agent outputs, and ground-truth tool expectations.\n", + " agent_output_column (str): Dataset column name containing agent outputs (should include tool call details in 'messages').\n", + " expected_tools_column (str): Dataset column specifying the true expected tools (as lists).\n", + "\n", + " Returns:\n", + " List[dict]: Per-row dictionaries with details: expected tools, found tools, match count, total expected, and accuracy score.\n", + "\n", + " Purpose:\n", + " Provides diagnostic evidence of the banking agent's core reasoning ability—specifically, its capacity to\n", + " interpret user needs and select the correct banking actions. Useful for diagnosing gaps in tool coverage,\n", + " misclassifications, or breakdowns in agent logic.\n", + "\n", + " Interpretation:\n", + " - An accuracy of 1.0 signals perfect tool selection for that example.\n", + " - Lower scores may indicate partial or complete failures to invoke required tools.\n", + " - Review 'found_tools' vs. 'expected_tools' to understand the source of discrepancies.\n", + "\n", + " Strengths:\n", + " - Directly tests a core capability of compositional tool-use agents.\n", + " - Framework-agnostic; robust to tool call output format (object or dict).\n", + " - Supports batch validation and result logging for systematic documentation.\n", + "\n", + " Limitations:\n", + " - Does not penalize extra, unnecessary tool calls.\n", + " - Does not assess result quality—only correct invocation.\n", + "\n", + " \"\"\"\n", + " df = dataset._df\n", + " \n", + " results = []\n", + " for i, row in df.iterrows():\n", + " result = validate_tool_calls_simple(row[agent_output_column]['messages'], row[expected_tools_column])\n", + " results.append(result)\n", + " \n", + " return results" + ], + "execution_count": null, + "outputs": [], + "id": "604d7313" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Finally, we can call our function with `run_test()` and log the test results to the ValidMind Platform:" + ], + "id": "57ab606b" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = vm.tests.run_test(\n", + " \"my_custom_tests.BankingToolCallAccuracy\",\n", + " inputs={\n", + " \"dataset\": vm_test_dataset,\n", + " },\n", + " params={\n", + " \"agent_output_column\": \"banking_agent_model_output\",\n", + " \"expected_tools_column\": \"expected_tools\"\n", + " }\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "dd14115e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Assigning AI evaluation metric scores\n", + "\n", + "*AI agent evaluation metrics* are specialized measurements designed to assess how well autonomous LLM-based agents reason, plan, select and execute tools, and ultimately complete user tasks by analyzing the *full execution trace* — including reasoning steps, tool calls, intermediate decisions, and outcomes, rather than just single input–output pairs. These metrics are essential because agent failures often occur in ways traditional LLM metrics miss — for example, choosing the right tool with wrong arguments, creating a good plan but not following it, or completing a task inefficiently.\n", + "\n", + "In this section, we'll evaluate our banking agent's outputs and add scoring to our sample dataset against metrics defined in [DeepEval’s AI agent evaluation framework](https://deepeval.com/guides/guides-ai-agent-evaluation-metrics) which breaks down AI agent evaluation into three layers with corresponding subcategories: **reasoning**, **action**, and **execution**.\n", + "\n", + "Together, these three metrics enable granular diagnosis of agent behavior, help pinpoint where failures occur (reasoning, action, or execution), and support both development benchmarking and production monitoring." + ], + "id": "be8d5270" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_1__'></a>\n", + "\n", + "### Identify relevant DeepEval scorers\n", + "\n", + "*Scorers* are evaluation metrics that analyze model outputs and store their results in the dataset:\n", + "\n", + "- Each scorer adds a new column to the dataset with format: `{scorer_name}_{metric_name}`\n", + "- The column contains the numeric score (typically `0`-`1`) for each example\n", + "- Multiple scorers can be run on the same dataset, each adding their own column\n", + "- Scores are persisted in the dataset for later analysis and visualization\n", + "- Common scorer patterns include:\n", + " - Model performance metrics (accuracy, F1, etc.)\n", + " - Output quality metrics (relevance, faithfulness)\n", + " - Task-specific metrics (completion, correctness)\n", + "\n", + "Use `list_scorers()` from [`validmind.scorers`](https://docs.validmind.ai/validmind/validmind/tests.html#scorer) to discover all available scoring methods and their IDs that can be used with `assign_scores()`. We'll filter these results to return only DeepEval scorers for our desired three metrics in a formatted table with descriptions:" + ], + "id": "25828bef" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load all DeepEval scorers\n", + "llm_scorers_dict = vm.tests.load._load_tests([s for s in vm.scorer.list_scorers() if \"deepeval\" in s.lower()])\n", + "\n", + "# Categorize scorers by metric layer\n", + "reasoning_scorers = {}\n", + "action_scorers = {}\n", + "execution_scorers = {}\n", + "\n", + "for scorer_id, scorer_func in llm_scorers_dict.items():\n", + " tags = getattr(scorer_func, \"__tags__\", [])\n", + " scorer_name = scorer_id.split(\".\")[-1]\n", + "\n", + " if \"reasoning_layer\" in tags:\n", + " reasoning_scorers[scorer_id] = scorer_func\n", + " elif \"action_layer\" in tags:\n", + " action_scorers[scorer_id] = scorer_func\n", + " elif \"TaskCompletion\" in scorer_name:\n", + " execution_scorers[scorer_id] = scorer_func\n", + "\n", + "# Display scorers by category\n", + "print(\"=\" * 80)\n", + "print(\"REASONING LAYER\")\n", + "print(\"=\" * 80)\n", + "if reasoning_scorers:\n", + " reasoning_df = vm.tests.load._pretty_list_tests(reasoning_scorers, truncate=True)\n", + " display(reasoning_df)\n", + "else:\n", + " print(\"No reasoning layer scorers found.\")\n", + "\n", + "print(\"\\n\" + \"=\" * 80)\n", + "print(\"ACTION LAYER\")\n", + "print(\"=\" * 80)\n", + "if action_scorers:\n", + " action_df = vm.tests.load._pretty_list_tests(action_scorers, truncate=True)\n", + " display(action_df)\n", + "else:\n", + " print(\"No action layer scorers found.\")\n", + "\n", + "print(\"\\n\" + \"=\" * 80)\n", + "print(\"EXECUTION LAYER\")\n", + "print(\"=\" * 80)\n", + "if execution_scorers:\n", + " execution_df = vm.tests.load._pretty_list_tests(execution_scorers, truncate=True)\n", + " display(execution_df)\n", + "else:\n", + " print(\"No execution layer scorers found.\")" + ], + "execution_count": null, + "outputs": [], + "id": "730c70ec" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_2__'></a>\n", + "\n", + "### Assign reasoning scores\n", + "\n", + "*Reasoning* evaluates planning and strategy generation:\n", + "\n", + "- **Plan quality** – How logical, complete, and efficient the agent’s plan is.\n", + "- **Plan adherence** – Whether the agent follows its own plan during execution." + ], + "id": "e5fb739b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_2_1__'></a>\n", + "\n", + "#### Plan quality score\n", + "\n", + "Let's measure how well our banking agent generates a plan before acting. A high score means the plan is logical, complete, and efficient." + ], + "id": "fde94d01" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_test_dataset.assign_scores(\n", + " metrics = \"validmind.scorers.llm.deepeval.PlanQuality\",\n", + " model = vm_banking_model,\n", + " input_column = \"input\",\n", + ")\n", + "vm_test_dataset._df[['banking_agent_model_PlanQuality_score','banking_agent_model_PlanQuality_reason']]" + ], + "execution_count": null, + "outputs": [], + "id": "52f362ba" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_2_2__'></a>\n", + "\n", + "#### Plan adherence score\n", + "\n", + "Let's check whether our banking agent follows the plan it created. Deviations lower this score and indicate gaps between reasoning and execution." + ], + "id": "d631fd12" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_test_dataset.assign_scores(\n", + " metrics = \"validmind.scorers.llm.deepeval.PlanAdherence\",\n", + " input_column = \"input\",\n", + " model = vm_banking_model,\n", + ")\n", + "vm_test_dataset._df[['banking_agent_model_PlanAdherence_score','banking_agent_model_PlanAdherence_reason']]" + ], + "execution_count": null, + "outputs": [], + "id": "4124a7c2" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_3__'></a>\n", + "\n", + "### Assign action scores\n", + "\n", + "*Action* assesses tool usage and argument generation:\n", + "\n", + "- **Tool correctness** – Whether the agent selects and calls the right tools.\n", + "- **Argument correctness** – Whether the agent generates correct tool arguments." + ], + "id": "82e5e6f1" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_3_1__'></a>\n", + "\n", + "#### Tool correctness score\n", + "\n", + "Let's evaluate if our banking agent selects the appropriate tool for the task. Choosing the wrong tool reduces performance even if reasoning was correct." + ], + "id": "e641c9f2" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_test_dataset.assign_scores(\n", + " metrics = \"validmind.scorers.llm.deepeval.ToolCorrectness\",\n", + " input_column = \"input\",\n", + " model = vm_banking_model,\n", + " expected_tools_called_column = \"expected_tools\",\n", + " actual_tools_called_column = \"banking_agent_model_tool_called\",\n", + ")\n", + "vm_test_dataset._df[['banking_agent_model_ToolCorrectness_score','banking_agent_model_ToolCorrectness_reason']]" + ], + "execution_count": null, + "outputs": [], + "id": "8d2e8a25" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_3_2__'></a>\n", + "\n", + "#### Argument correctness score\n", + "\n", + "Let's assesses whether our banking agent provides correct inputs or arguments to the selected tool. Incorrect arguments can lead to failed or unexpected results." + ], + "id": "dd758ba5" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_test_dataset.assign_scores(\n", + " metrics = \"validmind.scorers.llm.deepeval.ArgumentCorrectness\",\n", + " input_column = \"input\",\n", + " model = vm_banking_model,\n", + " actual_tools_called_column = \"banking_agent_model_tool_called\",\n", + ")\n", + "vm_test_dataset._df[['banking_agent_model_ArgumentCorrectness_score','banking_agent_model_ArgumentCorrectness_reason']]" + ], + "execution_count": null, + "outputs": [], + "id": "04f90489" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_4__'></a>\n", + "\n", + "### Assign execution score\n", + "\n", + "*Execution* measures end-to-end performance:\n", + "\n", + "- **Task completion** – Whether the agent successfully completes the intended task." + ], + "id": "1aeec2f5" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_4_1__'></a>\n", + "\n", + "#### Task completion score\n", + "\n", + "Let's evaluate whether our banking agent successfully completes the requested tasks. Incomplete task execution can lead to user dissatisfaction and failed banking operations." + ], + "id": "eb9ab8de" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_test_dataset.assign_scores(\n", + " metrics = \"validmind.scorers.llm.deepeval.TaskCompletion\",\n", + " input_column = \"input\",\n", + " model = vm_banking_model,\n", + " actual_tools_called_column = \"banking_agent_model_tool_called\",\n", + ")\n", + "vm_test_dataset._df[['banking_agent_model_TaskCompletion_score','banking_agent_model_TaskCompletion_reason']]" + ], + "execution_count": null, + "outputs": [], + "id": "05024f1f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As you recall from the beginning of this section, when we run scorers through `assign_scores()`, the return values are automatically processed and added as new columns with the format `{scorer_name}_{metric_name}`. Note that the task completion scorer has added a new column `TaskCompletion_score` to our dataset.\n", + "\n", + "We'll use this column to visualize the distribution of task completion scores across our test cases through the [BoxPlot test](https://docs.validmind.ai/validmind/validmind/tests/plots/BoxPlot.html#boxplot):" + ], + "id": "b577c282" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.plots.BoxPlot\",\n", + " inputs={\"dataset\": vm_test_dataset},\n", + " params={\n", + " \"columns\": \"banking_agent_model_TaskCompletion_score\",\n", + " \"title\": \"Distribution of Task Completion Scores\",\n", + " \"ylabel\": \"Score\",\n", + " \"figsize\": (8, 6)\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "7f6d08ca" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Running RAGAS tests\n", + "\n", + "Next, let's run some out-of-the-box *Retrieval-Augmented Generation Assessment* (RAGAS) tests available in the ValidMind Library. RAGAS provides specialized metrics for evaluating retrieval-augmented generation systems and conversational AI agents. These metrics analyze different aspects of agent performance by assessing how well systems integrate retrieved information with generated responses.\n", + "\n", + "Our banking agent uses tools to retrieve information and generates responses based on that context, making it similar to a RAG system. RAGAS metrics help evaluate the quality of this integration by analyzing the relationship between retrieved tool outputs, user queries, and generated responses.\n", + "\n", + "These tests provide insights into how well our banking agent integrates tool usage with conversational abilities, ensuring it provides accurate, relevant, and helpful responses to banking users while maintaining fidelity to retrieved information." + ], + "id": "30d9ec62" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7_1__'></a>\n", + "\n", + "### Identify relevant RAGAS tests\n", + "\n", + "Let's explore some of ValidMind's available tests. Using ValidMind’s repository of tests streamlines your development testing, and helps you ensure that your records are being documented and evaluated appropriately.\n", + "\n", + "You can pass `tasks` and `tags` as parameters to the [`vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to filter the tests based on the tags and task types:\n", + "\n", + "- **`tasks`** represent the kind of modeling task associated with a test. Here we'll focus on `text_qa` tasks.\n", + "- **`tags`** are free-form descriptions providing more details about the test, for example, what category the test falls into. Here we'll focus on the `ragas` tag.\n", + "\n", + "We'll then run three of these tests returned as examples below." + ], + "id": "8288f6c3" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tests(task=\"text_qa\", tags=[\"ragas\"])" + ], + "execution_count": null, + "outputs": [], + "id": "0701f5a9" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7_1_1__'></a>\n", + "\n", + "#### Faithfulness\n", + "\n", + "Let's evaluate whether the banking agent's responses accurately reflect the information retrieved from tools. Unfaithful responses can misreport credit analysis, financial calculations, and compliance results—undermining user trust in the banking agent." + ], + "id": "2ce24ba0" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.model_validation.ragas.Faithfulness\",\n", + " inputs={\"dataset\": vm_test_dataset},\n", + " param_grid={\n", + " \"user_input_column\": [\"input\"],\n", + " \"response_column\": [\"banking_agent_model_prediction\"],\n", + " \"retrieved_contexts_column\": [\"banking_agent_model_tool_messages\"],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "92044533" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7_1_2__'></a>\n", + "\n", + "#### Response Relevancy\n", + "\n", + "Let's evaluate whether the banking agent's answers address the user's original question or request. Irrelevant or off-topic responses can frustrate users and fail to deliver the banking information they need." + ], + "id": "4d1fcfcd" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.model_validation.ragas.ResponseRelevancy\",\n", + " inputs={\"dataset\": vm_test_dataset},\n", + " params={\n", + " \"user_input_column\": \"input\",\n", + " \"response_column\": \"banking_agent_model_prediction\",\n", + " \"retrieved_contexts_column\": \"banking_agent_model_tool_messages\",\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "d7483bc3" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7_1_3__'></a>\n", + "\n", + "#### Context Recall\n", + "\n", + "Let's evaluate how well the banking agent uses the information retrieved from tools when generating its responses. Poor context recall can lead to incomplete or underinformed answers even when the right tools were selected." + ], + "id": "38c1dfb5" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.model_validation.ragas.ContextRecall\",\n", + " inputs={\"dataset\": vm_test_dataset},\n", + " param_grid={\n", + " \"user_input_column\": [\"input\"],\n", + " \"retrieved_contexts_column\": [\"banking_agent_model_tool_messages\"],\n", + " \"reference_column\": [\"banking_agent_model_prediction\"],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "e5dc00ce" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Running safety tests\n", + "\n", + "Finally, let's run some out-of-the-box *safety* tests available in the ValidMind Library. Safety tests provide specialized metrics for evaluating whether AI agents operate reliably and securely. These metrics analyze different aspects of agent behavior by assessing adherence to safety guidelines, consistency of outputs, and resistance to harmful or inappropriate requests.\n", + "\n", + "Our banking agent handles sensitive financial information and user requests, making safety and reliability essential. Safety tests help evaluate whether the agent maintains appropriate boundaries, responds consistently and correctly to inputs, and avoids generating harmful, biased, or unprofessional content.\n", + "\n", + "These tests provide insights into how well our banking agent upholds standards of fairness and professionalism, ensuring it operates reliably and securely for banking users." + ], + "id": "95e1e16a" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8_1_1__'></a>\n", + "\n", + "#### AspectCritic\n", + "\n", + "Let's evaluate our banking agent's responses across multiple quality dimensions — conciseness, coherence, correctness, harmfulness, and maliciousness. Weak performance on these dimensions can degrade user experience, fall short of professional banking standards, or introduce safety risks. \n", + "\n", + "We'll use the `AspectCritic` we identified earlier:" + ], + "id": "e0972afa" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.model_validation.ragas.AspectCritic\",\n", + " inputs={\"dataset\": vm_test_dataset},\n", + " param_grid={\n", + " \"user_input_column\": [\"input\"],\n", + " \"response_column\": [\"banking_agent_model_prediction\"],\n", + " \"retrieved_contexts_column\": [\"banking_agent_model_tool_messages\"],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "148daa2b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8_1_2__'></a>\n", + "\n", + "#### Bias\n", + "\n", + "Let's evaluate whether our banking agent's prompts contain unintended biases that could affect banking decisions. Biased prompts can lead to unfair or discriminatory outcomes — undermining customer trust and exposing the institution to compliance risk.\n", + "\n", + "We'll first use `list_tests()` again to filter for tests relating to `prompt_validation`:" + ], + "id": "16f29c8d" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tests(filter=\"prompt_validation\")" + ], + "execution_count": null, + "outputs": [], + "id": "74eba86c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "And then run the identified `Bias` test:" + ], + "id": "e9413803" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.prompt_validation.Bias\",\n", + " inputs={\n", + " \"model\": vm_banking_model,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "062cf8e7" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc9__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the output produced by the ValidMind Library right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your documentation." + ], + "id": "8f3f2dbe" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc9_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + " What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "3. Click into any section related to the tests we ran in this notebook, for example: **4.3. Prompt Evaluation** to review the results of the tests we logged." + ], + "id": "8716165d" + }, + { + "cell_type": "markdown", + "id": "7c4a78ce", + "metadata": {}, + "source": [ + "<a id='toc9_2__'></a>\n", + "\n", + "### Customize the banking agent for your use case\n", + "\n", + "You've now built an agentic AI system designed for banking use cases that supports compliance with supervisory guidance such as SR 26-2 and SS1/23. While SR 26-2 explicitly excludes generative and agentic AI from its scope, underlying principles — materiality, ongoing monitoring, and effective challenge — still apply to governance of these systems. The example covers credit and fraud risk assessment for both retail and commercial banking. Extend this example agent to real-world banking scenarios and production deployment by:\n", + "\n", + "- Adapting the banking tools to your organization's specific requirements\n", + "- Adding more banking scenarios and edge cases to your test set\n", + "- Connecting the agent to your banking systems and databases\n", + "- Implementing additional banking-specific tools and workflows" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc9_3__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "Learn more about the ValidMind Library tools we used in this notebook:\n", + "\n", + "- [Custom prompts](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/customize_test_result_descriptions.html)\n", + "- [Custom tests](https://docs.validmind.ai/notebooks/how_to/tests/custom_tests/implement_custom_tests.html)\n", + "- [ValidMind scorers](https://docs.validmind.ai/notebooks/how_to/scoring/assign_scores_complete_tutorial.html)\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ], + "id": "7f9385d3" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc10__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ], + "id": "fdd5c0db" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [], + "id": "9733adff" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ], + "id": "829429fd" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ], + "id": "55339760" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-b9e82bcf4e364c4f8e5ae4bb0e4b2865" + } + ], + "metadata": { + "kernelspec": { + "display_name": "validmind-1QuffXMV-py3.11", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.9" + } + }, + "nbformat": 4, + "nbformat_minor": 5 } diff --git a/site/notebooks/use_cases/capital_markets/capital_markets_template.yaml b/site/notebooks/use_cases/capital_markets/capital_markets_template.yaml index 9cb561dc27..1ae9f6e4fa 100644 --- a/site/notebooks/use_cases/capital_markets/capital_markets_template.yaml +++ b/site/notebooks/use_cases/capital_markets/capital_markets_template.yaml @@ -40,7 +40,7 @@ with business goals. - Include specific use cases, outputs, and highlight regulatory expectations to demonstrate compliance. - - Specify compliance requirements, such as IFRS, Basel III or SR11-7, as + - Specify compliance requirements, such as IFRS, Basel III or SR 26-2, as applicable. - id: products_and_risks title: Products and Risks diff --git a/site/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb b/site/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb index a7ee3c3724..4b4ae386c0 100644 --- a/site/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb +++ b/site/notebooks/use_cases/capital_markets/quickstart_option_pricing_models.ipynb @@ -1,2109 +1,2115 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "87056cee", - "metadata": {}, - "source": [ - "# Quickstart for knockout option pricing model documentation\n", - "\n", - "Welcome! Let's get you started with the basic process of documenting models with ValidMind.\n", - "\n", - "A knockout option is a barrier option that ceases to exist if the underlying asset hits a predetermined price, known as the \"barrier.\" This barrier level, set above or below the current market price, determines whether the option will \"knock out\" before its expiration date. There are two types: \"up-and-out\" and \"down-and-out.\" In an up-and-out knockout option, the option expires if the asset price rises above the barrier, while in a down-and-out, it expires if the asset price falls below. Knockout options generally offer a lower premium than standard options since there is a chance they will expire worthless if the barrier is reached.\n", - "\n", - "Pricing knockout options involves accounting for the proximity of the asset's price to the barrier, as well as market volatility and the option’s time to expiration. High volatility and longer expiry increase the likelihood of the barrier being triggered, which reduces the option’s value. Models like modified Black-Scholes are used for simpler cases, while Monte Carlo simulations or binomial trees handle complex scenarios. Knockout options are useful for hedging or cost-effective investment strategies, allowing investors to save on premiums but with the risk of losing the option entirely if the barrier is hit.\n", - "\n", - "You will learn how to initialize the ValidMind Library, develop a option pricing model, and then write custom tests that can be used for sensitivity and stress testing to quickly generate documentation about model." - ] - }, - { - "cell_type": "markdown", - "id": "7417dfe1", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Initialize the Python environment](#toc2_3__) \n", - " - [Preview the documentation template](#toc2_4__) \n", - "- [Model development](#toc3__) \n", - "- [Data Preparation](#toc4__) \n", - " - [Synthetic data generation](#toc4_1__) \n", - " - [Initialize the ValidMind datasets](#toc4_2__) \n", - " - [Data Quality](#toc4_3__) \n", - " - [Outliers detection using IQR method](#toc4_3_1__) \n", - " - [Isolation Forest Outliers Test](#toc4_3_2__) \n", - " - [Model Calibration](#toc4_4__) \n", - " - [Synthetic Data Calibration Test](#toc4_5__) \n", - " - [Model Evaluation](#toc4_6__) \n", - " - [Benchmark Testing](#toc4_6_1__) \n", - " - [Sensitivity Testing](#toc4_6_2__) \n", - " - [Greeks](#toc4_6_3__) \n", - " - [Delta](#toc4_7__) \n", - " - [Gamma](#toc4_8__) \n", - " - [Theta](#toc4_9__) \n", - " - [Vega](#toc4_10__) \n", - " - [Rho](#toc4_11__) \n", - " - [Stress Testing](#toc4_11_1__) \n", - "- [Next steps](#toc5__) \n", - " - [Work with your model documentation](#toc5_1__) \n", - " - [Discover more learning resources](#toc5_2__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "id": "1426d212", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "id": "f8812717", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "id": "b792f6a9", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "c3d26e61", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "id": "f3db6c9b", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "id": "e1865b8d", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "id": "214572ff", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Capital markets`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "id": "8b9547ad", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0cc9c04c", - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "e928f7e5", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Next, let's import the necessary libraries and set up your Python environment for data analysis:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "9edb42a2", - "metadata": {}, - "outputs": [], - "source": [ - "%matplotlib inline\n", - "import pandas as pd\n", - "import numpy as np\n", - "import matplotlib.pyplot as plt\n", - "from scipy.optimize import minimize\n", - "\n", - "from validmind.tests import run_test" - ] - }, - { - "cell_type": "markdown", - "id": "a2403294", - "metadata": {}, - "source": [ - "<a id='toc2_4__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "3dfd04dd", - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "id": "d79d9953", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Model development" - ] - }, - { - "cell_type": "code", - "execution_count": 32, - "id": "c3f5b0b9", - "metadata": {}, - "outputs": [], - "source": [ - "class OptionPricing:\n", - " def __init__(self, S0, K, T, r):\n", - " self.S0 = S0\n", - " self.K = K\n", - " self.T = T\n", - " self.r = r\n", - "\n", - " def monte_carlo_simulation(self, N, M):\n", - " raise NotImplementedError(\"Must be implemented by subclasses\")\n", - "\n", - " def price_option(self, N, M):\n", - " raise NotImplementedError(\"Must be implemented by subclasses\")\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "a9d7f832", - "metadata": {}, - "outputs": [], - "source": [ - "\n", - "class BlackScholesModel(OptionPricing):\n", - " def __init__(self, S0, K, T, r, sigma):\n", - " super().__init__(S0, K, T, r)\n", - " self.sigma = sigma\n", - " def monte_carlo_simulation(self, N, M):\n", - " dt = self.T / M\n", - " price_paths = np.zeros((N, M + 1))\n", - " price_paths[:, 0] = self.S0\n", - " for t in range(1, M + 1):\n", - " Z = np.random.standard_normal(N)\n", - " price_paths[:, t] = price_paths[:, t - 1] * np.exp((self.r - 0.5 * self.sigma**2) * dt + self.sigma * np.sqrt(dt) * Z)\n", - " return price_paths\n", - "\n", - " def price_option(self, N, M):\n", - " price_paths = self.monte_carlo_simulation(N, M)\n", - " payoffs = np.maximum(price_paths[:, -1] - self.K, 0)\n", - " return np.exp(-self.r * self.T) * np.mean(payoffs)\n", - " \n", - " def calibrate(self, market_prices, strikes, maturities):\n", - " def objective_function(params):\n", - " self.sigma = params[0]\n", - " for K, T in zip(strikes, maturities):\n", - " self.K = K\n", - " self.T = T\n", - " model_prices.append(self.price_option(10000, 100))\n", - " return np.sum((np.array(market_prices) - np.array(model_prices))**2)\n", - " result = minimize(objective_function, [self.sigma], bounds=[(0.01, 1.0)])\n", - " self.sigma = result.x[0]\n", - "\n", - "class StochasticVolatilityModel(OptionPricing):\n", - " def __init__(self, S0, K, T, r, v0, kappa, theta, xi, rho):\n", - " super().__init__(S0, K, T, r)\n", - " self.v0 = v0\n", - " self.kappa = kappa\n", - " self.theta = theta\n", - " self.xi = xi\n", - " self.rho = rho\n", - " def monte_carlo_simulation(self, N, M):\n", - " dt = self.T / M\n", - " price_paths = np.zeros((N, M + 1))\n", - " vol_paths = np.zeros((N, M + 1))\n", - " price_paths[:, 0] = self.S0\n", - " vol_paths[:, 0] = self.v0\n", - " for t in range(1, M + 1):\n", - " Z1 = np.random.standard_normal(N)\n", - " Z2 = np.random.standard_normal(N)\n", - " W1 = Z1\n", - " W2 = self.rho * Z1 + np.sqrt(1 - self.rho**2) * Z2\n", - " vol_paths[:, t] = np.abs(vol_paths[:, t - 1] + self.kappa * (self.theta - vol_paths[:, t - 1]) * dt + self.xi * np.sqrt(vol_paths[:, t - 1] * dt) * W1)\n", - " price_paths[:, t] = price_paths[:, t - 1] * np.exp((self.r - 0.5 * vol_paths[:, t - 1]) * dt + np.sqrt(vol_paths[:, t - 1] * dt) * W2)\n", - " return price_paths\n", - "\n", - " def price_option(self, N, M):\n", - " price_paths = self.monte_carlo_simulation(N, M)\n", - " payoffs = np.maximum(price_paths[:, -1] - self.K, 0)\n", - " return np.exp(-self.r * self.T) * np.mean(payoffs)\n", - " \n", - " def calibrate(self, market_prices, strikes, maturities):\n", - " def objective_function(params):\n", - " self.v0, self.kappa, self.theta, self.xi, self.rho = params\n", - " model_prices = []\n", - " for K, T in zip(strikes, maturities):\n", - " self.K = K\n", - " self.T = T\n", - " model_prices.append(self.price_option(10000, 100))\n", - "\n", - " return np.sum((np.array(market_prices) - np.array(model_prices))**2)\n", - " \n", - " initial_guess = [self.v0, self.kappa, self.theta, self.xi, self.rho]\n", - " bounds = [(0.01, 1.0), (0.01, 5.0), (0.01, 1.0), (0.01, 1.0), (-1.0, 1.0)]\n", - " result = minimize(objective_function, initial_guess, bounds=bounds)\n", - " self.v0, self.kappa, self.theta, self.xi, self.rho = result.x\n", - "\n", - "\n", - "class KnockoutOption:\n", - " def __init__(self, model, S0, K, T, r, barrier):\n", - " self.model = model\n", - " self.S0 = S0\n", - " self.K = K\n", - " self.T = T\n", - " self.r = r\n", - " self.barrier = barrier\n", - "\n", - " def price_knockout_option(self, N, M):\n", - " dt = self.T / M\n", - " price_paths = np.zeros((N, M + 1))\n", - " vol_paths = np.zeros((N, M + 1)) if isinstance(self.model, StochasticVolatilityModel) else None\n", - " price_paths[:, 0] = self.S0\n", - " if vol_paths is not None:\n", - " vol_paths[:, 0] = self.model.v0\n", - " \n", - " for t in range(1, M + 1):\n", - " Z1 = np.random.standard_normal(N)\n", - " if vol_paths is None:\n", - " # Black-Scholes Model\n", - " price_paths[:, t] = price_paths[:, t - 1] * np.exp(\n", - " (self.r - 0.5 * self.model.sigma**2) * dt + self.model.sigma * np.sqrt(dt) * Z1\n", - " )\n", - " else:\n", - " # Stochastic Volatility Model\n", - " Z2 = np.random.standard_normal(N)\n", - " W1 = Z1\n", - " W2 = self.model.rho * Z1 + np.sqrt(1 - self.model.rho**2) * Z2\n", - " vol_paths[:, t] = np.abs(vol_paths[:, t - 1] + self.model.kappa * (self.model.theta - vol_paths[:, t - 1]) * dt + self.model.xi * np.sqrt(vol_paths[:, t - 1] * dt) * W1)\n", - " price_paths[:, t] = price_paths[:, t - 1] * np.exp(\n", - " (self.r - 0.5 * vol_paths[:, t - 1]) * dt + np.sqrt(vol_paths[:, t - 1] * dt) * W2\n", - " )\n", - " \n", - " # Knockout condition\n", - " price_paths[:, t][price_paths[:, t] >= self.barrier] = 0\n", - " payoffs = np.maximum(price_paths[:, -1] - self.K, 0)\n", - " return np.exp(-self.r * self.T) * np.mean(payoffs)" - ] - }, - { - "cell_type": "markdown", - "id": "14bcdbb9", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Data Preparation" - ] - }, - { - "cell_type": "markdown", - "id": "f655dc9c", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Synthetic data generation" - ] - }, - { - "cell_type": "code", - "execution_count": 34, - "id": "42cb9070", - "metadata": {}, - "outputs": [], - "source": [ - "def generate_synthetic_market_data(model, strikes, maturities):\n", - " market_prices = []\n", - " market_data = []\n", - " for K, T in zip(strikes, maturities):\n", - " model.K = K\n", - " model.T = T\n", - " market_prices.append(model.price_option(10000, 100))\n", - " market_data.append({\"strike\": K, \"option_price\": model.price_option(10000, 100)})\n", - " return market_prices, market_data\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "2854fbe3", - "metadata": {}, - "outputs": [], - "source": [ - "N = 10000\n", - "M = 100\n", - "\n", - "# Parameters for synthetic data\n", - "S0 = 100\n", - "K = 100\n", - "T = 1\n", - "r = 0.05\n", - "# BlackSholes\n", - "true_sigma = 0.2\n", - "\n", - "# Stochastic Volatility\n", - "true_v0 = 0.2\n", - "true_kappa = 2.0\n", - "true_theta = 0.2\n", - "true_xi = 0.1\n", - "true_rho = -0.5\n", - "\n", - "# Synthetic data generation parameters\n", - "strikes = list(np.linspace(75, 130, 25))\n", - "maturities = list(np.linspace(0.2, 3.0, 25))\n", - "\n", - "# Generate synthetic market data using the true parameters\n", - "bs_model = BlackScholesModel(S0, K, T, r, true_sigma)\n", - "bs_market_prices, bs_market_data = generate_synthetic_market_data(bs_model, strikes, maturities)\n", - "\n", - "\n", - "sv_model = StochasticVolatilityModel(S0, K, T, r, true_v0, true_kappa, true_theta, true_xi, true_rho)\n", - "sv_market_prices, sv_market_data = generate_synthetic_market_data(sv_model, strikes, maturities)\n" - ] - }, - { - "cell_type": "markdown", - "id": "b54c4950", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "7f3498dd", - "metadata": {}, - "outputs": [], - "source": [ - "bs_market_data_df = pd.DataFrame(bs_market_data)\n", - "vm_bs_market_data = vm.init_dataset(\n", - " dataset=bs_market_data_df,\n", - " input_id=\"sv_market_data\",\n", - ")\n", - "\n", - "sv_market_data_df = pd.DataFrame(sv_market_data)\n", - "vm_sv_market_data = vm.init_dataset(\n", - " dataset=sv_market_data_df,\n", - " input_id=\"sv_market_data\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "7b36b59c", - "metadata": {}, - "source": [ - "<a id='toc4_3__'></a>\n", - "\n", - "### Data Quality\n", - "Let's check quality of the data using outliers and missing data tests." - ] - }, - { - "cell_type": "markdown", - "id": "671330b1", - "metadata": {}, - "source": [ - "<a id='toc4_3_1__'></a>\n", - "\n", - "#### Outliers detection using IQR method\n", - "Let's visualizes the distribution of outliers in the option_price feature using the Interquartile Range (IQR) method." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "f1c1ab6f", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"validmind.data_validation.IQROutliersBarPlot:BlackScholes\",\n", - " inputs={\n", - " \"dataset\": vm_bs_market_data,\n", - " },\n", - " title=\"Outliers detection using IQR method for BlackScholes\",\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "6b5e8654", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"validmind.data_validation.IQROutliersTable:BlackScholes\",\n", - " inputs={\n", - " \"dataset\": vm_bs_market_data,\n", - " },\n", - " title=\"Outliers table using IQR method for BlackScholes\",\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d96f10c7", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"validmind.data_validation.IQROutliersBarPlot:StochasticVolatility\",\n", - " inputs={\n", - " \"dataset\": vm_sv_market_data,\n", - " },\n", - " title=\"Outliers detection using IQR method for StochasticVolatility\",\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "758c4c57", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"validmind.data_validation.IQROutliersTable:StochasticVolatility\",\n", - " inputs={\n", - " \"dataset\": vm_sv_market_data,\n", - " },\n", - " title=\"Outliers table using IQR method for StochasticVolatility\",\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "b1430200", - "metadata": {}, - "source": [ - "<a id='toc4_3_2__'></a>\n", - "\n", - "#### Isolation Forest Outliers Test\n", - "Let's detects anomalies in the dataset using the Isolation Forest algorithm, visualized through scatter plots." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "9eb91453", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"validmind.data_validation.IsolationForestOutliers:BlackScholes\",\n", - " inputs={\n", - " \"dataset\": vm_bs_market_data,\n", - " },\n", - " title=\"Outliers detection using Isolation Forest for BlackScholes\",\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "12940f8e", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"validmind.data_validation.IsolationForestOutliers:StochasticVolatility\",\n", - " inputs={\n", - " \"dataset\": vm_sv_market_data,\n", - " },\n", - " title=\"Outliers detection using Isolation Forest for StochasticVolatility\",\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "f30e5579", - "metadata": {}, - "source": [ - "##### Missing Values Test\n", - "Let's evaluates dataset quality by ensuring the missing value ratio across all features does not exceed a set threshold." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "805ddb1c", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"validmind.data_validation.MissingValues:BlackScholes\",\n", - " inputs={\n", - " \"dataset\": vm_bs_market_data,\n", - " },\n", - " title=\"Missing Values detection for BlackScholes\",\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e69e0039", - "metadata": {}, - "outputs": [], - "source": [ - "\n", - "result = run_test(\n", - " \"validmind.data_validation.MissingValues:StochasticVolatility\",\n", - " inputs={\n", - " \"dataset\": vm_sv_market_data,\n", - " },\n", - " title=\"MissingValues detection for StochasticVolatility\",\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "09628809", - "metadata": {}, - "source": [ - "<a id='toc4_4__'></a>\n", - "\n", - "### Model Calibration\n", - "* Clearly state the purpose of the calibration process. For example, in the context of an option pricing model, calibration aims to adjust model parameters to fit market data (e.g., market option prices, volatility surfaces).\n", - "* Specify whether the calibration is to historical data, current market data, or a blend of both." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "6802c26e", - "metadata": {}, - "outputs": [], - "source": [ - "import pandas as pd\n", - "\n", - "@vm.test(\"my_custom_tests.SyntheticDataCalibrationTest\")\n", - "def generate_synthetic_data_summary(option_pricing_model, strikes, maturities, synthetic_prices):\n", - " \"\"\"\n", - " This function will use synthetic prices to calibrate each model\n", - " and then generate derived prices based on the calibrated parameters.\n", - " It will output a DataFrame summarizing the strikes, maturities,\n", - " synthetic and derived prices, and the model parameters.\n", - "\n", - " \"\"\"\n", - " derived_prices = []\n", - " for K, T in zip(strikes, maturities):\n", - " option_pricing_model.K = K\n", - " option_pricing_model.T = T\n", - " derived_prices.append(option_pricing_model.price_option(10000, 100))\n", - " \n", - " model_type = type(option_pricing_model).__name__\n", - " data = {\n", - " \"Strike\": strikes,\n", - " \"Maturity\": maturities,\n", - " \"Synthetic_Price\": synthetic_prices,\n", - " \"Derived_Price\": derived_prices,\n", - " \"Model_Type\": model_type,\n", - " \"S0\": [option_pricing_model.S0] * len(strikes),\n", - " \"K\": [option_pricing_model.K] * len(strikes),\n", - " \"T\": [option_pricing_model.T] * len(strikes),\n", - " \"r\": [option_pricing_model.r] * len(strikes)\n", - " }\n", - " \n", - " if model_type == \"BlackScholesModel\":\n", - " data[\"sigma\"] = [option_pricing_model.sigma] * len(strikes)\n", - " elif model_type == \"StochasticVolatilityModel\":\n", - " data[\"v0\"] = [option_pricing_model.v0] * len(strikes)\n", - " data[\"kappa\"] = [option_pricing_model.kappa] * len(strikes)\n", - " data[\"theta\"] = [option_pricing_model.theta] * len(strikes)\n", - " data[\"xi\"] = [option_pricing_model.xi] * len(strikes)\n", - " data[\"rho\"] = [option_pricing_model.rho] * len(strikes)\n", - " \n", - " df = pd.DataFrame(data)\n", - " return df\n" - ] - }, - { - "cell_type": "markdown", - "id": "3bf04d21", - "metadata": {}, - "source": [ - "<a id='toc4_5__'></a>\n", - "\n", - "### Synthetic Data Calibration Test\n", - "Let's evaluates the accuracy of a stochastic volatility model by comparing synthetic prices with derived prices after model calibration." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "4345cb5c", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.SyntheticDataCalibrationTest\",\n", - " params={\n", - " \"option_pricing_model\": sv_model,\n", - " \"strikes\": strikes,\n", - " \"maturities\": maturities,\n", - " \"synthetic_prices\": sv_market_prices\n", - " },\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "4d48f107", - "metadata": {}, - "source": [ - "<a id='toc4_6__'></a>\n", - "\n", - "### Model Evaluation" - ] - }, - { - "cell_type": "markdown", - "id": "8ec8b5a3", - "metadata": {}, - "source": [ - "<a id='toc4_6_1__'></a>\n", - "\n", - "#### Benchmark Testing\n", - "* Compare the model’s performance with alternative models or industry-standard models to assess its relative effectiveness.\n", - "* Ensure that the model is competitive in pricing, accuracy, and computational efficiency." - ] - }, - { - "cell_type": "code", - "execution_count": 47, - "id": "ac733262", - "metadata": {}, - "outputs": [], - "source": [ - "@vm.test(\"my_custom_tests.BenchmarkTest\")\n", - "def benchmark_test(bs_model, sv_model, strikes, maturities):\n", - " \"\"\"\n", - " Comparison between Black Scholes and stochastic volatility model\n", - "\n", - " \"\"\"\n", - " bs_model_type = type(bs_model).__name__\n", - " sv_model_type = type(sv_model).__name__\n", - "\n", - " bs_derived_prices = []\n", - " sv_derived_prices = []\n", - " for K in strikes:\n", - " bs_model.K = K\n", - " bs_derived_prices.append(bs_model.price_option(10000, 100))\n", - " sv_model.K = K\n", - " sv_derived_prices.append(sv_model.price_option(10000, 100))\n", - "\n", - " data = {\n", - " \"Strike\": strikes,\n", - " \"Maturities\": [sv_model.T] * len(strikes),\n", - " \"bs_model_price\": bs_derived_prices,\n", - " \"sv_model_price\": sv_derived_prices,\n", - "\n", - " }\n", - " df1 = pd.DataFrame(data)\n", - "\n", - " bs_derived_prices = []\n", - " sv_derived_prices = []\n", - " for T in maturities:\n", - " bs_model.T = T\n", - " bs_derived_prices.append(bs_model.price_option(10000, 100))\n", - " sv_model.T = T\n", - " sv_derived_prices.append(sv_model.price_option(10000, 100))\n", - "\n", - " data = {\n", - " \"Strike\": [sv_model.K] * len(maturities),\n", - " \"Maturities\": maturities,\n", - " \"bs_model_price\": bs_derived_prices,\n", - " \"sv_model_price\": sv_derived_prices,\n", - " }\n", - "\n", - " df2 = pd.DataFrame(data)\n", - "\n", - " return {\"strikes variation benchmarking\": df1}, {\"maturities variation benchmarking\": df2}" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "20de9858", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.BenchmarkTest\",\n", - " params={\n", - " \"sv_model\": sv_model,\n", - " \"bs_model\": bs_model,\n", - " \"strikes\": strikes,\n", - " \"maturities\": maturities,\n", - " },\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "d9ad15b8", - "metadata": {}, - "source": [ - "##### Surface Volatility Test\n", - "Let's calculates the implied volatility across different strikes and maturities based on market prices" - ] - }, - { - "cell_type": "code", - "execution_count": 49, - "id": "46e275e3", - "metadata": {}, - "outputs": [], - "source": [ - "import numpy as np\n", - "import pandas as pd\n", - "from scipy.optimize import minimize\n", - "import plotly.graph_objects as go\n", - "\n", - "@vm.test(\"my_custom_tests.ImpliedVolSurface\")\n", - "def implied_vol_surface(market_prices, strikes, maturities, S0, r, barrier, N=10000, M=100):\n", - " \"\"\"\n", - " This is a test to compute the implied volatility surface for a given set of market prices,\n", - " strikes, and maturities.\n", - " \"\"\"\n", - " def implied_volatility(market_price, N, M, initial_guess=0.2):\n", - " def objective_function(sigma):\n", - " model.sigma = sigma\n", - " model_price = model.price_option(N, M)\n", - " return (model_price - market_price) ** 2\n", - "\n", - " result = minimize(objective_function, initial_guess, bounds=[(0.01, 1.0)])\n", - " return result.x[0]\n", - " \n", - " implied_vols = np.zeros((len(strikes), len(maturities)))\n", - "\n", - " for i, K in enumerate(strikes):\n", - " for j, T in enumerate(maturities):\n", - " market_price = market_prices[i]\n", - " model = BlackScholesModel(S0, K, T, r, sigma=0.2)\n", - "\n", - " implied_vol = implied_volatility(market_price, N, M)\n", - " implied_vols[i, j] = implied_vol\n", - "\n", - " # Create the 3D surface plot\n", - " X, Y = np.meshgrid(strikes, maturities)\n", - " Z = implied_vols.T # Transpose to match the meshgrid orientation\n", - "\n", - " fig = go.Figure(data=[go.Surface(x=X, y=Y, z=Z)])\n", - " \n", - " # Update the layout\n", - " fig.update_layout(\n", - " title=f'3D Surface Plot of Implied Volatility',\n", - " scene=dict(\n", - " xaxis_title='Strike',\n", - " yaxis_title='Maturity',\n", - " zaxis_title='Implied Volatility',\n", - " camera=dict(\n", - " up=dict(x=0, y=0, z=1),\n", - " center=dict(x=0, y=0, z=0),\n", - " eye=dict(x=1.5, y=1.5, z=1.5)\n", - " )\n", - " ),\n", - " width=900,\n", - " height=700,\n", - " margin=dict(l=65, r=50, b=65, t=90)\n", - " )\n", - "\n", - " return fig" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "66ca002a", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.ImpliedVolSurface\",\n", - " params={\n", - " \"market_prices\": sv_market_prices,\n", - " \"strikes\": strikes,\n", - " \"maturities\": maturities,\n", - " \"S0\": S0,\n", - " \"r\": r,\n", - " \"barrier\": 120\n", - " }\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "a49d8a1e", - "metadata": {}, - "source": [ - "<a id='toc4_6_2__'></a>\n", - "\n", - "#### Sensitivity Testing" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "784a5e7c", - "metadata": {}, - "outputs": [], - "source": [ - "\n", - "@vm.test(\"my_custom_tests.Sensitivity\")\n", - "def sensitivity_test(model_type, S0, T, r, N, M, strike=None, barrier=None, sigma=None, v0=None, kappa=None,theta=None, xi=None, rho=None):\n", - " \"\"\"\n", - " This is sensitivity test\n", - "\"\"\"\n", - " if model_type == 'BS':\n", - " model = BlackScholesModel(S0, strike, T, r, sigma)\n", - " else:\n", - " model = StochasticVolatilityModel(S0, strike, T, r, v0, kappa, theta, xi, rho)\n", - " \n", - " knockout_option = KnockoutOption(model, S0, strike, T, r, barrier)\n", - " price = knockout_option.price_knockout_option(N, M)\n", - "\n", - " return pd.DataFrame({\"Option price\": [price]})" - ] - }, - { - "cell_type": "markdown", - "id": "d4be30e6", - "metadata": {}, - "source": [ - "##### Initialise parameters" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "46878b84", - "metadata": {}, - "outputs": [], - "source": [ - "\n", - "strike_range = (min(strikes), max(strikes))\n", - "barrier_range = (100, 120)" - ] - }, - { - "cell_type": "markdown", - "id": "205c46ce", - "metadata": {}, - "source": [ - "##### Common plot function\n", - "Let's create a line plot using the default result output data and log it by passing the function through the `post_process_fn` parameter in the `run_test()` method." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d4b9ea2f", - "metadata": {}, - "outputs": [], - "source": [ - "from plotly.express import bar\n", - "from validmind.vm_models.figure import Figure\n", - "from validmind.vm_models.result import TestResult\n", - "import plotly.graph_objects as go\n", - "import random\n", - "\n", - "def process_results(result: TestResult):\n", - "\n", - " # Convert to DataFrame\n", - " df = pd.DataFrame(result.tables[0].data)\n", - " \n", - " # Get the first two column names\n", - " x_col = df.columns[0]\n", - " y_col = df.columns[1]\n", - " \n", - " # Create figure\n", - " fig = go.Figure()\n", - " fig.add_trace(\n", - " go.Scatter(\n", - " x=df[x_col],\n", - " y=df[y_col],\n", - " mode='lines',\n", - " name=y_col # Use y-axis column name as trace name\n", - " )\n", - " )\n", - " \n", - " fig.update_layout(\n", - " xaxis_title=x_col,\n", - " yaxis_title=y_col,\n", - " showlegend=True,\n", - " template=\"plotly_white\"\n", - " )\n", - "\n", - " result.add_figure(\n", - " Figure(\n", - " figure=fig,\n", - " key=\"sensitivity_plot_\" + str(random.randint(0, 1000000)),\n", - " ref_id=result.ref_id,\n", - " )\n", - " )\n", - "\n", - " return result" - ] - }, - { - "cell_type": "markdown", - "id": "528b409c", - "metadata": {}, - "source": [ - "##### Strike sensitivity Test\n", - "Let's evaluates the sensitivity of a model's output value to changes in the strike price, while keeping other parameters constant.\n", - "This test is crucial for understanding how variations in strike prices affect the valuation of financial derivatives, particularly options." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "bb8f1cab", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.Sensitivity:S0\",\n", - " param_grid={\n", - " \"model_type\": ['SV'],\n", - " \"N\": [N],\n", - " \"M\": [M],\n", - " \"strike\":[strike_range[0]],\n", - " \"barrier\": [barrier_range[0]],\n", - " \"S0\": list(np.linspace(S0-20, S0+20, 20)),\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"v0\": [0.2],\n", - " \"kappa\": [2],\n", - " \"theta\": [0.2],\n", - " \"xi\": [0.1],\n", - " \"rho\": [-0.5],\n", - " },\n", - " post_process_fn= process_results\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e566a681", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.Sensitivity:ToStrike\",\n", - " param_grid={\n", - " \"model_type\": ['SV'],\n", - " \"N\": [N],\n", - " \"M\": [M],\n", - " \"strike\": list(np.linspace(strike_range[0], strike_range[1], 20)),\n", - " \"barrier\": [barrier_range[0]],\n", - " \"S0\": [S0],\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"v0\": [0.2],\n", - " \"kappa\": [2],\n", - " \"theta\": [0.2],\n", - " \"xi\": [0.1],\n", - " \"rho\": [-0.5],\n", - " },\n", - " post_process_fn= process_results\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "0f288663", - "metadata": {}, - "source": [ - "##### Barrier Sensitivity Test\n", - "Let's evaluates the sensitivity of a model's output to changes in the barrier level of a financial derivative, specifically a barrier option. This test is crucial for understanding how small changes in the barrier can impact the option's valuation, which is essential for risk management and pricing strategies." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "95f81283", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.Sensitivity:ToBarrier\",\n", - " param_grid={\n", - " \"model_type\": ['SV'],\n", - " \"N\": [N],\n", - " \"M\": [M],\n", - " \"strike\": [strike_range[0]],\n", - " \"barrier\": list(np.linspace(barrier_range[0], barrier_range[1], 20)),\n", - " \"S0\": [S0],\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"v0\": [0.2],\n", - " \"kappa\": [2],\n", - " \"theta\": [0.2],\n", - " \"xi\": [0.1],\n", - " \"rho\": [-0.5],\n", - " },\n", - " post_process_fn=process_results\n", - "\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "3201aa09", - "metadata": {}, - "source": [ - "<a id='toc4_6_3__'></a>\n", - "\n", - "#### Greeks\n", - "These Greeks are crucial for traders and risk managers as they provide insights into the risk and potential price movements of options and derivatives, allowing for more informed decision-making and risk management strategies." - ] - }, - { - "cell_type": "markdown", - "id": "f31afc73", - "metadata": {}, - "source": [ - "<a id='toc4_7__'></a>\n", - "\n", - "### Delta\n", - "Let's measures the sensitivity of the option's price to a change in the price of the underlying asset. It indicates how much the price of an option is expected to move per $1 change in the underlying asset's price." - ] - }, - { - "cell_type": "code", - "execution_count": 30, - "id": "31befc58", - "metadata": {}, - "outputs": [], - "source": [ - "@vm.test(\"my_custom_tests.GreeksDelta\")\n", - "def calculate_delta(model_type, S0, T, r, N, M, strike=None, barrier=None, \n", - " sigma=None, v0=None, kappa=None, theta=None, xi=None, rho=None, \n", - " h=0.001): # h is the step size for finite difference\n", - " \"\"\"\n", - " Calculate delta using finite difference method.\n", - " Delta = (V(S0 + h) - V(S0 - h)) / (2h)\n", - " where V is the option price and h is a small increment\n", - " \"\"\"\n", - " # Initialize the model with S0 + h\n", - " if model_type == 'BS':\n", - " model_up = BlackScholesModel(S0 + h, strike, T, r, sigma)\n", - " model_down = BlackScholesModel(S0 - h, strike, T, r, sigma)\n", - " else:\n", - " model_up = StochasticVolatilityModel(S0 + h, strike, T, r, v0, kappa, theta, xi, rho)\n", - " model_down = StochasticVolatilityModel(S0 - h, strike, T, r, v0, kappa, theta, xi, rho)\n", - " \n", - "\n", - " # Calculate option prices for up and down moves\n", - " knockout_up = KnockoutOption(model_up, S0 + h, strike, T, r, barrier)\n", - " knockout_down = KnockoutOption(model_down, S0 - h, strike, T, r, barrier)\n", - " \n", - " price_up = knockout_up.price_knockout_option(N, M)\n", - " price_down = knockout_down.price_knockout_option(N, M)\n", - " \n", - " # Calculate delta using central difference\n", - " delta = (price_up - price_down) / (2 * h)\n", - " df = pd.DataFrame({\"Delta\": [delta], \"Price_Up\": [price_up], \"Price_Down\": [price_down], \"h\": [h]})\n", - " return df\n", - "\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "a033dd96", - "metadata": {}, - "outputs": [], - "source": [ - "# To analyze delta sensitivity to underlying price changes\n", - "result = run_test(\n", - " \"my_custom_tests.GreeksDelta\",\n", - " param_grid={\n", - " \"model_type\": ['SV'],\n", - " \"N\": [1000000],\n", - " \"M\": [M],\n", - " \"strike\":[strike_range[0]],\n", - " \"barrier\": [barrier_range[0]],\n", - " \"S0\": list(np.linspace(S0-20, S0+20, 20)),\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"v0\": [0.2],\n", - " \"kappa\": [2],\n", - " \"theta\": [0.2],\n", - " \"xi\": [0.1],\n", - " \"rho\": [-0.5],\n", - " \"h\": [0.001]\n", - " },\n", - "post_process_fn=process_results # Using the plotting function defined earlier\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "0826d4dc", - "metadata": {}, - "source": [ - "<a id='toc4_8__'></a>\n", - "\n", - "### Gamma\n", - "Let's measures the rate of change of Delta with respect to changes in the underlying asset's price. It indicates the curvature of the option's price relative to the underlying asset's price." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "ccf54452", - "metadata": {}, - "outputs": [], - "source": [ - "@vm.test(\"my_custom_tests.GreeksGamma\")\n", - "def calculate_gamma(model_type, S0, T, r, N, M, strike=None, barrier=None, \n", - " sigma=None, v0=None, kappa=None, theta=None, xi=None, rho=None, \n", - " h=0.01): # h is the step size for finite difference\n", - " \"\"\"\n", - " Calculate gamma using finite difference method.\n", - " Gamma = (V(S0 + h) - 2V(S0) + V(S0 - h)) / h^2\n", - " where V is the option price and h is a small increment\n", - " \"\"\"\n", - " # Initialize the models with S0 + h, S0, and S0 - h\n", - " if model_type == 'BS':\n", - " model_up = BlackScholesModel(S0 + h, strike, T, r, sigma)\n", - " model_center = BlackScholesModel(S0, strike, T, r, sigma)\n", - " model_down = BlackScholesModel(S0 - h, strike, T, r, sigma)\n", - " else:\n", - " model_up = StochasticVolatilityModel(S0 + h, strike, T, r, v0, kappa, theta, xi, rho)\n", - " model_center = StochasticVolatilityModel(S0, strike, T, r, v0, kappa, theta, xi, rho)\n", - " model_down = StochasticVolatilityModel(S0 - h, strike, T, r, v0, kappa, theta, xi, rho)\n", - " \n", - " # Calculate option prices for up, center, and down moves\n", - " knockout_up = KnockoutOption(model_up, S0 + h, strike, T, r, barrier)\n", - " knockout_center = KnockoutOption(model_center, S0, strike, T, r, barrier)\n", - " knockout_down = KnockoutOption(model_down, S0 - h, strike, T, r, barrier)\n", - " \n", - " price_up = knockout_up.price_knockout_option(N, M)\n", - " price_center = knockout_center.price_knockout_option(N, M)\n", - " price_down = knockout_down.price_knockout_option(N, M)\n", - " \n", - " # Calculate gamma using second-order central difference\n", - " gamma = (price_up - 2*price_center + price_down) / (h * h)\n", - " \n", - " df = pd.DataFrame({\n", - " \"Gamma\": [gamma], \n", - " \"Price_Up\": [price_up], \n", - " \"Price_Center\": [price_center],\n", - " \"Price_Down\": [price_down], \n", - " \"h\": [h]\n", - " })\n", - " return df\n", - "\n", - "# To analyze gamma sensitivity to underlying price changes\n", - "result = run_test(\n", - " \"my_custom_tests.GreeksGamma\",\n", - " param_grid={\n", - " \"model_type\": ['SV'],\n", - " \"N\": [1000000],\n", - " \"M\": [M],\n", - " \"strike\":[strike_range[0]],\n", - " \"barrier\": [barrier_range[0]],\n", - " \"S0\": list(np.linspace(S0-20, S0+20, 20)),\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"v0\": [0.2],\n", - " \"kappa\": [2],\n", - " \"theta\": [0.2],\n", - " \"xi\": [0.1],\n", - " \"rho\": [-0.5],\n", - " \"h\": [0.1]\n", - " },\n", - " post_process_fn=process_results # Using the plotting function defined earlier\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "df0eaa72", - "metadata": {}, - "source": [ - "<a id='toc4_9__'></a>\n", - "\n", - "### Theta\n", - "Let's measures the sensitivity of the option's price to the passage of time, also known as time decay. It indicates how much the price of an option is expected to decrease as the option approaches its expiration date." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0e9810b1", - "metadata": {}, - "outputs": [], - "source": [ - "@vm.test(\"my_custom_tests.GreeksTheta\")\n", - "def calculate_theta(model_type, S0, T, r, N, M, strike=None, barrier=None, \n", - " sigma=None, v0=None, kappa=None, theta=None, xi=None, rho=None, \n", - " dt=1/365): # dt is typically one day\n", - " \"\"\"\n", - " Calculate theta using finite difference method.\n", - " Theta = (V(t + dt) - V(t)) / dt\n", - " where V is the option price and dt is a small time increment (typically 1 day)\n", - " \"\"\"\n", - " # Initialize the models with T and T + dt\n", - " if model_type == 'BS':\n", - " model_current = BlackScholesModel(S0, strike, T, r, sigma)\n", - " model_future = BlackScholesModel(S0, strike, T + dt, r, sigma)\n", - " else:\n", - " model_current = StochasticVolatilityModel(S0, strike, T, r, v0, kappa, theta, xi, rho)\n", - " model_future = StochasticVolatilityModel(S0, strike, T + dt, r, v0, kappa, theta, xi, rho)\n", - " \n", - " # Calculate option prices for current and future time\n", - " knockout_current = KnockoutOption(model_current, S0, strike, T, r, barrier)\n", - " knockout_future = KnockoutOption(model_future, S0, strike, T + dt, r, barrier)\n", - " \n", - " price_current = knockout_current.price_knockout_option(N, M)\n", - " price_future = knockout_future.price_knockout_option(N, M)\n", - " \n", - " # Calculate theta using forward difference\n", - " # Note: We divide by dt and multiply by -1 since theta represents the negative rate of change\n", - " theta_value = -1 * (price_future - price_current) / dt\n", - " \n", - " df = pd.DataFrame({\n", - " \"Theta\": [theta_value], \n", - " \"Price_Current\": [price_current],\n", - " \"Price_Future\": [price_future],\n", - " \"dt\": [dt]\n", - " })\n", - " return df\n", - "\n", - "# Example usage to analyze theta sensitivity across different underlying prices\n", - "result = run_test(\n", - " \"my_custom_tests.GreeksTheta\",\n", - " param_grid={\n", - " \"model_type\": ['SV'],\n", - " \"N\": [1000000],\n", - " \"M\": [M],\n", - " \"strike\":[strike_range[0]],\n", - " \"barrier\": [barrier_range[0]],\n", - " \"S0\": list(np.linspace(S0-20, S0+20, 20)),\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"v0\": [0.2],\n", - " \"kappa\": [2],\n", - " \"theta\": [0.2],\n", - " \"xi\": [0.1],\n", - " \"rho\": [-0.5],\n", - " \"dt\": [1/365] # One day time step\n", - " },\n", - " post_process_fn=process_results # Using the plotting function defined earlier\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "28c60e1d", - "metadata": {}, - "source": [ - "<a id='toc4_10__'></a>\n", - "\n", - "### Vega\n", - "Let's measures the sensitivity of the option's price to changes in the volatility of the underlying asset. It indicates how much the price of an option is expected to change with a 1% change in the underlying asset's volatility." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "1dbc6632", - "metadata": {}, - "outputs": [], - "source": [ - "@vm.test(\"my_custom_tests.GreeksVega\")\n", - "def calculate_vega(model_type, S0, T, r, N, M, strike=None, barrier=None, \n", - " sigma=None, v0=None, kappa=None, theta=None, xi=None, rho=None, \n", - " h=0.001): # h is the step size for finite difference\n", - " \"\"\"\n", - " Calculate vega using finite difference method.\n", - " For Black-Scholes: Vega = (V(σ + h) - V(σ - h)) / (2h)\n", - " For Stochastic Vol: Vega = (V(v0 + h) - V(v0 - h)) / (2h)\n", - " where V is the option price and h is a small increment in volatility\n", - " \"\"\"\n", - " if model_type == 'BS':\n", - " # For Black-Scholes, perturb sigma\n", - " model_up = BlackScholesModel(S0, strike, T, r, sigma + h)\n", - " model_down = BlackScholesModel(S0, strike, T, r, sigma - h)\n", - " else:\n", - " # For Stochastic Volatility, perturb v0\n", - " model_up = StochasticVolatilityModel(S0, strike, T, r, v0 + h, kappa, theta, xi, rho)\n", - " model_down = StochasticVolatilityModel(S0, strike, T, r, v0 - h, kappa, theta, xi, rho)\n", - " \n", - " # Calculate option prices for up and down moves in volatility\n", - " knockout_up = KnockoutOption(model_up, S0, strike, T, r, barrier)\n", - " knockout_down = KnockoutOption(model_down, S0, strike, T, r, barrier)\n", - " \n", - " price_up = knockout_up.price_knockout_option(N, M)\n", - " price_down = knockout_down.price_knockout_option(N, M)\n", - " \n", - " # Calculate vega using central difference\n", - " vega = (price_up - price_down) / (2 * h)\n", - " \n", - " df = pd.DataFrame({\n", - " \"Vega\": [vega], \n", - " \"Price_Up\": [price_up], \n", - " \"Price_Down\": [price_down], \n", - " \"h\": [h]\n", - " })\n", - " return df\n", - "\n", - "# Example usage to analyze vega sensitivity across different underlying prices\n", - "result = run_test(\n", - " \"my_custom_tests.GreeksVega\",\n", - " param_grid={\n", - " \"model_type\": ['SV'],\n", - " \"N\": [1000000],\n", - " \"M\": [M],\n", - " \"strike\":[strike_range[0]],\n", - " \"barrier\": [barrier_range[0]],\n", - " \"S0\": list(np.linspace(S0-20, S0+20, 20)),\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"v0\": [0.2],\n", - " \"kappa\": [2],\n", - " \"theta\": [0.2],\n", - " \"xi\": [0.1],\n", - " \"rho\": [-0.5],\n", - " \"h\": [0.0001] # Small step size for better accuracy\n", - " },\n", - " post_process_fn=process_results # Using the plotting function defined earlier\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "1ec51eba", - "metadata": {}, - "source": [ - "<a id='toc4_11__'></a>\n", - "\n", - "### Rho\n", - "Let's measures the sensitivity of the option's price to changes in the interest rate. It indicates how much the price of an option is expected to change with a 1% change in interest rates." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "2f497b5f", - "metadata": {}, - "outputs": [], - "source": [ - "@vm.test(\"my_custom_tests.GreeksRho\")\n", - "def calculate_rho(model_type, S0, T, r, N, M, strike=None, barrier=None, \n", - " sigma=None, v0=None, kappa=None, theta=None, xi=None, rho=None, \n", - " h=0.0001): # h is the step size for finite difference\n", - " \"\"\"\n", - " Calculate rho using finite difference method.\n", - " Rho = (V(r + h) - V(r - h)) / (2h)\n", - " where V is the option price and h is a small increment in interest rate\n", - " \"\"\"\n", - " # Initialize the models with r + h and r - h\n", - " if model_type == 'BS':\n", - " model_up = BlackScholesModel(S0, strike, T, r + h, sigma)\n", - " model_down = BlackScholesModel(S0, strike, T, r - h, sigma)\n", - " else:\n", - " model_up = StochasticVolatilityModel(S0, strike, T, r + h, v0, kappa, theta, xi, rho)\n", - " model_down = StochasticVolatilityModel(S0, strike, T, r - h, v0, kappa, theta, xi, rho)\n", - " \n", - " # Calculate option prices for up and down moves in interest rate\n", - " knockout_up = KnockoutOption(model_up, S0, strike, T, r + h, barrier)\n", - " knockout_down = KnockoutOption(model_down, S0, strike, T, r - h, barrier)\n", - " \n", - " price_up = knockout_up.price_knockout_option(N, M)\n", - " price_down = knockout_down.price_knockout_option(N, M)\n", - " \n", - " # Calculate rho using central difference\n", - " rho_value = (price_up - price_down) / (2 * h)\n", - " \n", - " df = pd.DataFrame({\n", - " \"Rho\": [rho_value], \n", - " \"Price_Up\": [price_up], \n", - " \"Price_Down\": [price_down], \n", - " \"h\": [h]\n", - " })\n", - " return df\n", - "\n", - "# Example usage to analyze rho sensitivity across different underlying prices\n", - "result = run_test(\n", - " \"my_custom_tests.GreeksRho\",\n", - " param_grid={\n", - " \"model_type\": ['SV'],\n", - " \"N\": [1000000],\n", - " \"M\": [M],\n", - " \"strike\":[strike_range[0]],\n", - " \"barrier\": [barrier_range[0]],\n", - " \"S0\": list(np.linspace(S0-20, S0+20, 20)),\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"v0\": [0.2],\n", - " \"kappa\": [2],\n", - " \"theta\": [0.2],\n", - " \"xi\": [0.1],\n", - " \"rho\": [-0.5],\n", - " \"h\": [0.0001] # Small step size for better accuracy\n", - " },\n", - " post_process_fn=process_results # Using the plotting function defined earlier\n", - ")\n", - "result.log()" - ] - }, - { - "cell_type": "markdown", - "id": "0cdd1b1b", - "metadata": {}, - "source": [ - "<a id='toc4_11_1__'></a>\n", - "\n", - "#### Stress Testing" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "c98ff396", - "metadata": {}, - "outputs": [], - "source": [ - "@vm.test(\"my_custom_tests.Stressing\")\n", - "def sensitivity_test(model_type, S0, T, r, N, M, strike=None, barrier=None, sigma=None, v0=None, kappa=None,theta=None, xi=None, rho=None):\n", - " \"\"\"\n", - " This is stress test\n", - " \"\"\"\n", - " if model_type == 'BS':\n", - " model = BlackScholesModel(S0, strike, T, r, sigma)\n", - " else:\n", - " model = StochasticVolatilityModel(S0, strike, T, r, v0, kappa, theta, xi, rho)\n", - " \n", - " knockout_option = KnockoutOption(model, S0, strike, T, r, barrier)\n", - " price = knockout_option.price_knockout_option(N, M)\n", - "\n", - " return pd.DataFrame({\"Option price\": [price]})" - ] - }, - { - "cell_type": "markdown", - "id": "b6f0a179", - "metadata": {}, - "source": [ - "##### Rho (correlation) and Theta (long term vol) stress test\n", - "First, we create a surface plot to visualize the option price with respect to two variables." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b408de0f", - "metadata": {}, - "outputs": [], - "source": [ - "def two_parameters_stress_surface_plot(result: TestResult):\n", - " import plotly.graph_objects as go\n", - " import numpy as np\n", - " import pandas as pd\n", - " # Convert to DataFrame\n", - " data = pd.DataFrame(result.tables[0].data)\n", - " \n", - " # Get column names (assuming first column is x, next two are y1 and y2)\n", - " z_col = data.columns[2]\n", - " x_col = data.columns[0]\n", - " y_col = data.columns[1]\n", - " \n", - " # Get unique values for x and y\n", - " x_unique = np.sort(data[x_col].unique())\n", - " y_unique = np.sort(data[y_col].unique())\n", - " \n", - " # Create meshgrid\n", - " X, Y = np.meshgrid(x_unique, y_unique)\n", - " \n", - " # Create Z matrix\n", - " Z = np.zeros_like(X)\n", - " for i, x_val in enumerate(x_unique):\n", - " for j, y_val in enumerate(y_unique):\n", - " mask = (data[x_col] == x_val) & (data[y_col] == y_val)\n", - " if mask.any():\n", - " Z[j, i] = data.loc[mask, z_col].iloc[0]\n", - " \n", - " # Create the 3D surface plot\n", - " fig = go.Figure(data=[go.Surface(x=X, y=Y, z=Z)])\n", - " \n", - " # Update the layout\n", - " fig.update_layout(\n", - " title=f'3D Surface Plot of {z_col}',\n", - " scene=dict(\n", - " xaxis_title=x_col,\n", - " yaxis_title=y_col,\n", - " zaxis_title=z_col,\n", - " camera=dict(\n", - " up=dict(x=0, y=0, z=1),\n", - " center=dict(x=0, y=0, z=0),\n", - " eye=dict(x=1.5, y=1.5, z=1.5)\n", - " )\n", - " ),\n", - " width=900,\n", - " height=700,\n", - " margin=dict(l=65, r=50, b=65, t=90)\n", - " )\n", - "\n", - " result.add_figure(\n", - " Figure(\n", - " figure=fig,\n", - " key=\"sensitivity_plot_\" + str(random.randint(0, 1000000)),\n", - " ref_id=result.ref_id,\n", - " )\n", - " )\n", - "\n", - " return result" - ] - }, - { - "cell_type": "markdown", - "id": "87289ee6", - "metadata": {}, - "source": [ - "Let's evaluates the sensitivity of a model's output to changes in the correlation parameter (rho) and the long-term variance parameter (theta) within a stochastic volatility framework.\n", - "\n", - "This test is useful for understanding how variations in these parameters affect the model's valuation, which is crucial for risk management and model validation." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "5c0ec52d", - "metadata": {}, - "outputs": [], - "source": [ - "\n", - "\n", - "result = run_test(\n", - " \"my_custom_tests.Stressing:TheRhoAndThetaParameters\",\n", - " param_grid={\n", - " \"model_type\": ['SV'],\n", - " \"N\": [N],\n", - " \"M\": [M],\n", - " \"strike\": [strike_range[0]],\n", - " \"barrier\": [barrier_range[0]],\n", - " \"S0\": [S0],\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"v0\": [0.2],\n", - " \"kappa\": [2],\n", - " \"theta\": list(np.linspace(0,0.8, 10)),\n", - " \"xi\": [0.1],\n", - " \"rho\": list(np.linspace(-1,0.8, 10)),\n", - " },\n", - " post_process_fn=two_parameters_stress_surface_plot\n", - ")\n", - "result.log()\n" - ] - }, - { - "cell_type": "markdown", - "id": "44be4c61", - "metadata": {}, - "source": [ - "##### Rho (correlation) and Xi (vol of vol) stress test" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e0a2996e", - "metadata": {}, - "outputs": [], - "source": [ - "\n", - "\n", - "result = run_test(\n", - " \"my_custom_tests.Stressing:TheRhoAndXiParameters\",\n", - " param_grid={\n", - " \"model_type\": ['SV'],\n", - " \"N\": [N],\n", - " \"M\": [M],\n", - " \"strike\": [strike_range[0]],\n", - " \"barrier\": [barrier_range[0]],\n", - " \"S0\": [S0],\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"v0\": [0.2],\n", - " \"kappa\": [2],\n", - " \"theta\": [0.2],\n", - " \"xi\": list(np.linspace(0,0.8, 10)),\n", - " \"rho\": list(np.linspace(-1,0.8, 10)),\n", - " },\n", - " post_process_fn=two_parameters_stress_surface_plot\n", - ")\n", - "result.log()\n" - ] - }, - { - "cell_type": "markdown", - "id": "5fed568d", - "metadata": {}, - "source": [ - "##### Sigma stress test\n", - "evaluates the sensitivity of a model's output to changes in the volatility parameter, sigma. This test is crucial for understanding how variations in market volatility impact the model's valuation of financial instruments, particularly options.\n", - "\n", - "This test is useful for risk management and model validation, as it helps identify the robustness of the model under different market conditions. By analyzing the changes in the model's output as sigma varies, stakeholders can assess the model's stability and reliability." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d49e2e37", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.Stressing:TheSigmaParameter\",\n", - " param_grid={\n", - " \"model_type\": ['BS'],\n", - " \"N\": [N],\n", - " \"M\": [M],\n", - " \"strike\": [strike_range[0]],\n", - " \"barrier\": [barrier_range[0]],\n", - " \"S0\": [S0],\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"sigma\": list(np.linspace(0.2, 0.8, 10)),\n", - " },\n", - " post_process_fn=process_results\n", - ")\n", - "result.log()\n" - ] - }, - { - "cell_type": "markdown", - "id": "4e7a1f00", - "metadata": {}, - "source": [ - "##### Stress kappa\n", - "Let's evaluates the sensitivity of a model's output to changes in the kappa parameter, which is a mean reversion rate in stochastic volatility models." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e995f6ae", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.Stressing:TheKappaParameter\",\n", - " param_grid={\n", - " \"model_type\": ['SV'],\n", - " \"N\": [N],\n", - " \"M\": [M],\n", - " \"strike\": [strike_range[0]],\n", - " \"barrier\": [barrier_range[0]],\n", - " \"S0\": [S0],\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"v0\": [0.2],\n", - " \"kappa\": list(np.linspace(0, 8, 10)),\n", - " \"theta\": [0.2],\n", - " \"xi\": [0.1],\n", - " \"rho\": [-0.5],\n", - " },\n", - " post_process_fn=process_results\n", - ")\n", - "result.log()\n" - ] - }, - { - "cell_type": "markdown", - "id": "40d1c9e2", - "metadata": {}, - "source": [ - "##### Stress theta\n", - "Stress Theta evaluates the sensitivity of a model's output to changes in the parameter theta, which represents the long-term variance in a stochastic volatility model" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "7e371aee", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.Stressing:TheThetaParameter\",\n", - " param_grid={\n", - " \"model_type\": ['SV'],\n", - " \"N\": [N],\n", - " \"M\": [M],\n", - " \"strike\": [strike_range[0]],\n", - " \"barrier\": [barrier_range[0]],\n", - " \"S0\": [S0],\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"v0\": [0.2],\n", - " \"kappa\": [2],\n", - " \"theta\": list(np.linspace(0, 0.8, 10)),\n", - " \"xi\": [0.1],\n", - " \"rho\": [-0.5],\n", - " },\n", - " post_process_fn=process_results\n", - ")\n", - "result.log()\n" - ] - }, - { - "cell_type": "markdown", - "id": "e20d074f", - "metadata": {}, - "source": [ - "##### Stress xi\n", - "Stress Xi evaluates the sensitivity of a model's output to changes in the parameter xi, which represents the volatility of volatility in a stochastic volatility model. This test is crucial for understanding how variations in xi impact the model's valuation, particularly in financial derivatives pricing." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "9c545090", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.Stressing:TheXiParameter\",\n", - " param_grid={\n", - " \"model_type\": ['SV'],\n", - " \"N\": [N],\n", - " \"M\": [M],\n", - " \"strike\": [strike_range[0]],\n", - " \"barrier\": [barrier_range[0]],\n", - " \"S0\": [S0],\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"v0\": [0.2],\n", - " \"kappa\": [2],\n", - " \"theta\": [0.2],\n", - " \"xi\": list(np.linspace(0.05, 0.95, 10)),\n", - " \"rho\": [-0.5],\n", - " },\n", - " post_process_fn=process_results\n", - ")\n", - "result.log()\n" - ] - }, - { - "cell_type": "markdown", - "id": "f0360e20", - "metadata": {}, - "source": [ - "##### Stress rho\n", - "Stress rho test evaluates the sensitivity of a model's output to changes in the correlation parameter, rho, within a stochastic volatility (SV) model framework. This test is crucial for understanding how variations in rho, which represents the correlation between the asset price and its volatility, impact the model's valuation output." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e2c5dfb1", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.Stressing:TheRhoParameter\",\n", - " param_grid={\n", - " \"model_type\": ['SV'],\n", - " \"N\": [N],\n", - " \"M\": [M],\n", - " \"strike\": [strike_range[0]],\n", - " \"barrier\": [barrier_range[0]],\n", - " \"S0\": [S0],\n", - " \"T\": [T],\n", - " \"r\": [r],\n", - " \"v0\": [0.2],\n", - " \"kappa\": [2],\n", - " \"theta\": [0.2],\n", - " \"xi\": [0.1],\n", - " \"rho\": list(np.linspace(-1.0, 1.0, 20)),\n", - " },\n", - " post_process_fn=process_results\n", - ")\n", - "result.log()\n" - ] - }, - { - "cell_type": "markdown", - "id": "61d4e596", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "<a id='toc5_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "<a id='toc5_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-a23adf093a60485ea005cf8fc18545a5", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "validmind-1QuffXMV-py3.10", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.14" - } - }, - "nbformat": 4, - "nbformat_minor": 5 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Quickstart for knockout option pricing model documentation\n", + "\n", + "Welcome! Let's get you started with the basic process of documenting models with ValidMind.\n", + "\n", + "A knockout option is a barrier option that ceases to exist if the underlying asset hits a predetermined price, known as the \"barrier.\" This barrier level, set above or below the current market price, determines whether the option will \"knock out\" before its expiration date. There are two types: \"up-and-out\" and \"down-and-out.\" In an up-and-out knockout option, the option expires if the asset price rises above the barrier, while in a down-and-out, it expires if the asset price falls below. Knockout options generally offer a lower premium than standard options since there is a chance they will expire worthless if the barrier is reached.\n", + "\n", + "Pricing knockout options involves accounting for the proximity of the asset's price to the barrier, as well as market volatility and the option’s time to expiration. High volatility and longer expiry increase the likelihood of the barrier being triggered, which reduces the option’s value. Models like modified Black-Scholes are used for simpler cases, while Monte Carlo simulations or binomial trees handle complex scenarios. Knockout options are useful for hedging or cost-effective investment strategies, allowing investors to save on premiums but with the risk of losing the option entirely if the barrier is hit.\n", + "\n", + "You will learn how to initialize the ValidMind Library, develop a option pricing model, and then write custom tests that can be used for sensitivity and stress testing to quickly generate documentation about model." + ], + "id": "87056cee" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Initialize the Python environment](#toc2_3__) \n", + " - [Preview the documentation template](#toc2_4__) \n", + "- [Model development](#toc3__) \n", + "- [Data Preparation](#toc4__) \n", + " - [Synthetic data generation](#toc4_1__) \n", + " - [Initialize the ValidMind datasets](#toc4_2__) \n", + " - [Data Quality](#toc4_3__) \n", + " - [Outliers detection using IQR method](#toc4_3_1__) \n", + " - [Isolation Forest Outliers Test](#toc4_3_2__) \n", + " - [Model Calibration](#toc4_4__) \n", + " - [Synthetic Data Calibration Test](#toc4_5__) \n", + " - [Model Evaluation](#toc4_6__) \n", + " - [Benchmark Testing](#toc4_6_1__) \n", + " - [Sensitivity Testing](#toc4_6_2__) \n", + " - [Greeks](#toc4_6_3__) \n", + " - [Delta](#toc4_7__) \n", + " - [Gamma](#toc4_8__) \n", + " - [Theta](#toc4_9__) \n", + " - [Vega](#toc4_10__) \n", + " - [Rho](#toc4_11__) \n", + " - [Stress Testing](#toc4_11_1__) \n", + "- [Next steps](#toc5__) \n", + " - [Work with your model documentation](#toc5_1__) \n", + " - [Discover more learning resources](#toc5_2__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ], + "id": "7417dfe1" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", + "\n", + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ], + "id": "1426d212" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ], + "id": "f8812717" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ], + "id": "b792f6a9" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [], + "id": "c3d26e61" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ], + "id": "f3db6c9b" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ], + "id": "e1865b8d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Capital markets`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ], + "id": "214572ff" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ], + "id": "8b9547ad" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "0cc9c04c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Next, let's import the necessary libraries and set up your Python environment for data analysis:" + ], + "id": "e928f7e5" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%matplotlib inline\n", + "import pandas as pd\n", + "import numpy as np\n", + "import matplotlib.pyplot as plt\n", + "from scipy.optimize import minimize\n", + "\n", + "from validmind.tests import run_test" + ], + "execution_count": null, + "outputs": [], + "id": "9edb42a2" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_4__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ], + "id": "a2403294" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [], + "id": "3dfd04dd" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Model development" + ], + "id": "d79d9953" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "class OptionPricing:\n", + " def __init__(self, S0, K, T, r):\n", + " self.S0 = S0\n", + " self.K = K\n", + " self.T = T\n", + " self.r = r\n", + "\n", + " def monte_carlo_simulation(self, N, M):\n", + " raise NotImplementedError(\"Must be implemented by subclasses\")\n", + "\n", + " def price_option(self, N, M):\n", + " raise NotImplementedError(\"Must be implemented by subclasses\")\n" + ], + "execution_count": 32, + "outputs": [], + "id": "c3f5b0b9" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "\n", + "class BlackScholesModel(OptionPricing):\n", + " def __init__(self, S0, K, T, r, sigma):\n", + " super().__init__(S0, K, T, r)\n", + " self.sigma = sigma\n", + " def monte_carlo_simulation(self, N, M):\n", + " dt = self.T / M\n", + " price_paths = np.zeros((N, M + 1))\n", + " price_paths[:, 0] = self.S0\n", + " for t in range(1, M + 1):\n", + " Z = np.random.standard_normal(N)\n", + " price_paths[:, t] = price_paths[:, t - 1] * np.exp((self.r - 0.5 * self.sigma**2) * dt + self.sigma * np.sqrt(dt) * Z)\n", + " return price_paths\n", + "\n", + " def price_option(self, N, M):\n", + " price_paths = self.monte_carlo_simulation(N, M)\n", + " payoffs = np.maximum(price_paths[:, -1] - self.K, 0)\n", + " return np.exp(-self.r * self.T) * np.mean(payoffs)\n", + " \n", + " def calibrate(self, market_prices, strikes, maturities):\n", + " def objective_function(params):\n", + " self.sigma = params[0]\n", + " for K, T in zip(strikes, maturities):\n", + " self.K = K\n", + " self.T = T\n", + " model_prices.append(self.price_option(10000, 100))\n", + " return np.sum((np.array(market_prices) - np.array(model_prices))**2)\n", + " result = minimize(objective_function, [self.sigma], bounds=[(0.01, 1.0)])\n", + " self.sigma = result.x[0]\n", + "\n", + "class StochasticVolatilityModel(OptionPricing):\n", + " def __init__(self, S0, K, T, r, v0, kappa, theta, xi, rho):\n", + " super().__init__(S0, K, T, r)\n", + " self.v0 = v0\n", + " self.kappa = kappa\n", + " self.theta = theta\n", + " self.xi = xi\n", + " self.rho = rho\n", + " def monte_carlo_simulation(self, N, M):\n", + " dt = self.T / M\n", + " price_paths = np.zeros((N, M + 1))\n", + " vol_paths = np.zeros((N, M + 1))\n", + " price_paths[:, 0] = self.S0\n", + " vol_paths[:, 0] = self.v0\n", + " for t in range(1, M + 1):\n", + " Z1 = np.random.standard_normal(N)\n", + " Z2 = np.random.standard_normal(N)\n", + " W1 = Z1\n", + " W2 = self.rho * Z1 + np.sqrt(1 - self.rho**2) * Z2\n", + " vol_paths[:, t] = np.abs(vol_paths[:, t - 1] + self.kappa * (self.theta - vol_paths[:, t - 1]) * dt + self.xi * np.sqrt(vol_paths[:, t - 1] * dt) * W1)\n", + " price_paths[:, t] = price_paths[:, t - 1] * np.exp((self.r - 0.5 * vol_paths[:, t - 1]) * dt + np.sqrt(vol_paths[:, t - 1] * dt) * W2)\n", + " return price_paths\n", + "\n", + " def price_option(self, N, M):\n", + " price_paths = self.monte_carlo_simulation(N, M)\n", + " payoffs = np.maximum(price_paths[:, -1] - self.K, 0)\n", + " return np.exp(-self.r * self.T) * np.mean(payoffs)\n", + " \n", + " def calibrate(self, market_prices, strikes, maturities):\n", + " def objective_function(params):\n", + " self.v0, self.kappa, self.theta, self.xi, self.rho = params\n", + " model_prices = []\n", + " for K, T in zip(strikes, maturities):\n", + " self.K = K\n", + " self.T = T\n", + " model_prices.append(self.price_option(10000, 100))\n", + "\n", + " return np.sum((np.array(market_prices) - np.array(model_prices))**2)\n", + " \n", + " initial_guess = [self.v0, self.kappa, self.theta, self.xi, self.rho]\n", + " bounds = [(0.01, 1.0), (0.01, 5.0), (0.01, 1.0), (0.01, 1.0), (-1.0, 1.0)]\n", + " result = minimize(objective_function, initial_guess, bounds=bounds)\n", + " self.v0, self.kappa, self.theta, self.xi, self.rho = result.x\n", + "\n", + "\n", + "class KnockoutOption:\n", + " def __init__(self, model, S0, K, T, r, barrier):\n", + " self.model = model\n", + " self.S0 = S0\n", + " self.K = K\n", + " self.T = T\n", + " self.r = r\n", + " self.barrier = barrier\n", + "\n", + " def price_knockout_option(self, N, M):\n", + " dt = self.T / M\n", + " price_paths = np.zeros((N, M + 1))\n", + " vol_paths = np.zeros((N, M + 1)) if isinstance(self.model, StochasticVolatilityModel) else None\n", + " price_paths[:, 0] = self.S0\n", + " if vol_paths is not None:\n", + " vol_paths[:, 0] = self.model.v0\n", + " \n", + " for t in range(1, M + 1):\n", + " Z1 = np.random.standard_normal(N)\n", + " if vol_paths is None:\n", + " # Black-Scholes Model\n", + " price_paths[:, t] = price_paths[:, t - 1] * np.exp(\n", + " (self.r - 0.5 * self.model.sigma**2) * dt + self.model.sigma * np.sqrt(dt) * Z1\n", + " )\n", + " else:\n", + " # Stochastic Volatility Model\n", + " Z2 = np.random.standard_normal(N)\n", + " W1 = Z1\n", + " W2 = self.model.rho * Z1 + np.sqrt(1 - self.model.rho**2) * Z2\n", + " vol_paths[:, t] = np.abs(vol_paths[:, t - 1] + self.model.kappa * (self.model.theta - vol_paths[:, t - 1]) * dt + self.model.xi * np.sqrt(vol_paths[:, t - 1] * dt) * W1)\n", + " price_paths[:, t] = price_paths[:, t - 1] * np.exp(\n", + " (self.r - 0.5 * vol_paths[:, t - 1]) * dt + np.sqrt(vol_paths[:, t - 1] * dt) * W2\n", + " )\n", + " \n", + " # Knockout condition\n", + " price_paths[:, t][price_paths[:, t] >= self.barrier] = 0\n", + " payoffs = np.maximum(price_paths[:, -1] - self.K, 0)\n", + " return np.exp(-self.r * self.T) * np.mean(payoffs)" + ], + "execution_count": null, + "outputs": [], + "id": "a9d7f832" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Data Preparation" + ], + "id": "14bcdbb9" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Synthetic data generation" + ], + "id": "f655dc9c" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "def generate_synthetic_market_data(model, strikes, maturities):\n", + " market_prices = []\n", + " market_data = []\n", + " for K, T in zip(strikes, maturities):\n", + " model.K = K\n", + " model.T = T\n", + " market_prices.append(model.price_option(10000, 100))\n", + " market_data.append({\"strike\": K, \"option_price\": model.price_option(10000, 100)})\n", + " return market_prices, market_data\n" + ], + "execution_count": 34, + "outputs": [], + "id": "42cb9070" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "N = 10000\n", + "M = 100\n", + "\n", + "# Parameters for synthetic data\n", + "S0 = 100\n", + "K = 100\n", + "T = 1\n", + "r = 0.05\n", + "# BlackSholes\n", + "true_sigma = 0.2\n", + "\n", + "# Stochastic Volatility\n", + "true_v0 = 0.2\n", + "true_kappa = 2.0\n", + "true_theta = 0.2\n", + "true_xi = 0.1\n", + "true_rho = -0.5\n", + "\n", + "# Synthetic data generation parameters\n", + "strikes = list(np.linspace(75, 130, 25))\n", + "maturities = list(np.linspace(0.2, 3.0, 25))\n", + "\n", + "# Generate synthetic market data using the true parameters\n", + "bs_model = BlackScholesModel(S0, K, T, r, true_sigma)\n", + "bs_market_prices, bs_market_data = generate_synthetic_market_data(bs_model, strikes, maturities)\n", + "\n", + "\n", + "sv_model = StochasticVolatilityModel(S0, K, T, r, true_v0, true_kappa, true_theta, true_xi, true_rho)\n", + "sv_market_prices, sv_market_data = generate_synthetic_market_data(sv_model, strikes, maturities)\n" + ], + "execution_count": null, + "outputs": [], + "id": "2854fbe3" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module." + ], + "id": "b54c4950" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "bs_market_data_df = pd.DataFrame(bs_market_data)\n", + "vm_bs_market_data = vm.init_dataset(\n", + " dataset=bs_market_data_df,\n", + " input_id=\"sv_market_data\",\n", + ")\n", + "\n", + "sv_market_data_df = pd.DataFrame(sv_market_data)\n", + "vm_sv_market_data = vm.init_dataset(\n", + " dataset=sv_market_data_df,\n", + " input_id=\"sv_market_data\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "7f3498dd" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_3__'></a>\n", + "\n", + "### Data Quality\n", + "Let's check quality of the data using outliers and missing data tests." + ], + "id": "7b36b59c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_3_1__'></a>\n", + "\n", + "#### Outliers detection using IQR method\n", + "Let's visualizes the distribution of outliers in the option_price feature using the Interquartile Range (IQR) method." + ], + "id": "671330b1" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"validmind.data_validation.IQROutliersBarPlot:BlackScholes\",\n", + " inputs={\n", + " \"dataset\": vm_bs_market_data,\n", + " },\n", + " title=\"Outliers detection using IQR method for BlackScholes\",\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "f1c1ab6f" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"validmind.data_validation.IQROutliersTable:BlackScholes\",\n", + " inputs={\n", + " \"dataset\": vm_bs_market_data,\n", + " },\n", + " title=\"Outliers table using IQR method for BlackScholes\",\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "6b5e8654" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"validmind.data_validation.IQROutliersBarPlot:StochasticVolatility\",\n", + " inputs={\n", + " \"dataset\": vm_sv_market_data,\n", + " },\n", + " title=\"Outliers detection using IQR method for StochasticVolatility\",\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "d96f10c7" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"validmind.data_validation.IQROutliersTable:StochasticVolatility\",\n", + " inputs={\n", + " \"dataset\": vm_sv_market_data,\n", + " },\n", + " title=\"Outliers table using IQR method for StochasticVolatility\",\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "758c4c57" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_3_2__'></a>\n", + "\n", + "#### Isolation Forest Outliers Test\n", + "Let's detects anomalies in the dataset using the Isolation Forest algorithm, visualized through scatter plots." + ], + "id": "b1430200" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"validmind.data_validation.IsolationForestOutliers:BlackScholes\",\n", + " inputs={\n", + " \"dataset\": vm_bs_market_data,\n", + " },\n", + " title=\"Outliers detection using Isolation Forest for BlackScholes\",\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "9eb91453" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"validmind.data_validation.IsolationForestOutliers:StochasticVolatility\",\n", + " inputs={\n", + " \"dataset\": vm_sv_market_data,\n", + " },\n", + " title=\"Outliers detection using Isolation Forest for StochasticVolatility\",\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "12940f8e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Missing Values Test\n", + "Let's evaluates dataset quality by ensuring the missing value ratio across all features does not exceed a set threshold." + ], + "id": "f30e5579" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"validmind.data_validation.MissingValues:BlackScholes\",\n", + " inputs={\n", + " \"dataset\": vm_bs_market_data,\n", + " },\n", + " title=\"Missing Values detection for BlackScholes\",\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "805ddb1c" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "\n", + "result = run_test(\n", + " \"validmind.data_validation.MissingValues:StochasticVolatility\",\n", + " inputs={\n", + " \"dataset\": vm_sv_market_data,\n", + " },\n", + " title=\"MissingValues detection for StochasticVolatility\",\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "e69e0039" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_4__'></a>\n", + "\n", + "### Model Calibration\n", + "* Clearly state the purpose of the calibration process. For example, in the context of an option pricing model, calibration aims to adjust model parameters to fit market data (e.g., market option prices, volatility surfaces).\n", + "* Specify whether the calibration is to historical data, current market data, or a blend of both." + ], + "id": "09628809" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import pandas as pd\n", + "\n", + "@vm.test(\"my_custom_tests.SyntheticDataCalibrationTest\")\n", + "def generate_synthetic_data_summary(option_pricing_model, strikes, maturities, synthetic_prices):\n", + " \"\"\"\n", + " This function will use synthetic prices to calibrate each model\n", + " and then generate derived prices based on the calibrated parameters.\n", + " It will output a DataFrame summarizing the strikes, maturities,\n", + " synthetic and derived prices, and the model parameters.\n", + "\n", + " \"\"\"\n", + " derived_prices = []\n", + " for K, T in zip(strikes, maturities):\n", + " option_pricing_model.K = K\n", + " option_pricing_model.T = T\n", + " derived_prices.append(option_pricing_model.price_option(10000, 100))\n", + " \n", + " model_type = type(option_pricing_model).__name__\n", + " data = {\n", + " \"Strike\": strikes,\n", + " \"Maturity\": maturities,\n", + " \"Synthetic_Price\": synthetic_prices,\n", + " \"Derived_Price\": derived_prices,\n", + " \"Model_Type\": model_type,\n", + " \"S0\": [option_pricing_model.S0] * len(strikes),\n", + " \"K\": [option_pricing_model.K] * len(strikes),\n", + " \"T\": [option_pricing_model.T] * len(strikes),\n", + " \"r\": [option_pricing_model.r] * len(strikes)\n", + " }\n", + " \n", + " if model_type == \"BlackScholesModel\":\n", + " data[\"sigma\"] = [option_pricing_model.sigma] * len(strikes)\n", + " elif model_type == \"StochasticVolatilityModel\":\n", + " data[\"v0\"] = [option_pricing_model.v0] * len(strikes)\n", + " data[\"kappa\"] = [option_pricing_model.kappa] * len(strikes)\n", + " data[\"theta\"] = [option_pricing_model.theta] * len(strikes)\n", + " data[\"xi\"] = [option_pricing_model.xi] * len(strikes)\n", + " data[\"rho\"] = [option_pricing_model.rho] * len(strikes)\n", + " \n", + " df = pd.DataFrame(data)\n", + " return df\n" + ], + "execution_count": null, + "outputs": [], + "id": "6802c26e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_5__'></a>\n", + "\n", + "### Synthetic Data Calibration Test\n", + "Let's evaluates the accuracy of a stochastic volatility model by comparing synthetic prices with derived prices after model calibration." + ], + "id": "3bf04d21" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.SyntheticDataCalibrationTest\",\n", + " params={\n", + " \"option_pricing_model\": sv_model,\n", + " \"strikes\": strikes,\n", + " \"maturities\": maturities,\n", + " \"synthetic_prices\": sv_market_prices\n", + " },\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "4345cb5c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_6__'></a>\n", + "\n", + "### Model Evaluation" + ], + "id": "4d48f107" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_6_1__'></a>\n", + "\n", + "#### Benchmark Testing\n", + "* Compare the model’s performance with alternative models or industry-standard models to assess its relative effectiveness.\n", + "* Ensure that the model is competitive in pricing, accuracy, and computational efficiency." + ], + "id": "8ec8b5a3" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "@vm.test(\"my_custom_tests.BenchmarkTest\")\n", + "def benchmark_test(bs_model, sv_model, strikes, maturities):\n", + " \"\"\"\n", + " Comparison between Black Scholes and stochastic volatility model\n", + "\n", + " \"\"\"\n", + " bs_model_type = type(bs_model).__name__\n", + " sv_model_type = type(sv_model).__name__\n", + "\n", + " bs_derived_prices = []\n", + " sv_derived_prices = []\n", + " for K in strikes:\n", + " bs_model.K = K\n", + " bs_derived_prices.append(bs_model.price_option(10000, 100))\n", + " sv_model.K = K\n", + " sv_derived_prices.append(sv_model.price_option(10000, 100))\n", + "\n", + " data = {\n", + " \"Strike\": strikes,\n", + " \"Maturities\": [sv_model.T] * len(strikes),\n", + " \"bs_model_price\": bs_derived_prices,\n", + " \"sv_model_price\": sv_derived_prices,\n", + "\n", + " }\n", + " df1 = pd.DataFrame(data)\n", + "\n", + " bs_derived_prices = []\n", + " sv_derived_prices = []\n", + " for T in maturities:\n", + " bs_model.T = T\n", + " bs_derived_prices.append(bs_model.price_option(10000, 100))\n", + " sv_model.T = T\n", + " sv_derived_prices.append(sv_model.price_option(10000, 100))\n", + "\n", + " data = {\n", + " \"Strike\": [sv_model.K] * len(maturities),\n", + " \"Maturities\": maturities,\n", + " \"bs_model_price\": bs_derived_prices,\n", + " \"sv_model_price\": sv_derived_prices,\n", + " }\n", + "\n", + " df2 = pd.DataFrame(data)\n", + "\n", + " return {\"strikes variation benchmarking\": df1}, {\"maturities variation benchmarking\": df2}" + ], + "execution_count": 47, + "outputs": [], + "id": "ac733262" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.BenchmarkTest\",\n", + " params={\n", + " \"sv_model\": sv_model,\n", + " \"bs_model\": bs_model,\n", + " \"strikes\": strikes,\n", + " \"maturities\": maturities,\n", + " },\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "20de9858" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Surface Volatility Test\n", + "Let's calculates the implied volatility across different strikes and maturities based on market prices" + ], + "id": "d9ad15b8" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import numpy as np\n", + "import pandas as pd\n", + "from scipy.optimize import minimize\n", + "import plotly.graph_objects as go\n", + "\n", + "@vm.test(\"my_custom_tests.ImpliedVolSurface\")\n", + "def implied_vol_surface(market_prices, strikes, maturities, S0, r, barrier, N=10000, M=100):\n", + " \"\"\"\n", + " This is a test to compute the implied volatility surface for a given set of market prices,\n", + " strikes, and maturities.\n", + " \"\"\"\n", + " def implied_volatility(market_price, N, M, initial_guess=0.2):\n", + " def objective_function(sigma):\n", + " model.sigma = sigma\n", + " model_price = model.price_option(N, M)\n", + " return (model_price - market_price) ** 2\n", + "\n", + " result = minimize(objective_function, initial_guess, bounds=[(0.01, 1.0)])\n", + " return result.x[0]\n", + " \n", + " implied_vols = np.zeros((len(strikes), len(maturities)))\n", + "\n", + " for i, K in enumerate(strikes):\n", + " for j, T in enumerate(maturities):\n", + " market_price = market_prices[i]\n", + " model = BlackScholesModel(S0, K, T, r, sigma=0.2)\n", + "\n", + " implied_vol = implied_volatility(market_price, N, M)\n", + " implied_vols[i, j] = implied_vol\n", + "\n", + " # Create the 3D surface plot\n", + " X, Y = np.meshgrid(strikes, maturities)\n", + " Z = implied_vols.T # Transpose to match the meshgrid orientation\n", + "\n", + " fig = go.Figure(data=[go.Surface(x=X, y=Y, z=Z)])\n", + " \n", + " # Update the layout\n", + " fig.update_layout(\n", + " title=f'3D Surface Plot of Implied Volatility',\n", + " scene=dict(\n", + " xaxis_title='Strike',\n", + " yaxis_title='Maturity',\n", + " zaxis_title='Implied Volatility',\n", + " camera=dict(\n", + " up=dict(x=0, y=0, z=1),\n", + " center=dict(x=0, y=0, z=0),\n", + " eye=dict(x=1.5, y=1.5, z=1.5)\n", + " )\n", + " ),\n", + " width=900,\n", + " height=700,\n", + " margin=dict(l=65, r=50, b=65, t=90)\n", + " )\n", + "\n", + " return fig" + ], + "execution_count": 49, + "outputs": [], + "id": "46e275e3" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.ImpliedVolSurface\",\n", + " params={\n", + " \"market_prices\": sv_market_prices,\n", + " \"strikes\": strikes,\n", + " \"maturities\": maturities,\n", + " \"S0\": S0,\n", + " \"r\": r,\n", + " \"barrier\": 120\n", + " }\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "66ca002a" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_6_2__'></a>\n", + "\n", + "#### Sensitivity Testing" + ], + "id": "a49d8a1e" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "\n", + "@vm.test(\"my_custom_tests.Sensitivity\")\n", + "def sensitivity_test(model_type, S0, T, r, N, M, strike=None, barrier=None, sigma=None, v0=None, kappa=None,theta=None, xi=None, rho=None):\n", + " \"\"\"\n", + " This is sensitivity test\n", + "\"\"\"\n", + " if model_type == 'BS':\n", + " model = BlackScholesModel(S0, strike, T, r, sigma)\n", + " else:\n", + " model = StochasticVolatilityModel(S0, strike, T, r, v0, kappa, theta, xi, rho)\n", + " \n", + " knockout_option = KnockoutOption(model, S0, strike, T, r, barrier)\n", + " price = knockout_option.price_knockout_option(N, M)\n", + "\n", + " return pd.DataFrame({\"Option price\": [price]})" + ], + "execution_count": null, + "outputs": [], + "id": "784a5e7c" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Initialise parameters" + ], + "id": "d4be30e6" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "\n", + "strike_range = (min(strikes), max(strikes))\n", + "barrier_range = (100, 120)" + ], + "execution_count": null, + "outputs": [], + "id": "46878b84" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Common plot function\n", + "Let's create a line plot using the default result output data and log it by passing the function through the `post_process_fn` parameter in the `run_test()` method." + ], + "id": "205c46ce" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from plotly.express import bar\n", + "from validmind.vm_models.figure import Figure\n", + "from validmind.vm_models.result import TestResult\n", + "import plotly.graph_objects as go\n", + "import random\n", + "\n", + "def process_results(result: TestResult):\n", + "\n", + " # Convert to DataFrame\n", + " df = pd.DataFrame(result.tables[0].data)\n", + " \n", + " # Get the first two column names\n", + " x_col = df.columns[0]\n", + " y_col = df.columns[1]\n", + " \n", + " # Create figure\n", + " fig = go.Figure()\n", + " fig.add_trace(\n", + " go.Scatter(\n", + " x=df[x_col],\n", + " y=df[y_col],\n", + " mode='lines',\n", + " name=y_col # Use y-axis column name as trace name\n", + " )\n", + " )\n", + " \n", + " fig.update_layout(\n", + " xaxis_title=x_col,\n", + " yaxis_title=y_col,\n", + " showlegend=True,\n", + " template=\"plotly_white\"\n", + " )\n", + "\n", + " result.add_figure(\n", + " Figure(\n", + " figure=fig,\n", + " key=\"sensitivity_plot_\" + str(random.randint(0, 1000000)),\n", + " ref_id=result.ref_id,\n", + " )\n", + " )\n", + "\n", + " return result" + ], + "execution_count": null, + "outputs": [], + "id": "d4b9ea2f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Strike sensitivity Test\n", + "Let's evaluates the sensitivity of a model's output value to changes in the strike price, while keeping other parameters constant.\n", + "This test is crucial for understanding how variations in strike prices affect the valuation of financial derivatives, particularly options." + ], + "id": "528b409c" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.Sensitivity:S0\",\n", + " param_grid={\n", + " \"model_type\": ['SV'],\n", + " \"N\": [N],\n", + " \"M\": [M],\n", + " \"strike\":[strike_range[0]],\n", + " \"barrier\": [barrier_range[0]],\n", + " \"S0\": list(np.linspace(S0-20, S0+20, 20)),\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"v0\": [0.2],\n", + " \"kappa\": [2],\n", + " \"theta\": [0.2],\n", + " \"xi\": [0.1],\n", + " \"rho\": [-0.5],\n", + " },\n", + " post_process_fn= process_results\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "bb8f1cab" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.Sensitivity:ToStrike\",\n", + " param_grid={\n", + " \"model_type\": ['SV'],\n", + " \"N\": [N],\n", + " \"M\": [M],\n", + " \"strike\": list(np.linspace(strike_range[0], strike_range[1], 20)),\n", + " \"barrier\": [barrier_range[0]],\n", + " \"S0\": [S0],\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"v0\": [0.2],\n", + " \"kappa\": [2],\n", + " \"theta\": [0.2],\n", + " \"xi\": [0.1],\n", + " \"rho\": [-0.5],\n", + " },\n", + " post_process_fn= process_results\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "e566a681" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Barrier Sensitivity Test\n", + "Let's evaluates the sensitivity of a model's output to changes in the barrier level of a financial derivative, specifically a barrier option. This test is crucial for understanding how small changes in the barrier can impact the option's valuation, which is essential for risk management and pricing strategies." + ], + "id": "0f288663" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.Sensitivity:ToBarrier\",\n", + " param_grid={\n", + " \"model_type\": ['SV'],\n", + " \"N\": [N],\n", + " \"M\": [M],\n", + " \"strike\": [strike_range[0]],\n", + " \"barrier\": list(np.linspace(barrier_range[0], barrier_range[1], 20)),\n", + " \"S0\": [S0],\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"v0\": [0.2],\n", + " \"kappa\": [2],\n", + " \"theta\": [0.2],\n", + " \"xi\": [0.1],\n", + " \"rho\": [-0.5],\n", + " },\n", + " post_process_fn=process_results\n", + "\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "95f81283" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_6_3__'></a>\n", + "\n", + "#### Greeks\n", + "These Greeks are crucial for traders and risk managers as they provide insights into the risk and potential price movements of options and derivatives, allowing for more informed decision-making and risk management strategies." + ], + "id": "3201aa09" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_7__'></a>\n", + "\n", + "### Delta\n", + "Let's measures the sensitivity of the option's price to a change in the price of the underlying asset. It indicates how much the price of an option is expected to move per $1 change in the underlying asset's price." + ], + "id": "f31afc73" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "@vm.test(\"my_custom_tests.GreeksDelta\")\n", + "def calculate_delta(model_type, S0, T, r, N, M, strike=None, barrier=None, \n", + " sigma=None, v0=None, kappa=None, theta=None, xi=None, rho=None, \n", + " h=0.001): # h is the step size for finite difference\n", + " \"\"\"\n", + " Calculate delta using finite difference method.\n", + " Delta = (V(S0 + h) - V(S0 - h)) / (2h)\n", + " where V is the option price and h is a small increment\n", + " \"\"\"\n", + " # Initialize the model with S0 + h\n", + " if model_type == 'BS':\n", + " model_up = BlackScholesModel(S0 + h, strike, T, r, sigma)\n", + " model_down = BlackScholesModel(S0 - h, strike, T, r, sigma)\n", + " else:\n", + " model_up = StochasticVolatilityModel(S0 + h, strike, T, r, v0, kappa, theta, xi, rho)\n", + " model_down = StochasticVolatilityModel(S0 - h, strike, T, r, v0, kappa, theta, xi, rho)\n", + " \n", + "\n", + " # Calculate option prices for up and down moves\n", + " knockout_up = KnockoutOption(model_up, S0 + h, strike, T, r, barrier)\n", + " knockout_down = KnockoutOption(model_down, S0 - h, strike, T, r, barrier)\n", + " \n", + " price_up = knockout_up.price_knockout_option(N, M)\n", + " price_down = knockout_down.price_knockout_option(N, M)\n", + " \n", + " # Calculate delta using central difference\n", + " delta = (price_up - price_down) / (2 * h)\n", + " df = pd.DataFrame({\"Delta\": [delta], \"Price_Up\": [price_up], \"Price_Down\": [price_down], \"h\": [h]})\n", + " return df\n", + "\n" + ], + "execution_count": 30, + "outputs": [], + "id": "31befc58" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# To analyze delta sensitivity to underlying price changes\n", + "result = run_test(\n", + " \"my_custom_tests.GreeksDelta\",\n", + " param_grid={\n", + " \"model_type\": ['SV'],\n", + " \"N\": [1000000],\n", + " \"M\": [M],\n", + " \"strike\":[strike_range[0]],\n", + " \"barrier\": [barrier_range[0]],\n", + " \"S0\": list(np.linspace(S0-20, S0+20, 20)),\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"v0\": [0.2],\n", + " \"kappa\": [2],\n", + " \"theta\": [0.2],\n", + " \"xi\": [0.1],\n", + " \"rho\": [-0.5],\n", + " \"h\": [0.001]\n", + " },\n", + "post_process_fn=process_results # Using the plotting function defined earlier\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "a033dd96" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_8__'></a>\n", + "\n", + "### Gamma\n", + "Let's measures the rate of change of Delta with respect to changes in the underlying asset's price. It indicates the curvature of the option's price relative to the underlying asset's price." + ], + "id": "0826d4dc" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "@vm.test(\"my_custom_tests.GreeksGamma\")\n", + "def calculate_gamma(model_type, S0, T, r, N, M, strike=None, barrier=None, \n", + " sigma=None, v0=None, kappa=None, theta=None, xi=None, rho=None, \n", + " h=0.01): # h is the step size for finite difference\n", + " \"\"\"\n", + " Calculate gamma using finite difference method.\n", + " Gamma = (V(S0 + h) - 2V(S0) + V(S0 - h)) / h^2\n", + " where V is the option price and h is a small increment\n", + " \"\"\"\n", + " # Initialize the models with S0 + h, S0, and S0 - h\n", + " if model_type == 'BS':\n", + " model_up = BlackScholesModel(S0 + h, strike, T, r, sigma)\n", + " model_center = BlackScholesModel(S0, strike, T, r, sigma)\n", + " model_down = BlackScholesModel(S0 - h, strike, T, r, sigma)\n", + " else:\n", + " model_up = StochasticVolatilityModel(S0 + h, strike, T, r, v0, kappa, theta, xi, rho)\n", + " model_center = StochasticVolatilityModel(S0, strike, T, r, v0, kappa, theta, xi, rho)\n", + " model_down = StochasticVolatilityModel(S0 - h, strike, T, r, v0, kappa, theta, xi, rho)\n", + " \n", + " # Calculate option prices for up, center, and down moves\n", + " knockout_up = KnockoutOption(model_up, S0 + h, strike, T, r, barrier)\n", + " knockout_center = KnockoutOption(model_center, S0, strike, T, r, barrier)\n", + " knockout_down = KnockoutOption(model_down, S0 - h, strike, T, r, barrier)\n", + " \n", + " price_up = knockout_up.price_knockout_option(N, M)\n", + " price_center = knockout_center.price_knockout_option(N, M)\n", + " price_down = knockout_down.price_knockout_option(N, M)\n", + " \n", + " # Calculate gamma using second-order central difference\n", + " gamma = (price_up - 2*price_center + price_down) / (h * h)\n", + " \n", + " df = pd.DataFrame({\n", + " \"Gamma\": [gamma], \n", + " \"Price_Up\": [price_up], \n", + " \"Price_Center\": [price_center],\n", + " \"Price_Down\": [price_down], \n", + " \"h\": [h]\n", + " })\n", + " return df\n", + "\n", + "# To analyze gamma sensitivity to underlying price changes\n", + "result = run_test(\n", + " \"my_custom_tests.GreeksGamma\",\n", + " param_grid={\n", + " \"model_type\": ['SV'],\n", + " \"N\": [1000000],\n", + " \"M\": [M],\n", + " \"strike\":[strike_range[0]],\n", + " \"barrier\": [barrier_range[0]],\n", + " \"S0\": list(np.linspace(S0-20, S0+20, 20)),\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"v0\": [0.2],\n", + " \"kappa\": [2],\n", + " \"theta\": [0.2],\n", + " \"xi\": [0.1],\n", + " \"rho\": [-0.5],\n", + " \"h\": [0.1]\n", + " },\n", + " post_process_fn=process_results # Using the plotting function defined earlier\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "ccf54452" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_9__'></a>\n", + "\n", + "### Theta\n", + "Let's measures the sensitivity of the option's price to the passage of time, also known as time decay. It indicates how much the price of an option is expected to decrease as the option approaches its expiration date." + ], + "id": "df0eaa72" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "@vm.test(\"my_custom_tests.GreeksTheta\")\n", + "def calculate_theta(model_type, S0, T, r, N, M, strike=None, barrier=None, \n", + " sigma=None, v0=None, kappa=None, theta=None, xi=None, rho=None, \n", + " dt=1/365): # dt is typically one day\n", + " \"\"\"\n", + " Calculate theta using finite difference method.\n", + " Theta = (V(t + dt) - V(t)) / dt\n", + " where V is the option price and dt is a small time increment (typically 1 day)\n", + " \"\"\"\n", + " # Initialize the models with T and T + dt\n", + " if model_type == 'BS':\n", + " model_current = BlackScholesModel(S0, strike, T, r, sigma)\n", + " model_future = BlackScholesModel(S0, strike, T + dt, r, sigma)\n", + " else:\n", + " model_current = StochasticVolatilityModel(S0, strike, T, r, v0, kappa, theta, xi, rho)\n", + " model_future = StochasticVolatilityModel(S0, strike, T + dt, r, v0, kappa, theta, xi, rho)\n", + " \n", + " # Calculate option prices for current and future time\n", + " knockout_current = KnockoutOption(model_current, S0, strike, T, r, barrier)\n", + " knockout_future = KnockoutOption(model_future, S0, strike, T + dt, r, barrier)\n", + " \n", + " price_current = knockout_current.price_knockout_option(N, M)\n", + " price_future = knockout_future.price_knockout_option(N, M)\n", + " \n", + " # Calculate theta using forward difference\n", + " # Note: We divide by dt and multiply by -1 since theta represents the negative rate of change\n", + " theta_value = -1 * (price_future - price_current) / dt\n", + " \n", + " df = pd.DataFrame({\n", + " \"Theta\": [theta_value], \n", + " \"Price_Current\": [price_current],\n", + " \"Price_Future\": [price_future],\n", + " \"dt\": [dt]\n", + " })\n", + " return df\n", + "\n", + "# Example usage to analyze theta sensitivity across different underlying prices\n", + "result = run_test(\n", + " \"my_custom_tests.GreeksTheta\",\n", + " param_grid={\n", + " \"model_type\": ['SV'],\n", + " \"N\": [1000000],\n", + " \"M\": [M],\n", + " \"strike\":[strike_range[0]],\n", + " \"barrier\": [barrier_range[0]],\n", + " \"S0\": list(np.linspace(S0-20, S0+20, 20)),\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"v0\": [0.2],\n", + " \"kappa\": [2],\n", + " \"theta\": [0.2],\n", + " \"xi\": [0.1],\n", + " \"rho\": [-0.5],\n", + " \"dt\": [1/365] # One day time step\n", + " },\n", + " post_process_fn=process_results # Using the plotting function defined earlier\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "0e9810b1" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_10__'></a>\n", + "\n", + "### Vega\n", + "Let's measures the sensitivity of the option's price to changes in the volatility of the underlying asset. It indicates how much the price of an option is expected to change with a 1% change in the underlying asset's volatility." + ], + "id": "28c60e1d" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "@vm.test(\"my_custom_tests.GreeksVega\")\n", + "def calculate_vega(model_type, S0, T, r, N, M, strike=None, barrier=None, \n", + " sigma=None, v0=None, kappa=None, theta=None, xi=None, rho=None, \n", + " h=0.001): # h is the step size for finite difference\n", + " \"\"\"\n", + " Calculate vega using finite difference method.\n", + " For Black-Scholes: Vega = (V(σ + h) - V(σ - h)) / (2h)\n", + " For Stochastic Vol: Vega = (V(v0 + h) - V(v0 - h)) / (2h)\n", + " where V is the option price and h is a small increment in volatility\n", + " \"\"\"\n", + " if model_type == 'BS':\n", + " # For Black-Scholes, perturb sigma\n", + " model_up = BlackScholesModel(S0, strike, T, r, sigma + h)\n", + " model_down = BlackScholesModel(S0, strike, T, r, sigma - h)\n", + " else:\n", + " # For Stochastic Volatility, perturb v0\n", + " model_up = StochasticVolatilityModel(S0, strike, T, r, v0 + h, kappa, theta, xi, rho)\n", + " model_down = StochasticVolatilityModel(S0, strike, T, r, v0 - h, kappa, theta, xi, rho)\n", + " \n", + " # Calculate option prices for up and down moves in volatility\n", + " knockout_up = KnockoutOption(model_up, S0, strike, T, r, barrier)\n", + " knockout_down = KnockoutOption(model_down, S0, strike, T, r, barrier)\n", + " \n", + " price_up = knockout_up.price_knockout_option(N, M)\n", + " price_down = knockout_down.price_knockout_option(N, M)\n", + " \n", + " # Calculate vega using central difference\n", + " vega = (price_up - price_down) / (2 * h)\n", + " \n", + " df = pd.DataFrame({\n", + " \"Vega\": [vega], \n", + " \"Price_Up\": [price_up], \n", + " \"Price_Down\": [price_down], \n", + " \"h\": [h]\n", + " })\n", + " return df\n", + "\n", + "# Example usage to analyze vega sensitivity across different underlying prices\n", + "result = run_test(\n", + " \"my_custom_tests.GreeksVega\",\n", + " param_grid={\n", + " \"model_type\": ['SV'],\n", + " \"N\": [1000000],\n", + " \"M\": [M],\n", + " \"strike\":[strike_range[0]],\n", + " \"barrier\": [barrier_range[0]],\n", + " \"S0\": list(np.linspace(S0-20, S0+20, 20)),\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"v0\": [0.2],\n", + " \"kappa\": [2],\n", + " \"theta\": [0.2],\n", + " \"xi\": [0.1],\n", + " \"rho\": [-0.5],\n", + " \"h\": [0.0001] # Small step size for better accuracy\n", + " },\n", + " post_process_fn=process_results # Using the plotting function defined earlier\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "1dbc6632" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_11__'></a>\n", + "\n", + "### Rho\n", + "Let's measures the sensitivity of the option's price to changes in the interest rate. It indicates how much the price of an option is expected to change with a 1% change in interest rates." + ], + "id": "1ec51eba" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "@vm.test(\"my_custom_tests.GreeksRho\")\n", + "def calculate_rho(model_type, S0, T, r, N, M, strike=None, barrier=None, \n", + " sigma=None, v0=None, kappa=None, theta=None, xi=None, rho=None, \n", + " h=0.0001): # h is the step size for finite difference\n", + " \"\"\"\n", + " Calculate rho using finite difference method.\n", + " Rho = (V(r + h) - V(r - h)) / (2h)\n", + " where V is the option price and h is a small increment in interest rate\n", + " \"\"\"\n", + " # Initialize the models with r + h and r - h\n", + " if model_type == 'BS':\n", + " model_up = BlackScholesModel(S0, strike, T, r + h, sigma)\n", + " model_down = BlackScholesModel(S0, strike, T, r - h, sigma)\n", + " else:\n", + " model_up = StochasticVolatilityModel(S0, strike, T, r + h, v0, kappa, theta, xi, rho)\n", + " model_down = StochasticVolatilityModel(S0, strike, T, r - h, v0, kappa, theta, xi, rho)\n", + " \n", + " # Calculate option prices for up and down moves in interest rate\n", + " knockout_up = KnockoutOption(model_up, S0, strike, T, r + h, barrier)\n", + " knockout_down = KnockoutOption(model_down, S0, strike, T, r - h, barrier)\n", + " \n", + " price_up = knockout_up.price_knockout_option(N, M)\n", + " price_down = knockout_down.price_knockout_option(N, M)\n", + " \n", + " # Calculate rho using central difference\n", + " rho_value = (price_up - price_down) / (2 * h)\n", + " \n", + " df = pd.DataFrame({\n", + " \"Rho\": [rho_value], \n", + " \"Price_Up\": [price_up], \n", + " \"Price_Down\": [price_down], \n", + " \"h\": [h]\n", + " })\n", + " return df\n", + "\n", + "# Example usage to analyze rho sensitivity across different underlying prices\n", + "result = run_test(\n", + " \"my_custom_tests.GreeksRho\",\n", + " param_grid={\n", + " \"model_type\": ['SV'],\n", + " \"N\": [1000000],\n", + " \"M\": [M],\n", + " \"strike\":[strike_range[0]],\n", + " \"barrier\": [barrier_range[0]],\n", + " \"S0\": list(np.linspace(S0-20, S0+20, 20)),\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"v0\": [0.2],\n", + " \"kappa\": [2],\n", + " \"theta\": [0.2],\n", + " \"xi\": [0.1],\n", + " \"rho\": [-0.5],\n", + " \"h\": [0.0001] # Small step size for better accuracy\n", + " },\n", + " post_process_fn=process_results # Using the plotting function defined earlier\n", + ")\n", + "result.log()" + ], + "execution_count": null, + "outputs": [], + "id": "2f497b5f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_11_1__'></a>\n", + "\n", + "#### Stress Testing" + ], + "id": "0cdd1b1b" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "@vm.test(\"my_custom_tests.Stressing\")\n", + "def sensitivity_test(model_type, S0, T, r, N, M, strike=None, barrier=None, sigma=None, v0=None, kappa=None,theta=None, xi=None, rho=None):\n", + " \"\"\"\n", + " This is stress test\n", + " \"\"\"\n", + " if model_type == 'BS':\n", + " model = BlackScholesModel(S0, strike, T, r, sigma)\n", + " else:\n", + " model = StochasticVolatilityModel(S0, strike, T, r, v0, kappa, theta, xi, rho)\n", + " \n", + " knockout_option = KnockoutOption(model, S0, strike, T, r, barrier)\n", + " price = knockout_option.price_knockout_option(N, M)\n", + "\n", + " return pd.DataFrame({\"Option price\": [price]})" + ], + "execution_count": null, + "outputs": [], + "id": "c98ff396" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Rho (correlation) and Theta (long term vol) stress test\n", + "First, we create a surface plot to visualize the option price with respect to two variables." + ], + "id": "b6f0a179" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "def two_parameters_stress_surface_plot(result: TestResult):\n", + " import plotly.graph_objects as go\n", + " import numpy as np\n", + " import pandas as pd\n", + " # Convert to DataFrame\n", + " data = pd.DataFrame(result.tables[0].data)\n", + " \n", + " # Get column names (assuming first column is x, next two are y1 and y2)\n", + " z_col = data.columns[2]\n", + " x_col = data.columns[0]\n", + " y_col = data.columns[1]\n", + " \n", + " # Get unique values for x and y\n", + " x_unique = np.sort(data[x_col].unique())\n", + " y_unique = np.sort(data[y_col].unique())\n", + " \n", + " # Create meshgrid\n", + " X, Y = np.meshgrid(x_unique, y_unique)\n", + " \n", + " # Create Z matrix\n", + " Z = np.zeros_like(X)\n", + " for i, x_val in enumerate(x_unique):\n", + " for j, y_val in enumerate(y_unique):\n", + " mask = (data[x_col] == x_val) & (data[y_col] == y_val)\n", + " if mask.any():\n", + " Z[j, i] = data.loc[mask, z_col].iloc[0]\n", + " \n", + " # Create the 3D surface plot\n", + " fig = go.Figure(data=[go.Surface(x=X, y=Y, z=Z)])\n", + " \n", + " # Update the layout\n", + " fig.update_layout(\n", + " title=f'3D Surface Plot of {z_col}',\n", + " scene=dict(\n", + " xaxis_title=x_col,\n", + " yaxis_title=y_col,\n", + " zaxis_title=z_col,\n", + " camera=dict(\n", + " up=dict(x=0, y=0, z=1),\n", + " center=dict(x=0, y=0, z=0),\n", + " eye=dict(x=1.5, y=1.5, z=1.5)\n", + " )\n", + " ),\n", + " width=900,\n", + " height=700,\n", + " margin=dict(l=65, r=50, b=65, t=90)\n", + " )\n", + "\n", + " result.add_figure(\n", + " Figure(\n", + " figure=fig,\n", + " key=\"sensitivity_plot_\" + str(random.randint(0, 1000000)),\n", + " ref_id=result.ref_id,\n", + " )\n", + " )\n", + "\n", + " return result" + ], + "execution_count": null, + "outputs": [], + "id": "b408de0f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's evaluates the sensitivity of a model's output to changes in the correlation parameter (rho) and the long-term variance parameter (theta) within a stochastic volatility framework.\n", + "\n", + "This test is useful for understanding how variations in these parameters affect the model's valuation, which is crucial for risk management and model validation." + ], + "id": "87289ee6" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "\n", + "\n", + "result = run_test(\n", + " \"my_custom_tests.Stressing:TheRhoAndThetaParameters\",\n", + " param_grid={\n", + " \"model_type\": ['SV'],\n", + " \"N\": [N],\n", + " \"M\": [M],\n", + " \"strike\": [strike_range[0]],\n", + " \"barrier\": [barrier_range[0]],\n", + " \"S0\": [S0],\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"v0\": [0.2],\n", + " \"kappa\": [2],\n", + " \"theta\": list(np.linspace(0,0.8, 10)),\n", + " \"xi\": [0.1],\n", + " \"rho\": list(np.linspace(-1,0.8, 10)),\n", + " },\n", + " post_process_fn=two_parameters_stress_surface_plot\n", + ")\n", + "result.log()\n" + ], + "execution_count": null, + "outputs": [], + "id": "5c0ec52d" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Rho (correlation) and Xi (vol of vol) stress test" + ], + "id": "44be4c61" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "\n", + "\n", + "result = run_test(\n", + " \"my_custom_tests.Stressing:TheRhoAndXiParameters\",\n", + " param_grid={\n", + " \"model_type\": ['SV'],\n", + " \"N\": [N],\n", + " \"M\": [M],\n", + " \"strike\": [strike_range[0]],\n", + " \"barrier\": [barrier_range[0]],\n", + " \"S0\": [S0],\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"v0\": [0.2],\n", + " \"kappa\": [2],\n", + " \"theta\": [0.2],\n", + " \"xi\": list(np.linspace(0,0.8, 10)),\n", + " \"rho\": list(np.linspace(-1,0.8, 10)),\n", + " },\n", + " post_process_fn=two_parameters_stress_surface_plot\n", + ")\n", + "result.log()\n" + ], + "execution_count": null, + "outputs": [], + "id": "e0a2996e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Sigma stress test\n", + "evaluates the sensitivity of a model's output to changes in the volatility parameter, sigma. This test is crucial for understanding how variations in market volatility impact the model's valuation of financial instruments, particularly options.\n", + "\n", + "This test is useful for risk management and model validation, as it helps identify the robustness of the model under different market conditions. By analyzing the changes in the model's output as sigma varies, stakeholders can assess the model's stability and reliability." + ], + "id": "5fed568d" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.Stressing:TheSigmaParameter\",\n", + " param_grid={\n", + " \"model_type\": ['BS'],\n", + " \"N\": [N],\n", + " \"M\": [M],\n", + " \"strike\": [strike_range[0]],\n", + " \"barrier\": [barrier_range[0]],\n", + " \"S0\": [S0],\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"sigma\": list(np.linspace(0.2, 0.8, 10)),\n", + " },\n", + " post_process_fn=process_results\n", + ")\n", + "result.log()\n" + ], + "execution_count": null, + "outputs": [], + "id": "d49e2e37" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Stress kappa\n", + "Let's evaluates the sensitivity of a model's output to changes in the kappa parameter, which is a mean reversion rate in stochastic volatility models." + ], + "id": "4e7a1f00" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.Stressing:TheKappaParameter\",\n", + " param_grid={\n", + " \"model_type\": ['SV'],\n", + " \"N\": [N],\n", + " \"M\": [M],\n", + " \"strike\": [strike_range[0]],\n", + " \"barrier\": [barrier_range[0]],\n", + " \"S0\": [S0],\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"v0\": [0.2],\n", + " \"kappa\": list(np.linspace(0, 8, 10)),\n", + " \"theta\": [0.2],\n", + " \"xi\": [0.1],\n", + " \"rho\": [-0.5],\n", + " },\n", + " post_process_fn=process_results\n", + ")\n", + "result.log()\n" + ], + "execution_count": null, + "outputs": [], + "id": "e995f6ae" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Stress theta\n", + "Stress Theta evaluates the sensitivity of a model's output to changes in the parameter theta, which represents the long-term variance in a stochastic volatility model" + ], + "id": "40d1c9e2" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.Stressing:TheThetaParameter\",\n", + " param_grid={\n", + " \"model_type\": ['SV'],\n", + " \"N\": [N],\n", + " \"M\": [M],\n", + " \"strike\": [strike_range[0]],\n", + " \"barrier\": [barrier_range[0]],\n", + " \"S0\": [S0],\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"v0\": [0.2],\n", + " \"kappa\": [2],\n", + " \"theta\": list(np.linspace(0, 0.8, 10)),\n", + " \"xi\": [0.1],\n", + " \"rho\": [-0.5],\n", + " },\n", + " post_process_fn=process_results\n", + ")\n", + "result.log()\n" + ], + "execution_count": null, + "outputs": [], + "id": "7e371aee" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Stress xi\n", + "Stress Xi evaluates the sensitivity of a model's output to changes in the parameter xi, which represents the volatility of volatility in a stochastic volatility model. This test is crucial for understanding how variations in xi impact the model's valuation, particularly in financial derivatives pricing." + ], + "id": "e20d074f" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.Stressing:TheXiParameter\",\n", + " param_grid={\n", + " \"model_type\": ['SV'],\n", + " \"N\": [N],\n", + " \"M\": [M],\n", + " \"strike\": [strike_range[0]],\n", + " \"barrier\": [barrier_range[0]],\n", + " \"S0\": [S0],\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"v0\": [0.2],\n", + " \"kappa\": [2],\n", + " \"theta\": [0.2],\n", + " \"xi\": list(np.linspace(0.05, 0.95, 10)),\n", + " \"rho\": [-0.5],\n", + " },\n", + " post_process_fn=process_results\n", + ")\n", + "result.log()\n" + ], + "execution_count": null, + "outputs": [], + "id": "9c545090" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Stress rho\n", + "Stress rho test evaluates the sensitivity of a model's output to changes in the correlation parameter, rho, within a stochastic volatility (SV) model framework. This test is crucial for understanding how variations in rho, which represents the correlation between the asset price and its volatility, impact the model's valuation output." + ], + "id": "f0360e20" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.Stressing:TheRhoParameter\",\n", + " param_grid={\n", + " \"model_type\": ['SV'],\n", + " \"N\": [N],\n", + " \"M\": [M],\n", + " \"strike\": [strike_range[0]],\n", + " \"barrier\": [barrier_range[0]],\n", + " \"S0\": [S0],\n", + " \"T\": [T],\n", + " \"r\": [r],\n", + " \"v0\": [0.2],\n", + " \"kappa\": [2],\n", + " \"theta\": [0.2],\n", + " \"xi\": [0.1],\n", + " \"rho\": list(np.linspace(-1.0, 1.0, 20)),\n", + " },\n", + " post_process_fn=process_results\n", + ")\n", + "result.log()\n" + ], + "execution_count": null, + "outputs": [], + "id": "e2c5dfb1" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "<a id='toc5_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "<a id='toc5_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ], + "id": "61d4e596" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-a23adf093a60485ea005cf8fc18545a5" + } + ], + "metadata": { + "kernelspec": { + "display_name": "validmind-1QuffXMV-py3.10", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.14" + } + }, + "nbformat": 4, + "nbformat_minor": 5 } diff --git a/site/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb b/site/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb index e0d7c1a11f..eccfb8fc3b 100644 --- a/site/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb +++ b/site/notebooks/use_cases/capital_markets/quickstart_option_pricing_models_quantlib.ipynb @@ -1,1354 +1,1360 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "1e2a4689", - "metadata": {}, - "source": [ - "# Quickstart for Heston option pricing model using QuantLib\n", - "\n", - "Welcome! Let's get you started with the basic process of documenting models with ValidMind.\n", - "\n", - "The Heston option pricing model is a popular stochastic volatility model used to price options. Developed by Steven Heston in 1993, the model assumes that the asset's volatility follows a mean-reverting square-root process, allowing it to capture the empirical observation of volatility \"clustering\" in financial markets. This model is particularly useful for assets where volatility is not constant, making it a favored approach in quantitative finance for pricing complex derivatives.\n", - "\n", - "Here’s an overview of the Heston model as implemented in QuantLib, a powerful library for quantitative finance:\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Model Assumptions and Characteristics\n", - "1. **Stochastic Volatility**: The volatility is modeled as a stochastic process, following a mean-reverting square-root process (Cox-Ingersoll-Ross process).\n", - "2. **Correlated Asset and Volatility Processes**: The asset price and volatility are assumed to be correlated, allowing the model to capture the \"smile\" effect observed in implied volatilities.\n", - "3. **Risk-Neutral Dynamics**: The Heston model is typically calibrated under a risk-neutral measure, which allows for direct application to pricing.\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### Heston Model Parameters\n", - "The model is governed by a set of key parameters:\n", - "- **S0**: Initial stock price\n", - "- **v0**: Initial variance of the asset price\n", - "- **kappa**: Speed of mean reversion of the variance\n", - "- **theta**: Long-term mean level of variance\n", - "- **sigma**: Volatility of volatility (vol of vol)\n", - "- **rho**: Correlation between the asset price and variance processes\n", - "\n", - "The dynamics of the asset price \\( S \\) and the variance \\( v \\) under the Heston model are given by:\n", - "\n", - "$$\n", - "dS_t = r S_t \\, dt + \\sqrt{v_t} S_t \\, dW^S_t\n", - "$$\n", - "\n", - "$$\n", - "dv_t = \\kappa (\\theta - v_t) \\, dt + \\sigma \\sqrt{v_t} \\, dW^v_t\n", - "$$\n", - "\n", - "where \\( $dW^S$ \\) and \\( $dW^v$ \\) are Wiener processes with correlation \\( $\\rho$ \\).\n", - "\n", - "<a id='toc1_3__'></a>\n", - "\n", - "### Advantages and Limitations\n", - "- **Advantages**:\n", - " - Ability to capture volatility smiles and skews.\n", - " - More realistic pricing for options on assets with stochastic volatility.\n", - "- **Limitations**:\n", - " - Calibration can be complex due to the number of parameters.\n", - " - Computationally intensive compared to simpler models like Black-Scholes.\n", - "\n", - "This setup provides a robust framework for pricing and analyzing options with stochastic volatility dynamics. QuantLib’s implementation makes it easy to experiment with different parameter configurations and observe their effects on pricing.\n", - "\n", - "You will learn how to initialize the ValidMind Library, develop a option pricing model, and then write custom tests that can be used for sensitivity and stress testing to quickly generate documentation about model." - ] - }, - { - "cell_type": "markdown", - "id": "69ec219a", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - " - [Model Assumptions and Characteristics](#toc1_1__) \n", - " - [Heston Model Parameters](#toc1_2__) \n", - " - [Advantages and Limitations](#toc1_3__) \n", - "- [About ValidMind](#toc2__) \n", - " - [Before you begin](#toc2_1__) \n", - " - [New to ValidMind?](#toc2_2__) \n", - " - [Key concepts](#toc2_3__) \n", - "- [Setting up](#toc3__) \n", - " - [Install the ValidMind Library](#toc3_1__) \n", - " - [Initialize the ValidMind Library](#toc3_2__) \n", - " - [Register sample model](#toc3_2_1__) \n", - " - [Apply documentation template](#toc3_2_2__) \n", - " - [Get your code snippet](#toc3_2_3__) \n", - " - [Initialize the Python environment](#toc3_3__) \n", - " - [Preview the documentation template](#toc3_4__) \n", - "- [Data Preparation](#toc4__) \n", - " - [Helper functions](#toc4_1_1__) \n", - " - [Market Data Quality and Availability](#toc4_2__) \n", - " - [Initialize the ValidMind datasets](#toc4_3__) \n", - " - [Data Quality](#toc4_4__) \n", - " - [Isolation Forest Outliers Test](#toc4_4_1__) \n", - " - [Model parameters](#toc4_4_2__) \n", - "- [Model development - Heston Option price](#toc5__) \n", - " - [Model Calibration](#toc5_1__) \n", - " - [Model Evaluation](#toc5_2__) \n", - " - [Benchmark Testing](#toc5_2_1__) \n", - " - [Sensitivity Testing](#toc5_2_2__) \n", - " - [Stress Testing](#toc5_2_3__) \n", - "- [Next steps](#toc6__) \n", - " - [Work with your model documentation](#toc6_1__) \n", - " - [Discover more learning resources](#toc6_2__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "id": "b9fb5d17", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc2_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc2_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc2_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "id": "f2dccf35", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "id": "5a5ce085", - "metadata": {}, - "source": [ - "<a id='toc3_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "409352bf", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "id": "65e870b2", - "metadata": {}, - "source": [ - "To install the QuantLib library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "3a34debf", - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q QuantLib" - ] - }, - { - "cell_type": "markdown", - "id": "fb30ae07", - "metadata": {}, - "source": [ - "<a id='toc3_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "id": "c6f87017", - "metadata": {}, - "source": [ - "<a id='toc3_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "id": "cbb2e2c9", - "metadata": {}, - "source": [ - "<a id='toc3_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Capital Markets`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "id": "41c4edca", - "metadata": {}, - "source": [ - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Can't select this template?</b></span>\n", - "<br></br>\n", - "Your organization administrators may need to add it to your template library:\n", - "<ul>\n", - "<li><a href=\"capital_markets_template.yaml\" style=\"color: #DE257E;\"><b>Download Template YAML</b></a></li>\n", - "<li><a href=\"https://docs.validmind.ai/guide/templates/customize-document-templates.html\" style=\"color: #DE257E;\"><b>Customize Document Templates</b></a></li>\n", - "</ul>\n", - "</div>" - ] - }, - { - "cell_type": "markdown", - "id": "2012eb82", - "metadata": {}, - "source": [ - "<a id='toc3_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0cd3f67e", - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")\n" - ] - }, - { - "cell_type": "markdown", - "id": "6d944cc9", - "metadata": {}, - "source": [ - "<a id='toc3_3__'></a>\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Next, let's import the necessary libraries and set up your Python environment for data analysis:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "f8cf2746", - "metadata": {}, - "outputs": [], - "source": [ - "%matplotlib inline\n", - "\n", - "import pandas as pd\n", - "import numpy as np\n", - "import matplotlib.pyplot as plt\n", - "from scipy.optimize import minimize\n", - "import yfinance as yf\n", - "import QuantLib as ql\n", - "from validmind.tests import run_test" - ] - }, - { - "cell_type": "markdown", - "id": "bc431ee0", - "metadata": {}, - "source": [ - "<a id='toc3_4__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "7e844028", - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "id": "0c0ee8b9", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Data Preparation" - ] - }, - { - "cell_type": "markdown", - "id": "5a4d2c36", - "metadata": {}, - "source": [ - "### Market Data Sources\n", - "\n", - "<a id='toc4_1_1__'></a>\n", - "\n", - "#### Helper functions\n", - "Let's define helper function retrieve to option data from Yahoo Finance." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b96a500f", - "metadata": {}, - "outputs": [], - "source": [ - "def get_market_data(ticker, expiration_date_str):\n", - " \"\"\"\n", - " Fetch option market data from Yahoo Finance for the given ticker and expiration date.\n", - " Returns a list of tuples: (strike, maturity, option_price).\n", - " \"\"\"\n", - " # Create a Ticker object for the specified stock\n", - " stock = yf.Ticker(ticker)\n", - "\n", - " # Get all available expiration dates for options\n", - " option_dates = stock.options\n", - "\n", - " # Check if the requested expiration date is available\n", - " if expiration_date_str not in option_dates:\n", - " raise ValueError(f\"Expiration date {expiration_date_str} not available for {ticker}. Available dates: {option_dates}\")\n", - "\n", - " # Get the option chain for the specified expiration date\n", - " option_chain = stock.option_chain(expiration_date_str)\n", - "\n", - " # Get call options (or you can use puts as well based on your requirement)\n", - " calls = option_chain.calls\n", - "\n", - " # Convert expiration_date_str to QuantLib Date\n", - " expiry_date_parts = list(map(int, expiration_date_str.split('-'))) # Split YYYY-MM-DD\n", - " maturity_date = ql.Date(expiry_date_parts[2], expiry_date_parts[1], expiry_date_parts[0]) # Convert to QuantLib Date\n", - "\n", - " # Create a list to store strike prices, maturity dates, and option prices\n", - " market_data = []\n", - " for index, row in calls.iterrows():\n", - " strike = row['strike']\n", - " option_price = row['lastPrice'] # You can also use 'bid', 'ask', 'mid', etc.\n", - " market_data.append((strike, maturity_date, option_price))\n", - " df = pd.DataFrame(market_data, columns = ['strike', 'maturity_date', 'option_price'])\n", - " return df" - ] - }, - { - "cell_type": "markdown", - "id": "c7769b73", - "metadata": {}, - "source": [ - "Let's define helper function retrieve to stock data from Yahoo Finance. This helper function to calculate spot price, dividend yield, volatility and risk free rate using the underline stock data." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "dc44c448", - "metadata": {}, - "outputs": [], - "source": [ - "def get_option_parameters(ticker):\n", - " # Fetch historical data for the stock\n", - " stock_data = yf.Ticker(ticker)\n", - " \n", - " # Get the current spot price\n", - " spot_price = stock_data.history(period=\"1d\")['Close'].iloc[-1]\n", - " \n", - " # Get dividend yield\n", - " dividend_rate = stock_data.dividends.mean() / spot_price if not stock_data.dividends.empty else 0.0\n", - " \n", - " # Estimate volatility (standard deviation of log returns)\n", - " hist_data = stock_data.history(period=\"1y\")['Close']\n", - " log_returns = np.log(hist_data / hist_data.shift(1)).dropna()\n", - " volatility = np.std(log_returns) * np.sqrt(252) # Annualized volatility\n", - " \n", - " # Assume a risk-free rate from some known data (can be fetched from market data, here we use 0.001)\n", - " risk_free_rate = 0.001\n", - " \n", - " # Return the calculated parameters\n", - " return {\n", - " \"spot_price\": spot_price,\n", - " \"volatility\": volatility,\n", - " \"dividend_rate\": dividend_rate,\n", - " \"risk_free_rate\": risk_free_rate\n", - " }" - ] - }, - { - "cell_type": "markdown", - "id": "c7b739d3", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### Market Data Quality and Availability\n", - "Next, let's specify ticker and expiration date to get market data." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "50225fde", - "metadata": {}, - "outputs": [], - "source": [ - "ticker = \"MSFT\"\n", - "expiration_date = \"2024-12-13\" # Example expiration date in 'YYYY-MM-DD' form\n", - "\n", - "market_data = get_market_data(ticker=ticker, expiration_date_str=expiration_date)" - ] - }, - { - "cell_type": "markdown", - "id": "c539b95e", - "metadata": {}, - "source": [ - "<a id='toc4_3__'></a>\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "113f9c17", - "metadata": {}, - "outputs": [], - "source": [ - "vm_market_data = vm.init_dataset(\n", - " dataset=market_data,\n", - " input_id=\"market_data\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "185beb24", - "metadata": {}, - "source": [ - "<a id='toc4_4__'></a>\n", - "\n", - "### Data Quality\n", - "Let's check quality of the data using outliers and missing data tests." - ] - }, - { - "cell_type": "markdown", - "id": "7f14464c", - "metadata": {}, - "source": [ - "<a id='toc4_4_1__'></a>\n", - "\n", - "#### Isolation Forest Outliers Test\n", - "Let's detects anomalies in the dataset using the Isolation Forest algorithm, visualized through scatter plots." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "56c919ec", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"validmind.data_validation.IsolationForestOutliers\",\n", - " inputs={\n", - " \"dataset\": vm_market_data,\n", - " },\n", - " title=\"Outliers detection using Isolation Forest\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "e4d0e5ca", - "metadata": {}, - "source": [ - "##### Missing Values Test\n", - "Let's evaluates dataset quality by ensuring the missing value ratio across all features does not exceed a set threshold." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e95c825f", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"validmind.data_validation.MissingValues\",\n", - " inputs={\n", - " \"dataset\": vm_market_data,\n", - " },\n", - " title=\"Missing Values detection\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "829403a3", - "metadata": {}, - "source": [ - "<a id='toc4_4_2__'></a>\n", - "\n", - "#### Model parameters\n", - "Let's calculate the model parameters using from stock data " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "25936449", - "metadata": {}, - "outputs": [], - "source": [ - "option_params = get_option_parameters(ticker=ticker)" - ] - }, - { - "cell_type": "markdown", - "id": "0a0948b6", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Model development - Heston Option price" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e15b8221", - "metadata": {}, - "outputs": [], - "source": [ - "class HestonModel:\n", - "\n", - " def __init__(self, ticker, expiration_date_str, calculation_date, spot_price, dividend_rate, risk_free_rate):\n", - " self.ticker = ticker\n", - " self.expiration_date_str = expiration_date_str,\n", - " self.calculation_date = calculation_date\n", - " self.spot_price = spot_price\n", - " self.dividend_rate = dividend_rate\n", - " self.risk_free_rate = risk_free_rate\n", - " \n", - " def predict_option_price(self, strike, maturity_date, spot_price, v0=None, theta=None, kappa=None, sigma=None, rho=None):\n", - " # Set the evaluation date\n", - " ql.Settings.instance().evaluationDate = self.calculation_date\n", - "\n", - " # Construct the European Option\n", - " payoff = ql.PlainVanillaPayoff(ql.Option.Call, strike)\n", - " exercise = ql.EuropeanExercise(maturity_date)\n", - " european_option = ql.VanillaOption(payoff, exercise)\n", - "\n", - " # Yield term structures for risk-free rate and dividend\n", - " riskFreeTS = ql.YieldTermStructureHandle(ql.FlatForward(calculation_date, self.risk_free_rate, ql.Actual365Fixed()))\n", - " dividendTS = ql.YieldTermStructureHandle(ql.FlatForward(calculation_date, self.dividend_rate, ql.Actual365Fixed()))\n", - "\n", - " # Initial stock price\n", - " initialValue = ql.QuoteHandle(ql.SimpleQuote(spot_price))\n", - "\n", - " # Heston process parameters\n", - " heston_process = ql.HestonProcess(riskFreeTS, dividendTS, initialValue, v0, kappa, theta, sigma, rho)\n", - " hestonModel = ql.HestonModel(heston_process)\n", - "\n", - " # Use the Heston analytic engine\n", - " engine = ql.AnalyticHestonEngine(hestonModel)\n", - " european_option.setPricingEngine(engine)\n", - "\n", - " # Calculate the Heston model price\n", - " h_price = european_option.NPV()\n", - "\n", - " return h_price\n", - "\n", - " def predict_american_option_price(self, strike, maturity_date, spot_price, v0=None, theta=None, kappa=None, sigma=None, rho=None):\n", - " # Set the evaluation date\n", - " ql.Settings.instance().evaluationDate = self.calculation_date\n", - "\n", - " # Construct the American Option\n", - " payoff = ql.PlainVanillaPayoff(ql.Option.Call, strike)\n", - " exercise = ql.AmericanExercise(self.calculation_date, maturity_date)\n", - " american_option = ql.VanillaOption(payoff, exercise)\n", - "\n", - " # Yield term structures for risk-free rate and dividend\n", - " riskFreeTS = ql.YieldTermStructureHandle(ql.FlatForward(self.calculation_date, self.risk_free_rate, ql.Actual365Fixed()))\n", - " dividendTS = ql.YieldTermStructureHandle(ql.FlatForward(self.calculation_date, self.dividend_rate, ql.Actual365Fixed()))\n", - "\n", - " # Initial stock price\n", - " initialValue = ql.QuoteHandle(ql.SimpleQuote(spot_price))\n", - "\n", - " # Heston process parameters\n", - " heston_process = ql.HestonProcess(riskFreeTS, dividendTS, initialValue, v0, kappa, theta, sigma, rho)\n", - " heston_model = ql.HestonModel(heston_process)\n", - "\n", - "\n", - " payoff = ql.PlainVanillaPayoff(ql.Option.Call, strike)\n", - " exercise = ql.AmericanExercise(self.calculation_date, maturity_date)\n", - " american_option = ql.VanillaOption(payoff, exercise)\n", - " heston_fd_engine = ql.FdHestonVanillaEngine(heston_model)\n", - " american_option.setPricingEngine(heston_fd_engine)\n", - " option_price = american_option.NPV()\n", - "\n", - " return option_price\n", - "\n", - " def objective_function(self, params, market_data, spot_price, dividend_rate, risk_free_rate):\n", - " v0, theta, kappa, sigma, rho = params\n", - "\n", - " # Sum of squared differences between market prices and model prices\n", - " error = 0.0\n", - " for i, row in market_data.iterrows():\n", - " model_price = self.predict_option_price(row['strike'], row['maturity_date'], spot_price, \n", - " v0, theta, kappa, sigma, rho)\n", - " error += (model_price - row['option_price']) ** 2\n", - " \n", - " return error\n", - "\n", - " def calibrate_model(self, ticker, expiration_date_str):\n", - " # Get the option market data dynamically from Yahoo Finance\n", - " market_data = get_market_data(ticker, expiration_date_str)\n", - "\n", - " # Initial guesses for Heston parameters\n", - " initial_params = [0.04, 0.04, 0.1, 0.1, -0.75]\n", - "\n", - " # Bounds for the parameters to ensure realistic values\n", - " bounds = [(0.0001, 1.0), # v0\n", - " (0.0001, 1.0), # theta\n", - " (0.001, 2.0), # kappa\n", - " (0.001, 1.0), # sigma\n", - " (-0.75, 0.0)] # rho\n", - "\n", - " # Optimize the parameters to minimize the error between model and market prices\n", - " result = minimize(self.objective_function, initial_params, args=(market_data, self.spot_price, self.dividend_rate, self.risk_free_rate),\n", - " bounds=bounds, method='L-BFGS-B')\n", - "\n", - " # Optimized Heston parameters\n", - " v0_opt, theta_opt, kappa_opt, sigma_opt, rho_opt = result.x\n", - "\n", - " return v0_opt, theta_opt, kappa_opt, sigma_opt, rho_opt\n" - ] - }, - { - "cell_type": "markdown", - "id": "a941aa32", - "metadata": {}, - "source": [ - "<a id='toc5_1__'></a>\n", - "\n", - "### Model Calibration\n", - "* The calibration process aims to optimize the Heston model parameters (v0, theta, kappa, sigma, rho) by minimizing the difference between model-predicted option prices and observed market prices.\n", - "* In this implementation, the model is calibrated to current market data, specifically using option prices from the selected ticker and expiration date.\n", - "\n", - "Let's specify `calculation_date` and `strike_price` as input parameters for the model to verify its functionality and confirm it operates as expected." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "1d61dfca", - "metadata": {}, - "outputs": [], - "source": [ - "calculation_date = ql.Date(26, 11, 2024)\n", - "# Convert expiration date string to QuantLib.Date\n", - "expiry_date_parts = list(map(int, expiration_date.split('-')))\n", - "maturity_date = ql.Date(expiry_date_parts[2], expiry_date_parts[1], expiry_date_parts[0])\n", - "strike_price = 460.0\n", - "\n", - "hm = HestonModel(\n", - " ticker=ticker,\n", - " expiration_date_str= expiration_date,\n", - " calculation_date= calculation_date,\n", - " spot_price= option_params['spot_price'],\n", - " dividend_rate = option_params['dividend_rate'],\n", - " risk_free_rate = option_params['risk_free_rate']\n", - ")\n", - "\n", - "# Let's calibrate model\n", - "v0_opt, theta_opt, kappa_opt, sigma_opt, rho_opt = hm.calibrate_model(ticker, expiration_date)\n", - "print(f\"Optimized Heston parameters: v0={v0_opt}, theta={theta_opt}, kappa={kappa_opt}, sigma={sigma_opt}, rho={rho_opt}\")\n", - "\n", - "\n", - "# option price\n", - "h_price = hm.predict_option_price(strike_price, maturity_date, option_params['spot_price'], v0_opt, theta_opt, kappa_opt, sigma_opt, rho_opt)\n", - "print(\"The Heston model price for the option is:\", h_price)" - ] - }, - { - "cell_type": "markdown", - "id": "75313272", - "metadata": {}, - "source": [ - "<a id='toc5_2__'></a>\n", - "\n", - "### Model Evaluation" - ] - }, - { - "cell_type": "markdown", - "id": "2e6471ef", - "metadata": {}, - "source": [ - "<a id='toc5_2_1__'></a>\n", - "\n", - "#### Benchmark Testing\n", - "The benchmark testing framework provides a robust way to validate the Heston model implementation and understand the relationships between European and American option prices under stochastic volatility conditions.\n", - "Let's compares European and American option prices using the Heston model." - ] - }, - { - "cell_type": "code", - "execution_count": 15, - "id": "810cf887", - "metadata": {}, - "outputs": [], - "source": [ - "@vm.test(\"my_custom_tests.BenchmarkTest\")\n", - "def benchmark_test(hm_model, strikes, maturity_date, spot_price, v0=None, theta=None, kappa=None, sigma=None, rho=None):\n", - " \"\"\"\n", - " Compares European and American option prices using the Heston model.\n", - "\n", - " This test evaluates the price differences between European and American options\n", - " across multiple strike prices while keeping other parameters constant. The comparison\n", - " helps understand the early exercise premium of American options over their European\n", - " counterparts under stochastic volatility conditions.\n", - "\n", - " Args:\n", - " hm_model: HestonModel instance for option pricing calculations\n", - " strikes (list[float]): List of strike prices to test\n", - " maturity_date (ql.Date): Option expiration date in QuantLib format\n", - " spot_price (float): Current price of the underlying asset\n", - " v0 (float, optional): Initial variance. Defaults to None.\n", - " theta (float, optional): Long-term variance. Defaults to None.\n", - " kappa (float, optional): Mean reversion rate. Defaults to None.\n", - " sigma (float, optional): Volatility of variance. Defaults to None.\n", - " rho (float, optional): Correlation between asset and variance. Defaults to None.\n", - "\n", - " Returns:\n", - " dict: Contains a DataFrame with the following columns:\n", - " - Strike: Strike prices tested\n", - " - Maturity date: Expiration date for all options\n", - " - Spot price: Current underlying price\n", - " - european model price: Prices for European options\n", - " - american model price: Prices for American options\n", - "\"\"\"\n", - " american_derived_prices = []\n", - " european_derived_prices = []\n", - " for K in strikes:\n", - " european_derived_prices.append(hm_model.predict_option_price(K, maturity_date, spot_price, v0, theta, kappa, sigma, rho))\n", - " american_derived_prices.append(hm_model.predict_american_option_price(K, maturity_date, spot_price, v0, theta, kappa, sigma, rho))\n", - "\n", - " data = {\n", - " \"Strike\": strikes,\n", - " \"Maturity date\": [maturity_date] * len(strikes),\n", - " \"Spot price\": [spot_price] * len(strikes),\n", - " \"european model price\": european_derived_prices,\n", - " \"american model price\": american_derived_prices,\n", - "\n", - " }\n", - " df1 = pd.DataFrame(data)\n", - " return {\"strikes variation benchmarking\": df1}" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "3fdd6705", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.BenchmarkTest\",\n", - " params={\n", - " \"hm_model\": hm,\n", - " \"strikes\": [400, 425, 460, 495, 520],\n", - " \"maturity_date\": maturity_date,\n", - " \"spot_price\": option_params['spot_price'],\n", - " \"v0\":v0_opt,\n", - " \"theta\": theta_opt,\n", - " \"kappa\":kappa_opt ,\n", - " \"sigma\": sigma_opt,\n", - " \"rho\":rho_opt\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "e359b503", - "metadata": {}, - "source": [ - "<a id='toc5_2_2__'></a>\n", - "\n", - "#### Sensitivity Testing\n", - "The sensitivity testing framework provides a systematic approach to understanding how the Heston model responds to parameter changes, which is crucial for both model validation and practical application in trading and risk management." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "51922313", - "metadata": {}, - "outputs": [], - "source": [ - "@vm.test(\"my_test_provider.Sensitivity\")\n", - "def SensitivityTest(\n", - " model,\n", - " strike_price,\n", - " maturity_date,\n", - " spot_price,\n", - " v0_opt,\n", - " theta_opt,\n", - " kappa_opt,\n", - " sigma_opt,\n", - " rho_opt,\n", - "):\n", - " \"\"\"\n", - " Evaluates the sensitivity of American option prices to changes in model parameters.\n", - "\n", - " This test calculates option prices using the Heston model with optimized parameters.\n", - " It's designed to analyze how changes in various model inputs affect the option price,\n", - " which is crucial for understanding model behavior and risk management.\n", - "\n", - " Args:\n", - " model (HestonModel): Initialized Heston model instance wrapped in ValidMind model object\n", - " strike_price (float): Strike price of the option\n", - " maturity_date (ql.Date): Expiration date of the option in QuantLib format\n", - " spot_price (float): Current price of the underlying asset\n", - " v0_opt (float): Optimized initial variance parameter\n", - " theta_opt (float): Optimized long-term variance parameter\n", - " kappa_opt (float): Optimized mean reversion rate parameter\n", - " sigma_opt (float): Optimized volatility of variance parameter\n", - " rho_opt (float): Optimized correlation parameter between asset price and variance\n", - " \"\"\"\n", - " price = model.model.predict_american_option_price(\n", - " strike_price,\n", - " maturity_date,\n", - " spot_price,\n", - " v0_opt,\n", - " theta_opt,\n", - " kappa_opt,\n", - " sigma_opt,\n", - " rho_opt,\n", - " )\n", - "\n", - " return price\n" - ] - }, - { - "cell_type": "markdown", - "id": "408a05ef", - "metadata": {}, - "source": [ - "##### Common plot function" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "104ca6dd", - "metadata": {}, - "outputs": [], - "source": [ - "def plot_results(df, params: dict = None):\n", - " fig2 = plt.figure(figsize=(10, 6))\n", - " plt.plot(df[params[\"x\"]], df[params[\"y\"]], label=params[\"label\"])\n", - " plt.xlabel(params[\"xlabel\"])\n", - " plt.ylabel(params[\"ylabel\"])\n", - " \n", - " plt.title(params[\"title\"])\n", - " plt.legend()\n", - " plt.grid(True)\n", - " plt.show() # close the plot to avoid displaying it" - ] - }, - { - "cell_type": "markdown", - "id": "ca72b9e5", - "metadata": {}, - "source": [ - "Let's create ValidMind model object" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "ae7093fa", - "metadata": {}, - "outputs": [], - "source": [ - "hm_model = vm.init_model(model=hm, input_id=\"HestonModel\")" - ] - }, - { - "cell_type": "markdown", - "id": "b2141640", - "metadata": {}, - "source": [ - "##### Strike sensitivity\n", - "Let's analyzes how option prices change as the strike price varies. We create a range of strike prices around the current strike (460) and observe the impact on option prices while keeping all other parameters constant." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "ea7f1cbe", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_test_provider.Sensitivity:ToStrike\",\n", - " inputs = {\n", - " \"model\": hm_model\n", - " },\n", - " param_grid={\n", - " \"strike_price\": list(np.linspace(460-50, 460+50, 10)),\n", - " \"maturity_date\": [maturity_date],\n", - " \"spot_price\": [option_params[\"spot_price\"]],\n", - " \"v0_opt\": [v0_opt],\n", - " \"theta_opt\": [theta_opt],\n", - " \"kappa_opt\": [kappa_opt],\n", - " \"sigma_opt\": [sigma_opt],\n", - " \"rho_opt\":[rho_opt]\n", - " },\n", - ")\n", - "result.log()\n", - "# Visualize how option prices change with different strike prices\n", - "plot_results(\n", - " pd.DataFrame(result.tables[0].data),\n", - " params={\n", - " \"x\": \"strike_price\",\n", - " \"y\":\"Value\",\n", - " \"label\":\"Strike price\",\n", - " \"xlabel\":\"Strike price\",\n", - " \"ylabel\":\"option price\",\n", - " \"title\":\"Heston option - Strike price Sensitivity\",\n", - " }\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "be143012", - "metadata": {}, - "source": [ - "<a id='toc5_2_3__'></a>\n", - "\n", - "#### Stress Testing\n", - "This stress testing framework provides a comprehensive view of how the Heston model behaves under different market conditions and helps identify potential risks in option pricing." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "f2f01a40", - "metadata": {}, - "outputs": [], - "source": [ - "@vm.test(\"my_custom_tests.Stressing\")\n", - "def StressTest(\n", - " model,\n", - " strike_price,\n", - " maturity_date,\n", - " spot_price,\n", - " v0_opt,\n", - " theta_opt,\n", - " kappa_opt,\n", - " sigma_opt,\n", - " rho_opt,\n", - "):\n", - " \"\"\"\n", - " Performs stress testing on Heston model parameters to evaluate option price sensitivity.\n", - "\n", - " This test evaluates how the American option price responds to stressed market conditions\n", - " by varying key model parameters. It's designed to:\n", - " 1. Identify potential model vulnerabilities\n", - " 2. Understand price behavior under extreme scenarios\n", - " 3. Support risk management decisions\n", - " 4. Validate model stability across parameter ranges\n", - "\n", - " Args:\n", - " model (HestonModel): Initialized Heston model instance wrapped in ValidMind model object\n", - " strike_price (float): Option strike price\n", - " maturity_date (ql.Date): Option expiration date in QuantLib format\n", - " spot_price (float): Current price of the underlying asset\n", - " v0_opt (float): Initial variance parameter under stress testing\n", - " theta_opt (float): Long-term variance parameter under stress testing\n", - " kappa_opt (float): Mean reversion rate parameter under stress testing\n", - " sigma_opt (float): Volatility of variance parameter under stress testing\n", - " rho_opt (float): Correlation parameter under stress testing\n", - " \"\"\"\n", - " price = model.model.predict_american_option_price(\n", - " strike_price,\n", - " maturity_date,\n", - " spot_price,\n", - " v0_opt,\n", - " theta_opt,\n", - " kappa_opt,\n", - " sigma_opt,\n", - " rho_opt,\n", - " )\n", - "\n", - " return price\n" - ] - }, - { - "cell_type": "markdown", - "id": "31fcbe9c", - "metadata": {}, - "source": [ - "##### Rho (correlation) and Theta (long term vol) stress test\n", - "Next, let's evaluates the sensitivity of a model's output to changes in the correlation parameter (rho) and the long-term variance parameter (theta) within a stochastic volatility framework." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "6119b5d9", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.Stressing:TheRhoAndThetaParameters\",\n", - " inputs = {\n", - " \"model\": hm_model,\n", - " },\n", - " param_grid={\n", - " \"strike_price\": [460],\n", - " \"maturity_date\": [maturity_date],\n", - " \"spot_price\": [option_params[\"spot_price\"]],\n", - " \"v0_opt\": [v0_opt],\n", - " \"theta_opt\": list(np.linspace(0.1, theta_opt+0.4, 5)),\n", - " \"kappa_opt\": [kappa_opt],\n", - " \"sigma_opt\": [sigma_opt],\n", - " \"rho_opt\":list(np.linspace(rho_opt-0.2, rho_opt+0.2, 5))\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "be39cb3a", - "metadata": {}, - "source": [ - "##### Sigma stress test\n", - "Let's evaluates the sensitivity of a model's output to changes in the volatility parameter, sigma. This test is crucial for understanding how variations in market volatility impact the model's valuation of financial instruments, particularly options." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0dc189b7", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.Stressing:TheSigmaParameter\",\n", - " inputs = {\n", - " \"model\": hm_model,\n", - " },\n", - " param_grid={\n", - " \"strike_price\": [460],\n", - " \"maturity_date\": [maturity_date],\n", - " \"spot_price\": [option_params[\"spot_price\"]],\n", - " \"v0_opt\": [v0_opt],\n", - " \"theta_opt\": [theta_opt],\n", - " \"kappa_opt\": [kappa_opt],\n", - " \"sigma_opt\": list(np.linspace(0.1, sigma_opt+0.6, 5)),\n", - " \"rho_opt\": [rho_opt]\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "173a5294", - "metadata": {}, - "source": [ - "##### Stress kappa\n", - "Let's evaluates the sensitivity of a model's output to changes in the kappa parameter, which is a mean reversion rate in stochastic volatility models." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "dae9714f", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.Stressing:TheKappaParameter\",\n", - " inputs = {\n", - " \"model\": hm_model,\n", - " },\n", - " param_grid={\n", - " \"strike_price\": [460],\n", - " \"maturity_date\": [maturity_date],\n", - " \"spot_price\": [option_params[\"spot_price\"]],\n", - " \"v0_opt\": [v0_opt],\n", - " \"theta_opt\": [theta_opt],\n", - " \"kappa_opt\": list(np.linspace(kappa_opt, kappa_opt+0.2, 5)),\n", - " \"sigma_opt\": [sigma_opt],\n", - " \"rho_opt\": [rho_opt]\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "b4d1d968", - "metadata": {}, - "source": [ - "##### Stress theta\n", - "Let's evaluates the sensitivity of a model's output to changes in the parameter theta, which represents the long-term variance in a stochastic volatility model." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "e68df3db", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.Stressing:TheThetaParameter\",\n", - " inputs = {\n", - " \"model\": hm_model,\n", - " },\n", - " param_grid={\n", - " \"strike_price\": [460],\n", - " \"maturity_date\": [maturity_date],\n", - " \"spot_price\": [option_params[\"spot_price\"]],\n", - " \"v0_opt\": [v0_opt],\n", - " \"theta_opt\": list(np.linspace(0.1, theta_opt+0.9, 5)),\n", - " \"kappa_opt\": [kappa_opt],\n", - " \"sigma_opt\": [sigma_opt],\n", - " \"rho_opt\": [rho_opt]\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "32e70456", - "metadata": {}, - "source": [ - "##### Stress rho\n", - "Let's evaluates the sensitivity of a model's output to changes in the correlation parameter, rho, within a stochastic volatility (SV) model framework. This test is crucial for understanding how variations in rho, which represents the correlation between the asset price and its volatility, impact the model's valuation output." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b5ca3fc2", - "metadata": {}, - "outputs": [], - "source": [ - "result = run_test(\n", - " \"my_custom_tests.Stressing:TheRhoParameter\",\n", - " inputs = {\n", - " \"model\": hm_model,\n", - " },\n", - " param_grid={\n", - " \"strike_price\": [460],\n", - " \"maturity_date\": [maturity_date],\n", - " \"spot_price\": [option_params[\"spot_price\"]],\n", - " \"v0_opt\": [v0_opt],\n", - " \"theta_opt\": [theta_opt],\n", - " \"kappa_opt\": [kappa_opt],\n", - " \"sigma_opt\": [sigma_opt],\n", - " \"rho_opt\": list(np.linspace(rho_opt-0.2, rho_opt+0.2, 5))\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "id": "892c5347", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "<a id='toc6_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "<a id='toc6_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-de5d1e182b09403abddabc2850f2dd05", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "validmind-1QuffXMV-py3.10", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.14" - } - }, - "nbformat": 4, - "nbformat_minor": 5 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Quickstart for Heston option pricing model using QuantLib\n", + "\n", + "Welcome! Let's get you started with the basic process of documenting models with ValidMind.\n", + "\n", + "The Heston option pricing model is a popular stochastic volatility model used to price options. Developed by Steven Heston in 1993, the model assumes that the asset's volatility follows a mean-reverting square-root process, allowing it to capture the empirical observation of volatility \"clustering\" in financial markets. This model is particularly useful for assets where volatility is not constant, making it a favored approach in quantitative finance for pricing complex derivatives.\n", + "\n", + "Here’s an overview of the Heston model as implemented in QuantLib, a powerful library for quantitative finance:\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Model Assumptions and Characteristics\n", + "1. **Stochastic Volatility**: The volatility is modeled as a stochastic process, following a mean-reverting square-root process (Cox-Ingersoll-Ross process).\n", + "2. **Correlated Asset and Volatility Processes**: The asset price and volatility are assumed to be correlated, allowing the model to capture the \"smile\" effect observed in implied volatilities.\n", + "3. **Risk-Neutral Dynamics**: The Heston model is typically calibrated under a risk-neutral measure, which allows for direct application to pricing.\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### Heston Model Parameters\n", + "The model is governed by a set of key parameters:\n", + "- **S0**: Initial stock price\n", + "- **v0**: Initial variance of the asset price\n", + "- **kappa**: Speed of mean reversion of the variance\n", + "- **theta**: Long-term mean level of variance\n", + "- **sigma**: Volatility of volatility (vol of vol)\n", + "- **rho**: Correlation between the asset price and variance processes\n", + "\n", + "The dynamics of the asset price \\( S \\) and the variance \\( v \\) under the Heston model are given by:\n", + "\n", + "$$\n", + "dS_t = r S_t \\, dt + \\sqrt{v_t} S_t \\, dW^S_t\n", + "$$\n", + "\n", + "$$\n", + "dv_t = \\kappa (\\theta - v_t) \\, dt + \\sigma \\sqrt{v_t} \\, dW^v_t\n", + "$$\n", + "\n", + "where \\( $dW^S$ \\) and \\( $dW^v$ \\) are Wiener processes with correlation \\( $\\rho$ \\).\n", + "\n", + "<a id='toc1_3__'></a>\n", + "\n", + "### Advantages and Limitations\n", + "- **Advantages**:\n", + " - Ability to capture volatility smiles and skews.\n", + " - More realistic pricing for options on assets with stochastic volatility.\n", + "- **Limitations**:\n", + " - Calibration can be complex due to the number of parameters.\n", + " - Computationally intensive compared to simpler models like Black-Scholes.\n", + "\n", + "This setup provides a robust framework for pricing and analyzing options with stochastic volatility dynamics. QuantLib’s implementation makes it easy to experiment with different parameter configurations and observe their effects on pricing.\n", + "\n", + "You will learn how to initialize the ValidMind Library, develop a option pricing model, and then write custom tests that can be used for sensitivity and stress testing to quickly generate documentation about model." + ], + "id": "1e2a4689" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + " - [Model Assumptions and Characteristics](#toc1_1__) \n", + " - [Heston Model Parameters](#toc1_2__) \n", + " - [Advantages and Limitations](#toc1_3__) \n", + "- [About ValidMind](#toc2__) \n", + " - [Before you begin](#toc2_1__) \n", + " - [New to ValidMind?](#toc2_2__) \n", + " - [Key concepts](#toc2_3__) \n", + "- [Setting up](#toc3__) \n", + " - [Install the ValidMind Library](#toc3_1__) \n", + " - [Initialize the ValidMind Library](#toc3_2__) \n", + " - [Register sample model](#toc3_2_1__) \n", + " - [Apply documentation template](#toc3_2_2__) \n", + " - [Get your code snippet](#toc3_2_3__) \n", + " - [Initialize the Python environment](#toc3_3__) \n", + " - [Preview the documentation template](#toc3_4__) \n", + "- [Data Preparation](#toc4__) \n", + " - [Helper functions](#toc4_1_1__) \n", + " - [Market Data Quality and Availability](#toc4_2__) \n", + " - [Initialize the ValidMind datasets](#toc4_3__) \n", + " - [Data Quality](#toc4_4__) \n", + " - [Isolation Forest Outliers Test](#toc4_4_1__) \n", + " - [Model parameters](#toc4_4_2__) \n", + "- [Model development - Heston Option price](#toc5__) \n", + " - [Model Calibration](#toc5_1__) \n", + " - [Model Evaluation](#toc5_2__) \n", + " - [Benchmark Testing](#toc5_2_1__) \n", + " - [Sensitivity Testing](#toc5_2_2__) \n", + " - [Stress Testing](#toc5_2_3__) \n", + "- [Next steps](#toc6__) \n", + " - [Work with your model documentation](#toc6_1__) \n", + " - [Discover more learning resources](#toc6_2__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ], + "id": "69ec219a" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc2_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc2_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", + "\n", + "<a id='toc2_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ], + "id": "b9fb5d17" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Setting up" + ], + "id": "f2dccf35" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ], + "id": "5a5ce085" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [], + "id": "409352bf" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To install the QuantLib library:" + ], + "id": "65e870b2" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q QuantLib" + ], + "execution_count": null, + "outputs": [], + "id": "3a34debf" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ], + "id": "fb30ae07" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ], + "id": "c6f87017" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Capital Markets`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ], + "id": "cbb2e2c9" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Can't select this template?</b></span>\n", + "<br></br>\n", + "Your organization administrators may need to add it to your template library:\n", + "<ul>\n", + "<li><a href=\"capital_markets_template.yaml\" style=\"color: #DE257E;\"><b>Download Template YAML</b></a></li>\n", + "<li><a href=\"https://docs.validmind.ai/guide/templates/customize-document-templates.html\" style=\"color: #DE257E;\"><b>Customize Document Templates</b></a></li>\n", + "</ul>\n", + "</div>" + ], + "id": "41c4edca" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ], + "id": "2012eb82" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")\n" + ], + "execution_count": null, + "outputs": [], + "id": "0cd3f67e" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_3__'></a>\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Next, let's import the necessary libraries and set up your Python environment for data analysis:" + ], + "id": "6d944cc9" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%matplotlib inline\n", + "\n", + "import pandas as pd\n", + "import numpy as np\n", + "import matplotlib.pyplot as plt\n", + "from scipy.optimize import minimize\n", + "import yfinance as yf\n", + "import QuantLib as ql\n", + "from validmind.tests import run_test" + ], + "execution_count": null, + "outputs": [], + "id": "f8cf2746" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_4__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ], + "id": "bc431ee0" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [], + "id": "7e844028" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Data Preparation" + ], + "id": "0c0ee8b9" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Market Data Sources\n", + "\n", + "<a id='toc4_1_1__'></a>\n", + "\n", + "#### Helper functions\n", + "Let's define helper function retrieve to option data from Yahoo Finance." + ], + "id": "5a4d2c36" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "def get_market_data(ticker, expiration_date_str):\n", + " \"\"\"\n", + " Fetch option market data from Yahoo Finance for the given ticker and expiration date.\n", + " Returns a list of tuples: (strike, maturity, option_price).\n", + " \"\"\"\n", + " # Create a Ticker object for the specified stock\n", + " stock = yf.Ticker(ticker)\n", + "\n", + " # Get all available expiration dates for options\n", + " option_dates = stock.options\n", + "\n", + " # Check if the requested expiration date is available\n", + " if expiration_date_str not in option_dates:\n", + " raise ValueError(f\"Expiration date {expiration_date_str} not available for {ticker}. Available dates: {option_dates}\")\n", + "\n", + " # Get the option chain for the specified expiration date\n", + " option_chain = stock.option_chain(expiration_date_str)\n", + "\n", + " # Get call options (or you can use puts as well based on your requirement)\n", + " calls = option_chain.calls\n", + "\n", + " # Convert expiration_date_str to QuantLib Date\n", + " expiry_date_parts = list(map(int, expiration_date_str.split('-'))) # Split YYYY-MM-DD\n", + " maturity_date = ql.Date(expiry_date_parts[2], expiry_date_parts[1], expiry_date_parts[0]) # Convert to QuantLib Date\n", + "\n", + " # Create a list to store strike prices, maturity dates, and option prices\n", + " market_data = []\n", + " for index, row in calls.iterrows():\n", + " strike = row['strike']\n", + " option_price = row['lastPrice'] # You can also use 'bid', 'ask', 'mid', etc.\n", + " market_data.append((strike, maturity_date, option_price))\n", + " df = pd.DataFrame(market_data, columns = ['strike', 'maturity_date', 'option_price'])\n", + " return df" + ], + "execution_count": null, + "outputs": [], + "id": "b96a500f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's define helper function retrieve to stock data from Yahoo Finance. This helper function to calculate spot price, dividend yield, volatility and risk free rate using the underline stock data." + ], + "id": "c7769b73" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "def get_option_parameters(ticker):\n", + " # Fetch historical data for the stock\n", + " stock_data = yf.Ticker(ticker)\n", + " \n", + " # Get the current spot price\n", + " spot_price = stock_data.history(period=\"1d\")['Close'].iloc[-1]\n", + " \n", + " # Get dividend yield\n", + " dividend_rate = stock_data.dividends.mean() / spot_price if not stock_data.dividends.empty else 0.0\n", + " \n", + " # Estimate volatility (standard deviation of log returns)\n", + " hist_data = stock_data.history(period=\"1y\")['Close']\n", + " log_returns = np.log(hist_data / hist_data.shift(1)).dropna()\n", + " volatility = np.std(log_returns) * np.sqrt(252) # Annualized volatility\n", + " \n", + " # Assume a risk-free rate from some known data (can be fetched from market data, here we use 0.001)\n", + " risk_free_rate = 0.001\n", + " \n", + " # Return the calculated parameters\n", + " return {\n", + " \"spot_price\": spot_price,\n", + " \"volatility\": volatility,\n", + " \"dividend_rate\": dividend_rate,\n", + " \"risk_free_rate\": risk_free_rate\n", + " }" + ], + "execution_count": null, + "outputs": [], + "id": "dc44c448" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### Market Data Quality and Availability\n", + "Next, let's specify ticker and expiration date to get market data." + ], + "id": "c7b739d3" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "ticker = \"MSFT\"\n", + "expiration_date = \"2024-12-13\" # Example expiration date in 'YYYY-MM-DD' form\n", + "\n", + "market_data = get_market_data(ticker=ticker, expiration_date_str=expiration_date)" + ], + "execution_count": null, + "outputs": [], + "id": "50225fde" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_3__'></a>\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module." + ], + "id": "c539b95e" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_market_data = vm.init_dataset(\n", + " dataset=market_data,\n", + " input_id=\"market_data\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "113f9c17" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_4__'></a>\n", + "\n", + "### Data Quality\n", + "Let's check quality of the data using outliers and missing data tests." + ], + "id": "185beb24" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_4_1__'></a>\n", + "\n", + "#### Isolation Forest Outliers Test\n", + "Let's detects anomalies in the dataset using the Isolation Forest algorithm, visualized through scatter plots." + ], + "id": "7f14464c" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"validmind.data_validation.IsolationForestOutliers\",\n", + " inputs={\n", + " \"dataset\": vm_market_data,\n", + " },\n", + " title=\"Outliers detection using Isolation Forest\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "56c919ec" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Missing Values Test\n", + "Let's evaluates dataset quality by ensuring the missing value ratio across all features does not exceed a set threshold." + ], + "id": "e4d0e5ca" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"validmind.data_validation.MissingValues\",\n", + " inputs={\n", + " \"dataset\": vm_market_data,\n", + " },\n", + " title=\"Missing Values detection\",\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "e95c825f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_4_2__'></a>\n", + "\n", + "#### Model parameters\n", + "Let's calculate the model parameters using from stock data " + ], + "id": "829403a3" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "option_params = get_option_parameters(ticker=ticker)" + ], + "execution_count": null, + "outputs": [], + "id": "25936449" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Model development - Heston Option price" + ], + "id": "0a0948b6" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "class HestonModel:\n", + "\n", + " def __init__(self, ticker, expiration_date_str, calculation_date, spot_price, dividend_rate, risk_free_rate):\n", + " self.ticker = ticker\n", + " self.expiration_date_str = expiration_date_str,\n", + " self.calculation_date = calculation_date\n", + " self.spot_price = spot_price\n", + " self.dividend_rate = dividend_rate\n", + " self.risk_free_rate = risk_free_rate\n", + " \n", + " def predict_option_price(self, strike, maturity_date, spot_price, v0=None, theta=None, kappa=None, sigma=None, rho=None):\n", + " # Set the evaluation date\n", + " ql.Settings.instance().evaluationDate = self.calculation_date\n", + "\n", + " # Construct the European Option\n", + " payoff = ql.PlainVanillaPayoff(ql.Option.Call, strike)\n", + " exercise = ql.EuropeanExercise(maturity_date)\n", + " european_option = ql.VanillaOption(payoff, exercise)\n", + "\n", + " # Yield term structures for risk-free rate and dividend\n", + " riskFreeTS = ql.YieldTermStructureHandle(ql.FlatForward(calculation_date, self.risk_free_rate, ql.Actual365Fixed()))\n", + " dividendTS = ql.YieldTermStructureHandle(ql.FlatForward(calculation_date, self.dividend_rate, ql.Actual365Fixed()))\n", + "\n", + " # Initial stock price\n", + " initialValue = ql.QuoteHandle(ql.SimpleQuote(spot_price))\n", + "\n", + " # Heston process parameters\n", + " heston_process = ql.HestonProcess(riskFreeTS, dividendTS, initialValue, v0, kappa, theta, sigma, rho)\n", + " hestonModel = ql.HestonModel(heston_process)\n", + "\n", + " # Use the Heston analytic engine\n", + " engine = ql.AnalyticHestonEngine(hestonModel)\n", + " european_option.setPricingEngine(engine)\n", + "\n", + " # Calculate the Heston model price\n", + " h_price = european_option.NPV()\n", + "\n", + " return h_price\n", + "\n", + " def predict_american_option_price(self, strike, maturity_date, spot_price, v0=None, theta=None, kappa=None, sigma=None, rho=None):\n", + " # Set the evaluation date\n", + " ql.Settings.instance().evaluationDate = self.calculation_date\n", + "\n", + " # Construct the American Option\n", + " payoff = ql.PlainVanillaPayoff(ql.Option.Call, strike)\n", + " exercise = ql.AmericanExercise(self.calculation_date, maturity_date)\n", + " american_option = ql.VanillaOption(payoff, exercise)\n", + "\n", + " # Yield term structures for risk-free rate and dividend\n", + " riskFreeTS = ql.YieldTermStructureHandle(ql.FlatForward(self.calculation_date, self.risk_free_rate, ql.Actual365Fixed()))\n", + " dividendTS = ql.YieldTermStructureHandle(ql.FlatForward(self.calculation_date, self.dividend_rate, ql.Actual365Fixed()))\n", + "\n", + " # Initial stock price\n", + " initialValue = ql.QuoteHandle(ql.SimpleQuote(spot_price))\n", + "\n", + " # Heston process parameters\n", + " heston_process = ql.HestonProcess(riskFreeTS, dividendTS, initialValue, v0, kappa, theta, sigma, rho)\n", + " heston_model = ql.HestonModel(heston_process)\n", + "\n", + "\n", + " payoff = ql.PlainVanillaPayoff(ql.Option.Call, strike)\n", + " exercise = ql.AmericanExercise(self.calculation_date, maturity_date)\n", + " american_option = ql.VanillaOption(payoff, exercise)\n", + " heston_fd_engine = ql.FdHestonVanillaEngine(heston_model)\n", + " american_option.setPricingEngine(heston_fd_engine)\n", + " option_price = american_option.NPV()\n", + "\n", + " return option_price\n", + "\n", + " def objective_function(self, params, market_data, spot_price, dividend_rate, risk_free_rate):\n", + " v0, theta, kappa, sigma, rho = params\n", + "\n", + " # Sum of squared differences between market prices and model prices\n", + " error = 0.0\n", + " for i, row in market_data.iterrows():\n", + " model_price = self.predict_option_price(row['strike'], row['maturity_date'], spot_price, \n", + " v0, theta, kappa, sigma, rho)\n", + " error += (model_price - row['option_price']) ** 2\n", + " \n", + " return error\n", + "\n", + " def calibrate_model(self, ticker, expiration_date_str):\n", + " # Get the option market data dynamically from Yahoo Finance\n", + " market_data = get_market_data(ticker, expiration_date_str)\n", + "\n", + " # Initial guesses for Heston parameters\n", + " initial_params = [0.04, 0.04, 0.1, 0.1, -0.75]\n", + "\n", + " # Bounds for the parameters to ensure realistic values\n", + " bounds = [(0.0001, 1.0), # v0\n", + " (0.0001, 1.0), # theta\n", + " (0.001, 2.0), # kappa\n", + " (0.001, 1.0), # sigma\n", + " (-0.75, 0.0)] # rho\n", + "\n", + " # Optimize the parameters to minimize the error between model and market prices\n", + " result = minimize(self.objective_function, initial_params, args=(market_data, self.spot_price, self.dividend_rate, self.risk_free_rate),\n", + " bounds=bounds, method='L-BFGS-B')\n", + "\n", + " # Optimized Heston parameters\n", + " v0_opt, theta_opt, kappa_opt, sigma_opt, rho_opt = result.x\n", + "\n", + " return v0_opt, theta_opt, kappa_opt, sigma_opt, rho_opt\n" + ], + "execution_count": null, + "outputs": [], + "id": "e15b8221" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1__'></a>\n", + "\n", + "### Model Calibration\n", + "* The calibration process aims to optimize the Heston model parameters (v0, theta, kappa, sigma, rho) by minimizing the difference between model-predicted option prices and observed market prices.\n", + "* In this implementation, the model is calibrated to current market data, specifically using option prices from the selected ticker and expiration date.\n", + "\n", + "Let's specify `calculation_date` and `strike_price` as input parameters for the model to verify its functionality and confirm it operates as expected." + ], + "id": "a941aa32" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "calculation_date = ql.Date(26, 11, 2024)\n", + "# Convert expiration date string to QuantLib.Date\n", + "expiry_date_parts = list(map(int, expiration_date.split('-')))\n", + "maturity_date = ql.Date(expiry_date_parts[2], expiry_date_parts[1], expiry_date_parts[0])\n", + "strike_price = 460.0\n", + "\n", + "hm = HestonModel(\n", + " ticker=ticker,\n", + " expiration_date_str= expiration_date,\n", + " calculation_date= calculation_date,\n", + " spot_price= option_params['spot_price'],\n", + " dividend_rate = option_params['dividend_rate'],\n", + " risk_free_rate = option_params['risk_free_rate']\n", + ")\n", + "\n", + "# Let's calibrate model\n", + "v0_opt, theta_opt, kappa_opt, sigma_opt, rho_opt = hm.calibrate_model(ticker, expiration_date)\n", + "print(f\"Optimized Heston parameters: v0={v0_opt}, theta={theta_opt}, kappa={kappa_opt}, sigma={sigma_opt}, rho={rho_opt}\")\n", + "\n", + "\n", + "# option price\n", + "h_price = hm.predict_option_price(strike_price, maturity_date, option_params['spot_price'], v0_opt, theta_opt, kappa_opt, sigma_opt, rho_opt)\n", + "print(\"The Heston model price for the option is:\", h_price)" + ], + "execution_count": null, + "outputs": [], + "id": "1d61dfca" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_2__'></a>\n", + "\n", + "### Model Evaluation" + ], + "id": "75313272" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_2_1__'></a>\n", + "\n", + "#### Benchmark Testing\n", + "The benchmark testing framework provides a robust way to validate the Heston model implementation and understand the relationships between European and American option prices under stochastic volatility conditions.\n", + "Let's compares European and American option prices using the Heston model." + ], + "id": "2e6471ef" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "@vm.test(\"my_custom_tests.BenchmarkTest\")\n", + "def benchmark_test(hm_model, strikes, maturity_date, spot_price, v0=None, theta=None, kappa=None, sigma=None, rho=None):\n", + " \"\"\"\n", + " Compares European and American option prices using the Heston model.\n", + "\n", + " This test evaluates the price differences between European and American options\n", + " across multiple strike prices while keeping other parameters constant. The comparison\n", + " helps understand the early exercise premium of American options over their European\n", + " counterparts under stochastic volatility conditions.\n", + "\n", + " Args:\n", + " hm_model: HestonModel instance for option pricing calculations\n", + " strikes (list[float]): List of strike prices to test\n", + " maturity_date (ql.Date): Option expiration date in QuantLib format\n", + " spot_price (float): Current price of the underlying asset\n", + " v0 (float, optional): Initial variance. Defaults to None.\n", + " theta (float, optional): Long-term variance. Defaults to None.\n", + " kappa (float, optional): Mean reversion rate. Defaults to None.\n", + " sigma (float, optional): Volatility of variance. Defaults to None.\n", + " rho (float, optional): Correlation between asset and variance. Defaults to None.\n", + "\n", + " Returns:\n", + " dict: Contains a DataFrame with the following columns:\n", + " - Strike: Strike prices tested\n", + " - Maturity date: Expiration date for all options\n", + " - Spot price: Current underlying price\n", + " - european model price: Prices for European options\n", + " - american model price: Prices for American options\n", + "\"\"\"\n", + " american_derived_prices = []\n", + " european_derived_prices = []\n", + " for K in strikes:\n", + " european_derived_prices.append(hm_model.predict_option_price(K, maturity_date, spot_price, v0, theta, kappa, sigma, rho))\n", + " american_derived_prices.append(hm_model.predict_american_option_price(K, maturity_date, spot_price, v0, theta, kappa, sigma, rho))\n", + "\n", + " data = {\n", + " \"Strike\": strikes,\n", + " \"Maturity date\": [maturity_date] * len(strikes),\n", + " \"Spot price\": [spot_price] * len(strikes),\n", + " \"european model price\": european_derived_prices,\n", + " \"american model price\": american_derived_prices,\n", + "\n", + " }\n", + " df1 = pd.DataFrame(data)\n", + " return {\"strikes variation benchmarking\": df1}" + ], + "execution_count": 15, + "outputs": [], + "id": "810cf887" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.BenchmarkTest\",\n", + " params={\n", + " \"hm_model\": hm,\n", + " \"strikes\": [400, 425, 460, 495, 520],\n", + " \"maturity_date\": maturity_date,\n", + " \"spot_price\": option_params['spot_price'],\n", + " \"v0\":v0_opt,\n", + " \"theta\": theta_opt,\n", + " \"kappa\":kappa_opt ,\n", + " \"sigma\": sigma_opt,\n", + " \"rho\":rho_opt\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "3fdd6705" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_2_2__'></a>\n", + "\n", + "#### Sensitivity Testing\n", + "The sensitivity testing framework provides a systematic approach to understanding how the Heston model responds to parameter changes, which is crucial for both model validation and practical application in trading and risk management." + ], + "id": "e359b503" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "@vm.test(\"my_test_provider.Sensitivity\")\n", + "def SensitivityTest(\n", + " model,\n", + " strike_price,\n", + " maturity_date,\n", + " spot_price,\n", + " v0_opt,\n", + " theta_opt,\n", + " kappa_opt,\n", + " sigma_opt,\n", + " rho_opt,\n", + "):\n", + " \"\"\"\n", + " Evaluates the sensitivity of American option prices to changes in model parameters.\n", + "\n", + " This test calculates option prices using the Heston model with optimized parameters.\n", + " It's designed to analyze how changes in various model inputs affect the option price,\n", + " which is crucial for understanding model behavior and risk management.\n", + "\n", + " Args:\n", + " model (HestonModel): Initialized Heston model instance wrapped in ValidMind model object\n", + " strike_price (float): Strike price of the option\n", + " maturity_date (ql.Date): Expiration date of the option in QuantLib format\n", + " spot_price (float): Current price of the underlying asset\n", + " v0_opt (float): Optimized initial variance parameter\n", + " theta_opt (float): Optimized long-term variance parameter\n", + " kappa_opt (float): Optimized mean reversion rate parameter\n", + " sigma_opt (float): Optimized volatility of variance parameter\n", + " rho_opt (float): Optimized correlation parameter between asset price and variance\n", + " \"\"\"\n", + " price = model.model.predict_american_option_price(\n", + " strike_price,\n", + " maturity_date,\n", + " spot_price,\n", + " v0_opt,\n", + " theta_opt,\n", + " kappa_opt,\n", + " sigma_opt,\n", + " rho_opt,\n", + " )\n", + "\n", + " return price\n" + ], + "execution_count": null, + "outputs": [], + "id": "51922313" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Common plot function" + ], + "id": "408a05ef" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "def plot_results(df, params: dict = None):\n", + " fig2 = plt.figure(figsize=(10, 6))\n", + " plt.plot(df[params[\"x\"]], df[params[\"y\"]], label=params[\"label\"])\n", + " plt.xlabel(params[\"xlabel\"])\n", + " plt.ylabel(params[\"ylabel\"])\n", + " \n", + " plt.title(params[\"title\"])\n", + " plt.legend()\n", + " plt.grid(True)\n", + " plt.show() # close the plot to avoid displaying it" + ], + "execution_count": null, + "outputs": [], + "id": "104ca6dd" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's create ValidMind model object" + ], + "id": "ca72b9e5" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "hm_model = vm.init_model(model=hm, input_id=\"HestonModel\")" + ], + "execution_count": null, + "outputs": [], + "id": "ae7093fa" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Strike sensitivity\n", + "Let's analyzes how option prices change as the strike price varies. We create a range of strike prices around the current strike (460) and observe the impact on option prices while keeping all other parameters constant." + ], + "id": "b2141640" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_test_provider.Sensitivity:ToStrike\",\n", + " inputs = {\n", + " \"model\": hm_model\n", + " },\n", + " param_grid={\n", + " \"strike_price\": list(np.linspace(460-50, 460+50, 10)),\n", + " \"maturity_date\": [maturity_date],\n", + " \"spot_price\": [option_params[\"spot_price\"]],\n", + " \"v0_opt\": [v0_opt],\n", + " \"theta_opt\": [theta_opt],\n", + " \"kappa_opt\": [kappa_opt],\n", + " \"sigma_opt\": [sigma_opt],\n", + " \"rho_opt\":[rho_opt]\n", + " },\n", + ")\n", + "result.log()\n", + "# Visualize how option prices change with different strike prices\n", + "plot_results(\n", + " pd.DataFrame(result.tables[0].data),\n", + " params={\n", + " \"x\": \"strike_price\",\n", + " \"y\":\"Value\",\n", + " \"label\":\"Strike price\",\n", + " \"xlabel\":\"Strike price\",\n", + " \"ylabel\":\"option price\",\n", + " \"title\":\"Heston option - Strike price Sensitivity\",\n", + " }\n", + ")" + ], + "execution_count": null, + "outputs": [], + "id": "ea7f1cbe" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_2_3__'></a>\n", + "\n", + "#### Stress Testing\n", + "This stress testing framework provides a comprehensive view of how the Heston model behaves under different market conditions and helps identify potential risks in option pricing." + ], + "id": "be143012" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "@vm.test(\"my_custom_tests.Stressing\")\n", + "def StressTest(\n", + " model,\n", + " strike_price,\n", + " maturity_date,\n", + " spot_price,\n", + " v0_opt,\n", + " theta_opt,\n", + " kappa_opt,\n", + " sigma_opt,\n", + " rho_opt,\n", + "):\n", + " \"\"\"\n", + " Performs stress testing on Heston model parameters to evaluate option price sensitivity.\n", + "\n", + " This test evaluates how the American option price responds to stressed market conditions\n", + " by varying key model parameters. It's designed to:\n", + " 1. Identify potential model vulnerabilities\n", + " 2. Understand price behavior under extreme scenarios\n", + " 3. Support risk management decisions\n", + " 4. Validate model stability across parameter ranges\n", + "\n", + " Args:\n", + " model (HestonModel): Initialized Heston model instance wrapped in ValidMind model object\n", + " strike_price (float): Option strike price\n", + " maturity_date (ql.Date): Option expiration date in QuantLib format\n", + " spot_price (float): Current price of the underlying asset\n", + " v0_opt (float): Initial variance parameter under stress testing\n", + " theta_opt (float): Long-term variance parameter under stress testing\n", + " kappa_opt (float): Mean reversion rate parameter under stress testing\n", + " sigma_opt (float): Volatility of variance parameter under stress testing\n", + " rho_opt (float): Correlation parameter under stress testing\n", + " \"\"\"\n", + " price = model.model.predict_american_option_price(\n", + " strike_price,\n", + " maturity_date,\n", + " spot_price,\n", + " v0_opt,\n", + " theta_opt,\n", + " kappa_opt,\n", + " sigma_opt,\n", + " rho_opt,\n", + " )\n", + "\n", + " return price\n" + ], + "execution_count": null, + "outputs": [], + "id": "f2f01a40" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Rho (correlation) and Theta (long term vol) stress test\n", + "Next, let's evaluates the sensitivity of a model's output to changes in the correlation parameter (rho) and the long-term variance parameter (theta) within a stochastic volatility framework." + ], + "id": "31fcbe9c" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.Stressing:TheRhoAndThetaParameters\",\n", + " inputs = {\n", + " \"model\": hm_model,\n", + " },\n", + " param_grid={\n", + " \"strike_price\": [460],\n", + " \"maturity_date\": [maturity_date],\n", + " \"spot_price\": [option_params[\"spot_price\"]],\n", + " \"v0_opt\": [v0_opt],\n", + " \"theta_opt\": list(np.linspace(0.1, theta_opt+0.4, 5)),\n", + " \"kappa_opt\": [kappa_opt],\n", + " \"sigma_opt\": [sigma_opt],\n", + " \"rho_opt\":list(np.linspace(rho_opt-0.2, rho_opt+0.2, 5))\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "6119b5d9" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Sigma stress test\n", + "Let's evaluates the sensitivity of a model's output to changes in the volatility parameter, sigma. This test is crucial for understanding how variations in market volatility impact the model's valuation of financial instruments, particularly options." + ], + "id": "be39cb3a" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.Stressing:TheSigmaParameter\",\n", + " inputs = {\n", + " \"model\": hm_model,\n", + " },\n", + " param_grid={\n", + " \"strike_price\": [460],\n", + " \"maturity_date\": [maturity_date],\n", + " \"spot_price\": [option_params[\"spot_price\"]],\n", + " \"v0_opt\": [v0_opt],\n", + " \"theta_opt\": [theta_opt],\n", + " \"kappa_opt\": [kappa_opt],\n", + " \"sigma_opt\": list(np.linspace(0.1, sigma_opt+0.6, 5)),\n", + " \"rho_opt\": [rho_opt]\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "0dc189b7" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Stress kappa\n", + "Let's evaluates the sensitivity of a model's output to changes in the kappa parameter, which is a mean reversion rate in stochastic volatility models." + ], + "id": "173a5294" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.Stressing:TheKappaParameter\",\n", + " inputs = {\n", + " \"model\": hm_model,\n", + " },\n", + " param_grid={\n", + " \"strike_price\": [460],\n", + " \"maturity_date\": [maturity_date],\n", + " \"spot_price\": [option_params[\"spot_price\"]],\n", + " \"v0_opt\": [v0_opt],\n", + " \"theta_opt\": [theta_opt],\n", + " \"kappa_opt\": list(np.linspace(kappa_opt, kappa_opt+0.2, 5)),\n", + " \"sigma_opt\": [sigma_opt],\n", + " \"rho_opt\": [rho_opt]\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "dae9714f" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Stress theta\n", + "Let's evaluates the sensitivity of a model's output to changes in the parameter theta, which represents the long-term variance in a stochastic volatility model." + ], + "id": "b4d1d968" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.Stressing:TheThetaParameter\",\n", + " inputs = {\n", + " \"model\": hm_model,\n", + " },\n", + " param_grid={\n", + " \"strike_price\": [460],\n", + " \"maturity_date\": [maturity_date],\n", + " \"spot_price\": [option_params[\"spot_price\"]],\n", + " \"v0_opt\": [v0_opt],\n", + " \"theta_opt\": list(np.linspace(0.1, theta_opt+0.9, 5)),\n", + " \"kappa_opt\": [kappa_opt],\n", + " \"sigma_opt\": [sigma_opt],\n", + " \"rho_opt\": [rho_opt]\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "e68df3db" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Stress rho\n", + "Let's evaluates the sensitivity of a model's output to changes in the correlation parameter, rho, within a stochastic volatility (SV) model framework. This test is crucial for understanding how variations in rho, which represents the correlation between the asset price and its volatility, impact the model's valuation output." + ], + "id": "32e70456" + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_test(\n", + " \"my_custom_tests.Stressing:TheRhoParameter\",\n", + " inputs = {\n", + " \"model\": hm_model,\n", + " },\n", + " param_grid={\n", + " \"strike_price\": [460],\n", + " \"maturity_date\": [maturity_date],\n", + " \"spot_price\": [option_params[\"spot_price\"]],\n", + " \"v0_opt\": [v0_opt],\n", + " \"theta_opt\": [theta_opt],\n", + " \"kappa_opt\": [kappa_opt],\n", + " \"sigma_opt\": [sigma_opt],\n", + " \"rho_opt\": list(np.linspace(rho_opt-0.2, rho_opt+0.2, 5))\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [], + "id": "b5ca3fc2" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "<a id='toc6_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "<a id='toc6_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ], + "id": "892c5347" + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-de5d1e182b09403abddabc2850f2dd05" + } + ], + "metadata": { + "kernelspec": { + "display_name": "validmind-1QuffXMV-py3.10", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.14" + } + }, + "nbformat": 4, + "nbformat_minor": 5 } diff --git a/site/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb b/site/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb index 3da618e570..4f912501fe 100644 --- a/site/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb +++ b/site/notebooks/use_cases/code_explainer/quickstart_code_explainer_demo.ipynb @@ -1,882 +1,888 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Quickstart for model code documentation\n", - "\n", - "Welcome! This notebook demonstrates how to use the ValidMind code explainer to automatically generate comprehensive documentation for your codebase. The code explainer analyzes your source code and provides detailed explanations across various aspects of your implementation.\n", - "\n", - "<a id='toc1__'></a>\n", - "\n", - "## About Code Explainer\n", - "The ValidMind code explainer is a powerful tool that automatically analyzes your source code and generates comprehensive documentation. It helps you:\n", - "\n", - "- Understand the structure and organization of your codebase\n", - "- Document dependencies and environment setup\n", - "- Explain data processing and model implementation details\n", - "- Document training, evaluation, and inference pipelines\n", - "- Track configuration, testing, and security measures\n", - "\n", - "This tool is particularly useful for:\n", - "- Onboarding new team members\n", - "- Maintaining up-to-date documentation\n", - "- Ensuring code quality and best practices\n", - "- Facilitating code reviews and audits" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About Code Explainer](#toc1__) \n", - "- [About ValidMind](#toc2__) \n", - " - [Before you begin](#toc2_1__) \n", - " - [New to ValidMind?](#toc2_2__) \n", - " - [Key concepts](#toc2_3__) \n", - "- [Setting up](#toc3__) \n", - " - [Install the ValidMind Library](#toc3_1__) \n", - " - [Initialize the ValidMind Library](#toc3_2__) \n", - " - [Register sample model](#toc3_2_1__) \n", - " - [Apply documentation template](#toc3_2_2__) \n", - " - [Get your code snippet](#toc3_2_3__) \n", - " - [Preview the documentation template](#toc3_3__) \n", - "- [Common function](#toc4__) \n", - "- [Default Behavior](#toc5__) \n", - "- [Codebase Overview](#toc6__) \n", - "- [Environment and Dependencies ('environment_setup')](#toc7__) \n", - "- [Data Ingestion and Preprocessing](#toc8__) \n", - "- [Model Implementation Details](#toc9__) \n", - "- [Model Training Pipeline](#toc10__) \n", - "- [Evaluation and Validation Code](#toc11__) \n", - "- [Inference and Scoring Logic](#toc12__) \n", - "- [Configuration and Parameters](#toc13__) \n", - "- [Unit and Integration Testing](#toc14__) \n", - "- [Logging and Monitoring Hooks](#toc15__) \n", - "- [Code and Model Versioning](#toc16__) \n", - "- [Security and Access Control](#toc17__) \n", - "- [Example Runs and Scripts](#toc18__) \n", - "- [Known Issues and Future Improvements](#toc19__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc2_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc2_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc2_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Model Source Code Documentation`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Can't select this template?</b></span>\n", - "<br></br>\n", - "Your organization administrators may need to add it to your template library:\n", - "<ul>\n", - "<li><a href=\"model_source_code_documentation_template.yaml\" style=\"color: #DE257E;\"><b>Download Template YAML</b></a></li>\n", - "<li><a href=\"https://docs.validmind.ai/guide/templates/customize-document-templates.html\" style=\"color: #DE257E;\"><b>Customize Document Templates</b></a></li>\n", - "</ul>\n", - "</div>" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_3__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Common function\n", - "The code above defines two key functions:\n", - "1. A function to read source code from 'customer_churn_full_suite.py' file\n", - "2. An 'explain_code' function that uses ValidMind's experimental agents to analyze and explain code." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "source_code=\"\"\n", - "with open(\"customer_churn_full_suite.py\", \"r\") as f:\n", - " source_code = f.read()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The `vm.experimental.agents.run_task` function is used to execute AI agent tasks.\n", - "\n", - "It requires:\n", - "- task: The type of task to run (e.g. `code_explainer`)\n", - "- input: A dictionary containing task-specific parameters\n", - " - For `code_explainer`, this includes:\n", - " - **source_code** (str): The code to be analyzed\n", - " - **user_instructions** (str): Instructions for how to analyze the code" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def explain_code(content_id: str, user_instructions: str):\n", - " \"\"\"Run code explanation task and log the results.\n", - " By default, the code explainer includes sections for:\n", - " - Main Purpose and Overall Functionality\n", - " - Breakdown of Key Functions or Components\n", - " - Potential Risks or Failure Points \n", - " - Assumptions or Limitations\n", - " If you want default sections, specify user_instructions as an empty string.\n", - " \n", - " Args:\n", - " user_instructions (str): Instructions for how to analyze the code\n", - " content_id (str): ID to use when logging the results\n", - " \n", - " Returns:\n", - " The result object from running the code explanation task\n", - " \"\"\"\n", - " result = vm.experimental.agents.run_task(\n", - " task=\"code_explainer\",\n", - " input={\n", - " \"source_code\": source_code,\n", - " \"user_instructions\": user_instructions\n", - " }\n", - " )\n", - " result.log(content_id=content_id)\n", - " return result" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='defaultBehavior'></a>\n", - "\n", - "<a id='toc5__'></a>\n", - "\n", - "## Default Behavior" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "By default, the code explainer includes sections for:\n", - "- Main Purpose and Overall Functionality\n", - "- Breakdown of Key Functions or Components\n", - "- Potential Risks or Failure Points \n", - "- Assumptions or Limitations\n", - "\n", - "If you want default sections, specify `user_instructions` as an empty string. For example:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = vm.experimental.agents.run_task(\n", - " task=\"code_explainer\",\n", - " input={\n", - " \"source_code\": source_code,\n", - " \"user_instructions\": \"\"\n", - " }\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='overview'></a>\n", - "\n", - "<a id='toc6__'></a>\n", - "\n", - "## Codebase Overview\n", - "\n", - "Let's analyze your codebase structure to understand the main modules, components, entry points and their relationships. We'll also examine the technology stack and frameworks that are being utilized in the implementation." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\"\"\n", - " Please provide a summary of the following bullet points only.\n", - " - Describe the overall structure of the source code repository.\n", - " - Identify main modules, folders, and scripts.\n", - " - Highlight entry points for training, inference, and evaluation.\n", - " - State the main programming languages and frameworks used.\n", - " \"\"\",\n", - " content_id=\"code_structure_summary\"\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\",\n", - " content_id=\"code_structure_summary\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='environment'></a>\n", - "\n", - "<a id='toc7__'></a>\n", - "\n", - "## Environment and Dependencies ('environment_setup')\n", - "Let's document the technical requirements and setup needed to run your code, including Python packages, system dependencies, and environment configuration files. Understanding these requirements is essential for proper development environment setup and consistent deployments across different environments." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\"\"\n", - " Please provide a summary of the following bullet points only.\n", - " - List Python packages and system dependencies (OS, compilers, etc.).\n", - " - Reference environment files (requirements.txt, environment.yml, Dockerfile).\n", - " - Include setup instructions using Conda, virtualenv, or containers.\n", - " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", - " \"\"\",\n", - " content_id=\"setup_instructions\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='data'></a>\n", - "\n", - "<a id='toc8__'></a>\n", - "\n", - "## Data Ingestion and Preprocessing\n", - "Let's document how your code handles data, including data sources, validation procedures, and preprocessing steps. We'll examine the data pipeline architecture, covering everything from initial data loading through feature engineering and quality checks." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\"\"\n", - " Please provide a summary of the following bullet points only.\n", - " - Specify data input formats and sources.\n", - " - Document ingestion, validation, and transformation logic.\n", - " - Explain how raw data is preprocessed and features are generated.\n", - " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections. \"\"\",\n", - " content_id=\"data_handling_notes\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='model'> </a>\n", - "\n", - "<a id='toc9__'></a>\n", - "\n", - "## Model Implementation Details\n", - "Let's document the core implementation details of your model, including its architecture, components, and key algorithms. Understanding the technical implementation is crucial for maintenance, debugging, and future improvements to the codebase. We'll examine how theoretical concepts are translated into working code." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\"\"\n", - " Please provide a summary of the following bullet points only.\n", - " - Describe the core model code structure (classes, functions).\n", - " - Link code to theoretical models or equations when applicable.\n", - " - Note custom components like loss functions or feature selectors.\n", - " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", - " \"\"\",\n", - " content_id=\"model_code_description\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='training'></a>\n", - "\n", - "<a id='toc10__'></a>\n", - "\n", - "## Model Training Pipeline\n", - "\n", - "Let's document the training pipeline implementation, including how models are trained, optimized and evaluated. We'll examine the training process workflow, hyperparameter tuning approach, and model checkpointing mechanisms. This section provides insights into how the model learns from data and achieves optimal performance." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\"\"\n", - " Please provide a summary of the following bullet points only.\n", - " - Explain the training process, optimization strategy, and hyperparameters.\n", - " - Describe logging, checkpointing, and early stopping mechanisms.\n", - " - Include references to training config files or tuning logic.\n", - " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", - " \"\"\",\n", - " content_id=\"training_logic_details\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='evaluation'></a>\n", - "\n", - "<a id='toc11__'></a>\n", - "\n", - "## Evaluation and Validation Code\n", - "Let's examine how the model's validation and evaluation code is implemented, including the metrics calculation and validation processes. We'll explore the diagnostic tools and visualization methods used to assess model performance. This section will also cover how validation results are logged and stored for future reference." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\"\"\n", - " Please provide a summary of the following bullet points only.\n", - " - Describe how validation is implemented and metrics are calculated.\n", - " - Include plots and diagnostic tools (e.g., ROC, SHAP, confusion matrix).\n", - " - State how outputs are logged and persisted.\n", - " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", - " \"\"\",\n", - " content_id=\"evaluation_logic_notes\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='inference'></a>\n", - "\n", - "<a id='toc12__'></a>\n", - "\n", - "## Inference and Scoring Logic\n", - "Let's examine how the model performs inference and scoring on new data. This section will cover the implementation details of loading trained models, making predictions, and any required pre/post-processing steps. We'll also look at the APIs and interfaces available for both real-time serving and batch scoring scenarios." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\"\"\n", - " Please provide a summary of the following bullet points only.\n", - " - Detail how the trained model is loaded and used for predictions.\n", - " - Explain I/O formats and APIs for serving or batch scoring.\n", - " - Include any preprocessing/postprocessing logic required.\n", - " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", - " \"\"\",\n", - " content_id=\"inference_mechanism\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='config'></a>\n", - "\n", - "<a id='toc13__'></a>\n", - "\n", - "## Configuration and Parameters\n", - "Let's explore how configuration and parameters are managed in the codebase. We'll examine the configuration files, command-line arguments, environment variables, and other mechanisms used to control model behavior. This section will also cover parameter versioning and how different configurations are tracked across model iterations." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\"\"\n", - " Please provide a summary of the following bullet points only.\n", - " - Describe configuration management (files, CLI args, env vars).\n", - " - Highlight default parameters and override mechanisms.\n", - " - Reference versioning practices for config files.\n", - " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", - " \"\"\",\n", - " content_id=\"config_control_notes\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='testing'></a>\n", - "\n", - "<a id='toc14__'></a>\n", - "\n", - "## Unit and Integration Testing\n", - "Let's examine the testing strategy and implementation in the codebase. We'll analyze the unit tests, integration tests, and testing frameworks used to ensure code quality and reliability. This section will also cover test coverage metrics and continuous integration practices." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\"\"\n", - " Please provide a summary of the following bullet points only.\n", - " - List unit and integration tests and what they cover.\n", - " - Mention testing frameworks and coverage tools used.\n", - " - Explain testing strategy for production-readiness.\n", - " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", - " \"\"\",\n", - " content_id=\"test_strategy_overview\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='logging'></a>\n", - "\n", - "<a id='toc15__'></a>\n", - "\n", - "## Logging and Monitoring Hooks\n", - "Let's analyze how logging and monitoring are implemented in the codebase. We'll examine the logging configuration, monitoring hooks, and key metrics being tracked. This section will also cover any real-time observability integrations and alerting mechanisms in place." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\"\"\n", - " Please provide a summary of the following bullet points only.\n", - " - Describe logging configuration and structure.\n", - " - Highlight real-time monitoring or observability integrations.\n", - " - List key events, metrics, or alerts tracked.\n", - " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", - " \"\"\",\n", - " content_id=\"logging_monitoring_notes\"\n", - ")\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='versioning'></a>\n", - "\n", - "<a id='toc16__'></a>\n", - "\n", - "## Code and Model Versioning\n", - "Let's examine how code and model versioning is managed in the codebase. This section will cover version control practices, including Git workflows and model artifact versioning tools like DVC or MLflow. We'll also look at how versioning integrates with the CI/CD pipeline." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\"\"\n", - " Please provide a summary of the following bullet points only.\n", - " - Describe Git usage, branching, tagging, and commit standards.\n", - " - Include model artifact versioning practices (e.g., DVC, MLflow).\n", - " - Reference any automation in CI/CD.\n", - " Please remove the following sections: \n", - " - Potential Risks or Failure Points\n", - " - Assumptions or Limitations\n", - " - Breakdown of Key Functions or Components\n", - " Please don't add any other sections.\n", - " \"\"\",\n", - " content_id=\"version_tracking_description\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='security'></a>\n", - "\n", - "<a id='toc17__'></a>\n", - "\n", - "## Security and Access Control\n", - "Let's analyze the security and access control measures implemented in the codebase. We'll examine how sensitive data and code are protected through access controls, encryption, and compliance measures. Additionally, we'll review secure deployment practices and any specific handling of PII data." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\"\"\n", - " Please provide a summary of the following bullet points only.\n", - " - Document access controls for source code and data.\n", - " - Include any encryption, PII handling, or compliance measures.\n", - " - Mention secure deployment practices.\n", - " Please remove the following sections: \n", - " - Potential Risks or Failure Points\n", - " - Assumptions or Limitations\n", - " - Breakdown of Key Functions or Components\n", - " Please don't add any other sections.\n", - " \"\"\",\n", - " content_id=\"security_policies_notes\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='examples'></a>\n", - "\n", - "<a id='toc18__'></a>\n", - "\n", - "## Example Runs and Scripts\n", - "Let's explore example runs and scripts that demonstrate how to use this codebase in practice. We'll look at working examples, command-line usage, and sample notebooks that showcase the core functionality. This section will also point to demo datasets and test scenarios that can help new users get started quickly." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\"\"\n", - " Please provide a summary of the following bullet points only.\n", - " - Provide working script examples.\n", - " - Include CLI usage instructions or sample notebooks.\n", - " - Link to demo datasets or test scenarios.\n", - " Please remove the following sections: \n", - " - Potential Risks or Failure Points\n", - " - Assumptions or Limitations\n", - " - Breakdown of Key Functions or Components\n", - " Please don't add any other sections.\n", - " \"\"\",\n", - " content_id=\"runnable_examples\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='issues'></a>\n", - "\n", - "<a id='toc19__'></a>\n", - "\n", - "## Known Issues and Future Improvements\n", - "Let's examine the current limitations and areas for improvement in the codebase. This section will document known technical debt, bugs, and feature gaps that need to be addressed. We'll also outline proposed enhancements and reference any existing tickets or GitHub issues tracking these improvements." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = explain_code(\n", - " user_instructions=\"\"\"\n", - " Please provide a summary of the following bullet points only.\n", - " - List current limitations or technical debt.\n", - " - Outline proposed enhancements or refactors.\n", - " - Reference relevant tickets, GitHub issues, or roadmap items.\n", - " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", - " \"\"\",\n", - " content_id=\"issues_and_improvements_log\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "copyright-72ed6e2a48984af3aca5888b96d1f6b6", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "validmind-1QuffXMV-py3.11", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.11.9" - } - }, - "nbformat": 4, - "nbformat_minor": 4 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Quickstart for model code documentation\n", + "\n", + "Welcome! This notebook demonstrates how to use the ValidMind code explainer to automatically generate comprehensive documentation for your codebase. The code explainer analyzes your source code and provides detailed explanations across various aspects of your implementation.\n", + "\n", + "<a id='toc1__'></a>\n", + "\n", + "## About Code Explainer\n", + "The ValidMind code explainer is a powerful tool that automatically analyzes your source code and generates comprehensive documentation. It helps you:\n", + "\n", + "- Understand the structure and organization of your codebase\n", + "- Document dependencies and environment setup\n", + "- Explain data processing and model implementation details\n", + "- Document training, evaluation, and inference pipelines\n", + "- Track configuration, testing, and security measures\n", + "\n", + "This tool is particularly useful for:\n", + "- Onboarding new team members\n", + "- Maintaining up-to-date documentation\n", + "- Ensuring code quality and best practices\n", + "- Facilitating code reviews and audits" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About Code Explainer](#toc1__) \n", + "- [About ValidMind](#toc2__) \n", + " - [Before you begin](#toc2_1__) \n", + " - [New to ValidMind?](#toc2_2__) \n", + " - [Key concepts](#toc2_3__) \n", + "- [Setting up](#toc3__) \n", + " - [Install the ValidMind Library](#toc3_1__) \n", + " - [Initialize the ValidMind Library](#toc3_2__) \n", + " - [Register sample model](#toc3_2_1__) \n", + " - [Apply documentation template](#toc3_2_2__) \n", + " - [Get your code snippet](#toc3_2_3__) \n", + " - [Preview the documentation template](#toc3_3__) \n", + "- [Common function](#toc4__) \n", + "- [Default Behavior](#toc5__) \n", + "- [Codebase Overview](#toc6__) \n", + "- [Environment and Dependencies ('environment_setup')](#toc7__) \n", + "- [Data Ingestion and Preprocessing](#toc8__) \n", + "- [Model Implementation Details](#toc9__) \n", + "- [Model Training Pipeline](#toc10__) \n", + "- [Evaluation and Validation Code](#toc11__) \n", + "- [Inference and Scoring Logic](#toc12__) \n", + "- [Configuration and Parameters](#toc13__) \n", + "- [Unit and Integration Testing](#toc14__) \n", + "- [Logging and Monitoring Hooks](#toc15__) \n", + "- [Code and Model Versioning](#toc16__) \n", + "- [Security and Access Control](#toc17__) \n", + "- [Example Runs and Scripts](#toc18__) \n", + "- [Known Issues and Future Improvements](#toc19__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc2_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc2_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", + "\n", + "<a id='toc2_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Model Source Code Documentation`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Can't select this template?</b></span>\n", + "<br></br>\n", + "Your organization administrators may need to add it to your template library:\n", + "<ul>\n", + "<li><a href=\"model_source_code_documentation_template.yaml\" style=\"color: #DE257E;\"><b>Download Template YAML</b></a></li>\n", + "<li><a href=\"https://docs.validmind.ai/guide/templates/customize-document-templates.html\" style=\"color: #DE257E;\"><b>Customize Document Templates</b></a></li>\n", + "</ul>\n", + "</div>" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_3__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Common function\n", + "The code above defines two key functions:\n", + "1. A function to read source code from 'customer_churn_full_suite.py' file\n", + "2. An 'explain_code' function that uses ValidMind's experimental agents to analyze and explain code." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "source_code=\"\"\n", + "with open(\"customer_churn_full_suite.py\", \"r\") as f:\n", + " source_code = f.read()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The `vm.experimental.agents.run_task` function is used to execute AI agent tasks.\n", + "\n", + "It requires:\n", + "- task: The type of task to run (e.g. `code_explainer`)\n", + "- input: A dictionary containing task-specific parameters\n", + " - For `code_explainer`, this includes:\n", + " - **source_code** (str): The code to be analyzed\n", + " - **user_instructions** (str): Instructions for how to analyze the code" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "def explain_code(content_id: str, user_instructions: str):\n", + " \"\"\"Run code explanation task and log the results.\n", + " By default, the code explainer includes sections for:\n", + " - Main Purpose and Overall Functionality\n", + " - Breakdown of Key Functions or Components\n", + " - Potential Risks or Failure Points \n", + " - Assumptions or Limitations\n", + " If you want default sections, specify user_instructions as an empty string.\n", + " \n", + " Args:\n", + " user_instructions (str): Instructions for how to analyze the code\n", + " content_id (str): ID to use when logging the results\n", + " \n", + " Returns:\n", + " The result object from running the code explanation task\n", + " \"\"\"\n", + " result = vm.experimental.agents.run_task(\n", + " task=\"code_explainer\",\n", + " input={\n", + " \"source_code\": source_code,\n", + " \"user_instructions\": user_instructions\n", + " }\n", + " )\n", + " result.log(content_id=content_id)\n", + " return result" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='defaultBehavior'></a>\n", + "\n", + "<a id='toc5__'></a>\n", + "\n", + "## Default Behavior" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "By default, the code explainer includes sections for:\n", + "- Main Purpose and Overall Functionality\n", + "- Breakdown of Key Functions or Components\n", + "- Potential Risks or Failure Points \n", + "- Assumptions or Limitations\n", + "\n", + "If you want default sections, specify `user_instructions` as an empty string. For example:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = vm.experimental.agents.run_task(\n", + " task=\"code_explainer\",\n", + " input={\n", + " \"source_code\": source_code,\n", + " \"user_instructions\": \"\"\n", + " }\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='overview'></a>\n", + "\n", + "<a id='toc6__'></a>\n", + "\n", + "## Codebase Overview\n", + "\n", + "Let's analyze your codebase structure to understand the main modules, components, entry points and their relationships. We'll also examine the technology stack and frameworks that are being utilized in the implementation." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\"\"\n", + " Please provide a summary of the following bullet points only.\n", + " - Describe the overall structure of the source code repository.\n", + " - Identify main modules, folders, and scripts.\n", + " - Highlight entry points for training, inference, and evaluation.\n", + " - State the main programming languages and frameworks used.\n", + " \"\"\",\n", + " content_id=\"code_structure_summary\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\",\n", + " content_id=\"code_structure_summary\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='environment'></a>\n", + "\n", + "<a id='toc7__'></a>\n", + "\n", + "## Environment and Dependencies ('environment_setup')\n", + "Let's document the technical requirements and setup needed to run your code, including Python packages, system dependencies, and environment configuration files. Understanding these requirements is essential for proper development environment setup and consistent deployments across different environments." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\"\"\n", + " Please provide a summary of the following bullet points only.\n", + " - List Python packages and system dependencies (OS, compilers, etc.).\n", + " - Reference environment files (requirements.txt, environment.yml, Dockerfile).\n", + " - Include setup instructions using Conda, virtualenv, or containers.\n", + " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", + " \"\"\",\n", + " content_id=\"setup_instructions\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='data'></a>\n", + "\n", + "<a id='toc8__'></a>\n", + "\n", + "## Data Ingestion and Preprocessing\n", + "Let's document how your code handles data, including data sources, validation procedures, and preprocessing steps. We'll examine the data pipeline architecture, covering everything from initial data loading through feature engineering and quality checks." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\"\"\n", + " Please provide a summary of the following bullet points only.\n", + " - Specify data input formats and sources.\n", + " - Document ingestion, validation, and transformation logic.\n", + " - Explain how raw data is preprocessed and features are generated.\n", + " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections. \"\"\",\n", + " content_id=\"data_handling_notes\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='model'> </a>\n", + "\n", + "<a id='toc9__'></a>\n", + "\n", + "## Model Implementation Details\n", + "Let's document the core implementation details of your model, including its architecture, components, and key algorithms. Understanding the technical implementation is crucial for maintenance, debugging, and future improvements to the codebase. We'll examine how theoretical concepts are translated into working code." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\"\"\n", + " Please provide a summary of the following bullet points only.\n", + " - Describe the core model code structure (classes, functions).\n", + " - Link code to theoretical models or equations when applicable.\n", + " - Note custom components like loss functions or feature selectors.\n", + " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", + " \"\"\",\n", + " content_id=\"model_code_description\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='training'></a>\n", + "\n", + "<a id='toc10__'></a>\n", + "\n", + "## Model Training Pipeline\n", + "\n", + "Let's document the training pipeline implementation, including how models are trained, optimized and evaluated. We'll examine the training process workflow, hyperparameter tuning approach, and model checkpointing mechanisms. This section provides insights into how the model learns from data and achieves optimal performance." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\"\"\n", + " Please provide a summary of the following bullet points only.\n", + " - Explain the training process, optimization strategy, and hyperparameters.\n", + " - Describe logging, checkpointing, and early stopping mechanisms.\n", + " - Include references to training config files or tuning logic.\n", + " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", + " \"\"\",\n", + " content_id=\"training_logic_details\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='evaluation'></a>\n", + "\n", + "<a id='toc11__'></a>\n", + "\n", + "## Evaluation and Validation Code\n", + "Let's examine how the model's validation and evaluation code is implemented, including the metrics calculation and validation processes. We'll explore the diagnostic tools and visualization methods used to assess model performance. This section will also cover how validation results are logged and stored for future reference." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\"\"\n", + " Please provide a summary of the following bullet points only.\n", + " - Describe how validation is implemented and metrics are calculated.\n", + " - Include plots and diagnostic tools (e.g., ROC, SHAP, confusion matrix).\n", + " - State how outputs are logged and persisted.\n", + " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", + " \"\"\",\n", + " content_id=\"evaluation_logic_notes\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='inference'></a>\n", + "\n", + "<a id='toc12__'></a>\n", + "\n", + "## Inference and Scoring Logic\n", + "Let's examine how the model performs inference and scoring on new data. This section will cover the implementation details of loading trained models, making predictions, and any required pre/post-processing steps. We'll also look at the APIs and interfaces available for both real-time serving and batch scoring scenarios." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\"\"\n", + " Please provide a summary of the following bullet points only.\n", + " - Detail how the trained model is loaded and used for predictions.\n", + " - Explain I/O formats and APIs for serving or batch scoring.\n", + " - Include any preprocessing/postprocessing logic required.\n", + " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", + " \"\"\",\n", + " content_id=\"inference_mechanism\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='config'></a>\n", + "\n", + "<a id='toc13__'></a>\n", + "\n", + "## Configuration and Parameters\n", + "Let's explore how configuration and parameters are managed in the codebase. We'll examine the configuration files, command-line arguments, environment variables, and other mechanisms used to control model behavior. This section will also cover parameter versioning and how different configurations are tracked across model iterations." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\"\"\n", + " Please provide a summary of the following bullet points only.\n", + " - Describe configuration management (files, CLI args, env vars).\n", + " - Highlight default parameters and override mechanisms.\n", + " - Reference versioning practices for config files.\n", + " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", + " \"\"\",\n", + " content_id=\"config_control_notes\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='testing'></a>\n", + "\n", + "<a id='toc14__'></a>\n", + "\n", + "## Unit and Integration Testing\n", + "Let's examine the testing strategy and implementation in the codebase. We'll analyze the unit tests, integration tests, and testing frameworks used to ensure code quality and reliability. This section will also cover test coverage metrics and continuous integration practices." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\"\"\n", + " Please provide a summary of the following bullet points only.\n", + " - List unit and integration tests and what they cover.\n", + " - Mention testing frameworks and coverage tools used.\n", + " - Explain testing strategy for production-readiness.\n", + " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", + " \"\"\",\n", + " content_id=\"test_strategy_overview\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='logging'></a>\n", + "\n", + "<a id='toc15__'></a>\n", + "\n", + "## Logging and Monitoring Hooks\n", + "Let's analyze how logging and monitoring are implemented in the codebase. We'll examine the logging configuration, monitoring hooks, and key metrics being tracked. This section will also cover any real-time observability integrations and alerting mechanisms in place." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\"\"\n", + " Please provide a summary of the following bullet points only.\n", + " - Describe logging configuration and structure.\n", + " - Highlight real-time monitoring or observability integrations.\n", + " - List key events, metrics, or alerts tracked.\n", + " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", + " \"\"\",\n", + " content_id=\"logging_monitoring_notes\"\n", + ")\n" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='versioning'></a>\n", + "\n", + "<a id='toc16__'></a>\n", + "\n", + "## Code and Model Versioning\n", + "Let's examine how code and model versioning is managed in the codebase. This section will cover version control practices, including Git workflows and model artifact versioning tools like DVC or MLflow. We'll also look at how versioning integrates with the CI/CD pipeline." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\"\"\n", + " Please provide a summary of the following bullet points only.\n", + " - Describe Git usage, branching, tagging, and commit standards.\n", + " - Include model artifact versioning practices (e.g., DVC, MLflow).\n", + " - Reference any automation in CI/CD.\n", + " Please remove the following sections: \n", + " - Potential Risks or Failure Points\n", + " - Assumptions or Limitations\n", + " - Breakdown of Key Functions or Components\n", + " Please don't add any other sections.\n", + " \"\"\",\n", + " content_id=\"version_tracking_description\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='security'></a>\n", + "\n", + "<a id='toc17__'></a>\n", + "\n", + "## Security and Access Control\n", + "Let's analyze the security and access control measures implemented in the codebase. We'll examine how sensitive data and code are protected through access controls, encryption, and compliance measures. Additionally, we'll review secure deployment practices and any specific handling of PII data." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\"\"\n", + " Please provide a summary of the following bullet points only.\n", + " - Document access controls for source code and data.\n", + " - Include any encryption, PII handling, or compliance measures.\n", + " - Mention secure deployment practices.\n", + " Please remove the following sections: \n", + " - Potential Risks or Failure Points\n", + " - Assumptions or Limitations\n", + " - Breakdown of Key Functions or Components\n", + " Please don't add any other sections.\n", + " \"\"\",\n", + " content_id=\"security_policies_notes\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='examples'></a>\n", + "\n", + "<a id='toc18__'></a>\n", + "\n", + "## Example Runs and Scripts\n", + "Let's explore example runs and scripts that demonstrate how to use this codebase in practice. We'll look at working examples, command-line usage, and sample notebooks that showcase the core functionality. This section will also point to demo datasets and test scenarios that can help new users get started quickly." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\"\"\n", + " Please provide a summary of the following bullet points only.\n", + " - Provide working script examples.\n", + " - Include CLI usage instructions or sample notebooks.\n", + " - Link to demo datasets or test scenarios.\n", + " Please remove the following sections: \n", + " - Potential Risks or Failure Points\n", + " - Assumptions or Limitations\n", + " - Breakdown of Key Functions or Components\n", + " Please don't add any other sections.\n", + " \"\"\",\n", + " content_id=\"runnable_examples\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='issues'></a>\n", + "\n", + "<a id='toc19__'></a>\n", + "\n", + "## Known Issues and Future Improvements\n", + "Let's examine the current limitations and areas for improvement in the codebase. This section will document known technical debt, bugs, and feature gaps that need to be addressed. We'll also outline proposed enhancements and reference any existing tickets or GitHub issues tracking these improvements." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = explain_code(\n", + " user_instructions=\"\"\"\n", + " Please provide a summary of the following bullet points only.\n", + " - List current limitations or technical debt.\n", + " - Outline proposed enhancements or refactors.\n", + " - Reference relevant tickets, GitHub issues, or roadmap items.\n", + " Please remove Potential Risks or Failure Points and Assumptions or Limitations sections. Please don't add any other sections.\n", + " \"\"\",\n", + " content_id=\"issues_and_improvements_log\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-72ed6e2a48984af3aca5888b96d1f6b6" + } + ], + "metadata": { + "kernelspec": { + "display_name": "validmind-1QuffXMV-py3.11", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.9" + } + }, + "nbformat": 4, + "nbformat_minor": 4 } diff --git a/site/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb b/site/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb index 8ef764021e..50f2f0202e 100644 --- a/site/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb +++ b/site/notebooks/use_cases/credit_risk/application_scorecard_executive.ipynb @@ -1,391 +1,397 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Document an application scorecard model\n", - "\n", - "Build and document an *application scorecard model* with the ValidMind Library by using Kaggle's [Lending Club](https://www.kaggle.com/datasets/devanshi23/loan-data-2007-2014/data) sample dataset to build a simple application scorecard.\n", - "\n", - "An application scorecard model is a type of statistical model used in credit scoring to evaluate the creditworthiness of potential borrowers by generating a score based on various characteristics of an applicant — such as credit history, income, employment status, and other relevant financial data. \n", - "\n", - "- This score helps lenders make decisions about whether to approve or reject loan applications, as well as determine the terms of the loan, including interest rates and credit limits. \n", - "- Application scorecard models enable lenders to manage risk efficiently while making the loan application process faster and more transparent for applicants.\n", - "\n", - "This interactive notebook provides a step-by-step guide for loading a demo dataset, preprocessing the raw data, training a model for testing, setting up test inputs, initializing the required ValidMind objects, running the test, and then logging the results to ValidMind." - ] + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Document an application scorecard model\n", + "\n", + "Build and document an *application scorecard model* with the ValidMind Library by using Kaggle's [Lending Club](https://www.kaggle.com/datasets/devanshi23/loan-data-2007-2014/data) sample dataset to build a simple application scorecard.\n", + "\n", + "An application scorecard model is a type of statistical model used in credit scoring to evaluate the creditworthiness of potential borrowers by generating a score based on various characteristics of an applicant — such as credit history, income, employment status, and other relevant financial data. \n", + "\n", + "- This score helps lenders make decisions about whether to approve or reject loan applications, as well as determine the terms of the loan, including interest rates and credit limits. \n", + "- Application scorecard models enable lenders to manage risk efficiently while making the loan application process faster and more transparent for applicants.\n", + "\n", + "This interactive notebook provides a step-by-step guide for loading a demo dataset, preprocessing the raw data, training a model for testing, setting up test inputs, initializing the required ValidMind objects, running the test, and then logging the results to ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + "- [Document the model](#toc3__) \n", + "- [Next steps](#toc4__) \n", + " - [Work with your model documentation](#toc4_1__) \n", + " - [Discover more learning resources](#toc4_2__) \n", + "- [Upgrade ValidMind](#toc5__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.\n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **[template]{.smallcaps}**, select `Credit Risk Scorecard`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host = \"...\",\n", + " # api_key = \"...\",\n", + " # api_secret = \"...\",\n", + " # model = \"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Document the model" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.datasets.credit_risk import lending_club\n", + "from validmind.utils import preview_test_config\n", + "\n", + "scorecard = lending_club.load_scorecard()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "lending_club.init_vm_objects(scorecard)" + ], + "execution_count": 4, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test_config = lending_club.load_test_config(scorecard)\n", + "preview_test_config(test_config)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.run_documentation_tests(config=test_config)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "<a id='toc4_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "3. Expand the following sections and take a look around:\n", + "\n", + " - **2. Data Preparation**\n", + " - **3. Model Development**\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation (hint: some of the tests in **2.3. Feature Selection and Engineering** look like they need some attention), view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "<a id='toc4_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-382e83e3fe1d4928ae90c3917480d27d" + } + ], + "metadata": { + "kernelspec": { + "display_name": "validmind-eEL8LtKG-py3.10", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.13" + } }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - "- [Document the model](#toc3__) \n", - "- [Next steps](#toc4__) \n", - " - [Work with your model documentation](#toc4_1__) \n", - " - [Discover more learning resources](#toc4_2__) \n", - "- [Upgrade ValidMind](#toc5__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.\n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - "- **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - "- **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - "- **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - "- **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: The [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **[template]{.smallcaps}**, select `Credit Risk Scorecard`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host = \"...\",\n", - " # api_key = \"...\",\n", - " # api_secret = \"...\",\n", - " # model = \"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Document the model" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.datasets.credit_risk import lending_club\n", - "from validmind.utils import preview_test_config\n", - "\n", - "scorecard = lending_club.load_scorecard()" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [], - "source": [ - "lending_club.init_vm_objects(scorecard)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test_config = lending_club.load_test_config(scorecard)\n", - "preview_test_config(test_config)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.run_documentation_tests(config=test_config)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "<a id='toc4_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "3. Expand the following sections and take a look around:\n", - "\n", - " - **2. Data Preparation**\n", - " - **3. Model Development**\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation (hint: some of the tests in **2.3. Feature Selection and Engineering** look like they need some attention), view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "<a id='toc4_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-382e83e3fe1d4928ae90c3917480d27d", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "validmind-eEL8LtKG-py3.10", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 2 + "nbformat": 4, + "nbformat_minor": 2 } diff --git a/site/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb b/site/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb index 73e1726f6a..2b857a03bc 100644 --- a/site/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb +++ b/site/notebooks/use_cases/credit_risk/application_scorecard_full_suite.ipynb @@ -1,916 +1,922 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Document an application scorecard model\n", - "\n", - "Build and document an *application scorecard model* with the ValidMind Library by using Kaggle's [Lending Club](https://www.kaggle.com/datasets/devanshi23/loan-data-2007-2014/data) sample dataset to build a simple application scorecard.\n", - "\n", - "An application scorecard model is a type of statistical model used in credit scoring to evaluate the creditworthiness of potential borrowers by generating a score based on various characteristics of an applicant — such as credit history, income, employment status, and other relevant financial data. \n", - "\n", - "- This score helps lenders make decisions about whether to approve or reject loan applications, as well as determine the terms of the loan, including interest rates and credit limits. \n", - "- Application scorecard models enable lenders to manage risk efficiently while making the loan application process faster and more transparent for applicants.\n", - "\n", - "This interactive notebook provides a step-by-step guide for loading a demo dataset, preprocessing the raw data, training a model for testing, setting up test inputs, initializing the required ValidMind objects, running the test, and then logging the results to ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Initialize the Python environment](#toc2_3__) \n", - " - [Preview the documentation template](#toc2_4__) \n", - "- [Load the sample dataset](#toc3__) \n", - " - [Prepocess the dataset](#toc3_1__) \n", - " - [Feature engineering](#toc3_2__) \n", - "- [Train the model](#toc4__) \n", - " - [Compute probabilities](#toc4_1__) \n", - " - [Compute binary predictions](#toc4_2__) \n", - "- [Document the model](#toc5__) \n", - " - [Initialize the ValidMind datasets](#toc5_1__) \n", - " - [Initialize ValidMind models](#toc5_2__) \n", - " - [Assign prediction values and probabilities to the datasets](#toc5_3__) \n", - " - [Compute credit risk scores](#toc5_4__) \n", - " - [Adding custom context to the LLM descriptions](#toc5_5__) \n", - " - [Run the full suite of tests](#toc5_6__) \n", - "- [Next steps](#toc6__) \n", - " - [Work with your documentation](#toc6_1__) \n", - " - [Discover more learning resources](#toc6_2__) \n", - "- [Upgrade ValidMind](#toc7__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.\n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - "- **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - "- **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - "- **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - "- **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: The [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Credit Risk Scorecard`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host = \"...\",\n", - " # api_key = \"...\",\n", - " # api_secret = \"...\",\n", - " # model = \"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Next, let's import the necessary libraries and set up your Python environment for data analysis:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import xgboost as xgb\n", - "from sklearn.ensemble import RandomForestClassifier\n", - "\n", - "from validmind.datasets.credit_risk import lending_club\n", - "\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_4__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Load the sample dataset\n", - "\n", - "The sample dataset used here is provided by the ValidMind library. To be able to use it, you'll need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "df = lending_club.load_data(source=\"offline\")\n", - "\n", - "df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_1__'></a>\n", - "\n", - "### Prepocess the dataset\n", - "\n", - "In the preprocessing step we perform a number of operations to get ready for building our application scorecard. \n", - "\n", - "We use the `lending_club.preprocess` to simplify preprocessing. This function performs the following operations: \n", - "- Filters the dataset to include only loans for debt consolidation or credit card purposes\n", - "- Removes loans classified under the riskier grades \"F\" and \"G\"\n", - "- Excludes uncommon home ownership types and standardizes employment length and loan terms into numerical formats\n", - "- Discards unnecessary fields and any entries with missing information to maintain a clean and robust dataset for modeling" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "preprocess_df = lending_club.preprocess(df)\n", - "preprocess_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_2__'></a>\n", - "\n", - "### Feature engineering\n", - "\n", - "In the feature engineering phase, we apply specific transformations to optimize the dataset for predictive modeling in our application scorecard. \n", - "\n", - "Using the `ending_club.feature_engineering()` function, we conduct the following operations:\n", - "- **WoE encoding**: Converts both numerical and categorical features into Weight of Evidence (WoE) values. WoE is a statistical measure used in scorecard modeling that quantifies the relationship between a predictor variable and the binary target variable. It calculates the ratio of the distribution of good outcomes to the distribution of bad outcomes for each category or bin of a feature. This transformation helps to ensure that the features are predictive and consistent in their contribution to the model.\n", - "- **Integration of WoE bins**: Ensures that the WoE transformed values are integrated throughout the dataset, replacing the original feature values while excluding the target variable from this transformation. This transformation is used to maintain a consistent scale and impact of each variable within the model, which helps make the predictions more stable and accurate." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "fe_df = lending_club.feature_engineering(preprocess_df)\n", - "fe_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Train the model\n", - "\n", - "In this section, we focus on constructing and refining our predictive model. \n", - "- We begin by dividing our data, which is based on Weight of Evidence (WoE) features, into training and testing sets (`train_df`, `test_df`). \n", - "- With `lending_club.split`, we employ a simple random split, randomly allocating data points to each set to ensure a mix of examples in both." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Split the data\n", - "train_df, test_df = lending_club.split(fe_df, test_size=0.2)\n", - "\n", - "x_train = train_df.drop(lending_club.target_column, axis=1)\n", - "y_train = train_df[lending_club.target_column]\n", - "\n", - "x_test = test_df.drop(lending_club.target_column, axis=1)\n", - "y_test = test_df[lending_club.target_column]" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Define the XGBoost model\n", - "xgb_model = xgb.XGBClassifier(\n", - " n_estimators=50, \n", - " random_state=42, \n", - " early_stopping_rounds=10\n", - ")\n", - "xgb_model.set_params(\n", - " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", - ")\n", - "\n", - "# Fit the model\n", - "xgb_model.fit(\n", - " x_train, \n", - " y_train,\n", - " eval_set=[(x_test, y_test)],\n", - " verbose=False\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Define the Random Forest model\n", - "rf_model = RandomForestClassifier(\n", - " n_estimators=50, \n", - " random_state=42,\n", - ")\n", - "\n", - "# Fit the model\n", - "rf_model.fit(x_train, y_train)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Compute probabilities" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_xgb_prob = xgb_model.predict_proba(x_train)[:, 1]\n", - "test_xgb_prob = xgb_model.predict_proba(x_test)[:, 1]\n", - "\n", - "train_rf_prob = rf_model.predict_proba(x_train)[:, 1]\n", - "test_rf_prob = rf_model.predict_proba(x_test)[:, 1]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### Compute binary predictions" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "cut_off_threshold = 0.3\n", - "\n", - "train_xgb_binary_predictions = (train_xgb_prob > cut_off_threshold).astype(int)\n", - "test_xgb_binary_predictions = (test_xgb_prob > cut_off_threshold).astype(int)\n", - "\n", - "train_rf_binary_predictions = (train_rf_prob > cut_off_threshold).astype(int)\n", - "test_rf_binary_predictions = (test_rf_prob > cut_off_threshold).astype(int)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Document the model\n", - "\n", - "To document the model with the ValidMind Library, you'll need to:\n", - "1. Preprocess the raw dataset\n", - "2. Initialize some training and test datasets\n", - "3. Initialize a model object you can use for testing\n", - "4. Run the full suite of tests" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_1__'></a>\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", - "\n", - "This function takes a number of arguments:\n", - "\n", - "- `dataset`: The dataset that you want to provide as input to tests.\n", - "- `input_id`: A unique identifier that allows tracking what inputs are used when running each individual test.\n", - "- `target_column`: A required argument if tests require access to true values. This is the name of the target column in the dataset.\n", - "\n", - "With all datasets ready, you can now initialize the raw, processed, training and test datasets (`raw_df`, `preprocessed_df`, `fe_df`, `train_df` and `test_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_dataset = vm.init_dataset(\n", - " dataset=df,\n", - " input_id=\"raw_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")\n", - "\n", - "vm_preprocess_dataset = vm.init_dataset(\n", - " dataset=preprocess_df,\n", - " input_id=\"preprocess_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")\n", - "\n", - "vm_fe_dataset = vm.init_dataset(\n", - " dataset=fe_df,\n", - " input_id=\"fe_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")\n", - "\n", - "vm_train_ds = vm.init_dataset(\n", - " dataset=train_df,\n", - " input_id=\"train_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")\n", - "\n", - "vm_test_ds = vm.init_dataset(\n", - " dataset=test_df,\n", - " input_id=\"test_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_2__'></a>\n", - "\n", - "### Initialize ValidMind models\n", - "\n", - "You'll also need to initialize ValidMind model objects (`vm_model`) that can be passed to other functions for analysis and tests on the data for our modelS.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model objects with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_xgb_model = vm.init_model(\n", - " xgb_model,\n", - " input_id=\"xgb_model\",\n", - ")\n", - "\n", - "vm_rf_model = vm.init_model(\n", - " rf_model,\n", - " input_id=\"rf_model\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_3__'></a>\n", - "\n", - "### Assign prediction values and probabilities to the datasets\n", - "\n", - "With our model now trained, we'll move on to assigning both the predictive probabilities coming directly from the model's predictions, and the binary prediction after applying the cutoff threshold described in the previous steps. \n", - "- These tasks are achieved through the use of the `assign_predictions()` method associated with the VM `dataset` object.\n", - "- This method links the model's class prediction values and probabilities to our VM train and test datasets." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# XGBoost\n", - "vm_train_ds.assign_predictions(\n", - " model=vm_xgb_model,\n", - " prediction_values=train_xgb_binary_predictions,\n", - " prediction_probabilities=train_xgb_prob,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_xgb_model,\n", - " prediction_values=test_xgb_binary_predictions,\n", - " prediction_probabilities=test_xgb_prob,\n", - ")\n", - "\n", - "# Random Forest\n", - "vm_train_ds.assign_predictions(\n", - " model=vm_rf_model,\n", - " prediction_values=train_rf_binary_predictions,\n", - " prediction_probabilities=train_rf_prob,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_rf_model,\n", - " prediction_values=test_rf_binary_predictions,\n", - " prediction_probabilities=test_rf_prob,\n", - ")\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_4__'></a>\n", - "\n", - "### Compute credit risk scores\n", - "\n", - "In this phase, we translate model predictions into actionable scores using probability estimates generated by our trained model." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_xgb_scores = lending_club.compute_scores(train_xgb_prob)\n", - "test_xgb_scores = lending_club.compute_scores(test_xgb_prob)\n", - "\n", - "# Assign scores to the datasets\n", - "vm_train_ds.add_extra_column(\"xgb_scores\", train_xgb_scores)\n", - "vm_test_ds.add_extra_column(\"xgb_scores\", test_xgb_scores)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_5__'></a>\n", - "\n", - "### Adding custom context to the LLM descriptions\n", - "\n", - "To enable the LLM descriptions context, you need to set the `VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED` environment variable to `1`. This will enable the LLM descriptions context, which will be used to provide additional context to the LLM descriptions. This is a global setting that will affect all tests." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED\"] = \"1\"\n", - "\n", - "context = \"\"\"\n", - "FORMAT FOR THE LLM DESCRIPTIONS: \n", - " **<Test Name>** is designed to <begin with a concise overview of what the test does and its primary purpose, \n", - " extracted from the test description>.\n", - "\n", - " The test operates by <write a paragraph about the test mechanism, explaining how it works and what it measures. \n", - " Include any relevant formulas or methodologies mentioned in the test description.>\n", - "\n", - " The primary advantages of this test include <write a paragraph about the test's strengths and capabilities, \n", - " highlighting what makes it particularly useful for specific scenarios.>\n", - "\n", - " Users should be aware that <write a paragraph about the test's limitations and potential risks. \n", - " Include both technical limitations and interpretation challenges. \n", - " If the test description includes specific signs of high risk, incorporate these here.>\n", - "\n", - " **Key Insights:**\n", - "\n", - " The test results reveal:\n", - "\n", - " - **<insight title>**: <comprehensive description of one aspect of the results>\n", - " - **<insight title>**: <comprehensive description of another aspect>\n", - " ...\n", - "\n", - " Based on these results, <conclude with a brief paragraph that ties together the test results with the test's \n", - " purpose and provides any final recommendations or considerations.>\n", - "\n", - "ADDITIONAL INSTRUCTIONS:\n", - " Present insights in order from general to specific, with each insight as a single bullet point with bold title.\n", - "\n", - " For each metric in the test results, include in the test overview:\n", - " - The metric's purpose and what it measures\n", - " - Its mathematical formula\n", - " - The range of possible values\n", - " - What constitutes good/bad performance\n", - " - How to interpret different values\n", - "\n", - " Each insight should progressively cover:\n", - " 1. Overall scope and distribution\n", - " 2. Complete breakdown of all elements with specific values\n", - " 3. Natural groupings and patterns\n", - " 4. Comparative analysis between datasets/categories\n", - " 5. Stability and variations\n", - " 6. Notable relationships or dependencies\n", - "\n", - " Remember:\n", - " - Keep all insights at the same level (no sub-bullets or nested structures)\n", - " - Make each insight complete and self-contained\n", - " - Include specific numerical values and ranges\n", - " - Cover all elements in the results comprehensively\n", - " - Maintain clear, concise language\n", - " - Use only \"- **Title**: Description\" format for insights\n", - " - Progress naturally from general to specific observations\n", - "\n", - "\"\"\".strip()\n", - "\n", - "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT\"] = context" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_6__'></a>\n", - "\n", - "### Run the full suite of tests\n", - "\n", - "This is where it all comes together: you are now ready to run the documentation tests for the model as defined by the documentation template you looked at earlier.\n", - "\n", - "The [`vm.run_documentation_tests`](https://docs.validmind.ai/validmind/validmind.html#run_documentation_tests) function finds and runs every test specified in the template and then uploads all the documentation and test artifacts that get generated to the ValidMind Platform.\n", - "\n", - "The function requires information about the inputs to use on every test. These inputs can be passed as an `inputs` argument if we want to use the same inputs for all tests. It's also possible to pass a `config` argument that has information about the `params` and `inputs` that each test requires. The `config` parameter is a dictionary with the following structure:\n", - "\n", - "```python\n", - "config = {\n", - " \"<test-id>\": {\n", - " \"params\": {\n", - " \"param1\": \"value1\",\n", - " \"param2\": \"value2\",\n", - " ...\n", - " },\n", - " \"inputs\": {\n", - " \"input1\": \"value1\",\n", - " \"input2\": \"value2\",\n", - " ...\n", - " }\n", - " },\n", - " ...\n", - "}\n", - "```\n", - "\n", - "Each `<test-id>` above corresponds to the test driven block identifiers shown by `vm.preview_template()`. For this model, we will use the default parameters for all tests, but we'll need to specify the input configuration for each one. The method `get_demo_test_config()` below constructs the default input configuration for our demo." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.utils import preview_test_config\n", - "\n", - "test_config = lending_club.get_demo_test_config(x_test, y_test)\n", - "preview_test_config(test_config)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now we can pass the input configuration to `vm.run_documentation_tests()` and run the full suite of tests. The variable `full_suite` then holds the result of these tests." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "full_suite = vm.run_documentation_tests(config=test_config)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "<a id='toc6_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "3. Expand the following sections and take a look around:\n", - "\n", - " - **2. Data Preparation**\n", - " - **3. Model Development**\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation (hint: some of the tests in **2.3. Feature Selection and Engineering** look like they need some attention), view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "<a id='toc6_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-a658e3f1bece47cabc255c03460e255f", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "validmind-eEL8LtKG-py3.10", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 2 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Document an application scorecard model\n", + "\n", + "Build and document an *application scorecard model* with the ValidMind Library by using Kaggle's [Lending Club](https://www.kaggle.com/datasets/devanshi23/loan-data-2007-2014/data) sample dataset to build a simple application scorecard.\n", + "\n", + "An application scorecard model is a type of statistical model used in credit scoring to evaluate the creditworthiness of potential borrowers by generating a score based on various characteristics of an applicant — such as credit history, income, employment status, and other relevant financial data. \n", + "\n", + "- This score helps lenders make decisions about whether to approve or reject loan applications, as well as determine the terms of the loan, including interest rates and credit limits. \n", + "- Application scorecard models enable lenders to manage risk efficiently while making the loan application process faster and more transparent for applicants.\n", + "\n", + "This interactive notebook provides a step-by-step guide for loading a demo dataset, preprocessing the raw data, training a model for testing, setting up test inputs, initializing the required ValidMind objects, running the test, and then logging the results to ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Initialize the Python environment](#toc2_3__) \n", + " - [Preview the documentation template](#toc2_4__) \n", + "- [Load the sample dataset](#toc3__) \n", + " - [Prepocess the dataset](#toc3_1__) \n", + " - [Feature engineering](#toc3_2__) \n", + "- [Train the model](#toc4__) \n", + " - [Compute probabilities](#toc4_1__) \n", + " - [Compute binary predictions](#toc4_2__) \n", + "- [Document the model](#toc5__) \n", + " - [Initialize the ValidMind datasets](#toc5_1__) \n", + " - [Initialize ValidMind models](#toc5_2__) \n", + " - [Assign prediction values and probabilities to the datasets](#toc5_3__) \n", + " - [Compute credit risk scores](#toc5_4__) \n", + " - [Adding custom context to the LLM descriptions](#toc5_5__) \n", + " - [Run the full suite of tests](#toc5_6__) \n", + "- [Next steps](#toc6__) \n", + " - [Work with your documentation](#toc6_1__) \n", + " - [Discover more learning resources](#toc6_2__) \n", + "- [Upgrade ValidMind](#toc7__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.\n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Credit Risk Scorecard`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host = \"...\",\n", + " # api_key = \"...\",\n", + " # api_secret = \"...\",\n", + " # model = \"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Next, let's import the necessary libraries and set up your Python environment for data analysis:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import xgboost as xgb\n", + "from sklearn.ensemble import RandomForestClassifier\n", + "\n", + "from validmind.datasets.credit_risk import lending_club\n", + "\n", + "%matplotlib inline" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_4__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Load the sample dataset\n", + "\n", + "The sample dataset used here is provided by the ValidMind library. To be able to use it, you'll need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "df = lending_club.load_data(source=\"offline\")\n", + "\n", + "df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1__'></a>\n", + "\n", + "### Prepocess the dataset\n", + "\n", + "In the preprocessing step we perform a number of operations to get ready for building our application scorecard. \n", + "\n", + "We use the `lending_club.preprocess` to simplify preprocessing. This function performs the following operations: \n", + "- Filters the dataset to include only loans for debt consolidation or credit card purposes\n", + "- Removes loans classified under the riskier grades \"F\" and \"G\"\n", + "- Excludes uncommon home ownership types and standardizes employment length and loan terms into numerical formats\n", + "- Discards unnecessary fields and any entries with missing information to maintain a clean and robust dataset for modeling" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "preprocess_df = lending_club.preprocess(df)\n", + "preprocess_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2__'></a>\n", + "\n", + "### Feature engineering\n", + "\n", + "In the feature engineering phase, we apply specific transformations to optimize the dataset for predictive modeling in our application scorecard. \n", + "\n", + "Using the `ending_club.feature_engineering()` function, we conduct the following operations:\n", + "- **WoE encoding**: Converts both numerical and categorical features into Weight of Evidence (WoE) values. WoE is a statistical measure used in scorecard modeling that quantifies the relationship between a predictor variable and the binary target variable. It calculates the ratio of the distribution of good outcomes to the distribution of bad outcomes for each category or bin of a feature. This transformation helps to ensure that the features are predictive and consistent in their contribution to the model.\n", + "- **Integration of WoE bins**: Ensures that the WoE transformed values are integrated throughout the dataset, replacing the original feature values while excluding the target variable from this transformation. This transformation is used to maintain a consistent scale and impact of each variable within the model, which helps make the predictions more stable and accurate." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "fe_df = lending_club.feature_engineering(preprocess_df)\n", + "fe_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Train the model\n", + "\n", + "In this section, we focus on constructing and refining our predictive model. \n", + "- We begin by dividing our data, which is based on Weight of Evidence (WoE) features, into training and testing sets (`train_df`, `test_df`). \n", + "- With `lending_club.split`, we employ a simple random split, randomly allocating data points to each set to ensure a mix of examples in both." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Split the data\n", + "train_df, test_df = lending_club.split(fe_df, test_size=0.2)\n", + "\n", + "x_train = train_df.drop(lending_club.target_column, axis=1)\n", + "y_train = train_df[lending_club.target_column]\n", + "\n", + "x_test = test_df.drop(lending_club.target_column, axis=1)\n", + "y_test = test_df[lending_club.target_column]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Define the XGBoost model\n", + "xgb_model = xgb.XGBClassifier(\n", + " n_estimators=50, \n", + " random_state=42, \n", + " early_stopping_rounds=10\n", + ")\n", + "xgb_model.set_params(\n", + " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", + ")\n", + "\n", + "# Fit the model\n", + "xgb_model.fit(\n", + " x_train, \n", + " y_train,\n", + " eval_set=[(x_test, y_test)],\n", + " verbose=False\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Define the Random Forest model\n", + "rf_model = RandomForestClassifier(\n", + " n_estimators=50, \n", + " random_state=42,\n", + ")\n", + "\n", + "# Fit the model\n", + "rf_model.fit(x_train, y_train)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Compute probabilities" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_xgb_prob = xgb_model.predict_proba(x_train)[:, 1]\n", + "test_xgb_prob = xgb_model.predict_proba(x_test)[:, 1]\n", + "\n", + "train_rf_prob = rf_model.predict_proba(x_train)[:, 1]\n", + "test_rf_prob = rf_model.predict_proba(x_test)[:, 1]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### Compute binary predictions" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "cut_off_threshold = 0.3\n", + "\n", + "train_xgb_binary_predictions = (train_xgb_prob > cut_off_threshold).astype(int)\n", + "test_xgb_binary_predictions = (test_xgb_prob > cut_off_threshold).astype(int)\n", + "\n", + "train_rf_binary_predictions = (train_rf_prob > cut_off_threshold).astype(int)\n", + "test_rf_binary_predictions = (test_rf_prob > cut_off_threshold).astype(int)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Document the model\n", + "\n", + "To document the model with the ValidMind Library, you'll need to:\n", + "1. Preprocess the raw dataset\n", + "2. Initialize some training and test datasets\n", + "3. Initialize a model object you can use for testing\n", + "4. Run the full suite of tests" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1__'></a>\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", + "\n", + "This function takes a number of arguments:\n", + "\n", + "- `dataset`: The dataset that you want to provide as input to tests.\n", + "- `input_id`: A unique identifier that allows tracking what inputs are used when running each individual test.\n", + "- `target_column`: A required argument if tests require access to true values. This is the name of the target column in the dataset.\n", + "\n", + "With all datasets ready, you can now initialize the raw, processed, training and test datasets (`raw_df`, `preprocessed_df`, `fe_df`, `train_df` and `test_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_dataset = vm.init_dataset(\n", + " dataset=df,\n", + " input_id=\"raw_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")\n", + "\n", + "vm_preprocess_dataset = vm.init_dataset(\n", + " dataset=preprocess_df,\n", + " input_id=\"preprocess_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")\n", + "\n", + "vm_fe_dataset = vm.init_dataset(\n", + " dataset=fe_df,\n", + " input_id=\"fe_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")\n", + "\n", + "vm_train_ds = vm.init_dataset(\n", + " dataset=train_df,\n", + " input_id=\"train_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")\n", + "\n", + "vm_test_ds = vm.init_dataset(\n", + " dataset=test_df,\n", + " input_id=\"test_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_2__'></a>\n", + "\n", + "### Initialize ValidMind models\n", + "\n", + "You'll also need to initialize ValidMind model objects (`vm_model`) that can be passed to other functions for analysis and tests on the data for our modelS.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model objects with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_xgb_model = vm.init_model(\n", + " xgb_model,\n", + " input_id=\"xgb_model\",\n", + ")\n", + "\n", + "vm_rf_model = vm.init_model(\n", + " rf_model,\n", + " input_id=\"rf_model\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_3__'></a>\n", + "\n", + "### Assign prediction values and probabilities to the datasets\n", + "\n", + "With our model now trained, we'll move on to assigning both the predictive probabilities coming directly from the model's predictions, and the binary prediction after applying the cutoff threshold described in the previous steps. \n", + "- These tasks are achieved through the use of the `assign_predictions()` method associated with the VM `dataset` object.\n", + "- This method links the model's class prediction values and probabilities to our VM train and test datasets." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# XGBoost\n", + "vm_train_ds.assign_predictions(\n", + " model=vm_xgb_model,\n", + " prediction_values=train_xgb_binary_predictions,\n", + " prediction_probabilities=train_xgb_prob,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_xgb_model,\n", + " prediction_values=test_xgb_binary_predictions,\n", + " prediction_probabilities=test_xgb_prob,\n", + ")\n", + "\n", + "# Random Forest\n", + "vm_train_ds.assign_predictions(\n", + " model=vm_rf_model,\n", + " prediction_values=train_rf_binary_predictions,\n", + " prediction_probabilities=train_rf_prob,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_rf_model,\n", + " prediction_values=test_rf_binary_predictions,\n", + " prediction_probabilities=test_rf_prob,\n", + ")\n" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_4__'></a>\n", + "\n", + "### Compute credit risk scores\n", + "\n", + "In this phase, we translate model predictions into actionable scores using probability estimates generated by our trained model." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_xgb_scores = lending_club.compute_scores(train_xgb_prob)\n", + "test_xgb_scores = lending_club.compute_scores(test_xgb_prob)\n", + "\n", + "# Assign scores to the datasets\n", + "vm_train_ds.add_extra_column(\"xgb_scores\", train_xgb_scores)\n", + "vm_test_ds.add_extra_column(\"xgb_scores\", test_xgb_scores)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_5__'></a>\n", + "\n", + "### Adding custom context to the LLM descriptions\n", + "\n", + "To enable the LLM descriptions context, you need to set the `VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED` environment variable to `1`. This will enable the LLM descriptions context, which will be used to provide additional context to the LLM descriptions. This is a global setting that will affect all tests." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import os\n", + "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED\"] = \"1\"\n", + "\n", + "context = \"\"\"\n", + "FORMAT FOR THE LLM DESCRIPTIONS: \n", + " **<Test Name>** is designed to <begin with a concise overview of what the test does and its primary purpose, \n", + " extracted from the test description>.\n", + "\n", + " The test operates by <write a paragraph about the test mechanism, explaining how it works and what it measures. \n", + " Include any relevant formulas or methodologies mentioned in the test description.>\n", + "\n", + " The primary advantages of this test include <write a paragraph about the test's strengths and capabilities, \n", + " highlighting what makes it particularly useful for specific scenarios.>\n", + "\n", + " Users should be aware that <write a paragraph about the test's limitations and potential risks. \n", + " Include both technical limitations and interpretation challenges. \n", + " If the test description includes specific signs of high risk, incorporate these here.>\n", + "\n", + " **Key Insights:**\n", + "\n", + " The test results reveal:\n", + "\n", + " - **<insight title>**: <comprehensive description of one aspect of the results>\n", + " - **<insight title>**: <comprehensive description of another aspect>\n", + " ...\n", + "\n", + " Based on these results, <conclude with a brief paragraph that ties together the test results with the test's \n", + " purpose and provides any final recommendations or considerations.>\n", + "\n", + "ADDITIONAL INSTRUCTIONS:\n", + " Present insights in order from general to specific, with each insight as a single bullet point with bold title.\n", + "\n", + " For each metric in the test results, include in the test overview:\n", + " - The metric's purpose and what it measures\n", + " - Its mathematical formula\n", + " - The range of possible values\n", + " - What constitutes good/bad performance\n", + " - How to interpret different values\n", + "\n", + " Each insight should progressively cover:\n", + " 1. Overall scope and distribution\n", + " 2. Complete breakdown of all elements with specific values\n", + " 3. Natural groupings and patterns\n", + " 4. Comparative analysis between datasets/categories\n", + " 5. Stability and variations\n", + " 6. Notable relationships or dependencies\n", + "\n", + " Remember:\n", + " - Keep all insights at the same level (no sub-bullets or nested structures)\n", + " - Make each insight complete and self-contained\n", + " - Include specific numerical values and ranges\n", + " - Cover all elements in the results comprehensively\n", + " - Maintain clear, concise language\n", + " - Use only \"- **Title**: Description\" format for insights\n", + " - Progress naturally from general to specific observations\n", + "\n", + "\"\"\".strip()\n", + "\n", + "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT\"] = context" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_6__'></a>\n", + "\n", + "### Run the full suite of tests\n", + "\n", + "This is where it all comes together: you are now ready to run the documentation tests for the model as defined by the documentation template you looked at earlier.\n", + "\n", + "The [`vm.run_documentation_tests`](https://docs.validmind.ai/validmind/validmind.html#run_documentation_tests) function finds and runs every test specified in the template and then uploads all the documentation and test artifacts that get generated to the ValidMind Platform.\n", + "\n", + "The function requires information about the inputs to use on every test. These inputs can be passed as an `inputs` argument if we want to use the same inputs for all tests. It's also possible to pass a `config` argument that has information about the `params` and `inputs` that each test requires. The `config` parameter is a dictionary with the following structure:\n", + "\n", + "```python\n", + "config = {\n", + " \"<test-id>\": {\n", + " \"params\": {\n", + " \"param1\": \"value1\",\n", + " \"param2\": \"value2\",\n", + " ...\n", + " },\n", + " \"inputs\": {\n", + " \"input1\": \"value1\",\n", + " \"input2\": \"value2\",\n", + " ...\n", + " }\n", + " },\n", + " ...\n", + "}\n", + "```\n", + "\n", + "Each `<test-id>` above corresponds to the test driven block identifiers shown by `vm.preview_template()`. For this model, we will use the default parameters for all tests, but we'll need to specify the input configuration for each one. The method `get_demo_test_config()` below constructs the default input configuration for our demo." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.utils import preview_test_config\n", + "\n", + "test_config = lending_club.get_demo_test_config(x_test, y_test)\n", + "preview_test_config(test_config)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now we can pass the input configuration to `vm.run_documentation_tests()` and run the full suite of tests. The variable `full_suite` then holds the result of these tests." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "full_suite = vm.run_documentation_tests(config=test_config)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "<a id='toc6_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "3. Expand the following sections and take a look around:\n", + "\n", + " - **2. Data Preparation**\n", + " - **3. Model Development**\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation (hint: some of the tests in **2.3. Feature Selection and Engineering** look like they need some attention), view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "<a id='toc6_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-a658e3f1bece47cabc255c03460e255f" + } + ], + "metadata": { + "kernelspec": { + "display_name": "validmind-eEL8LtKG-py3.10", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.13" + } + }, + "nbformat": 4, + "nbformat_minor": 2 } diff --git a/site/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb b/site/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb index 4824a1144c..6f6d23928e 100644 --- a/site/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb +++ b/site/notebooks/use_cases/credit_risk/application_scorecard_with_bias.ipynb @@ -1,1561 +1,1567 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Document a credit risk model\n", - "\n", - "Build and document an *application scorecard model* with the ValidMind Library by using Kaggle's [Lending Club](https://www.kaggle.com/datasets/devanshi23/loan-data-2007-2014/data) sample dataset to build a simple application scorecard.\n", - "\n", - "An application scorecard model is a type of statistical model used in credit scoring to evaluate the creditworthiness of potential borrowers by generating a score based on various characteristics of an applicant — such as credit history, income, employment status, and other relevant financial data. \n", - "\n", - "- This score helps lenders make decisions about whether to approve or reject loan applications, as well as determine the terms of the loan, including interest rates and credit limits. \n", - "- Application scorecard models enable lenders to manage risk efficiently while making the loan application process faster and more transparent for applicants.\n", - "\n", - "This interactive notebook provides a step-by-step guide for loading a demo dataset, preprocessing the raw data, training a model for testing, setting up test inputs, initializing the required ValidMind objects, running the test, and then logging the results to ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Initialize the Python environment](#toc2_3__) \n", - " - [Preview the documentation template](#toc2_4__) \n", - "- [Load the sample dataset](#toc3__) \n", - " - [Prepocess the dataset](#toc3_1__) \n", - "- [Train the model](#toc4__) \n", - " - [Compute probabilities](#toc4_1__) \n", - " - [Compute binary predictions](#toc4_2__) \n", - "- [Postprocess the dataset](#toc5__) \n", - "- [Document the model](#toc6__) \n", - " - [Initialize the ValidMind datasets](#toc6_1__) \n", - " - [Initialize the ValidMind model](#toc6_2__) \n", - " - [Assign predictions](#toc6_3__) \n", - " - [Run tests](#toc6_4__) \n", - " - [Data description](#toc6_4_1__) \n", - " - [Data quality](#toc6_4_2__) \n", - " - [Correlations](#toc6_4_3__) \n", - " - [Model training](#toc6_4_4__) \n", - " - [Model validation](#toc6_4_5__) \n", - " - [Model explainability](#toc6_4_6__) \n", - " - [Bias and fairness](#toc6_4_7__) \n", - "- [Next steps](#toc7__) \n", - " - [Work with your documentation](#toc7_1__) \n", - " - [Discover more learning resources](#toc7_2__) \n", - "- [Upgrade ValidMind](#toc8__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.\n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - "- **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - "- **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - "- **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - "- **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: The [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Credit Risk Scorecard`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Next, let's import the necessary libraries and set up your Python environment for data analysis:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip -q install aequitas fairlearn vl-convert-python" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import pandas as pd\n", - "\n", - "from sklearn.ensemble import RandomForestClassifier\n", - "from sklearn.preprocessing import OneHotEncoder, StandardScaler\n", - "from sklearn.pipeline import Pipeline\n", - "from sklearn.impute import SimpleImputer\n", - "from sklearn.compose import ColumnTransformer\n", - "from sklearn.compose import make_column_selector as selector\n", - "\n", - "from validmind.tests import run_test\n", - "\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_4__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Load the sample dataset\n", - "\n", - "The sample dataset used here is provided by the ValidMind library. To be able to use it, you'll need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.datasets.credit_risk import lending_club_bias as demo_dataset\n", - "\n", - "df = demo_dataset.load_data()\n", - "\n", - "df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_1__'></a>\n", - "\n", - "### Prepocess the dataset\n", - "\n", - "In the preprocessing step we perform a number of operations to get ready for building our credit decision model. \n", - "\n", - "We will in this example, create new feature, fill missing values and encode categorical variables." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "preprocess_df = demo_dataset.preprocess(df)\n", - "preprocess_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Train the model\n", - "\n", - "In this section, we focus on constructing and refining our predictive model. \n", - "- We begin by dividing our data into training and testing sets (`train_df`, `test_df`). \n", - "- We employ a simple random split, randomly allocating data points to each set to ensure a mix of examples in both." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Split the data into training and testing sets\n", - "train_df, test_df = demo_dataset.split(preprocess_df)\n", - "\n", - "X_train = train_df.drop(demo_dataset.target_column, axis=1)\n", - "y_train = train_df[demo_dataset.target_column]\n", - "X_test = test_df.drop(demo_dataset.target_column, axis=1)\n", - "y_test = test_df[demo_dataset.target_column]" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Train a Random Forest Classifier\n", - "model = RandomForestClassifier(n_estimators=50, random_state=42)\n", - "model.fit(X_train, y_train)\n", - "\n", - "# Print feature importances\n", - "feature_importances = pd.DataFrame({\n", - " 'feature': X_train.columns,\n", - " 'importance': model.feature_importances_\n", - "}).sort_values('importance', ascending=False)\n", - "\n", - "print(\"Feature Importances:\")\n", - "print(feature_importances)\n", - "\n", - "# Print model parameters\n", - "print(\"\\nModel Parameters:\")\n", - "print(model.get_params())\n", - "\n", - "# Print basic model information\n", - "print(f\"\\nNumber of trees: {model.n_estimators}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Compute probabilities" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_probabilities = model.predict_proba(X_train)[:,1]\n", - "test_probabilities = model.predict_proba(X_test)[:,1]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### Compute binary predictions" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "cut_off_threshold = 0.5\n", - "train_binary_predictions = (train_probabilities > cut_off_threshold).astype(int)\n", - "test_binary_predictions = (test_probabilities > cut_off_threshold).astype(int)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Postprocess the dataset" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Save the original labels for the protected classes for visualizations and investigation of biased outcomes\n", - "protected_classes_df = df[demo_dataset.protected_classes]\n", - "\n", - "train_df = train_df.merge(\n", - " protected_classes_df,\n", - " left_index=True,\n", - " right_index=True,\n", - ")\n", - "\n", - "test_df = test_df.merge(\n", - " protected_classes_df,\n", - " left_index=True,\n", - " right_index=True,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Document the model\n", - "\n", - "To document the model with the ValidMind Library, you'll need to:\n", - "1. Preprocess the raw dataset\n", - "2. Initialize some training and test datasets\n", - "3. Initialize a ValidMind model object for use with testing\n", - "4. Run the full suite of tests" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_1__'></a>\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", - "\n", - "This function takes a number of arguments:\n", - "\n", - "- `dataset`: The dataset that you want to provide as input to tests.\n", - "- `input_id`: A unique identifier that allows tracking what inputs are used when running each individual test.\n", - "- `target_column`: A required argument if tests require access to true values. This is the name of the target column in the dataset.\n", - "\n", - "With all datasets ready, you can now initialize the raw, training and test datasets created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Extract feature columns\n", - "feature_columns = train_df.drop(\n", - " columns=[demo_dataset.target_column] + demo_dataset.protected_classes\n", - ").columns.tolist()\n", - "feature_columns" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_ds= vm.init_dataset(\n", - " dataset=df,\n", - " input_id=\"raw_dataset\",\n", - " target_column=demo_dataset.target_column,\n", - ")\n", - "\n", - "vm_train_ds = vm.init_dataset(\n", - " dataset=train_df,\n", - " input_id=\"train_dataset\",\n", - " target_column=demo_dataset.target_column,\n", - " feature_columns=feature_columns\n", - ")\n", - "\n", - "vm_test_ds = vm.init_dataset(\n", - " dataset=test_df,\n", - " input_id=\"test_dataset\",\n", - " target_column=demo_dataset.target_column,\n", - " feature_columns=feature_columns\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_2__'></a>\n", - "\n", - "### Initialize the ValidMind model\n", - "\n", - "You will also need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_model = vm.init_model(\n", - " model,\n", - " input_id=\"random_forest_model\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_3__'></a>\n", - "\n", - "### Assign predictions\n", - "\n", - "With our model now trained, we'll move on to assigning both the predictive probabilities coming directly from the model's predictions, and the binary prediction after applying the cutoff threshold described in the previous steps. \n", - "- These tasks are achieved through the use of the `assign_predictions()` method associated with the VM `dataset` object.\n", - "- This method links the model's class prediction values and probabilities to our VM train and test datasets." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(\n", - " model=vm_model,\n", - " prediction_values=train_binary_predictions,\n", - " prediction_probabilities=train_probabilities,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_model,\n", - " prediction_values=test_binary_predictions,\n", - " prediction_probabilities=test_probabilities,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_4__'></a>\n", - "\n", - "### Run tests" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_4_1__'></a>\n", - "\n", - "#### Data description" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.DatasetDescription\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\",\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.DescriptiveStatistics\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\",\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.TabularNumericalHistograms\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\"\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.TargetRateBarPlots\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\"\n", - " },\n", - " params={\n", - " \"default_column\": demo_dataset.target_column,\n", - " \"columns\": None,\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_4_2__'></a>\n", - "\n", - "#### Data quality" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.ClassImbalance\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\",\n", - " },\n", - " params={\n", - " \"min_percent_threshold\": 10\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.Duplicates\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\",\n", - " },\n", - " params={\n", - " \"min_threshold\": 1\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.HighCardinality\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\",\n", - " },\n", - " params={\n", - " \"num_threshold\": 100,\n", - " \"percent_threshold\": 0.1,\n", - " \"threshold_type\": \"percent\"\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.MissingValues\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\",\n", - " },\n", - " params={\n", - " \"min_percentage_threshold\": 1,\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.Skewness\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\",\n", - " },\n", - " params={\n", - " \"max_threshold\": 1,\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.UniqueRows\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\",\n", - " },\n", - " params={\n", - " \"min_percent_threshold\": 1,\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.TooManyZeroValues\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\",\n", - " },\n", - " params={\n", - " \"max_percent_threshold\": 0.03,\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.IQROutliersTable\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\",\n", - " },\n", - " params={\n", - " \"threshold\": 1.5,\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.IQROutliersBarPlot\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\",\n", - " },\n", - " params={\n", - " \"threshold\": 1.5,\n", - " \"fig_width\": 800,\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_4_3__'></a>\n", - "\n", - "#### Correlations" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.PearsonCorrelationMatrix\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\",\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.HighPearsonCorrelation\",\n", - " inputs={\n", - " \"dataset\": \"raw_dataset\",\n", - " },\n", - " params={\n", - " \"max_threshold\": 0.3\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_4_4__'></a>\n", - "\n", - "#### Model training" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.ModelMetadata\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.DatasetSplit\",\n", - " inputs={\n", - " \"datasets\": [\"train_dataset\", \"test_dataset\"],\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_4_5__'></a>\n", - "\n", - "#### Model validation" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.PopulationStabilityIndex\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"datasets\": [\"train_dataset\", \"test_dataset\"],\n", - " },\n", - " params={\n", - " \"num_bins\": 10,\n", - " \"mode\": \"fixed\"\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.ConfusionMatrix\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"dataset\": \"test_dataset\",\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.ClassifierPerformance:in_sample\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"dataset\": \"train_dataset\",\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.ClassifierPerformance:out_of_sample\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"dataset\": \"test_dataset\",\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.PrecisionRecallCurve\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"dataset\": \"test_dataset\",\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.ROCCurve\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"dataset\": \"test_dataset\",\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.TrainingTestDegradation\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"datasets\": [\"train_dataset\", \"test_dataset\"],\n", - " },\n", - " params={\n", - " \"metrics\": [\"accuracy\", \"precision\", \"recall\", \"f1\"],\n", - " \"max_threshold\": 0.1\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"dataset\": \"test_dataset\",\n", - " },\n", - " params={\n", - " \"min_threshold\": 0.7\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.MinimumF1Score\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"dataset\": \"test_dataset\",\n", - " },\n", - " params={\n", - " \"min_threshold\": 0.7\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.MinimumROCAUCScore\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"dataset\": \"test_dataset\",\n", - " },\n", - " params={\n", - " \"min_threshold\": 0.5\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.statsmodels.GINITable\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " \"model\": [vm_model],\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.statsmodels.PredictionProbabilitiesHistogram\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " \"model\": [vm_model],\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.statsmodels.CumulativePredictionProbabilities\",\n", - " input_grid={\n", - " \"model\": [vm_model],\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_4_6__'></a>\n", - "\n", - "#### Model explainability" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.PermutationFeatureImportance\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"dataset\": \"test_dataset\",\n", - " },\n", - " params={\n", - " \"fontsize\": None,\n", - " \"figure_height\": 1000\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - "\"validmind.model_validation.sklearn.SHAPGlobalImportance\",\n", - "inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"dataset\": \"train_dataset\",\n", - " },\n", - " params={\n", - " \"kernel_explainer_samples\": 10,\n", - " \"tree_or_linear_explainer_samples\": 200\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.WeakspotsDiagnosis\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"datasets\": [\"train_dataset\", \"test_dataset\"],\n", - " },\n", - " params={\n", - " \"features_columns\": None,\n", - " \"thresholds\": {\n", - " \"accuracy\": 0.75,\n", - " \"precision\": 0.5,\n", - " \"recall\": 0.5,\n", - " \"f1\": 0.7\n", - " }\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.OverfitDiagnosis\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"datasets\": [\"train_dataset\", \"test_dataset\"],\n", - " },\n", - " params={\n", - " \"metric\": None,\n", - " \"cut_off_threshold\": 0.04\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.RobustnessDiagnosis\",\n", - " inputs={\n", - " \"model\": \"random_forest_model\",\n", - " \"datasets\": [\"train_dataset\", \"test_dataset\"],\n", - " },\n", - " params={\n", - " \"metric\": None,\n", - " \"scaling_factor_std_dev_list\": [0.1, 0.2, 0.3, 0.4, 0.5],\n", - " \"performance_decay_threshold\": 0.05\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_4_7__'></a>\n", - "\n", - "#### Bias and fairness" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = run_test(\n", - " \"validmind.data_validation.ProtectedClassesDescription\",\n", - " inputs={\n", - " \"dataset\": \"test_dataset\"\n", - " },\n", - " params={\n", - " 'protected_classes': demo_dataset.protected_classes\n", - " })\n", - "test.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now we are going to focus our analysis on the fairness metric(s) of interest in this case study: FNR/FPR across different groups. The `aequitas` plot module exposes the `disparities_metrics()` plot, which displays both the disparities and the group-wise metric results side by side." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = run_test(\n", - " \"validmind.data_validation.ProtectedClassesDisparity\",\n", - " inputs={\n", - " \"dataset\": \"test_dataset\",\n", - " \"model\": \"random_forest_model\"\n", - " },\n", - " params={\n", - " \"protected_classes\": demo_dataset.protected_classes,\n", - " \"disparity_tolerance\": 1.25,\n", - " \"metrics\": [\"fnr\", \"fpr\", \"tpr\"]\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.ProtectedClassesCombination\",\n", - " inputs={\n", - " \"dataset\": \"test_dataset\",\n", - " \"model\": \"random_forest_model\"\n", - " },\n", - " params={\n", - " \"protected_classes\": demo_dataset.protected_classes\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The following code defines a preprocessing `Pipeline` that handles both numeric and categorical features. Numeric data is imputed and scaled, while categorical data is imputed with the most frequent value and one-hot encoded. The pipelines are then combined using a `ColumnTransformer` and integrated with a classifier." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Define a pipeline for numeric features\n", - "numeric_transformer = Pipeline(\n", - " steps=[\n", - " (\"impute\", SimpleImputer()), # Impute missing values\n", - " (\"scaler\", StandardScaler()), # Scale numeric features\n", - " ]\n", - ")\n", - "\n", - "# Define a pipeline for categorical features\n", - "categorical_transformer = Pipeline(\n", - " [\n", - " (\"impute\", SimpleImputer(strategy=\"most_frequent\")), # Impute missing values with most frequent\n", - " (\"ohe\", OneHotEncoder(handle_unknown=\"ignore\")), # One-hot encode categorical features\n", - " ]\n", - ")\n", - "\n", - "# Combine numeric and categorical pipelines\n", - "preprocessor = ColumnTransformer(\n", - " transformers=[\n", - " (\"num\", numeric_transformer, selector(dtype_exclude=\"category\")), # Apply numeric transformer to non-categorical columns\n", - " (\"cat\", categorical_transformer, selector(dtype_include=\"category\")), # Apply categorical transformer to categorical columns\n", - " ]\n", - ")\n", - "\n", - "# Create the full pipeline including preprocessing and classification\n", - "pipeline = Pipeline(\n", - " steps=[\n", - " (\"preprocessor\", preprocessor), # Apply the preprocessor\n", - " (\n", - " \"classifier\",\n", - " model, # Use the previously defined model for classification\n", - " ),\n", - " ]\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "sensitive_features = ['Gender_encoded','Race_encoded','Marital_Status_encoded']\n", - "\n", - "run_test(\n", - " \"validmind.data_validation.ProtectedClassesThresholdOptimizer\",\n", - " inputs={\n", - " \"dataset\": vm_test_ds\n", - " },\n", - " params={\n", - " \"pipeline\":pipeline,\n", - " \"protected_classes\": sensitive_features,\n", - " \"X_train\":X_train,\n", - " \"y_train\":y_train,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "<a id='toc7_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "3. Expand the following sections and take a look around:\n", - "\n", - " - **2. Data Preparation**\n", - " - **3. Model Development**\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation (hint: some of the tests in **2.3. Feature Selection and Engineering** look like they need some attention), view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "<a id='toc7_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-6a74bc76beda4633a0cfff2eaa20949e", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "validmind-eEL8LtKG-py3.10", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.13" - } - }, - "nbformat": 4, - "nbformat_minor": 2 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Document a credit risk model\n", + "\n", + "Build and document an *application scorecard model* with the ValidMind Library by using Kaggle's [Lending Club](https://www.kaggle.com/datasets/devanshi23/loan-data-2007-2014/data) sample dataset to build a simple application scorecard.\n", + "\n", + "An application scorecard model is a type of statistical model used in credit scoring to evaluate the creditworthiness of potential borrowers by generating a score based on various characteristics of an applicant — such as credit history, income, employment status, and other relevant financial data. \n", + "\n", + "- This score helps lenders make decisions about whether to approve or reject loan applications, as well as determine the terms of the loan, including interest rates and credit limits. \n", + "- Application scorecard models enable lenders to manage risk efficiently while making the loan application process faster and more transparent for applicants.\n", + "\n", + "This interactive notebook provides a step-by-step guide for loading a demo dataset, preprocessing the raw data, training a model for testing, setting up test inputs, initializing the required ValidMind objects, running the test, and then logging the results to ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Initialize the Python environment](#toc2_3__) \n", + " - [Preview the documentation template](#toc2_4__) \n", + "- [Load the sample dataset](#toc3__) \n", + " - [Prepocess the dataset](#toc3_1__) \n", + "- [Train the model](#toc4__) \n", + " - [Compute probabilities](#toc4_1__) \n", + " - [Compute binary predictions](#toc4_2__) \n", + "- [Postprocess the dataset](#toc5__) \n", + "- [Document the model](#toc6__) \n", + " - [Initialize the ValidMind datasets](#toc6_1__) \n", + " - [Initialize the ValidMind model](#toc6_2__) \n", + " - [Assign predictions](#toc6_3__) \n", + " - [Run tests](#toc6_4__) \n", + " - [Data description](#toc6_4_1__) \n", + " - [Data quality](#toc6_4_2__) \n", + " - [Correlations](#toc6_4_3__) \n", + " - [Model training](#toc6_4_4__) \n", + " - [Model validation](#toc6_4_5__) \n", + " - [Model explainability](#toc6_4_6__) \n", + " - [Bias and fairness](#toc6_4_7__) \n", + "- [Next steps](#toc7__) \n", + " - [Work with your documentation](#toc7_1__) \n", + " - [Discover more learning resources](#toc7_2__) \n", + "- [Upgrade ValidMind](#toc8__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.\n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Credit Risk Scorecard`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Next, let's import the necessary libraries and set up your Python environment for data analysis:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip -q install aequitas fairlearn vl-convert-python" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import pandas as pd\n", + "\n", + "from sklearn.ensemble import RandomForestClassifier\n", + "from sklearn.preprocessing import OneHotEncoder, StandardScaler\n", + "from sklearn.pipeline import Pipeline\n", + "from sklearn.impute import SimpleImputer\n", + "from sklearn.compose import ColumnTransformer\n", + "from sklearn.compose import make_column_selector as selector\n", + "\n", + "from validmind.tests import run_test\n", + "\n", + "%matplotlib inline" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_4__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Load the sample dataset\n", + "\n", + "The sample dataset used here is provided by the ValidMind library. To be able to use it, you'll need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.datasets.credit_risk import lending_club_bias as demo_dataset\n", + "\n", + "df = demo_dataset.load_data()\n", + "\n", + "df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1__'></a>\n", + "\n", + "### Prepocess the dataset\n", + "\n", + "In the preprocessing step we perform a number of operations to get ready for building our credit decision model. \n", + "\n", + "We will in this example, create new feature, fill missing values and encode categorical variables." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "preprocess_df = demo_dataset.preprocess(df)\n", + "preprocess_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Train the model\n", + "\n", + "In this section, we focus on constructing and refining our predictive model. \n", + "- We begin by dividing our data into training and testing sets (`train_df`, `test_df`). \n", + "- We employ a simple random split, randomly allocating data points to each set to ensure a mix of examples in both." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Split the data into training and testing sets\n", + "train_df, test_df = demo_dataset.split(preprocess_df)\n", + "\n", + "X_train = train_df.drop(demo_dataset.target_column, axis=1)\n", + "y_train = train_df[demo_dataset.target_column]\n", + "X_test = test_df.drop(demo_dataset.target_column, axis=1)\n", + "y_test = test_df[demo_dataset.target_column]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Train a Random Forest Classifier\n", + "model = RandomForestClassifier(n_estimators=50, random_state=42)\n", + "model.fit(X_train, y_train)\n", + "\n", + "# Print feature importances\n", + "feature_importances = pd.DataFrame({\n", + " 'feature': X_train.columns,\n", + " 'importance': model.feature_importances_\n", + "}).sort_values('importance', ascending=False)\n", + "\n", + "print(\"Feature Importances:\")\n", + "print(feature_importances)\n", + "\n", + "# Print model parameters\n", + "print(\"\\nModel Parameters:\")\n", + "print(model.get_params())\n", + "\n", + "# Print basic model information\n", + "print(f\"\\nNumber of trees: {model.n_estimators}\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Compute probabilities" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_probabilities = model.predict_proba(X_train)[:,1]\n", + "test_probabilities = model.predict_proba(X_test)[:,1]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### Compute binary predictions" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "cut_off_threshold = 0.5\n", + "train_binary_predictions = (train_probabilities > cut_off_threshold).astype(int)\n", + "test_binary_predictions = (test_probabilities > cut_off_threshold).astype(int)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Postprocess the dataset" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Save the original labels for the protected classes for visualizations and investigation of biased outcomes\n", + "protected_classes_df = df[demo_dataset.protected_classes]\n", + "\n", + "train_df = train_df.merge(\n", + " protected_classes_df,\n", + " left_index=True,\n", + " right_index=True,\n", + ")\n", + "\n", + "test_df = test_df.merge(\n", + " protected_classes_df,\n", + " left_index=True,\n", + " right_index=True,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Document the model\n", + "\n", + "To document the model with the ValidMind Library, you'll need to:\n", + "1. Preprocess the raw dataset\n", + "2. Initialize some training and test datasets\n", + "3. Initialize a ValidMind model object for use with testing\n", + "4. Run the full suite of tests" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_1__'></a>\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", + "\n", + "This function takes a number of arguments:\n", + "\n", + "- `dataset`: The dataset that you want to provide as input to tests.\n", + "- `input_id`: A unique identifier that allows tracking what inputs are used when running each individual test.\n", + "- `target_column`: A required argument if tests require access to true values. This is the name of the target column in the dataset.\n", + "\n", + "With all datasets ready, you can now initialize the raw, training and test datasets created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Extract feature columns\n", + "feature_columns = train_df.drop(\n", + " columns=[demo_dataset.target_column] + demo_dataset.protected_classes\n", + ").columns.tolist()\n", + "feature_columns" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_ds= vm.init_dataset(\n", + " dataset=df,\n", + " input_id=\"raw_dataset\",\n", + " target_column=demo_dataset.target_column,\n", + ")\n", + "\n", + "vm_train_ds = vm.init_dataset(\n", + " dataset=train_df,\n", + " input_id=\"train_dataset\",\n", + " target_column=demo_dataset.target_column,\n", + " feature_columns=feature_columns\n", + ")\n", + "\n", + "vm_test_ds = vm.init_dataset(\n", + " dataset=test_df,\n", + " input_id=\"test_dataset\",\n", + " target_column=demo_dataset.target_column,\n", + " feature_columns=feature_columns\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_2__'></a>\n", + "\n", + "### Initialize the ValidMind model\n", + "\n", + "You will also need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_model = vm.init_model(\n", + " model,\n", + " input_id=\"random_forest_model\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_3__'></a>\n", + "\n", + "### Assign predictions\n", + "\n", + "With our model now trained, we'll move on to assigning both the predictive probabilities coming directly from the model's predictions, and the binary prediction after applying the cutoff threshold described in the previous steps. \n", + "- These tasks are achieved through the use of the `assign_predictions()` method associated with the VM `dataset` object.\n", + "- This method links the model's class prediction values and probabilities to our VM train and test datasets." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(\n", + " model=vm_model,\n", + " prediction_values=train_binary_predictions,\n", + " prediction_probabilities=train_probabilities,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_model,\n", + " prediction_values=test_binary_predictions,\n", + " prediction_probabilities=test_probabilities,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_4__'></a>\n", + "\n", + "### Run tests" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_4_1__'></a>\n", + "\n", + "#### Data description" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.DatasetDescription\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\",\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.DescriptiveStatistics\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\",\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.TabularNumericalHistograms\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\"\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.TargetRateBarPlots\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\"\n", + " },\n", + " params={\n", + " \"default_column\": demo_dataset.target_column,\n", + " \"columns\": None,\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_4_2__'></a>\n", + "\n", + "#### Data quality" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.ClassImbalance\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\",\n", + " },\n", + " params={\n", + " \"min_percent_threshold\": 10\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.Duplicates\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\",\n", + " },\n", + " params={\n", + " \"min_threshold\": 1\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.HighCardinality\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\",\n", + " },\n", + " params={\n", + " \"num_threshold\": 100,\n", + " \"percent_threshold\": 0.1,\n", + " \"threshold_type\": \"percent\"\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.MissingValues\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\",\n", + " },\n", + " params={\n", + " \"min_percentage_threshold\": 1,\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.Skewness\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\",\n", + " },\n", + " params={\n", + " \"max_threshold\": 1,\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.UniqueRows\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\",\n", + " },\n", + " params={\n", + " \"min_percent_threshold\": 1,\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.TooManyZeroValues\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\",\n", + " },\n", + " params={\n", + " \"max_percent_threshold\": 0.03,\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.IQROutliersTable\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\",\n", + " },\n", + " params={\n", + " \"threshold\": 1.5,\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.IQROutliersBarPlot\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\",\n", + " },\n", + " params={\n", + " \"threshold\": 1.5,\n", + " \"fig_width\": 800,\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_4_3__'></a>\n", + "\n", + "#### Correlations" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.PearsonCorrelationMatrix\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\",\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.HighPearsonCorrelation\",\n", + " inputs={\n", + " \"dataset\": \"raw_dataset\",\n", + " },\n", + " params={\n", + " \"max_threshold\": 0.3\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_4_4__'></a>\n", + "\n", + "#### Model training" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.ModelMetadata\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.DatasetSplit\",\n", + " inputs={\n", + " \"datasets\": [\"train_dataset\", \"test_dataset\"],\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_4_5__'></a>\n", + "\n", + "#### Model validation" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.PopulationStabilityIndex\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"datasets\": [\"train_dataset\", \"test_dataset\"],\n", + " },\n", + " params={\n", + " \"num_bins\": 10,\n", + " \"mode\": \"fixed\"\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.ConfusionMatrix\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"dataset\": \"test_dataset\",\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.ClassifierPerformance:in_sample\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"dataset\": \"train_dataset\",\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.ClassifierPerformance:out_of_sample\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"dataset\": \"test_dataset\",\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.PrecisionRecallCurve\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"dataset\": \"test_dataset\",\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.ROCCurve\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"dataset\": \"test_dataset\",\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.TrainingTestDegradation\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"datasets\": [\"train_dataset\", \"test_dataset\"],\n", + " },\n", + " params={\n", + " \"metrics\": [\"accuracy\", \"precision\", \"recall\", \"f1\"],\n", + " \"max_threshold\": 0.1\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"dataset\": \"test_dataset\",\n", + " },\n", + " params={\n", + " \"min_threshold\": 0.7\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.MinimumF1Score\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"dataset\": \"test_dataset\",\n", + " },\n", + " params={\n", + " \"min_threshold\": 0.7\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.MinimumROCAUCScore\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"dataset\": \"test_dataset\",\n", + " },\n", + " params={\n", + " \"min_threshold\": 0.5\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.statsmodels.GINITable\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " \"model\": [vm_model],\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.statsmodels.PredictionProbabilitiesHistogram\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " \"model\": [vm_model],\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.statsmodels.CumulativePredictionProbabilities\",\n", + " input_grid={\n", + " \"model\": [vm_model],\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_4_6__'></a>\n", + "\n", + "#### Model explainability" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.PermutationFeatureImportance\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"dataset\": \"test_dataset\",\n", + " },\n", + " params={\n", + " \"fontsize\": None,\n", + " \"figure_height\": 1000\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + "\"validmind.model_validation.sklearn.SHAPGlobalImportance\",\n", + "inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"dataset\": \"train_dataset\",\n", + " },\n", + " params={\n", + " \"kernel_explainer_samples\": 10,\n", + " \"tree_or_linear_explainer_samples\": 200\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.WeakspotsDiagnosis\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"datasets\": [\"train_dataset\", \"test_dataset\"],\n", + " },\n", + " params={\n", + " \"features_columns\": None,\n", + " \"thresholds\": {\n", + " \"accuracy\": 0.75,\n", + " \"precision\": 0.5,\n", + " \"recall\": 0.5,\n", + " \"f1\": 0.7\n", + " }\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.OverfitDiagnosis\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"datasets\": [\"train_dataset\", \"test_dataset\"],\n", + " },\n", + " params={\n", + " \"metric\": None,\n", + " \"cut_off_threshold\": 0.04\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.RobustnessDiagnosis\",\n", + " inputs={\n", + " \"model\": \"random_forest_model\",\n", + " \"datasets\": [\"train_dataset\", \"test_dataset\"],\n", + " },\n", + " params={\n", + " \"metric\": None,\n", + " \"scaling_factor_std_dev_list\": [0.1, 0.2, 0.3, 0.4, 0.5],\n", + " \"performance_decay_threshold\": 0.05\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_4_7__'></a>\n", + "\n", + "#### Bias and fairness" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = run_test(\n", + " \"validmind.data_validation.ProtectedClassesDescription\",\n", + " inputs={\n", + " \"dataset\": \"test_dataset\"\n", + " },\n", + " params={\n", + " 'protected_classes': demo_dataset.protected_classes\n", + " })\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now we are going to focus our analysis on the fairness metric(s) of interest in this case study: FNR/FPR across different groups. The `aequitas` plot module exposes the `disparities_metrics()` plot, which displays both the disparities and the group-wise metric results side by side." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = run_test(\n", + " \"validmind.data_validation.ProtectedClassesDisparity\",\n", + " inputs={\n", + " \"dataset\": \"test_dataset\",\n", + " \"model\": \"random_forest_model\"\n", + " },\n", + " params={\n", + " \"protected_classes\": demo_dataset.protected_classes,\n", + " \"disparity_tolerance\": 1.25,\n", + " \"metrics\": [\"fnr\", \"fpr\", \"tpr\"]\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.ProtectedClassesCombination\",\n", + " inputs={\n", + " \"dataset\": \"test_dataset\",\n", + " \"model\": \"random_forest_model\"\n", + " },\n", + " params={\n", + " \"protected_classes\": demo_dataset.protected_classes\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The following code defines a preprocessing `Pipeline` that handles both numeric and categorical features. Numeric data is imputed and scaled, while categorical data is imputed with the most frequent value and one-hot encoded. The pipelines are then combined using a `ColumnTransformer` and integrated with a classifier." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Define a pipeline for numeric features\n", + "numeric_transformer = Pipeline(\n", + " steps=[\n", + " (\"impute\", SimpleImputer()), # Impute missing values\n", + " (\"scaler\", StandardScaler()), # Scale numeric features\n", + " ]\n", + ")\n", + "\n", + "# Define a pipeline for categorical features\n", + "categorical_transformer = Pipeline(\n", + " [\n", + " (\"impute\", SimpleImputer(strategy=\"most_frequent\")), # Impute missing values with most frequent\n", + " (\"ohe\", OneHotEncoder(handle_unknown=\"ignore\")), # One-hot encode categorical features\n", + " ]\n", + ")\n", + "\n", + "# Combine numeric and categorical pipelines\n", + "preprocessor = ColumnTransformer(\n", + " transformers=[\n", + " (\"num\", numeric_transformer, selector(dtype_exclude=\"category\")), # Apply numeric transformer to non-categorical columns\n", + " (\"cat\", categorical_transformer, selector(dtype_include=\"category\")), # Apply categorical transformer to categorical columns\n", + " ]\n", + ")\n", + "\n", + "# Create the full pipeline including preprocessing and classification\n", + "pipeline = Pipeline(\n", + " steps=[\n", + " (\"preprocessor\", preprocessor), # Apply the preprocessor\n", + " (\n", + " \"classifier\",\n", + " model, # Use the previously defined model for classification\n", + " ),\n", + " ]\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "sensitive_features = ['Gender_encoded','Race_encoded','Marital_Status_encoded']\n", + "\n", + "run_test(\n", + " \"validmind.data_validation.ProtectedClassesThresholdOptimizer\",\n", + " inputs={\n", + " \"dataset\": vm_test_ds\n", + " },\n", + " params={\n", + " \"pipeline\":pipeline,\n", + " \"protected_classes\": sensitive_features,\n", + " \"X_train\":X_train,\n", + " \"y_train\":y_train,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "<a id='toc7_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "3. Expand the following sections and take a look around:\n", + "\n", + " - **2. Data Preparation**\n", + " - **3. Model Development**\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation (hint: some of the tests in **2.3. Feature Selection and Engineering** look like they need some attention), view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "<a id='toc7_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-6a74bc76beda4633a0cfff2eaa20949e" + } + ], + "metadata": { + "kernelspec": { + "display_name": "validmind-eEL8LtKG-py3.10", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.13" + } + }, + "nbformat": 4, + "nbformat_minor": 2 } diff --git a/site/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb b/site/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb index ab7c3243ac..a735cbf5b1 100644 --- a/site/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb +++ b/site/notebooks/use_cases/credit_risk/application_scorecard_with_ml.ipynb @@ -1,2011 +1,2017 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Document an application scorecard model\n", - "\n", - "Build and document an *application scorecard model* with the ValidMind Library by using Kaggle's [Lending Club](https://www.kaggle.com/datasets/devanshi23/loan-data-2007-2014/data) sample dataset to build a simple application scorecard.\n", - "\n", - "An application scorecard model is a type of statistical model used in credit scoring to evaluate the creditworthiness of potential borrowers by generating a score based on various characteristics of an applicant — such as credit history, income, employment status, and other relevant financial data. \n", - "\n", - "- This score helps lenders make decisions about whether to approve or reject loan applications, as well as determine the terms of the loan, including interest rates and credit limits. \n", - "- Application scorecard models enable lenders to manage risk efficiently while making the loan application process faster and more transparent for applicants.\n", - "\n", - "This interactive notebook provides a step-by-step guide for loading a demo dataset, preprocessing the raw data, training a model for testing, setting up test inputs, initializing the required ValidMind objects, running the test, and then logging the results to ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Initialize the Python environment](#toc2_3__) \n", - " - [Preview the documentation template](#toc2_4__) \n", - "- [Load the sample dataset](#toc3__) \n", - " - [Prepocess the dataset](#toc3_1__) \n", - " - [Feature engineering](#toc3_2__) \n", - "- [Train the model](#toc4__) \n", - " - [Compute probabilities](#toc4_1__) \n", - " - [Compute binary predictions](#toc4_2__) \n", - "- [Document the model](#toc5__) \n", - " - [Initialize the ValidMind datasets](#toc5_1__) \n", - " - [Initialize the ValidMind models](#toc5_2__) \n", - " - [Assign prediction values and probabilities to the datasets](#toc5_3__) \n", - " - [Compute credit risk scores](#toc5_4__) \n", - " - [Adding custom context to the LLM descriptions](#toc5_5__) \n", - " - [Raw data](#toc5_6__) \n", - " - [Pre-processed data](#toc5_7__) \n", - " - [Development data](#toc5_8__) \n", - " - [Feature selection](#toc5_9__) \n", - " - [Model training](#toc5_10__) \n", - " - [Model selection](#toc5_11__) \n", - " - [Class discrimination](#toc5_12__) \n", - " - [Classification accuracy](#toc5_13__) \n", - " - [Model diagnosis](#toc5_14__) \n", - " - [Model explainability](#toc5_15__) \n", - " - [Scoring evaluation](#toc5_16__) \n", - "- [Custom tests](#toc6__) \n", - " - [In-line custom tests](#toc6_1__) \n", - " - [Local test provider](#toc6_2__) \n", - "- [Next steps](#toc7__) \n", - " - [Work with your documentation](#toc7_1__) \n", - " - [Discover more learning resources](#toc7_2__) \n", - "- [Upgrade ValidMind](#toc8__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.\n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - "- **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - "- **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - "- **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - "- **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: The [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Credit Risk Scorecard`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host = \"...\",\n", - " # api_key = \"...\",\n", - " # api_secret = \"...\",\n", - " # model = \"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Next, let's import the necessary libraries and set up your Python environment for data analysis:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import xgboost as xgb\n", - "from sklearn.ensemble import RandomForestClassifier\n", - "\n", - "from validmind.tests import run_test\n", - "from validmind.datasets.credit_risk import lending_club\n", - "\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_4__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Load the sample dataset\n", - "\n", - "The sample dataset used here is provided by the ValidMind library. To be able to use it, you'll need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "df = lending_club.load_data(source=\"offline\")\n", - "\n", - "df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_1__'></a>\n", - "\n", - "### Prepocess the dataset\n", - "\n", - "In the preprocessing step we perform a number of operations to get ready for building our application scorecard. \n", - "\n", - "We use the `lending_club.preprocess` to simplify preprocessing. This function performs the following operations: \n", - "- Filters the dataset to include only loans for debt consolidation or credit card purposes\n", - "- Removes loans classified under the riskier grades \"F\" and \"G\"\n", - "- Excludes uncommon home ownership types and standardizes employment length and loan terms into numerical formats\n", - "- Discards unnecessary fields and any entries with missing information to maintain a clean and robust dataset for modeling" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "preprocess_df = lending_club.preprocess(df)\n", - "preprocess_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_2__'></a>\n", - "\n", - "### Feature engineering\n", - "\n", - "In the feature engineering phase, we apply specific transformations to optimize the dataset for predictive modeling in our application scorecard. \n", - "\n", - "Using the `ending_club.feature_engineering()` function, we conduct the following operations:\n", - "- **WoE encoding**: Converts both numerical and categorical features into Weight of Evidence (WoE) values. WoE is a statistical measure used in scorecard modeling that quantifies the relationship between a predictor variable and the binary target variable. It calculates the ratio of the distribution of good outcomes to the distribution of bad outcomes for each category or bin of a feature. This transformation helps to ensure that the features are predictive and consistent in their contribution to the model.\n", - "- **Integration of WoE bins**: Ensures that the WoE transformed values are integrated throughout the dataset, replacing the original feature values while excluding the target variable from this transformation. This transformation is used to maintain a consistent scale and impact of each variable within the model, which helps make the predictions more stable and accurate." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "fe_df = lending_club.feature_engineering(preprocess_df)\n", - "fe_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Train the model\n", - "\n", - "In this section, we focus on constructing and refining our predictive model. \n", - "- We begin by dividing our data, which is based on Weight of Evidence (WoE) features, into training and testing sets (`train_df`, `test_df`). \n", - "- With `lending_club.split`, we employ a simple random split, randomly allocating data points to each set to ensure a mix of examples in both." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Split the data\n", - "train_df, test_df = lending_club.split(fe_df, test_size=0.2)\n", - "\n", - "x_train = train_df.drop(lending_club.target_column, axis=1)\n", - "y_train = train_df[lending_club.target_column]\n", - "\n", - "x_test = test_df.drop(lending_club.target_column, axis=1)\n", - "y_test = test_df[lending_club.target_column]" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Define the XGBoost model\n", - "xgb_model = xgb.XGBClassifier(\n", - " n_estimators=50, \n", - " random_state=42, \n", - " early_stopping_rounds=10\n", - ")\n", - "xgb_model.set_params(\n", - " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", - ")\n", - "\n", - "# Fit the model\n", - "xgb_model.fit(\n", - " x_train, \n", - " y_train,\n", - " eval_set=[(x_test, y_test)],\n", - " verbose=False\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Define the Random Forest model\n", - "rf_model = RandomForestClassifier(\n", - " n_estimators=50, \n", - " random_state=42,\n", - ")\n", - "\n", - "# Fit the model\n", - "rf_model.fit(x_train, y_train)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Compute probabilities" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_xgb_prob = xgb_model.predict_proba(x_train)[:, 1]\n", - "test_xgb_prob = xgb_model.predict_proba(x_test)[:, 1]\n", - "\n", - "train_rf_prob = rf_model.predict_proba(x_train)[:, 1]\n", - "test_rf_prob = rf_model.predict_proba(x_test)[:, 1]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### Compute binary predictions" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "cut_off_threshold = 0.3\n", - "\n", - "train_xgb_binary_predictions = (train_xgb_prob > cut_off_threshold).astype(int)\n", - "test_xgb_binary_predictions = (test_xgb_prob > cut_off_threshold).astype(int)\n", - "\n", - "train_rf_binary_predictions = (train_rf_prob > cut_off_threshold).astype(int)\n", - "test_rf_binary_predictions = (test_rf_prob > cut_off_threshold).astype(int)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Document the model\n", - "\n", - "To document the model with the ValidMind Library, you'll need to:\n", - "1. Preprocess the raw dataset\n", - "2. Initialize some training and test datasets\n", - "3. Initialize a ValidMind model object for use with testing\n", - "4. Run the full suite of tests" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_1__'></a>\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", - "\n", - "This function takes a number of arguments:\n", - "\n", - "- `dataset`: The dataset that you want to provide as input to tests.\n", - "- `input_id`: A unique identifier that allows tracking what inputs are used when running each individual test.\n", - "- `target_column`: A required argument if tests require access to true values. This is the name of the target column in the dataset.\n", - "\n", - "With all datasets ready, you can now initialize the raw, processed, training and test datasets (`raw_df`, `preprocessed_df`, `fe_df`, `train_df` and `test_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_dataset = vm.init_dataset(\n", - " dataset=df,\n", - " input_id=\"raw_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")\n", - "\n", - "vm_preprocess_dataset = vm.init_dataset(\n", - " dataset=preprocess_df,\n", - " input_id=\"preprocess_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")\n", - "\n", - "vm_fe_dataset = vm.init_dataset(\n", - " dataset=fe_df,\n", - " input_id=\"fe_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")\n", - "\n", - "vm_train_ds = vm.init_dataset(\n", - " dataset=train_df,\n", - " input_id=\"train_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")\n", - "\n", - "vm_test_ds = vm.init_dataset(\n", - " dataset=test_df,\n", - " input_id=\"test_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_2__'></a>\n", - "\n", - "### Initialize the ValidMind models\n", - "\n", - "You'll also need to initialize ValidMind model objects (`vm_model`) that can be passed to other functions for analysis and tests on the data for our modelS.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_xgb_model = vm.init_model(\n", - " xgb_model,\n", - " input_id=\"xgb_model\",\n", - ")\n", - "\n", - "vm_rf_model = vm.init_model(\n", - " rf_model,\n", - " input_id=\"rf_model\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_3__'></a>\n", - "\n", - "### Assign prediction values and probabilities to the datasets\n", - "\n", - "With our model now trained, we'll move on to assigning both the predictive probabilities coming directly from the model's predictions, and the binary prediction after applying the cutoff threshold described in the previous steps. \n", - "- These tasks are achieved through the use of the `assign_predictions()` method associated with the VM `dataset` object.\n", - "- This method links the model's class prediction values and probabilities to our VM train and test datasets." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# XGBoost\n", - "vm_train_ds.assign_predictions(\n", - " model=vm_xgb_model,\n", - " prediction_values=train_xgb_binary_predictions,\n", - " prediction_probabilities=train_xgb_prob,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_xgb_model,\n", - " prediction_values=test_xgb_binary_predictions,\n", - " prediction_probabilities=test_xgb_prob,\n", - ")\n", - "\n", - "# Random Forest\n", - "vm_train_ds.assign_predictions(\n", - " model=vm_rf_model,\n", - " prediction_values=train_rf_binary_predictions,\n", - " prediction_probabilities=train_rf_prob,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_rf_model,\n", - " prediction_values=test_rf_binary_predictions,\n", - " prediction_probabilities=test_rf_prob,\n", - ")\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_4__'></a>\n", - "\n", - "### Compute credit risk scores\n", - "\n", - "In this phase, we translate model predictions into actionable scores using probability estimates generated by our trained model." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_xgb_scores = lending_club.compute_scores(train_xgb_prob)\n", - "test_xgb_scores = lending_club.compute_scores(test_xgb_prob)\n", - "\n", - "# Assign scores to the datasets\n", - "vm_train_ds.add_extra_column(\"xgb_scores\", train_xgb_scores)\n", - "vm_test_ds.add_extra_column(\"xgb_scores\", test_xgb_scores)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_5__'></a>\n", - "\n", - "### Adding custom context to the LLM descriptions\n", - "\n", - "To enable the LLM descriptions context, you need to set the `VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED` environment variable to `1`. This will enable the LLM descriptions context, which will be used to provide additional context to the LLM descriptions. This is a global setting that will affect all tests." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED\"] = \"1\"\n", - "\n", - "context = \"\"\"\n", - "FORMAT FOR THE LLM DESCRIPTIONS: \n", - " **<Test Name>** is designed to <begin with a concise overview of what the test does and its primary purpose, \n", - " extracted from the test description>.\n", - "\n", - " The test operates by <write a paragraph about the test mechanism, explaining how it works and what it measures. \n", - " Include any relevant formulas or methodologies mentioned in the test description.>\n", - "\n", - " The primary advantages of this test include <write a paragraph about the test's strengths and capabilities, \n", - " highlighting what makes it particularly useful for specific scenarios.>\n", - "\n", - " Users should be aware that <write a paragraph about the test's limitations and potential risks. \n", - " Include both technical limitations and interpretation challenges. \n", - " If the test description includes specific signs of high risk, incorporate these here.>\n", - "\n", - " **Key Insights:**\n", - "\n", - " The test results reveal:\n", - "\n", - " - **<insight title>**: <comprehensive description of one aspect of the results>\n", - " - **<insight title>**: <comprehensive description of another aspect>\n", - " ...\n", - "\n", - " Based on these results, <conclude with a brief paragraph that ties together the test results with the test's \n", - " purpose and provides any final recommendations or considerations.>\n", - "\n", - "ADDITIONAL INSTRUCTIONS:\n", - " Present insights in order from general to specific, with each insight as a single bullet point with bold title.\n", - "\n", - " For each metric in the test results, include in the test overview:\n", - " - The metric's purpose and what it measures\n", - " - Its mathematical formula\n", - " - The range of possible values\n", - " - What constitutes good/bad performance\n", - " - How to interpret different values\n", - "\n", - " Each insight should progressively cover:\n", - " 1. Overall scope and distribution\n", - " 2. Complete breakdown of all elements with specific values\n", - " 3. Natural groupings and patterns\n", - " 4. Comparative analysis between datasets/categories\n", - " 5. Stability and variations\n", - " 6. Notable relationships or dependencies\n", - "\n", - " Remember:\n", - " - Keep all insights at the same level (no sub-bullets or nested structures)\n", - " - Make each insight complete and self-contained\n", - " - Include specific numerical values and ranges\n", - " - Cover all elements in the results comprehensively\n", - " - Maintain clear, concise language\n", - " - Use only \"- **Title**: Description\" format for insights\n", - " - Progress naturally from general to specific observations\n", - "\n", - "\"\"\".strip()\n", - "\n", - "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT\"] = context" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_6__'></a>\n", - "\n", - "### Raw data" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.DatasetDescription:raw_data\",\n", - " inputs={\n", - " \"dataset\": vm_raw_dataset,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.DescriptiveStatistics:raw_data\",\n", - " inputs={\n", - " \"dataset\": vm_raw_dataset,\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.MissingValues:raw_data\",\n", - " inputs={\n", - " \"dataset\": vm_raw_dataset,\n", - " },\n", - " params={\n", - " \"min_percentage_threshold\": 1\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.ClassImbalance:raw_data\",\n", - " inputs={\n", - " \"dataset\": vm_raw_dataset,\n", - " },\n", - " params={\n", - " \"min_percent_threshold\": 10\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.Duplicates:raw_data\",\n", - " inputs={\n", - " \"dataset\": vm_raw_dataset,\n", - " },\n", - " params={\n", - " \"min_percentage_threshold\": 1\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.HighCardinality:raw_data\",\n", - " inputs={\n", - " \"dataset\": vm_raw_dataset,\n", - " },\n", - " params={\n", - " \"num_threshold\": 100,\n", - " \"percent_threshold\": 0.1,\n", - " \"threshold_type\": \"percent\"\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.Skewness:raw_data\",\n", - " inputs={\n", - " \"dataset\": vm_raw_dataset,\n", - " },\n", - " params={\n", - " \"max_threshold\": 1\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.UniqueRows:raw_data\",\n", - " inputs={\n", - " \"dataset\": vm_raw_dataset,\n", - " },\n", - " params={\n", - " \"min_percent_threshold\": 1\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.TooManyZeroValues:raw_data\",\n", - " inputs={\n", - " \"dataset\": vm_raw_dataset,\n", - " },\n", - " params={\n", - " \"max_percent_threshold\": 0.03\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.IQROutliersTable:raw_data\",\n", - " inputs={\n", - " \"dataset\": vm_raw_dataset,\n", - " },\n", - " params={\n", - " \"threshold\": 5\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_7__'></a>\n", - "\n", - "### Pre-processed data" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.DescriptiveStatistics:preprocessed_data\",\n", - " inputs={\n", - " \"dataset\": vm_preprocess_dataset,\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.TabularDescriptionTables:preprocessed_data\",\n", - " inputs={\n", - " \"dataset\": vm_preprocess_dataset\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.MissingValues:preprocessed_data\",\n", - " inputs={\n", - " \"dataset\": vm_preprocess_dataset,\n", - " },\n", - " params={\n", - " \"min_threshold\": 1\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.TabularNumericalHistograms:preprocessed_data\",\n", - " inputs={\n", - " \"dataset\": vm_preprocess_dataset\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.TabularCategoricalBarPlots:preprocessed_data\",\n", - " inputs={\n", - " \"dataset\": vm_preprocess_dataset\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.TargetRateBarPlots:preprocessed_data\",\n", - " inputs={\n", - " \"dataset\": vm_preprocess_dataset\n", - " },\n", - " params={\n", - " \"default_column\": lending_club.target_column,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_8__'></a>\n", - "\n", - "### Development data" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.DescriptiveStatistics:development_data\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.TabularDescriptionTables:development_data\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.ClassImbalance:development_data\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " },\n", - " params={\n", - " \"min_percent_threshold\": 10\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.UniqueRows:development_data\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " },\n", - " params={\n", - " \"min_percent_threshold\": 1\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.TabularNumericalHistograms:development_data\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_9__'></a>\n", - "\n", - "### Feature selection" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.MutualInformation:development_data\",\n", - " input_grid ={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " },\n", - " params={\n", - " \"min_threshold\": 0.01,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.PearsonCorrelationMatrix:development_data\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.HighPearsonCorrelation:development_data\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " },\n", - " params={\n", - " \"max_threshold\": 0.3,\n", - " \"top_n_correlations\": 10\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.WOEBinTable\",\n", - " input_grid={\n", - " \"dataset\": [vm_preprocess_dataset]\n", - " },\n", - " params={\n", - " \"breaks_adj\": lending_club.breaks_adj,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.WOEBinPlots\",\n", - " input_grid={\n", - " \"dataset\": [vm_preprocess_dataset]\n", - " },\n", - " params={\n", - " \"breaks_adj\": lending_club.breaks_adj,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_10__'></a>\n", - "\n", - "### Model training" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.DatasetSplit\",\n", - " inputs={\n", - " \"datasets\": [vm_train_ds, vm_test_ds],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.ModelMetadata\",\n", - " input_grid={\n", - " \"model\": [vm_xgb_model, vm_rf_model],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.ModelParameters\",\n", - " input_grid={\n", - " \"model\": [vm_xgb_model, vm_rf_model],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_11__'></a>\n", - "\n", - "### Model selection" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.statsmodels.GINITable\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " \"model\": [vm_xgb_model, vm_rf_model],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.ClassifierPerformance\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " \"model\": [vm_xgb_model, vm_rf_model],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.TrainingTestDegradation:XGBoost\",\n", - " inputs={\n", - " \"datasets\": [vm_train_ds, vm_test_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - " params={\n", - " \"max_threshold\": 0.1\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.TrainingTestDegradation:RandomForest\",\n", - " inputs={\n", - " \"datasets\": [vm_train_ds, vm_test_ds],\n", - " \"model\": vm_rf_model,\n", - " },\n", - " params={\n", - " \"max_threshold\": 0.1\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.HyperParametersTuning\",\n", - " inputs={\n", - " \"model\": vm_xgb_model,\n", - " \"dataset\": vm_train_ds,\n", - " },\n", - " params={\n", - " \"param_grid\": {'n_estimators': [50, 100]},\n", - " \"scoring\": ['roc_auc', 'recall'],\n", - " \"fit_params\": {'eval_set': [(x_test, y_test)], 'verbose': False},\n", - " \"thresholds\": [0.3, 0.5],\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_12__'></a>\n", - "\n", - "### Class discrimination" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.ROCCurve\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " \"model\": [vm_xgb_model],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.MinimumROCAUCScore\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " \"model\": [vm_xgb_model],\n", - " },\n", - " params={\n", - " \"min_threshold\": 0.5\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.statsmodels.PredictionProbabilitiesHistogram\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " \"model\": [vm_xgb_model],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.statsmodels.CumulativePredictionProbabilities\",\n", - " input_grid={\n", - " \"model\": [vm_xgb_model],\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.PopulationStabilityIndex\",\n", - " inputs={\n", - " \"datasets\": [vm_train_ds, vm_test_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - " params={\n", - " \"num_bins\": 10,\n", - " \"mode\": \"fixed\"\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_13__'></a>\n", - "\n", - "### Classification accuracy" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.ClassifierThresholdOptimization\",\n", - " inputs={\n", - " \"dataset\": vm_train_ds,\n", - " \"model\": vm_xgb_model\n", - " },\n", - " params={\n", - " \"target_recall\": 0.8 # Find a threshold that achieves a recall of 80%\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.CalibrationCurve\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " \"model\": [vm_xgb_model],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.ConfusionMatrix\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " \"model\": [vm_xgb_model],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " \"model\": [vm_xgb_model],\n", - " },\n", - " params={\n", - " \"min_threshold\": 0.7\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.MinimumF1Score\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " \"model\": [vm_xgb_model],\n", - " },\n", - " params={\n", - " \"min_threshold\": 0.5\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.PrecisionRecallCurve\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " \"model\": [vm_xgb_model]\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_14__'></a>\n", - "\n", - "### Model diagnosis" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.WeakspotsDiagnosis\",\n", - " inputs={\n", - " \"datasets\": [vm_train_ds, vm_test_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.OverfitDiagnosis\",\n", - " inputs={\n", - " \"model\": vm_xgb_model,\n", - " \"datasets\": [vm_train_ds, vm_test_ds],\n", - " },\n", - " params={\n", - " \"cut_off_threshold\": 0.04\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.RobustnessDiagnosis\",\n", - " inputs={\n", - " \"datasets\": [vm_train_ds, vm_test_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - " params={\n", - " \"scaling_factor_std_dev_list\": [\n", - " 0.1,\n", - " 0.2,\n", - " 0.3,\n", - " 0.4,\n", - " 0.5\n", - " ],\n", - " \"performance_decay_threshold\": 0.05\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_15__'></a>\n", - "\n", - "### Model explainability" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.PermutationFeatureImportance\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " \"model\": [vm_xgb_model]\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.FeaturesAUC\",\n", - " input_grid={\n", - " \"model\": [vm_xgb_model],\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.SHAPGlobalImportance\",\n", - " input_grid={\n", - " \"model\": [vm_xgb_model],\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " },\n", - " params={\n", - " \"kernel_explainer_samples\": 10,\n", - " \"tree_or_linear_explainer_samples\": 200,\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_16__'></a>\n", - "\n", - "### Scoring evaluation" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.statsmodels.ScorecardHistogram\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_test_ds],\n", - " },\n", - " params={\n", - " \"score_column\": \"xgb_scores\",\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.ScoreBandDefaultRates\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds],\n", - " \"model\": [vm_xgb_model],\n", - " },\n", - " params = {\n", - " \"score_column\": \"xgb_scores\",\n", - " \"score_bands\": [500, 540, 570]\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.ScoreProbabilityAlignment\",\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds],\n", - " \"model\": [vm_xgb_model],\n", - " },\n", - " params={\n", - " \"score_column\": \"xgb_scores\",\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Custom tests\n", - "\n", - "Custom tests extend the functionality of ValidMind, allowing you to document any model or use case with added flexibility.\n", - "\n", - "ValidMind provides a comprehensive set of tests out-of-the-box to evaluate and document your models and datasets. We recognize there will be cases where the default tests do not support a model or dataset, or specific documentation is needed. In these cases, you can create and use your own custom code to accomplish what you need. To streamline custom code integration, we support the creation of custom test functions." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_1__'></a>\n", - "\n", - "### In-line custom tests\n", - "\n", - "The `@vm.test` decorator is doing the work of creating a wrapper around the function that will allow it to be run by the ValidMind Library. It also registers the test so it can be found by the ID `my_custom_tests.ScoreToOdds\"`. The function `score_to_odds_analysis` takes three arguments `dataset`, `score_column`, and `score_bands`. This is a `VMDataset` and the rest are parameters that can be passed in." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import numpy as np\n", - "import pandas as pd\n", - "import plotly.graph_objects as go\n", - "\n", - "\n", - "@vm.test(\"my_custom_tests.ScoreToOdds\")\n", - "def score_to_odds_analysis(dataset, score_column='score', score_bands=[410, 440, 470]):\n", - " \"\"\"\n", - " Analyzes the relationship between score bands and odds (good:bad ratio).\n", - " Good odds = (1 - default_rate) / default_rate\n", - " \n", - " Higher scores should correspond to higher odds of being good.\n", - " \"\"\"\n", - " df = dataset.df\n", - " \n", - " # Create score bands\n", - " df['score_band'] = pd.cut(\n", - " df[score_column],\n", - " bins=[-np.inf] + score_bands + [np.inf],\n", - " labels=[f'<{score_bands[0]}'] + \n", - " [f'{score_bands[i]}-{score_bands[i+1]}' for i in range(len(score_bands)-1)] +\n", - " [f'>{score_bands[-1]}']\n", - " )\n", - " \n", - " # Calculate metrics per band\n", - " results = df.groupby('score_band').agg({\n", - " dataset.target_column: ['mean', 'count']\n", - " })\n", - " \n", - " results.columns = ['Default Rate', 'Total']\n", - " results['Good Count'] = results['Total'] - (results['Default Rate'] * results['Total'])\n", - " results['Bad Count'] = results['Default Rate'] * results['Total']\n", - " results['Odds'] = results['Good Count'] / results['Bad Count']\n", - " \n", - " # Create visualization\n", - " fig = go.Figure()\n", - " \n", - " # Add odds bars\n", - " fig.add_trace(go.Bar(\n", - " name='Odds (Good:Bad)',\n", - " x=results.index,\n", - " y=results['Odds'],\n", - " marker_color='blue'\n", - " ))\n", - " \n", - " fig.update_layout(\n", - " title='Score-to-Odds Analysis',\n", - " yaxis=dict(title='Odds Ratio (Good:Bad)'),\n", - " showlegend=False\n", - " )\n", - " \n", - " return fig" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"my_custom_tests.ScoreToOdds\",\n", - " inputs={\n", - " \"dataset\": vm_test_ds,\n", - " },\n", - " params={\n", - " \"score_column\": \"xgb_scores\",\n", - " \"score_bands\": [500, 540, 570],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_2__'></a>\n", - "\n", - "### Local test provider\n", - "\n", - "The ValidMind Library offers the ability to extend the built-in library of tests with custom tests. A test \"Provider\" is a Python class that gets registered with the ValidMind Library and loads tests based on a test ID, for example `my_test_provider.my_test_id`. The built-in suite of tests that ValidMind offers is technically its own test provider. You can use one the built-in test provider offered by ValidMind (`validmind.tests.test_providers.LocalTestProvider`) or you can create your own. More than likely, you'll want to use the `LocalTestProvider` to add a directory of custom tests but there's flexibility to be able to load tests from any source." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.tests import LocalTestProvider\n", - "\n", - "# Define the folder where your tests are located\n", - "tests_folder = \"custom_tests\"\n", - "\n", - "# initialize the test provider with the tests folder we created earlier\n", - "my_test_provider = LocalTestProvider(tests_folder)\n", - "\n", - "vm.tests.register_test_provider(\n", - " namespace=\"my_test_provider\",\n", - " test_provider=my_test_provider,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now that we have our test provider set up, we can run any test that's located in our tests folder by using the `run_test()` method. This function is your entry point to running single tests in the ValidMind Library. It takes a test ID and runs the test associated with that ID. For our custom tests, the test ID will be the `namespace` specified when registering the provider, followed by the path to the test file relative to the tests folder. For example, the Confusion Matrix test we created earlier will have the test ID `my_test_provider.ConfusionMatrix`. You could organize the tests in subfolders, say `classification` and `regression`, and the test ID for the Confusion Matrix test would then be `my_test_provider.classification.ConfusionMatrix`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"my_test_provider.ScoreBandDiscriminationMetrics\",\n", - " inputs={\n", - " \"dataset\": vm_test_ds,\n", - " \"model\": vm_xgb_model,\n", - " },\n", - " params={\n", - " \"score_column\": \"xgb_scores\",\n", - " \"score_bands\": [500, 540, 570],\n", - " }\n", - ").log(section_id=\"interpretability_insights\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "<a id='toc7_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "3. Expand the following sections and take a look around:\n", - "\n", - " - **2. Data Preparation**\n", - " - **3. Model Development**\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation (hint: some of the tests in **2.3. Feature Selection and Engineering** look like they need some attention), view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "<a id='toc7_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-245d3f2bfcad480aa6baa2bde87c76e6", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "validmind-eEL8LtKG-py3.10", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 2 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Document an application scorecard model\n", + "\n", + "Build and document an *application scorecard model* with the ValidMind Library by using Kaggle's [Lending Club](https://www.kaggle.com/datasets/devanshi23/loan-data-2007-2014/data) sample dataset to build a simple application scorecard.\n", + "\n", + "An application scorecard model is a type of statistical model used in credit scoring to evaluate the creditworthiness of potential borrowers by generating a score based on various characteristics of an applicant — such as credit history, income, employment status, and other relevant financial data. \n", + "\n", + "- This score helps lenders make decisions about whether to approve or reject loan applications, as well as determine the terms of the loan, including interest rates and credit limits. \n", + "- Application scorecard models enable lenders to manage risk efficiently while making the loan application process faster and more transparent for applicants.\n", + "\n", + "This interactive notebook provides a step-by-step guide for loading a demo dataset, preprocessing the raw data, training a model for testing, setting up test inputs, initializing the required ValidMind objects, running the test, and then logging the results to ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Initialize the Python environment](#toc2_3__) \n", + " - [Preview the documentation template](#toc2_4__) \n", + "- [Load the sample dataset](#toc3__) \n", + " - [Prepocess the dataset](#toc3_1__) \n", + " - [Feature engineering](#toc3_2__) \n", + "- [Train the model](#toc4__) \n", + " - [Compute probabilities](#toc4_1__) \n", + " - [Compute binary predictions](#toc4_2__) \n", + "- [Document the model](#toc5__) \n", + " - [Initialize the ValidMind datasets](#toc5_1__) \n", + " - [Initialize the ValidMind models](#toc5_2__) \n", + " - [Assign prediction values and probabilities to the datasets](#toc5_3__) \n", + " - [Compute credit risk scores](#toc5_4__) \n", + " - [Adding custom context to the LLM descriptions](#toc5_5__) \n", + " - [Raw data](#toc5_6__) \n", + " - [Pre-processed data](#toc5_7__) \n", + " - [Development data](#toc5_8__) \n", + " - [Feature selection](#toc5_9__) \n", + " - [Model training](#toc5_10__) \n", + " - [Model selection](#toc5_11__) \n", + " - [Class discrimination](#toc5_12__) \n", + " - [Classification accuracy](#toc5_13__) \n", + " - [Model diagnosis](#toc5_14__) \n", + " - [Model explainability](#toc5_15__) \n", + " - [Scoring evaluation](#toc5_16__) \n", + "- [Custom tests](#toc6__) \n", + " - [In-line custom tests](#toc6_1__) \n", + " - [Local test provider](#toc6_2__) \n", + "- [Next steps](#toc7__) \n", + " - [Work with your documentation](#toc7_1__) \n", + " - [Discover more learning resources](#toc7_2__) \n", + "- [Upgrade ValidMind](#toc8__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.\n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Credit Risk Scorecard`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host = \"...\",\n", + " # api_key = \"...\",\n", + " # api_secret = \"...\",\n", + " # model = \"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Next, let's import the necessary libraries and set up your Python environment for data analysis:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import xgboost as xgb\n", + "from sklearn.ensemble import RandomForestClassifier\n", + "\n", + "from validmind.tests import run_test\n", + "from validmind.datasets.credit_risk import lending_club\n", + "\n", + "%matplotlib inline" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_4__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Load the sample dataset\n", + "\n", + "The sample dataset used here is provided by the ValidMind library. To be able to use it, you'll need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "df = lending_club.load_data(source=\"offline\")\n", + "\n", + "df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1__'></a>\n", + "\n", + "### Prepocess the dataset\n", + "\n", + "In the preprocessing step we perform a number of operations to get ready for building our application scorecard. \n", + "\n", + "We use the `lending_club.preprocess` to simplify preprocessing. This function performs the following operations: \n", + "- Filters the dataset to include only loans for debt consolidation or credit card purposes\n", + "- Removes loans classified under the riskier grades \"F\" and \"G\"\n", + "- Excludes uncommon home ownership types and standardizes employment length and loan terms into numerical formats\n", + "- Discards unnecessary fields and any entries with missing information to maintain a clean and robust dataset for modeling" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "preprocess_df = lending_club.preprocess(df)\n", + "preprocess_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2__'></a>\n", + "\n", + "### Feature engineering\n", + "\n", + "In the feature engineering phase, we apply specific transformations to optimize the dataset for predictive modeling in our application scorecard. \n", + "\n", + "Using the `ending_club.feature_engineering()` function, we conduct the following operations:\n", + "- **WoE encoding**: Converts both numerical and categorical features into Weight of Evidence (WoE) values. WoE is a statistical measure used in scorecard modeling that quantifies the relationship between a predictor variable and the binary target variable. It calculates the ratio of the distribution of good outcomes to the distribution of bad outcomes for each category or bin of a feature. This transformation helps to ensure that the features are predictive and consistent in their contribution to the model.\n", + "- **Integration of WoE bins**: Ensures that the WoE transformed values are integrated throughout the dataset, replacing the original feature values while excluding the target variable from this transformation. This transformation is used to maintain a consistent scale and impact of each variable within the model, which helps make the predictions more stable and accurate." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "fe_df = lending_club.feature_engineering(preprocess_df)\n", + "fe_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Train the model\n", + "\n", + "In this section, we focus on constructing and refining our predictive model. \n", + "- We begin by dividing our data, which is based on Weight of Evidence (WoE) features, into training and testing sets (`train_df`, `test_df`). \n", + "- With `lending_club.split`, we employ a simple random split, randomly allocating data points to each set to ensure a mix of examples in both." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Split the data\n", + "train_df, test_df = lending_club.split(fe_df, test_size=0.2)\n", + "\n", + "x_train = train_df.drop(lending_club.target_column, axis=1)\n", + "y_train = train_df[lending_club.target_column]\n", + "\n", + "x_test = test_df.drop(lending_club.target_column, axis=1)\n", + "y_test = test_df[lending_club.target_column]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Define the XGBoost model\n", + "xgb_model = xgb.XGBClassifier(\n", + " n_estimators=50, \n", + " random_state=42, \n", + " early_stopping_rounds=10\n", + ")\n", + "xgb_model.set_params(\n", + " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", + ")\n", + "\n", + "# Fit the model\n", + "xgb_model.fit(\n", + " x_train, \n", + " y_train,\n", + " eval_set=[(x_test, y_test)],\n", + " verbose=False\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Define the Random Forest model\n", + "rf_model = RandomForestClassifier(\n", + " n_estimators=50, \n", + " random_state=42,\n", + ")\n", + "\n", + "# Fit the model\n", + "rf_model.fit(x_train, y_train)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Compute probabilities" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_xgb_prob = xgb_model.predict_proba(x_train)[:, 1]\n", + "test_xgb_prob = xgb_model.predict_proba(x_test)[:, 1]\n", + "\n", + "train_rf_prob = rf_model.predict_proba(x_train)[:, 1]\n", + "test_rf_prob = rf_model.predict_proba(x_test)[:, 1]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### Compute binary predictions" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "cut_off_threshold = 0.3\n", + "\n", + "train_xgb_binary_predictions = (train_xgb_prob > cut_off_threshold).astype(int)\n", + "test_xgb_binary_predictions = (test_xgb_prob > cut_off_threshold).astype(int)\n", + "\n", + "train_rf_binary_predictions = (train_rf_prob > cut_off_threshold).astype(int)\n", + "test_rf_binary_predictions = (test_rf_prob > cut_off_threshold).astype(int)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Document the model\n", + "\n", + "To document the model with the ValidMind Library, you'll need to:\n", + "1. Preprocess the raw dataset\n", + "2. Initialize some training and test datasets\n", + "3. Initialize a ValidMind model object for use with testing\n", + "4. Run the full suite of tests" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1__'></a>\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", + "\n", + "This function takes a number of arguments:\n", + "\n", + "- `dataset`: The dataset that you want to provide as input to tests.\n", + "- `input_id`: A unique identifier that allows tracking what inputs are used when running each individual test.\n", + "- `target_column`: A required argument if tests require access to true values. This is the name of the target column in the dataset.\n", + "\n", + "With all datasets ready, you can now initialize the raw, processed, training and test datasets (`raw_df`, `preprocessed_df`, `fe_df`, `train_df` and `test_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_dataset = vm.init_dataset(\n", + " dataset=df,\n", + " input_id=\"raw_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")\n", + "\n", + "vm_preprocess_dataset = vm.init_dataset(\n", + " dataset=preprocess_df,\n", + " input_id=\"preprocess_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")\n", + "\n", + "vm_fe_dataset = vm.init_dataset(\n", + " dataset=fe_df,\n", + " input_id=\"fe_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")\n", + "\n", + "vm_train_ds = vm.init_dataset(\n", + " dataset=train_df,\n", + " input_id=\"train_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")\n", + "\n", + "vm_test_ds = vm.init_dataset(\n", + " dataset=test_df,\n", + " input_id=\"test_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_2__'></a>\n", + "\n", + "### Initialize the ValidMind models\n", + "\n", + "You'll also need to initialize ValidMind model objects (`vm_model`) that can be passed to other functions for analysis and tests on the data for our modelS.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_xgb_model = vm.init_model(\n", + " xgb_model,\n", + " input_id=\"xgb_model\",\n", + ")\n", + "\n", + "vm_rf_model = vm.init_model(\n", + " rf_model,\n", + " input_id=\"rf_model\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_3__'></a>\n", + "\n", + "### Assign prediction values and probabilities to the datasets\n", + "\n", + "With our model now trained, we'll move on to assigning both the predictive probabilities coming directly from the model's predictions, and the binary prediction after applying the cutoff threshold described in the previous steps. \n", + "- These tasks are achieved through the use of the `assign_predictions()` method associated with the VM `dataset` object.\n", + "- This method links the model's class prediction values and probabilities to our VM train and test datasets." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# XGBoost\n", + "vm_train_ds.assign_predictions(\n", + " model=vm_xgb_model,\n", + " prediction_values=train_xgb_binary_predictions,\n", + " prediction_probabilities=train_xgb_prob,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_xgb_model,\n", + " prediction_values=test_xgb_binary_predictions,\n", + " prediction_probabilities=test_xgb_prob,\n", + ")\n", + "\n", + "# Random Forest\n", + "vm_train_ds.assign_predictions(\n", + " model=vm_rf_model,\n", + " prediction_values=train_rf_binary_predictions,\n", + " prediction_probabilities=train_rf_prob,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_rf_model,\n", + " prediction_values=test_rf_binary_predictions,\n", + " prediction_probabilities=test_rf_prob,\n", + ")\n" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_4__'></a>\n", + "\n", + "### Compute credit risk scores\n", + "\n", + "In this phase, we translate model predictions into actionable scores using probability estimates generated by our trained model." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_xgb_scores = lending_club.compute_scores(train_xgb_prob)\n", + "test_xgb_scores = lending_club.compute_scores(test_xgb_prob)\n", + "\n", + "# Assign scores to the datasets\n", + "vm_train_ds.add_extra_column(\"xgb_scores\", train_xgb_scores)\n", + "vm_test_ds.add_extra_column(\"xgb_scores\", test_xgb_scores)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_5__'></a>\n", + "\n", + "### Adding custom context to the LLM descriptions\n", + "\n", + "To enable the LLM descriptions context, you need to set the `VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED` environment variable to `1`. This will enable the LLM descriptions context, which will be used to provide additional context to the LLM descriptions. This is a global setting that will affect all tests." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import os\n", + "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED\"] = \"1\"\n", + "\n", + "context = \"\"\"\n", + "FORMAT FOR THE LLM DESCRIPTIONS: \n", + " **<Test Name>** is designed to <begin with a concise overview of what the test does and its primary purpose, \n", + " extracted from the test description>.\n", + "\n", + " The test operates by <write a paragraph about the test mechanism, explaining how it works and what it measures. \n", + " Include any relevant formulas or methodologies mentioned in the test description.>\n", + "\n", + " The primary advantages of this test include <write a paragraph about the test's strengths and capabilities, \n", + " highlighting what makes it particularly useful for specific scenarios.>\n", + "\n", + " Users should be aware that <write a paragraph about the test's limitations and potential risks. \n", + " Include both technical limitations and interpretation challenges. \n", + " If the test description includes specific signs of high risk, incorporate these here.>\n", + "\n", + " **Key Insights:**\n", + "\n", + " The test results reveal:\n", + "\n", + " - **<insight title>**: <comprehensive description of one aspect of the results>\n", + " - **<insight title>**: <comprehensive description of another aspect>\n", + " ...\n", + "\n", + " Based on these results, <conclude with a brief paragraph that ties together the test results with the test's \n", + " purpose and provides any final recommendations or considerations.>\n", + "\n", + "ADDITIONAL INSTRUCTIONS:\n", + " Present insights in order from general to specific, with each insight as a single bullet point with bold title.\n", + "\n", + " For each metric in the test results, include in the test overview:\n", + " - The metric's purpose and what it measures\n", + " - Its mathematical formula\n", + " - The range of possible values\n", + " - What constitutes good/bad performance\n", + " - How to interpret different values\n", + "\n", + " Each insight should progressively cover:\n", + " 1. Overall scope and distribution\n", + " 2. Complete breakdown of all elements with specific values\n", + " 3. Natural groupings and patterns\n", + " 4. Comparative analysis between datasets/categories\n", + " 5. Stability and variations\n", + " 6. Notable relationships or dependencies\n", + "\n", + " Remember:\n", + " - Keep all insights at the same level (no sub-bullets or nested structures)\n", + " - Make each insight complete and self-contained\n", + " - Include specific numerical values and ranges\n", + " - Cover all elements in the results comprehensively\n", + " - Maintain clear, concise language\n", + " - Use only \"- **Title**: Description\" format for insights\n", + " - Progress naturally from general to specific observations\n", + "\n", + "\"\"\".strip()\n", + "\n", + "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT\"] = context" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_6__'></a>\n", + "\n", + "### Raw data" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.DatasetDescription:raw_data\",\n", + " inputs={\n", + " \"dataset\": vm_raw_dataset,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.DescriptiveStatistics:raw_data\",\n", + " inputs={\n", + " \"dataset\": vm_raw_dataset,\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.MissingValues:raw_data\",\n", + " inputs={\n", + " \"dataset\": vm_raw_dataset,\n", + " },\n", + " params={\n", + " \"min_percentage_threshold\": 1\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.ClassImbalance:raw_data\",\n", + " inputs={\n", + " \"dataset\": vm_raw_dataset,\n", + " },\n", + " params={\n", + " \"min_percent_threshold\": 10\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.Duplicates:raw_data\",\n", + " inputs={\n", + " \"dataset\": vm_raw_dataset,\n", + " },\n", + " params={\n", + " \"min_percentage_threshold\": 1\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.HighCardinality:raw_data\",\n", + " inputs={\n", + " \"dataset\": vm_raw_dataset,\n", + " },\n", + " params={\n", + " \"num_threshold\": 100,\n", + " \"percent_threshold\": 0.1,\n", + " \"threshold_type\": \"percent\"\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.Skewness:raw_data\",\n", + " inputs={\n", + " \"dataset\": vm_raw_dataset,\n", + " },\n", + " params={\n", + " \"max_threshold\": 1\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.UniqueRows:raw_data\",\n", + " inputs={\n", + " \"dataset\": vm_raw_dataset,\n", + " },\n", + " params={\n", + " \"min_percent_threshold\": 1\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.TooManyZeroValues:raw_data\",\n", + " inputs={\n", + " \"dataset\": vm_raw_dataset,\n", + " },\n", + " params={\n", + " \"max_percent_threshold\": 0.03\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.IQROutliersTable:raw_data\",\n", + " inputs={\n", + " \"dataset\": vm_raw_dataset,\n", + " },\n", + " params={\n", + " \"threshold\": 5\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_7__'></a>\n", + "\n", + "### Pre-processed data" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.DescriptiveStatistics:preprocessed_data\",\n", + " inputs={\n", + " \"dataset\": vm_preprocess_dataset,\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.TabularDescriptionTables:preprocessed_data\",\n", + " inputs={\n", + " \"dataset\": vm_preprocess_dataset\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.MissingValues:preprocessed_data\",\n", + " inputs={\n", + " \"dataset\": vm_preprocess_dataset,\n", + " },\n", + " params={\n", + " \"min_threshold\": 1\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.TabularNumericalHistograms:preprocessed_data\",\n", + " inputs={\n", + " \"dataset\": vm_preprocess_dataset\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.TabularCategoricalBarPlots:preprocessed_data\",\n", + " inputs={\n", + " \"dataset\": vm_preprocess_dataset\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.TargetRateBarPlots:preprocessed_data\",\n", + " inputs={\n", + " \"dataset\": vm_preprocess_dataset\n", + " },\n", + " params={\n", + " \"default_column\": lending_club.target_column,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_8__'></a>\n", + "\n", + "### Development data" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.DescriptiveStatistics:development_data\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.TabularDescriptionTables:development_data\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.ClassImbalance:development_data\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " },\n", + " params={\n", + " \"min_percent_threshold\": 10\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.UniqueRows:development_data\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " },\n", + " params={\n", + " \"min_percent_threshold\": 1\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.TabularNumericalHistograms:development_data\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_9__'></a>\n", + "\n", + "### Feature selection" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.MutualInformation:development_data\",\n", + " input_grid ={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " },\n", + " params={\n", + " \"min_threshold\": 0.01,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.PearsonCorrelationMatrix:development_data\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.HighPearsonCorrelation:development_data\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " },\n", + " params={\n", + " \"max_threshold\": 0.3,\n", + " \"top_n_correlations\": 10\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.WOEBinTable\",\n", + " input_grid={\n", + " \"dataset\": [vm_preprocess_dataset]\n", + " },\n", + " params={\n", + " \"breaks_adj\": lending_club.breaks_adj,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.WOEBinPlots\",\n", + " input_grid={\n", + " \"dataset\": [vm_preprocess_dataset]\n", + " },\n", + " params={\n", + " \"breaks_adj\": lending_club.breaks_adj,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_10__'></a>\n", + "\n", + "### Model training" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.DatasetSplit\",\n", + " inputs={\n", + " \"datasets\": [vm_train_ds, vm_test_ds],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.ModelMetadata\",\n", + " input_grid={\n", + " \"model\": [vm_xgb_model, vm_rf_model],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.ModelParameters\",\n", + " input_grid={\n", + " \"model\": [vm_xgb_model, vm_rf_model],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_11__'></a>\n", + "\n", + "### Model selection" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.statsmodels.GINITable\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " \"model\": [vm_xgb_model, vm_rf_model],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.ClassifierPerformance\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " \"model\": [vm_xgb_model, vm_rf_model],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.TrainingTestDegradation:XGBoost\",\n", + " inputs={\n", + " \"datasets\": [vm_train_ds, vm_test_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + " params={\n", + " \"max_threshold\": 0.1\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.TrainingTestDegradation:RandomForest\",\n", + " inputs={\n", + " \"datasets\": [vm_train_ds, vm_test_ds],\n", + " \"model\": vm_rf_model,\n", + " },\n", + " params={\n", + " \"max_threshold\": 0.1\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.HyperParametersTuning\",\n", + " inputs={\n", + " \"model\": vm_xgb_model,\n", + " \"dataset\": vm_train_ds,\n", + " },\n", + " params={\n", + " \"param_grid\": {'n_estimators': [50, 100]},\n", + " \"scoring\": ['roc_auc', 'recall'],\n", + " \"fit_params\": {'eval_set': [(x_test, y_test)], 'verbose': False},\n", + " \"thresholds\": [0.3, 0.5],\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_12__'></a>\n", + "\n", + "### Class discrimination" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.ROCCurve\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " \"model\": [vm_xgb_model],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.MinimumROCAUCScore\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " \"model\": [vm_xgb_model],\n", + " },\n", + " params={\n", + " \"min_threshold\": 0.5\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.statsmodels.PredictionProbabilitiesHistogram\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " \"model\": [vm_xgb_model],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.statsmodels.CumulativePredictionProbabilities\",\n", + " input_grid={\n", + " \"model\": [vm_xgb_model],\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.PopulationStabilityIndex\",\n", + " inputs={\n", + " \"datasets\": [vm_train_ds, vm_test_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + " params={\n", + " \"num_bins\": 10,\n", + " \"mode\": \"fixed\"\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_13__'></a>\n", + "\n", + "### Classification accuracy" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.ClassifierThresholdOptimization\",\n", + " inputs={\n", + " \"dataset\": vm_train_ds,\n", + " \"model\": vm_xgb_model\n", + " },\n", + " params={\n", + " \"target_recall\": 0.8 # Find a threshold that achieves a recall of 80%\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.CalibrationCurve\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " \"model\": [vm_xgb_model],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.ConfusionMatrix\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " \"model\": [vm_xgb_model],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.MinimumAccuracy\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " \"model\": [vm_xgb_model],\n", + " },\n", + " params={\n", + " \"min_threshold\": 0.7\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.MinimumF1Score\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " \"model\": [vm_xgb_model],\n", + " },\n", + " params={\n", + " \"min_threshold\": 0.5\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.PrecisionRecallCurve\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " \"model\": [vm_xgb_model]\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_14__'></a>\n", + "\n", + "### Model diagnosis" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.WeakspotsDiagnosis\",\n", + " inputs={\n", + " \"datasets\": [vm_train_ds, vm_test_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.OverfitDiagnosis\",\n", + " inputs={\n", + " \"model\": vm_xgb_model,\n", + " \"datasets\": [vm_train_ds, vm_test_ds],\n", + " },\n", + " params={\n", + " \"cut_off_threshold\": 0.04\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.RobustnessDiagnosis\",\n", + " inputs={\n", + " \"datasets\": [vm_train_ds, vm_test_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + " params={\n", + " \"scaling_factor_std_dev_list\": [\n", + " 0.1,\n", + " 0.2,\n", + " 0.3,\n", + " 0.4,\n", + " 0.5\n", + " ],\n", + " \"performance_decay_threshold\": 0.05\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_15__'></a>\n", + "\n", + "### Model explainability" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.PermutationFeatureImportance\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " \"model\": [vm_xgb_model]\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.FeaturesAUC\",\n", + " input_grid={\n", + " \"model\": [vm_xgb_model],\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.SHAPGlobalImportance\",\n", + " input_grid={\n", + " \"model\": [vm_xgb_model],\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " },\n", + " params={\n", + " \"kernel_explainer_samples\": 10,\n", + " \"tree_or_linear_explainer_samples\": 200,\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_16__'></a>\n", + "\n", + "### Scoring evaluation" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.statsmodels.ScorecardHistogram\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_test_ds],\n", + " },\n", + " params={\n", + " \"score_column\": \"xgb_scores\",\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.ScoreBandDefaultRates\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds],\n", + " \"model\": [vm_xgb_model],\n", + " },\n", + " params = {\n", + " \"score_column\": \"xgb_scores\",\n", + " \"score_bands\": [500, 540, 570]\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.ScoreProbabilityAlignment\",\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds],\n", + " \"model\": [vm_xgb_model],\n", + " },\n", + " params={\n", + " \"score_column\": \"xgb_scores\",\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Custom tests\n", + "\n", + "Custom tests extend the functionality of ValidMind, allowing you to document any model or use case with added flexibility.\n", + "\n", + "ValidMind provides a comprehensive set of tests out-of-the-box to evaluate and document your models and datasets. We recognize there will be cases where the default tests do not support a model or dataset, or specific documentation is needed. In these cases, you can create and use your own custom code to accomplish what you need. To streamline custom code integration, we support the creation of custom test functions." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_1__'></a>\n", + "\n", + "### In-line custom tests\n", + "\n", + "The `@vm.test` decorator is doing the work of creating a wrapper around the function that will allow it to be run by the ValidMind Library. It also registers the test so it can be found by the ID `my_custom_tests.ScoreToOdds\"`. The function `score_to_odds_analysis` takes three arguments `dataset`, `score_column`, and `score_bands`. This is a `VMDataset` and the rest are parameters that can be passed in." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import numpy as np\n", + "import pandas as pd\n", + "import plotly.graph_objects as go\n", + "\n", + "\n", + "@vm.test(\"my_custom_tests.ScoreToOdds\")\n", + "def score_to_odds_analysis(dataset, score_column='score', score_bands=[410, 440, 470]):\n", + " \"\"\"\n", + " Analyzes the relationship between score bands and odds (good:bad ratio).\n", + " Good odds = (1 - default_rate) / default_rate\n", + " \n", + " Higher scores should correspond to higher odds of being good.\n", + " \"\"\"\n", + " df = dataset.df\n", + " \n", + " # Create score bands\n", + " df['score_band'] = pd.cut(\n", + " df[score_column],\n", + " bins=[-np.inf] + score_bands + [np.inf],\n", + " labels=[f'<{score_bands[0]}'] + \n", + " [f'{score_bands[i]}-{score_bands[i+1]}' for i in range(len(score_bands)-1)] +\n", + " [f'>{score_bands[-1]}']\n", + " )\n", + " \n", + " # Calculate metrics per band\n", + " results = df.groupby('score_band').agg({\n", + " dataset.target_column: ['mean', 'count']\n", + " })\n", + " \n", + " results.columns = ['Default Rate', 'Total']\n", + " results['Good Count'] = results['Total'] - (results['Default Rate'] * results['Total'])\n", + " results['Bad Count'] = results['Default Rate'] * results['Total']\n", + " results['Odds'] = results['Good Count'] / results['Bad Count']\n", + " \n", + " # Create visualization\n", + " fig = go.Figure()\n", + " \n", + " # Add odds bars\n", + " fig.add_trace(go.Bar(\n", + " name='Odds (Good:Bad)',\n", + " x=results.index,\n", + " y=results['Odds'],\n", + " marker_color='blue'\n", + " ))\n", + " \n", + " fig.update_layout(\n", + " title='Score-to-Odds Analysis',\n", + " yaxis=dict(title='Odds Ratio (Good:Bad)'),\n", + " showlegend=False\n", + " )\n", + " \n", + " return fig" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"my_custom_tests.ScoreToOdds\",\n", + " inputs={\n", + " \"dataset\": vm_test_ds,\n", + " },\n", + " params={\n", + " \"score_column\": \"xgb_scores\",\n", + " \"score_bands\": [500, 540, 570],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_2__'></a>\n", + "\n", + "### Local test provider\n", + "\n", + "The ValidMind Library offers the ability to extend the built-in library of tests with custom tests. A test \"Provider\" is a Python class that gets registered with the ValidMind Library and loads tests based on a test ID, for example `my_test_provider.my_test_id`. The built-in suite of tests that ValidMind offers is technically its own test provider. You can use one the built-in test provider offered by ValidMind (`validmind.tests.test_providers.LocalTestProvider`) or you can create your own. More than likely, you'll want to use the `LocalTestProvider` to add a directory of custom tests but there's flexibility to be able to load tests from any source." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.tests import LocalTestProvider\n", + "\n", + "# Define the folder where your tests are located\n", + "tests_folder = \"custom_tests\"\n", + "\n", + "# initialize the test provider with the tests folder we created earlier\n", + "my_test_provider = LocalTestProvider(tests_folder)\n", + "\n", + "vm.tests.register_test_provider(\n", + " namespace=\"my_test_provider\",\n", + " test_provider=my_test_provider,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now that we have our test provider set up, we can run any test that's located in our tests folder by using the `run_test()` method. This function is your entry point to running single tests in the ValidMind Library. It takes a test ID and runs the test associated with that ID. For our custom tests, the test ID will be the `namespace` specified when registering the provider, followed by the path to the test file relative to the tests folder. For example, the Confusion Matrix test we created earlier will have the test ID `my_test_provider.ConfusionMatrix`. You could organize the tests in subfolders, say `classification` and `regression`, and the test ID for the Confusion Matrix test would then be `my_test_provider.classification.ConfusionMatrix`." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"my_test_provider.ScoreBandDiscriminationMetrics\",\n", + " inputs={\n", + " \"dataset\": vm_test_ds,\n", + " \"model\": vm_xgb_model,\n", + " },\n", + " params={\n", + " \"score_column\": \"xgb_scores\",\n", + " \"score_bands\": [500, 540, 570],\n", + " }\n", + ").log(section_id=\"interpretability_insights\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "<a id='toc7_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "3. Expand the following sections and take a look around:\n", + "\n", + " - **2. Data Preparation**\n", + " - **3. Model Development**\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation (hint: some of the tests in **2.3. Feature Selection and Engineering** look like they need some attention), view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "<a id='toc7_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-245d3f2bfcad480aa6baa2bde87c76e6" + } + ], + "metadata": { + "kernelspec": { + "display_name": "validmind-eEL8LtKG-py3.10", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.13" + } + }, + "nbformat": 4, + "nbformat_minor": 2 } diff --git a/site/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb b/site/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb index 8949b55a5e..fa8e86113a 100644 --- a/site/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb +++ b/site/notebooks/use_cases/credit_risk/document_excel_application_scorecard.ipynb @@ -1,1016 +1,1022 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Document an Excel-based application scorecard model\n", - "\n", - "Build and document an Excel-based application scorecard model with the ValidMind Library. Learn how to load an Excel-based model, prepare your datasets and model for testing, run tests and log those test results to the ValidMind Platform.\n", - "\n", - "An *application scorecard model* is a type of statistical model used in credit scoring to evaluate the creditworthiness of potential borrowers by generating a score based on various characteristics of an applicant such as credit history, income, employment status, and other relevant financial data.\n", - "\n", - " - This score assists lenders in making informed decisions about whether to approve or reject loan applications, as well as in determining the terms of the loan, including interest rates and credit limits.\n", - " - Effective validation of application scorecard models ensures that lenders can manage risk efficiently while maintaining a fast and transparent loan application process for applicants." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Initialize the Python environment](#toc2_3__) \n", - " - [Preview the documentation template](#toc2_4__) \n", - "- [Loading the sample datasets](#toc3__) \n", - " - [Load the raw dataset](#toc3_1__) \n", - " - [Load the preprocessed dataset](#toc3_2__) \n", - " - [Load the training and test datasets](#toc3_3__) \n", - "- [Initialize the ValidMind datasets](#toc4__) \n", - "- [Initialize the ValidMind model](#toc5__) \n", - " - [Link predictions](#toc5_1__) \n", - "- [Running tests](#toc6__) \n", - " - [Enable custom context for test descriptions](#toc6_1__) \n", - " - [Define tests to run](#toc6_2__) \n", - " - [Run defined tests](#toc6_3__) \n", - "- [Next steps](#toc7__) \n", - " - [Work with your documentation](#toc7_1__) \n", - " - [Add individual test results to documentation](#toc7_1_1__) \n", - " - [Discover more learning resources](#toc7_2__) \n", - "- [Upgrade ValidMind](#toc8__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.\n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - "- **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - "- **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - "- **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - "- **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: The [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2); border-radius: 5px;\">\n", - " <span style=\"color: #083E44;\"><b>Recommended Python versions</b></span><br />\n", - " Python 3.8 ≤ x ≤ 3.11\n", - "</div>\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Credit Risk Scorecard`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Then, let's import the necessary libraries and set up your Python environment for data analysis:\n", - "\n", - "- Install **OpenPyPL** (openpyxl) which will allow us to read and write `.xlsx` files.\n", - "- Import `pandas`, a Python library for data manipulation and analytics, as an alias.\n", - "- Enable `matplotlib`, a plotting library used for visualizing data. Ensures that any plots you generate will render inline in our notebook output rather than opening in a separate window." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install openpyxl\n", - "\n", - "import pandas as pd\n", - "\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_4__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Loading the sample datasets\n", - "\n", - "Let's import our sample dataset in the form of an Excel workbook ([CreditRiskData.xlsx](CreditRiskData.xlsx)) with five sheets indexed 0 to 3, each representing a different stage of data preparation:\n", - "\n", - "0. **Raw Data** – The original unprocessed dataset.\n", - "1. **Preprocessed Data** – A cleaned and prepared version of the raw data.\n", - "2. **Train Data** – A training subset used to fit your model.\n", - "3. **Test Data** – A testing subset used to evaluate model performance." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_1__'></a>\n", - "\n", - "### Load the raw dataset\n", - "\n", - "We'll start by loading the **Raw Data** sheet (index `0`) into a [Pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "df = pd.read_excel('CreditRiskData.xlsx', sheet_name=0,engine='openpyxl')\n", - "\n", - "df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_2__'></a>\n", - "\n", - "### Load the preprocessed dataset\n", - "\n", - "Next, load the **Preprocessed Data** sheet (index `1`), containing cleaned inputs ready for scoring:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "preprocess_df = pd.read_excel('CreditRiskData.xlsx', sheet_name=1,engine='openpyxl')\n", - "preprocess_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_3__'></a>\n", - "\n", - "### Load the training and test datasets\n", - "\n", - "Finally, load the split training (**Train Data**, index `2`) and testing (**Test Data**, index `3`) sets:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_df = pd.read_excel('CreditRiskData.xlsx', sheet_name=2,engine='openpyxl')\n", - "test_df = pd.read_excel('CreditRiskData.xlsx', sheet_name=3,engine='openpyxl')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests with your loaded datasets, you must first initialize a ValidMind `Dataset` object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module. **This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind,** but you only need to do it once per dataset.\n", - "\n", - "For this example, we'll pass in the following arguments:\n", - "\n", - "- **`dataset`:** The input DataFrame to test.\n", - "- **`input_id`:** A unique identifier for tracking test inputs.\n", - "- **`target_column`:** Required for tests that compare predictions to actual outcomes; specify the name of the column with the true values." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Initialize the raw dataset\n", - "vm_raw_dataset = vm.init_dataset(\n", - " dataset=df,\n", - " input_id=\"raw_dataset\",\n", - " target_column='loan_status',\n", - ")\n", - "\n", - "# Initialize the preprocessed dataset\n", - "vm_preprocess_dataset = vm.init_dataset(\n", - " dataset=preprocess_df,\n", - " input_id=\"preprocess_dataset\",\n", - " target_column='loan_status',\n", - ")\n", - "\n", - "# Initialize the training dataset\n", - "vm_train_ds = vm.init_dataset(\n", - " dataset=train_df,\n", - " input_id=\"train_dataset\",\n", - " target_column='loan_status',\n", - ")\n", - "\n", - "# Initialize the testing dataset\n", - "vm_test_ds = vm.init_dataset(\n", - " dataset=test_df,\n", - " input_id=\"test_dataset\",\n", - " target_column='loan_status',\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Initialize the ValidMind model\n", - "\n", - "In this Excel-based use case, predictions are precomputed and included in the Excel file. While there's no model logic to run, a ValidMind model object (`vm_model`) is still required for passing to other functions for analysis and tests on the data.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Prediction logic placeholder\n", - "def dummy(X, **kwargs):\n", - " return None\n", - "\n", - "xgb_model = vm.init_model(\n", - " input_id=\"xgb_model\",\n", - " predict_fn=dummy\n", - " )" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_1__'></a>\n", - "\n", - "### Link predictions\n", - "\n", - "Once the model has been registered, you can assign model predictions to the training and testing datasets.\n", - "\n", - "Use the [`assign_predictions()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#assign_predictions) from the `Dataset` object to link the prediction values and probabilities from the relevant columns on our Excel spreadsheet to the training and testing datasets:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(model=xgb_model, prediction_column=\"xgb_model_prediction\",probability_column='xgb_model_probabilities')\n", - "vm_test_ds.assign_predictions(model=xgb_model, prediction_column=\"xgb_model_prediction\",probability_column='xgb_model_probabilities')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Running tests\n", - "\n", - "This is where it all comes together — we'll use our previously initialized datasets as inputs to run tests, then log the results to the ValidMind Platform.\n", - "\n", - "We'll run some tests that are defined out-of-the-box by the template we previewed earlier in this notebook, as well as some additional tests for more evidence. For the example in this section, we've selected and defined the tests for you.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Want to learn more about navigating ValidMind tests?</b></span>\n", - "<br></br>\n", - "Refer to our notebook outlining the utilities available for viewing and understanding available ValidMind tests: <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/explore_tests/explore_tests.html\" style=\"color: #DE257E;\"><b>Explore tests</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_1__'></a>\n", - "\n", - "### Enable custom context for test descriptions\n", - "\n", - "When you run ValidMind tests, test descriptions are automatically generated with LLM using the test results, the test name, and the static test definitions provided in the test’s docstring. While this metadata offers valuable high-level overviews of tests, insights produced by the LLM-based descriptions may not always align with your specific use cases or incorporate organizational policy requirements.\n", - "\n", - "Before we run our tests, we'll include some custom use case context to improve the clarity, structure, and interpretability of the test descriptions returned. By default, custom context for LLM-generated descriptions is disabled, meaning that the output will not include any additional context. To enable custom use case context, set the `VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED` environment variable to `1`.\n", - "\n", - "This is a global setting that will affect all tests for your linked model:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED\"] = \"1\"\n", - "\n", - "context = \"\"\"\n", - "FORMAT FOR THE LLM DESCRIPTIONS: \n", - " **<Test Name>** is designed to <begin with a concise overview of what the test does and its primary purpose, \n", - " extracted from the test description>.\n", - "\n", - " The test operates by <write a paragraph about the test mechanism, explaining how it works and what it measures. \n", - " Include any relevant formulas or methodologies mentioned in the test description.>\n", - "\n", - " The primary advantages of this test include <write a paragraph about the test's strengths and capabilities, \n", - " highlighting what makes it particularly useful for specific scenarios.>\n", - "\n", - " Users should be aware that <write a paragraph about the test's limitations and potential risks. \n", - " Include both technical limitations and interpretation challenges. \n", - " If the test description includes specific signs of high risk, incorporate these here.>\n", - "\n", - " **Key Insights:**\n", - "\n", - " The test results reveal:\n", - "\n", - " - **<insight title>**: <comprehensive description of one aspect of the results>\n", - " - **<insight title>**: <comprehensive description of another aspect>\n", - " ...\n", - "\n", - " Based on these results, <conclude with a brief paragraph that ties together the test results with the test's \n", - " purpose and provides any final recommendations or considerations.>\n", - "\n", - "ADDITIONAL INSTRUCTIONS:\n", - " Present insights in order from general to specific, with each insight as a single bullet point with bold title.\n", - "\n", - " For each metric in the test results, include in the test overview:\n", - " - The metric's purpose and what it measures\n", - " - Its mathematical formula\n", - " - The range of possible values\n", - " - What constitutes good/bad performance\n", - " - How to interpret different values\n", - "\n", - " Each insight should progressively cover:\n", - " 1. Overall scope and distribution\n", - " 2. Complete breakdown of all elements with specific values\n", - " 3. Natural groupings and patterns\n", - " 4. Comparative analysis between datasets/categories\n", - " 5. Stability and variations\n", - " 6. Notable relationships or dependencies\n", - "\n", - " Remember:\n", - " - Keep all insights at the same level (no sub-bullets or nested structures)\n", - " - Make each insight complete and self-contained\n", - " - Include specific numerical values and ranges\n", - " - Cover all elements in the results comprehensively\n", - " - Maintain clear, concise language\n", - " - Use only \"- **Title**: Description\" format for insights\n", - " - Progress naturally from general to specific observations\n", - "\n", - "\"\"\".strip()\n", - "\n", - "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT\"] = context" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_2__'></a>\n", - "\n", - "### Define tests to run\n", - "\n", - "First, we'll specify all the tests we'd like to independently run in a dictionary called `test_config`, including information about the `params` and `inputs` that each test requires.\n", - "\n", - "- Note here that `inputs` and `input_grid` expect the `input_id` of the dataset or model as the value rather than the variable name we specified**.\n", - "- When running individual tests, you can use a custom `result_id` to tag the individual result with a unique identifier by appending this `result_id` to the `test_id` with a `:` separator. (Example: `:raw_data` for tests run with our raw dataset.)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test_config = {\n", - "\n", - " # Data validation tests run with raw dataset\n", - " 'validmind.data_validation.DatasetDescription:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'}\n", - " },\n", - " 'validmind.data_validation.DescriptiveStatistics:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'}\n", - " },\n", - " 'validmind.data_validation.MissingValues:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {'min_percentage_threshold': 1}\n", - " },\n", - " 'validmind.data_validation.ClassImbalance:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {'min_percent_threshold': 10}\n", - " },\n", - " 'validmind.data_validation.Duplicates:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {'min_threshold': 1}\n", - " },\n", - " 'validmind.data_validation.HighCardinality:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {\n", - " 'num_threshold': 100,\n", - " 'percent_threshold': 0.1,\n", - " 'threshold_type': 'percent'\n", - " }\n", - " },\n", - " 'validmind.data_validation.Skewness:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {'max_threshold': 1}\n", - " },\n", - " 'validmind.data_validation.UniqueRows:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {'min_percent_threshold': 1}\n", - " },\n", - " 'validmind.data_validation.TooManyZeroValues:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {'max_percent_threshold': 0.03}\n", - " },\n", - " 'validmind.data_validation.IQROutliersTable:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {'threshold': 5}\n", - " },\n", - "\n", - " # Data validation tests run with preprocessed dataset\n", - " 'validmind.data_validation.DescriptiveStatistics:preprocessed_data': {\n", - " 'inputs': {'dataset': 'preprocess_dataset'}\n", - " },\n", - " 'validmind.data_validation.TabularDescriptionTables:preprocessed_data': {\n", - " 'inputs': {'dataset': 'preprocess_dataset'}\n", - " },\n", - " 'validmind.data_validation.MissingValues:preprocessed_data': {\n", - " 'inputs': {'dataset': 'preprocess_dataset'},\n", - " 'params': {'min_percentage_threshold': 1}\n", - " },\n", - " 'validmind.data_validation.TabularNumericalHistograms:preprocessed_data': {\n", - " 'inputs': {'dataset': 'preprocess_dataset'}\n", - " },\n", - " 'validmind.data_validation.TabularCategoricalBarPlots:preprocessed_data': {\n", - " 'inputs': {'dataset': 'preprocess_dataset'}\n", - " },\n", - " 'validmind.data_validation.TargetRateBarPlots:preprocessed_data': {\n", - " 'inputs': {'dataset': 'preprocess_dataset'},\n", - " 'params': {'default_column': 'loan_status'}\n", - " },\n", - "\n", - " 'validmind.data_validation.WOEBinTable': {\n", - " 'input_grid': {'dataset': ['preprocess_dataset']},\n", - " 'params': {\n", - " 'breaks_adj': {\n", - " 'loan_amnt': [5000, 10000, 15000, 20000, 25000],\n", - " 'int_rate': [10, 15, 20],\n", - " 'annual_inc': [50000, 100000, 150000]\n", - " }\n", - " }\n", - " },\n", - " 'validmind.data_validation.WOEBinPlots': {\n", - " 'input_grid': {'dataset': ['preprocess_dataset']},\n", - " 'params': {\n", - " 'breaks_adj': {\n", - " 'loan_amnt': [5000, 10000, 15000, 20000, 25000],\n", - " 'int_rate': [10, 15, 20],\n", - " 'annual_inc': [50000, 100000, 150000]\n", - " }\n", - " }\n", - " },\n", - "\n", - " # Data validation tests run with training & testing datasets\n", - " 'validmind.data_validation.DescriptiveStatistics:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", - " },\n", - " 'validmind.data_validation.TabularDescriptionTables:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", - " },\n", - " 'validmind.data_validation.ClassImbalance:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", - " 'params': {'min_percent_threshold': 10}\n", - " },\n", - " 'validmind.data_validation.UniqueRows:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", - " 'params': {'min_percent_threshold': 1}\n", - " },\n", - " 'validmind.data_validation.TabularNumericalHistograms:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", - " },\n", - " 'validmind.data_validation.MutualInformation:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", - " 'params': {'min_threshold': 0.01}\n", - " },\n", - " 'validmind.data_validation.PearsonCorrelationMatrix:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", - " },\n", - " 'validmind.data_validation.HighPearsonCorrelation:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", - " 'params': {'max_threshold': 0.3, 'top_n_correlations': 10}\n", - " },\n", - " 'validmind.data_validation.ScoreBandDefaultRates:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset'], 'model': ['xgb_model']},\n", - " 'params': {'score_column': 'xgb_scores', 'score_bands': [504, 537, 570]}\n", - " },\n", - " 'validmind.data_validation.DatasetSplit:development_data': {\n", - " 'inputs': {'datasets': ['train_dataset', 'test_dataset']}\n", - " },\n", - "\n", - " # Model validation tests\n", - " 'validmind.model_validation.statsmodels.GINITable': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']}\n", - " },\n", - " 'validmind.model_validation.sklearn.ClassifierPerformance': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']}\n", - " },\n", - " 'validmind.model_validation.sklearn.TrainingTestDegradation:XGBoost': {\n", - " 'inputs': {\n", - " 'datasets': ['train_dataset', 'test_dataset'],\n", - " 'model': 'xgb_model'\n", - " },\n", - " 'params': {'max_threshold': 0.1}\n", - " },\n", - " 'validmind.model_validation.sklearn.ROCCurve': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']}\n", - " },\n", - " 'validmind.model_validation.sklearn.MinimumROCAUCScore': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']},\n", - " 'params': {'min_threshold': 0.5}\n", - " },\n", - " 'validmind.model_validation.statsmodels.PredictionProbabilitiesHistogram': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']}\n", - " },\n", - " 'validmind.model_validation.statsmodels.CumulativePredictionProbabilities': {\n", - " 'input_grid': {'model': ['xgb_model'], 'dataset': ['train_dataset', 'test_dataset']}\n", - " },\n", - " 'validmind.model_validation.sklearn.PopulationStabilityIndex': {\n", - " 'inputs': {\n", - " 'datasets': ['train_dataset', 'test_dataset'],\n", - " 'model': 'xgb_model'\n", - " },\n", - " 'params': {'num_bins': 10, 'mode': 'fixed'}\n", - " },\n", - " 'validmind.model_validation.sklearn.ConfusionMatrix': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']}\n", - " },\n", - " 'validmind.model_validation.sklearn.MinimumAccuracy': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']},\n", - " 'params': {'min_threshold': 0.7}\n", - " },\n", - " 'validmind.model_validation.sklearn.MinimumF1Score': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']},\n", - " 'params': {'min_threshold': 0.5}\n", - " },\n", - " 'validmind.model_validation.sklearn.PrecisionRecallCurve': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']}\n", - " },\n", - " 'validmind.model_validation.sklearn.CalibrationCurve': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']}\n", - " },\n", - " 'validmind.model_validation.sklearn.ClassifierThresholdOptimization': {\n", - " 'inputs': {'dataset': 'train_dataset', 'model': 'xgb_model'},\n", - " 'params': {'target_recall': 0.8}\n", - " },\n", - " 'validmind.model_validation.statsmodels.ScorecardHistogram': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", - " 'params': {'score_column': 'xgb_scores'}\n", - " },\n", - " 'validmind.model_validation.sklearn.ScoreProbabilityAlignment': {\n", - " 'input_grid': {'dataset': ['train_dataset'], 'model': ['xgb_model']},\n", - " 'params': {'score_column': 'xgb_scores'}\n", - " },\n", - " 'validmind.model_validation.sklearn.WeakspotsDiagnosis': {\n", - " 'inputs': {'datasets': ['train_dataset', 'test_dataset'], 'model': 'xgb_model'}\n", - " },\n", - " 'validmind.model_validation.sklearn.OverfitDiagnosis': {\n", - " 'inputs': {'model': 'xgb_model', 'datasets': ['train_dataset', 'test_dataset']},\n", - " 'params': {'cut_off_threshold': 0.04}\n", - " },\n", - " 'validmind.model_validation.sklearn.RobustnessDiagnosis': {\n", - " 'inputs': {'datasets': ['train_dataset', 'test_dataset'], 'model': 'xgb_model'},\n", - " 'params': {\n", - " 'scaling_factor_std_dev_list': [0.1, 0.2, 0.3, 0.4, 0.5],\n", - " 'performance_decay_threshold': 0.05\n", - " }\n", - " },\n", - " 'validmind.model_validation.FeaturesAUC': {\n", - " 'input_grid': {'model': ['xgb_model'], 'dataset': ['train_dataset', 'test_dataset']}\n", - " }\n", - "}" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_3__'></a>\n", - "\n", - "### Run defined tests\n", - "\n", - "Then, we'll define a utility wrapper around [the `run_test` function](https://docs.validmind.ai/validmind/validmind/tests.html#run_test) provided by the `validmind.tests` module in a function called `run_doc_tests`.\n", - "\n", - "- Every test result returned by the `run_test()` function has a [`.log()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#TestResult.log) that can be used to send the test results to the ValidMind Platform.\n", - "- Our function requires information about the inputs to use on every test — which is why we specified these inputs above in `test_config`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def run_doc_tests(test_config):\n", - " for test_name, test_cfg in test_config.items():\n", - " print(test_name)\n", - " try:\n", - " # Collect available keyword arguments\n", - " kwargs = {\n", - " key: test_cfg[key]\n", - " for key in (\"params\", \"input_grid\", \"inputs\")\n", - " if key in test_cfg\n", - " }\n", - " kwargs[\"show\"] = False\n", - "\n", - " # Execute the test and log the results\n", - " vm.tests.run_test(test_name, **kwargs).log()\n", - "\n", - " except Exception as e:\n", - " print(f\"Error running test {test_name}: {e}\")\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Finally, we can pass the input configuration to `run_doc_tests` and run the full suite of tests!\n", - "\n", - "The variable `full_suite` then holds the result of these tests:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "full_suite = run_doc_tests(test_config)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Note the outputs returned indicating that certain test-driven blocks don't currently exist in your documentation for this particular test ID. </b></span>\n", - "<br></br>\n", - "That's expected, as when we run individual tests not defined by the documentation template out-of-the-box, the results logged need to be manually added to your documentation within the ValidMind Platform.</div>" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way, use the ValidMind Platform to work with your documentation.\n", - "\n", - "<a id='toc7_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - " What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready.\n", - "\n", - "3. Expand the following section to review tests automatically inserted into your documentation template: **2.3. Feature Selection and Engineering**\n", - "\n", - "<a id='toc7_1_1__'></a>\n", - "\n", - "#### Add individual test results to documentation\n", - "\n", - "Let's also add our additional test results into the documentation. These were results sent by individual tests not defined out-of-the-box by our template. For example (**Learn more:** [Work with test results](https://docs.validmind.ai/guide/documentation/work-with-test-results.html)):\n", - "\n", - "1. Locate the Data Preparation section of your documentation and click on **2.2. Correlations and Interactions** to expand that section.\n", - "\n", - "4. Hover under the Pearson Correlation Matrix content block until a horizontal dashed line with a **+** button appears, indicating that you can insert a new block.\n", - "\n", - " <img src= \"../../tutorials/development/add-content-block.gif\" alt=\"Screenshot showing insert block button in model documentation\" style=\"border: 2px solid #083E44; border-radius: 8px; border-right-width: 2px; border-bottom-width: 3px;\">\n", - " <br><br>\n", - "\n", - "5. Click **+** and then select **Test-Driven Block** under FROM LIBRARY:\n", - "\n", - " - Click on **VM Library** under TEST-DRIVEN in the left sidebar.\n", - " - In the search bar, type in `HighPearsonCorrelation`.\n", - " - Select `HighPearsonCorrelation:development_data` as the test.\n", - "\n", - "6. Finally, click **Insert 1 Test Result to Document** to add the test result to the documentation.\n", - "\n", - " Confirm that the individual results for the high correlation test has been correctly inserted into section **2.3. Correlations and Interactions** of the documentation.\n", - "\n", - "<a id='toc7_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-9a4dd2ee254f496292698e9be3d8f799", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 4 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Document an Excel-based application scorecard model\n", + "\n", + "Build and document an Excel-based application scorecard model with the ValidMind Library. Learn how to load an Excel-based model, prepare your datasets and model for testing, run tests and log those test results to the ValidMind Platform.\n", + "\n", + "An *application scorecard model* is a type of statistical model used in credit scoring to evaluate the creditworthiness of potential borrowers by generating a score based on various characteristics of an applicant such as credit history, income, employment status, and other relevant financial data.\n", + "\n", + " - This score assists lenders in making informed decisions about whether to approve or reject loan applications, as well as in determining the terms of the loan, including interest rates and credit limits.\n", + " - Effective validation of application scorecard models ensures that lenders can manage risk efficiently while maintaining a fast and transparent loan application process for applicants." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Initialize the Python environment](#toc2_3__) \n", + " - [Preview the documentation template](#toc2_4__) \n", + "- [Loading the sample datasets](#toc3__) \n", + " - [Load the raw dataset](#toc3_1__) \n", + " - [Load the preprocessed dataset](#toc3_2__) \n", + " - [Load the training and test datasets](#toc3_3__) \n", + "- [Initialize the ValidMind datasets](#toc4__) \n", + "- [Initialize the ValidMind model](#toc5__) \n", + " - [Link predictions](#toc5_1__) \n", + "- [Running tests](#toc6__) \n", + " - [Enable custom context for test descriptions](#toc6_1__) \n", + " - [Define tests to run](#toc6_2__) \n", + " - [Run defined tests](#toc6_3__) \n", + "- [Next steps](#toc7__) \n", + " - [Work with your documentation](#toc7_1__) \n", + " - [Add individual test results to documentation](#toc7_1_1__) \n", + " - [Discover more learning resources](#toc7_2__) \n", + "- [Upgrade ValidMind](#toc8__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.\n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2); border-radius: 5px;\">\n", + " <span style=\"color: #083E44;\"><b>Recommended Python versions</b></span><br />\n", + " Python 3.8 ≤ x ≤ 3.11\n", + "</div>\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Credit Risk Scorecard`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Then, let's import the necessary libraries and set up your Python environment for data analysis:\n", + "\n", + "- Install **OpenPyPL** (openpyxl) which will allow us to read and write `.xlsx` files.\n", + "- Import `pandas`, a Python library for data manipulation and analytics, as an alias.\n", + "- Enable `matplotlib`, a plotting library used for visualizing data. Ensures that any plots you generate will render inline in our notebook output rather than opening in a separate window." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install openpyxl\n", + "\n", + "import pandas as pd\n", + "\n", + "%matplotlib inline" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_4__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Loading the sample datasets\n", + "\n", + "Let's import our sample dataset in the form of an Excel workbook ([CreditRiskData.xlsx](CreditRiskData.xlsx)) with five sheets indexed 0 to 3, each representing a different stage of data preparation:\n", + "\n", + "0. **Raw Data** – The original unprocessed dataset.\n", + "1. **Preprocessed Data** – A cleaned and prepared version of the raw data.\n", + "2. **Train Data** – A training subset used to fit your model.\n", + "3. **Test Data** – A testing subset used to evaluate model performance." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1__'></a>\n", + "\n", + "### Load the raw dataset\n", + "\n", + "We'll start by loading the **Raw Data** sheet (index `0`) into a [Pandas DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "df = pd.read_excel('CreditRiskData.xlsx', sheet_name=0,engine='openpyxl')\n", + "\n", + "df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2__'></a>\n", + "\n", + "### Load the preprocessed dataset\n", + "\n", + "Next, load the **Preprocessed Data** sheet (index `1`), containing cleaned inputs ready for scoring:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "preprocess_df = pd.read_excel('CreditRiskData.xlsx', sheet_name=1,engine='openpyxl')\n", + "preprocess_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_3__'></a>\n", + "\n", + "### Load the training and test datasets\n", + "\n", + "Finally, load the split training (**Train Data**, index `2`) and testing (**Test Data**, index `3`) sets:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_df = pd.read_excel('CreditRiskData.xlsx', sheet_name=2,engine='openpyxl')\n", + "test_df = pd.read_excel('CreditRiskData.xlsx', sheet_name=3,engine='openpyxl')" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests with your loaded datasets, you must first initialize a ValidMind `Dataset` object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module. **This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind,** but you only need to do it once per dataset.\n", + "\n", + "For this example, we'll pass in the following arguments:\n", + "\n", + "- **`dataset`:** The input DataFrame to test.\n", + "- **`input_id`:** A unique identifier for tracking test inputs.\n", + "- **`target_column`:** Required for tests that compare predictions to actual outcomes; specify the name of the column with the true values." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Initialize the raw dataset\n", + "vm_raw_dataset = vm.init_dataset(\n", + " dataset=df,\n", + " input_id=\"raw_dataset\",\n", + " target_column='loan_status',\n", + ")\n", + "\n", + "# Initialize the preprocessed dataset\n", + "vm_preprocess_dataset = vm.init_dataset(\n", + " dataset=preprocess_df,\n", + " input_id=\"preprocess_dataset\",\n", + " target_column='loan_status',\n", + ")\n", + "\n", + "# Initialize the training dataset\n", + "vm_train_ds = vm.init_dataset(\n", + " dataset=train_df,\n", + " input_id=\"train_dataset\",\n", + " target_column='loan_status',\n", + ")\n", + "\n", + "# Initialize the testing dataset\n", + "vm_test_ds = vm.init_dataset(\n", + " dataset=test_df,\n", + " input_id=\"test_dataset\",\n", + " target_column='loan_status',\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Initialize the ValidMind model\n", + "\n", + "In this Excel-based use case, predictions are precomputed and included in the Excel file. While there's no model logic to run, a ValidMind model object (`vm_model`) is still required for passing to other functions for analysis and tests on the data.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Prediction logic placeholder\n", + "def dummy(X, **kwargs):\n", + " return None\n", + "\n", + "xgb_model = vm.init_model(\n", + " input_id=\"xgb_model\",\n", + " predict_fn=dummy\n", + " )" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1__'></a>\n", + "\n", + "### Link predictions\n", + "\n", + "Once the model has been registered, you can assign model predictions to the training and testing datasets.\n", + "\n", + "Use the [`assign_predictions()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#assign_predictions) from the `Dataset` object to link the prediction values and probabilities from the relevant columns on our Excel spreadsheet to the training and testing datasets:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(model=xgb_model, prediction_column=\"xgb_model_prediction\",probability_column='xgb_model_probabilities')\n", + "vm_test_ds.assign_predictions(model=xgb_model, prediction_column=\"xgb_model_prediction\",probability_column='xgb_model_probabilities')" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Running tests\n", + "\n", + "This is where it all comes together — we'll use our previously initialized datasets as inputs to run tests, then log the results to the ValidMind Platform.\n", + "\n", + "We'll run some tests that are defined out-of-the-box by the template we previewed earlier in this notebook, as well as some additional tests for more evidence. For the example in this section, we've selected and defined the tests for you.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Want to learn more about navigating ValidMind tests?</b></span>\n", + "<br></br>\n", + "Refer to our notebook outlining the utilities available for viewing and understanding available ValidMind tests: <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/explore_tests/explore_tests.html\" style=\"color: #DE257E;\"><b>Explore tests</b></a></div>" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_1__'></a>\n", + "\n", + "### Enable custom context for test descriptions\n", + "\n", + "When you run ValidMind tests, test descriptions are automatically generated with LLM using the test results, the test name, and the static test definitions provided in the test’s docstring. While this metadata offers valuable high-level overviews of tests, insights produced by the LLM-based descriptions may not always align with your specific use cases or incorporate organizational policy requirements.\n", + "\n", + "Before we run our tests, we'll include some custom use case context to improve the clarity, structure, and interpretability of the test descriptions returned. By default, custom context for LLM-generated descriptions is disabled, meaning that the output will not include any additional context. To enable custom use case context, set the `VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED` environment variable to `1`.\n", + "\n", + "This is a global setting that will affect all tests for your linked model:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import os\n", + "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED\"] = \"1\"\n", + "\n", + "context = \"\"\"\n", + "FORMAT FOR THE LLM DESCRIPTIONS: \n", + " **<Test Name>** is designed to <begin with a concise overview of what the test does and its primary purpose, \n", + " extracted from the test description>.\n", + "\n", + " The test operates by <write a paragraph about the test mechanism, explaining how it works and what it measures. \n", + " Include any relevant formulas or methodologies mentioned in the test description.>\n", + "\n", + " The primary advantages of this test include <write a paragraph about the test's strengths and capabilities, \n", + " highlighting what makes it particularly useful for specific scenarios.>\n", + "\n", + " Users should be aware that <write a paragraph about the test's limitations and potential risks. \n", + " Include both technical limitations and interpretation challenges. \n", + " If the test description includes specific signs of high risk, incorporate these here.>\n", + "\n", + " **Key Insights:**\n", + "\n", + " The test results reveal:\n", + "\n", + " - **<insight title>**: <comprehensive description of one aspect of the results>\n", + " - **<insight title>**: <comprehensive description of another aspect>\n", + " ...\n", + "\n", + " Based on these results, <conclude with a brief paragraph that ties together the test results with the test's \n", + " purpose and provides any final recommendations or considerations.>\n", + "\n", + "ADDITIONAL INSTRUCTIONS:\n", + " Present insights in order from general to specific, with each insight as a single bullet point with bold title.\n", + "\n", + " For each metric in the test results, include in the test overview:\n", + " - The metric's purpose and what it measures\n", + " - Its mathematical formula\n", + " - The range of possible values\n", + " - What constitutes good/bad performance\n", + " - How to interpret different values\n", + "\n", + " Each insight should progressively cover:\n", + " 1. Overall scope and distribution\n", + " 2. Complete breakdown of all elements with specific values\n", + " 3. Natural groupings and patterns\n", + " 4. Comparative analysis between datasets/categories\n", + " 5. Stability and variations\n", + " 6. Notable relationships or dependencies\n", + "\n", + " Remember:\n", + " - Keep all insights at the same level (no sub-bullets or nested structures)\n", + " - Make each insight complete and self-contained\n", + " - Include specific numerical values and ranges\n", + " - Cover all elements in the results comprehensively\n", + " - Maintain clear, concise language\n", + " - Use only \"- **Title**: Description\" format for insights\n", + " - Progress naturally from general to specific observations\n", + "\n", + "\"\"\".strip()\n", + "\n", + "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT\"] = context" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_2__'></a>\n", + "\n", + "### Define tests to run\n", + "\n", + "First, we'll specify all the tests we'd like to independently run in a dictionary called `test_config`, including information about the `params` and `inputs` that each test requires.\n", + "\n", + "- Note here that `inputs` and `input_grid` expect the `input_id` of the dataset or model as the value rather than the variable name we specified**.\n", + "- When running individual tests, you can use a custom `result_id` to tag the individual result with a unique identifier by appending this `result_id` to the `test_id` with a `:` separator. (Example: `:raw_data` for tests run with our raw dataset.)" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test_config = {\n", + "\n", + " # Data validation tests run with raw dataset\n", + " 'validmind.data_validation.DatasetDescription:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'}\n", + " },\n", + " 'validmind.data_validation.DescriptiveStatistics:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'}\n", + " },\n", + " 'validmind.data_validation.MissingValues:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {'min_percentage_threshold': 1}\n", + " },\n", + " 'validmind.data_validation.ClassImbalance:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {'min_percent_threshold': 10}\n", + " },\n", + " 'validmind.data_validation.Duplicates:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {'min_threshold': 1}\n", + " },\n", + " 'validmind.data_validation.HighCardinality:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {\n", + " 'num_threshold': 100,\n", + " 'percent_threshold': 0.1,\n", + " 'threshold_type': 'percent'\n", + " }\n", + " },\n", + " 'validmind.data_validation.Skewness:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {'max_threshold': 1}\n", + " },\n", + " 'validmind.data_validation.UniqueRows:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {'min_percent_threshold': 1}\n", + " },\n", + " 'validmind.data_validation.TooManyZeroValues:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {'max_percent_threshold': 0.03}\n", + " },\n", + " 'validmind.data_validation.IQROutliersTable:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {'threshold': 5}\n", + " },\n", + "\n", + " # Data validation tests run with preprocessed dataset\n", + " 'validmind.data_validation.DescriptiveStatistics:preprocessed_data': {\n", + " 'inputs': {'dataset': 'preprocess_dataset'}\n", + " },\n", + " 'validmind.data_validation.TabularDescriptionTables:preprocessed_data': {\n", + " 'inputs': {'dataset': 'preprocess_dataset'}\n", + " },\n", + " 'validmind.data_validation.MissingValues:preprocessed_data': {\n", + " 'inputs': {'dataset': 'preprocess_dataset'},\n", + " 'params': {'min_percentage_threshold': 1}\n", + " },\n", + " 'validmind.data_validation.TabularNumericalHistograms:preprocessed_data': {\n", + " 'inputs': {'dataset': 'preprocess_dataset'}\n", + " },\n", + " 'validmind.data_validation.TabularCategoricalBarPlots:preprocessed_data': {\n", + " 'inputs': {'dataset': 'preprocess_dataset'}\n", + " },\n", + " 'validmind.data_validation.TargetRateBarPlots:preprocessed_data': {\n", + " 'inputs': {'dataset': 'preprocess_dataset'},\n", + " 'params': {'default_column': 'loan_status'}\n", + " },\n", + "\n", + " 'validmind.data_validation.WOEBinTable': {\n", + " 'input_grid': {'dataset': ['preprocess_dataset']},\n", + " 'params': {\n", + " 'breaks_adj': {\n", + " 'loan_amnt': [5000, 10000, 15000, 20000, 25000],\n", + " 'int_rate': [10, 15, 20],\n", + " 'annual_inc': [50000, 100000, 150000]\n", + " }\n", + " }\n", + " },\n", + " 'validmind.data_validation.WOEBinPlots': {\n", + " 'input_grid': {'dataset': ['preprocess_dataset']},\n", + " 'params': {\n", + " 'breaks_adj': {\n", + " 'loan_amnt': [5000, 10000, 15000, 20000, 25000],\n", + " 'int_rate': [10, 15, 20],\n", + " 'annual_inc': [50000, 100000, 150000]\n", + " }\n", + " }\n", + " },\n", + "\n", + " # Data validation tests run with training & testing datasets\n", + " 'validmind.data_validation.DescriptiveStatistics:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", + " },\n", + " 'validmind.data_validation.TabularDescriptionTables:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", + " },\n", + " 'validmind.data_validation.ClassImbalance:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", + " 'params': {'min_percent_threshold': 10}\n", + " },\n", + " 'validmind.data_validation.UniqueRows:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", + " 'params': {'min_percent_threshold': 1}\n", + " },\n", + " 'validmind.data_validation.TabularNumericalHistograms:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", + " },\n", + " 'validmind.data_validation.MutualInformation:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", + " 'params': {'min_threshold': 0.01}\n", + " },\n", + " 'validmind.data_validation.PearsonCorrelationMatrix:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", + " },\n", + " 'validmind.data_validation.HighPearsonCorrelation:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", + " 'params': {'max_threshold': 0.3, 'top_n_correlations': 10}\n", + " },\n", + " 'validmind.data_validation.ScoreBandDefaultRates:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset'], 'model': ['xgb_model']},\n", + " 'params': {'score_column': 'xgb_scores', 'score_bands': [504, 537, 570]}\n", + " },\n", + " 'validmind.data_validation.DatasetSplit:development_data': {\n", + " 'inputs': {'datasets': ['train_dataset', 'test_dataset']}\n", + " },\n", + "\n", + " # Model validation tests\n", + " 'validmind.model_validation.statsmodels.GINITable': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']}\n", + " },\n", + " 'validmind.model_validation.sklearn.ClassifierPerformance': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']}\n", + " },\n", + " 'validmind.model_validation.sklearn.TrainingTestDegradation:XGBoost': {\n", + " 'inputs': {\n", + " 'datasets': ['train_dataset', 'test_dataset'],\n", + " 'model': 'xgb_model'\n", + " },\n", + " 'params': {'max_threshold': 0.1}\n", + " },\n", + " 'validmind.model_validation.sklearn.ROCCurve': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']}\n", + " },\n", + " 'validmind.model_validation.sklearn.MinimumROCAUCScore': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']},\n", + " 'params': {'min_threshold': 0.5}\n", + " },\n", + " 'validmind.model_validation.statsmodels.PredictionProbabilitiesHistogram': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']}\n", + " },\n", + " 'validmind.model_validation.statsmodels.CumulativePredictionProbabilities': {\n", + " 'input_grid': {'model': ['xgb_model'], 'dataset': ['train_dataset', 'test_dataset']}\n", + " },\n", + " 'validmind.model_validation.sklearn.PopulationStabilityIndex': {\n", + " 'inputs': {\n", + " 'datasets': ['train_dataset', 'test_dataset'],\n", + " 'model': 'xgb_model'\n", + " },\n", + " 'params': {'num_bins': 10, 'mode': 'fixed'}\n", + " },\n", + " 'validmind.model_validation.sklearn.ConfusionMatrix': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']}\n", + " },\n", + " 'validmind.model_validation.sklearn.MinimumAccuracy': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']},\n", + " 'params': {'min_threshold': 0.7}\n", + " },\n", + " 'validmind.model_validation.sklearn.MinimumF1Score': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']},\n", + " 'params': {'min_threshold': 0.5}\n", + " },\n", + " 'validmind.model_validation.sklearn.PrecisionRecallCurve': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']}\n", + " },\n", + " 'validmind.model_validation.sklearn.CalibrationCurve': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model']}\n", + " },\n", + " 'validmind.model_validation.sklearn.ClassifierThresholdOptimization': {\n", + " 'inputs': {'dataset': 'train_dataset', 'model': 'xgb_model'},\n", + " 'params': {'target_recall': 0.8}\n", + " },\n", + " 'validmind.model_validation.statsmodels.ScorecardHistogram': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", + " 'params': {'score_column': 'xgb_scores'}\n", + " },\n", + " 'validmind.model_validation.sklearn.ScoreProbabilityAlignment': {\n", + " 'input_grid': {'dataset': ['train_dataset'], 'model': ['xgb_model']},\n", + " 'params': {'score_column': 'xgb_scores'}\n", + " },\n", + " 'validmind.model_validation.sklearn.WeakspotsDiagnosis': {\n", + " 'inputs': {'datasets': ['train_dataset', 'test_dataset'], 'model': 'xgb_model'}\n", + " },\n", + " 'validmind.model_validation.sklearn.OverfitDiagnosis': {\n", + " 'inputs': {'model': 'xgb_model', 'datasets': ['train_dataset', 'test_dataset']},\n", + " 'params': {'cut_off_threshold': 0.04}\n", + " },\n", + " 'validmind.model_validation.sklearn.RobustnessDiagnosis': {\n", + " 'inputs': {'datasets': ['train_dataset', 'test_dataset'], 'model': 'xgb_model'},\n", + " 'params': {\n", + " 'scaling_factor_std_dev_list': [0.1, 0.2, 0.3, 0.4, 0.5],\n", + " 'performance_decay_threshold': 0.05\n", + " }\n", + " },\n", + " 'validmind.model_validation.FeaturesAUC': {\n", + " 'input_grid': {'model': ['xgb_model'], 'dataset': ['train_dataset', 'test_dataset']}\n", + " }\n", + "}" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_3__'></a>\n", + "\n", + "### Run defined tests\n", + "\n", + "Then, we'll define a utility wrapper around [the `run_test` function](https://docs.validmind.ai/validmind/validmind/tests.html#run_test) provided by the `validmind.tests` module in a function called `run_doc_tests`.\n", + "\n", + "- Every test result returned by the `run_test()` function has a [`.log()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#TestResult.log) that can be used to send the test results to the ValidMind Platform.\n", + "- Our function requires information about the inputs to use on every test — which is why we specified these inputs above in `test_config`." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "def run_doc_tests(test_config):\n", + " for test_name, test_cfg in test_config.items():\n", + " print(test_name)\n", + " try:\n", + " # Collect available keyword arguments\n", + " kwargs = {\n", + " key: test_cfg[key]\n", + " for key in (\"params\", \"input_grid\", \"inputs\")\n", + " if key in test_cfg\n", + " }\n", + " kwargs[\"show\"] = False\n", + "\n", + " # Execute the test and log the results\n", + " vm.tests.run_test(test_name, **kwargs).log()\n", + "\n", + " except Exception as e:\n", + " print(f\"Error running test {test_name}: {e}\")\n" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Finally, we can pass the input configuration to `run_doc_tests` and run the full suite of tests!\n", + "\n", + "The variable `full_suite` then holds the result of these tests:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "full_suite = run_doc_tests(test_config)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Note the outputs returned indicating that certain test-driven blocks don't currently exist in your documentation for this particular test ID. </b></span>\n", + "<br></br>\n", + "That's expected, as when we run individual tests not defined by the documentation template out-of-the-box, the results logged need to be manually added to your documentation within the ValidMind Platform.</div>" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way, use the ValidMind Platform to work with your documentation.\n", + "\n", + "<a id='toc7_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + " What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready.\n", + "\n", + "3. Expand the following section to review tests automatically inserted into your documentation template: **2.3. Feature Selection and Engineering**\n", + "\n", + "<a id='toc7_1_1__'></a>\n", + "\n", + "#### Add individual test results to documentation\n", + "\n", + "Let's also add our additional test results into the documentation. These were results sent by individual tests not defined out-of-the-box by our template. For example (**Learn more:** [Work with test results](https://docs.validmind.ai/guide/documentation/work-with-test-results.html)):\n", + "\n", + "1. Locate the Data Preparation section of your documentation and click on **2.2. Correlations and Interactions** to expand that section.\n", + "\n", + "4. Hover under the Pearson Correlation Matrix content block until a horizontal dashed line with a **+** button appears, indicating that you can insert a new block.\n", + "\n", + " <img src= \"../../tutorials/development/add-content-block.gif\" alt=\"Screenshot showing insert block button in model documentation\" style=\"border: 2px solid #083E44; border-radius: 8px; border-right-width: 2px; border-bottom-width: 3px;\">\n", + " <br><br>\n", + "\n", + "5. Click **+** and then select **Test-Driven Block** under FROM LIBRARY:\n", + "\n", + " - Click on **VM Library** under TEST-DRIVEN in the left sidebar.\n", + " - In the search bar, type in `HighPearsonCorrelation`.\n", + " - Select `HighPearsonCorrelation:development_data` as the test.\n", + "\n", + "6. Finally, click **Insert 1 Test Result to Document** to add the test result to the documentation.\n", + "\n", + " Confirm that the individual results for the high correlation test has been correctly inserted into section **2.3. Correlations and Interactions** of the documentation.\n", + "\n", + "<a id='toc7_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-9a4dd2ee254f496292698e9be3d8f799" + } + ], + "metadata": { + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.13" + } + }, + "nbformat": 4, + "nbformat_minor": 4 } diff --git a/site/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb b/site/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb index 7e2fe07417..deeb8293e8 100644 --- a/site/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb +++ b/site/notebooks/use_cases/nlp_and_llm/prompt_validation_demo.ipynb @@ -1,560 +1,566 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Prompt validation for large language models (LLMs)\n", - "\n", - "Run and document prompt validation tests for a large language model (LLM) specialized in sentiment analysis for financial news. \n", - "\n", - "This interactive notebook shows you how to set up the ValidMind Library, initialize the library, and use a specific prompt template for analyzing the sentiment of given sentences. Prompt validation covers the initialization of a test dataset and the creation of a foundational model using the ValidMind Library, followed by the execution of a test suite specifically designed for prompt validation. The notebook also includes example data to test the model's ability to correctly identify sentiment as positive, negative, or neutral." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Preview the documentation template](#toc2_3__) \n", - "- [Get ready to run the analysis](#toc3__) \n", - "- [Get your sample dataset ready for analysis](#toc4__) \n", - "- [Perform the prompt validation](#toc5__) \n", - "- [Next steps](#toc6__) \n", - " - [Work with your model documentation](#toc6_1__) \n", - " - [Discover more learning resources](#toc6_2__) \n", - "- [Upgrade ValidMind](#toc7__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `LLM-based Text Classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Get ready to run the analysis\n", - "\n", - "Import the ValidMind `FoundationModel` and `Prompt` classes needed for the sentiment analysis later on:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.models import FoundationModel, Prompt" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Check your access to the OpenAI API:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "\n", - "import dotenv\n", - "\n", - "dotenv.load_dotenv()\n", - "\n", - "if os.getenv(\"OPENAI_API_KEY\") is None:\n", - " raise Exception(\"OPENAI_API_KEY not found\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from openai import OpenAI\n", - "\n", - "model = OpenAI()\n", - "\n", - "\n", - "def call_model(prompt):\n", - " return (\n", - " model.chat.completions.create(\n", - " model=\"gpt-3.5-turbo\",\n", - " messages=[\n", - " {\"role\": \"user\", \"content\": prompt},\n", - " ],\n", - " )\n", - " .choices[0]\n", - " .message.content\n", - " )" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Set the prompt guidelines for the sentiment analysis:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "prompt_template = \"\"\"\n", - "You are an AI with expertise in sentiment analysis, particularly in the context of financial news.\n", - "Your task is to analyze the sentiment of a specific sentence provided below.\n", - "Before proceeding, take a moment to understand the context and nuances of the financial terminology used in the sentence.\n", - "\n", - "Sentence to Analyze:\n", - "```\n", - "{Sentence}\n", - "```\n", - "\n", - "Please respond with the sentiment of the sentence denoted by one of either 'positive', 'negative', or 'neutral'.\n", - "Please respond only with the sentiment enum value. Do not include any other text in your response.\n", - "\n", - "Note: Ensure that your analysis is based on the content of the sentence and not on external information or assumptions.\n", - "\"\"\".strip()\n", - "\n", - "prompt_variables = [\"Sentence\"]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Get your sample dataset ready for analysis\n", - "\n", - "To perform the sentiment analysis for financial news we're going to load a local copy of this dataset: https://www.kaggle.com/datasets/ankurzing/sentiment-analysis-for-financial-news.\n", - "\n", - "This dataset contains two columns, `Sentiment` and `Sentence`. The sentiment can be `negative`, `neutral` or `positive`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import pandas as pd\n", - "\n", - "df = pd.read_csv(\"./datasets/sentiments.csv\")\n", - "\n", - "df_test = df[:10].reset_index(drop=True)\n", - "df_test" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Perform the prompt validation\n", - "\n", - "First, use the ValidMind Library to initialize the dataset and model objects necessary for documentation. The ValidMind `predict_fn` function allows the model to be tested and evaluated in a standardized manner:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_test_ds = vm.init_dataset(\n", - " dataset=df_test,\n", - " input_id=\"test_dataset\",\n", - " text_column=\"Sentence\",\n", - " target_column=\"Sentiment\",\n", - ")\n", - "\n", - "vm_model = vm.init_model(\n", - " model=FoundationModel(\n", - " predict_fn=call_model,\n", - " prompt=Prompt(\n", - " template=prompt_template,\n", - " variables=prompt_variables,\n", - " ),\n", - " ),\n", - " input_id=\"gpt_35_model\",\n", - ")\n", - "\n", - "# Assign model predictions to the test dataset\n", - "vm_test_ds.assign_predictions(vm_model)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Next, use the ValidMind Library to run validation tests on the model. These tests evaluate various aspects of the prompts, including bias, clarity, conciseness, delimitation, negative instruction, and specificity.\n", - "\n", - "Each test is explained in detail, highlighting its purpose, test mechanism, and the importance of the specific aspect being evaluated. The tests are graded on a scale from 1 to 10, with a predetermined threshold, and the explanations for each test include a score, threshold, and a pass/fail determination." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test_suite_results = vm.run_test_suite(\n", - " \"prompt_validation\",\n", - " inputs={\n", - " \"dataset\": vm_test_ds,\n", - " \"model\": vm_model,\n", - " },\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Here, most of the tests pass but the test for _conciseness_ needs further attention, as it fails the threshold. This test is designed to evaluate the brevity and succinctness of prompts provided to a large language model (LLM).\n", - "\n", - "The test matters, because a concise prompt strikes a balance between offering clear instructions and eliminating redundant or unnecessary information, ensuring that the LLM receives relevant input without being overwhelmed." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "<a id='toc6_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (**Learn more:** [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. Click and expand the **Model Development** section.\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "<a id='toc6_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-da0317263ddc4a119cb7b306ac1b39c1", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": ".venv", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.13" - } - }, - "nbformat": 4, - "nbformat_minor": 2 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Prompt validation for large language models (LLMs)\n", + "\n", + "Run and document prompt validation tests for a large language model (LLM) specialized in sentiment analysis for financial news. \n", + "\n", + "This interactive notebook shows you how to set up the ValidMind Library, initialize the library, and use a specific prompt template for analyzing the sentiment of given sentences. Prompt validation covers the initialization of a test dataset and the creation of a foundational model using the ValidMind Library, followed by the execution of a test suite specifically designed for prompt validation. The notebook also includes example data to test the model's ability to correctly identify sentiment as positive, negative, or neutral." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Preview the documentation template](#toc2_3__) \n", + "- [Get ready to run the analysis](#toc3__) \n", + "- [Get your sample dataset ready for analysis](#toc4__) \n", + "- [Perform the prompt validation](#toc5__) \n", + "- [Next steps](#toc6__) \n", + " - [Work with your model documentation](#toc6_1__) \n", + " - [Discover more learning resources](#toc6_2__) \n", + "- [Upgrade ValidMind](#toc7__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", + "\n", + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `LLM-based Text Classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Get ready to run the analysis\n", + "\n", + "Import the ValidMind `FoundationModel` and `Prompt` classes needed for the sentiment analysis later on:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.models import FoundationModel, Prompt" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Check your access to the OpenAI API:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import os\n", + "\n", + "import dotenv\n", + "\n", + "dotenv.load_dotenv()\n", + "\n", + "if os.getenv(\"OPENAI_API_KEY\") is None:\n", + " raise Exception(\"OPENAI_API_KEY not found\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from openai import OpenAI\n", + "\n", + "model = OpenAI()\n", + "\n", + "\n", + "def call_model(prompt):\n", + " return (\n", + " model.chat.completions.create(\n", + " model=\"gpt-3.5-turbo\",\n", + " messages=[\n", + " {\"role\": \"user\", \"content\": prompt},\n", + " ],\n", + " )\n", + " .choices[0]\n", + " .message.content\n", + " )" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Set the prompt guidelines for the sentiment analysis:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "prompt_template = \"\"\"\n", + "You are an AI with expertise in sentiment analysis, particularly in the context of financial news.\n", + "Your task is to analyze the sentiment of a specific sentence provided below.\n", + "Before proceeding, take a moment to understand the context and nuances of the financial terminology used in the sentence.\n", + "\n", + "Sentence to Analyze:\n", + "```\n", + "{Sentence}\n", + "```\n", + "\n", + "Please respond with the sentiment of the sentence denoted by one of either 'positive', 'negative', or 'neutral'.\n", + "Please respond only with the sentiment enum value. Do not include any other text in your response.\n", + "\n", + "Note: Ensure that your analysis is based on the content of the sentence and not on external information or assumptions.\n", + "\"\"\".strip()\n", + "\n", + "prompt_variables = [\"Sentence\"]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Get your sample dataset ready for analysis\n", + "\n", + "To perform the sentiment analysis for financial news we're going to load a local copy of this dataset: https://www.kaggle.com/datasets/ankurzing/sentiment-analysis-for-financial-news.\n", + "\n", + "This dataset contains two columns, `Sentiment` and `Sentence`. The sentiment can be `negative`, `neutral` or `positive`." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import pandas as pd\n", + "\n", + "df = pd.read_csv(\"./datasets/sentiments.csv\")\n", + "\n", + "df_test = df[:10].reset_index(drop=True)\n", + "df_test" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Perform the prompt validation\n", + "\n", + "First, use the ValidMind Library to initialize the dataset and model objects necessary for documentation. The ValidMind `predict_fn` function allows the model to be tested and evaluated in a standardized manner:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_test_ds = vm.init_dataset(\n", + " dataset=df_test,\n", + " input_id=\"test_dataset\",\n", + " text_column=\"Sentence\",\n", + " target_column=\"Sentiment\",\n", + ")\n", + "\n", + "vm_model = vm.init_model(\n", + " model=FoundationModel(\n", + " predict_fn=call_model,\n", + " prompt=Prompt(\n", + " template=prompt_template,\n", + " variables=prompt_variables,\n", + " ),\n", + " ),\n", + " input_id=\"gpt_35_model\",\n", + ")\n", + "\n", + "# Assign model predictions to the test dataset\n", + "vm_test_ds.assign_predictions(vm_model)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next, use the ValidMind Library to run validation tests on the model. These tests evaluate various aspects of the prompts, including bias, clarity, conciseness, delimitation, negative instruction, and specificity.\n", + "\n", + "Each test is explained in detail, highlighting its purpose, test mechanism, and the importance of the specific aspect being evaluated. The tests are graded on a scale from 1 to 10, with a predetermined threshold, and the explanations for each test include a score, threshold, and a pass/fail determination." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test_suite_results = vm.run_test_suite(\n", + " \"prompt_validation\",\n", + " inputs={\n", + " \"dataset\": vm_test_ds,\n", + " \"model\": vm_model,\n", + " },\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Here, most of the tests pass but the test for _conciseness_ needs further attention, as it fails the threshold. This test is designed to evaluate the brevity and succinctness of prompts provided to a large language model (LLM).\n", + "\n", + "The test matters, because a concise prompt strikes a balance between offering clear instructions and eliminating redundant or unnecessary information, ensuring that the LLM receives relevant input without being overwhelmed." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "<a id='toc6_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (**Learn more:** [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. Click and expand the **Model Development** section.\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "<a id='toc6_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-da0317263ddc4a119cb7b306ac1b39c1" + } + ], + "metadata": { + "kernelspec": { + "display_name": ".venv", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.13" + } + }, + "nbformat": 4, + "nbformat_minor": 2 } diff --git a/site/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb b/site/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb index a420430b1b..847417cf02 100644 --- a/site/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb +++ b/site/notebooks/use_cases/ongoing_monitoring/application_scorecard_ongoing_monitoring.ipynb @@ -1,1393 +1,1399 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Ongoing Monitoring for Application Scorecard\n", - "\n", - "In this notebook, you'll learn how to seamlessly monitor your production models using the ValidMind Platform.\n", - "\n", - "We'll walk you through the process of initializing the ValidMind Library, loading a sample dataset and model, and running a monitoring test suite to quickly generate documentation about your new data and model." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply monitoring report template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Preview the monitoring report template](#toc2_3__) \n", - " - [Initialize the Python environment](#toc2_4__) \n", - " - [Preview the monitoring template](#toc2_5__) \n", - "- [Load the reference and monitoring datasets](#toc3__) \n", - "- [Train the model](#toc4__) \n", - " - [Initialize the ValidMind datasets](#toc4_1__) \n", - " - [Initialize the ValidMind model](#toc4_2__) \n", - " - [Assign prediction values and probabilities to the datasets](#toc4_3__) \n", - " - [Compute credit risk scores](#toc4_4__) \n", - " - [Adding custom context to the LLM descriptions](#toc4_5__) \n", - " - [Monitoring data description](#toc4_6__) \n", - " - [Target and feature drift](#toc4_7__) \n", - " - [Classification accuracy](#toc4_8__) \n", - " - [Class discrimination](#toc4_9__) \n", - " - [Scoring](#toc4_10__) \n", - " - [Model insights](#toc4_11__) \n", - " - [Diagnostic monitoring](#toc4_12__) \n", - " - [Robustness monitoring](#toc4_13__) \n", - " - [Performance history](#toc4_14__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation, validation, and monitoring tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Model monitoring report**: A comprehensive and structured record of a production model, including key elements such as data sources, inputs, performance metrics, and periodic evaluations. This documentation ensures transparency and visibility of the model's performance in the production environment.\n", - "\n", - "**Monitoring report template**: Similar to documentation template, The monitoring report template functions as a test suite and lays out the structure of model monitoring, segmented into various sections and sub-sections. Monitoring report templates define the structure of your model monitoring report, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply monitoring report template\n", - "\n", - "Once you've registered your model, let's select a monitoring report template. A template predefines sections for your monitoring report and provides a general outline to follow, making the monitoring process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Monitoring**.\n", - "\n", - " If you cannot locate your Monitoring document, make sure Monitoring type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Ongoing Monitoring for Classification Models`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Monitoring` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"monitoring\",\n", - " monitoring = True,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Preview the monitoring report template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_4__'></a>\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Next, let's import the necessary libraries and set up your Python environment for data analysis:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import xgboost as xgb\n", - "import numpy as np\n", - "\n", - "from datetime import datetime, timedelta\n", - "\n", - "from validmind.tests import run_test\n", - "from validmind.datasets.credit_risk import lending_club\n", - "from validmind.unit_metrics import list_metrics\n", - "from validmind.unit_metrics import describe_metric\n", - "from validmind.unit_metrics import run_metric\n", - "from validmind.api_client import log_metric\n", - "\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_5__'></a>\n", - "\n", - "### Preview the monitoring template\n", - "\n", - "A template predefines sections for your monitoring documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "You will upload documentation and test results into this template later on. For now, take a look at the structure that the template provides with the `vm.preview_template()` function from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Load the reference and monitoring datasets\n", - "\n", - "The sample dataset used here is provided by the ValidMind library. For demonstration purposes we'll use the training, test dataset splits as `reference` and `monitoring` datasets." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "df = lending_club.load_data(source=\"offline\")\n", - "df.head()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "preprocess_df = lending_club.preprocess(df)\n", - "preprocess_df.head()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "fe_df = lending_club.feature_engineering(preprocess_df)\n", - "fe_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Train the model\n", - "\n", - "In this section, we focus on constructing and refining our predictive model. \n", - "- We begin by dividing our data, which is based on Weight of Evidence (WoE) features, into training and testing sets (`train_df`, `test_df`). \n", - "- With `lending_club.split`, we employ a simple random split, randomly allocating data points to each set to ensure a mix of examples in both." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Split the data\n", - "train_df, test_df = lending_club.split(fe_df, test_size=0.2)\n", - "\n", - "x_train = train_df.drop(lending_club.target_column, axis=1)\n", - "y_train = train_df[lending_club.target_column]\n", - "\n", - "x_test = test_df.drop(lending_club.target_column, axis=1)\n", - "y_test = test_df[lending_club.target_column]" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Define the XGBoost model\n", - "xgb_model = xgb.XGBClassifier(\n", - " n_estimators=50, \n", - " random_state=42, \n", - " early_stopping_rounds=10\n", - ")\n", - "xgb_model.set_params(\n", - " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", - ")\n", - "\n", - "# Fit the model\n", - "xgb_model.fit(\n", - " x_train, \n", - " y_train,\n", - " eval_set=[(x_test, y_test)],\n", - " verbose=False\n", - ")\n", - "\n", - "# Compute probabilities\n", - "train_xgb_prob = xgb_model.predict_proba(x_train)[:, 1]\n", - "test_xgb_prob = xgb_model.predict_proba(x_test)[:, 1]\n", - "\n", - "# Compute binary predictions\n", - "cut_off_threshold = 0.3\n", - "train_xgb_binary_predictions = (train_xgb_prob > cut_off_threshold).astype(int)\n", - "test_xgb_binary_predictions = (test_xgb_prob > cut_off_threshold).astype(int)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", - "\n", - "This function takes a number of arguments:\n", - "\n", - "- `dataset` — The raw dataset that you want to provide as input to tests.\n", - "- `input_id` - A unique identifier that allows tracking what inputs are used when running each individual test.\n", - "- `target_column` — A required argument if tests require access to true values. This is the name of the target column in the dataset.\n", - "\n", - "With all datasets ready, you can now initialize training, reference(test) and monitor datasets (`reference_df` and `monitor_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_reference_ds = vm.init_dataset(\n", - " dataset=train_df,\n", - " input_id=\"reference_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")\n", - "\n", - "vm_monitoring_ds = vm.init_dataset(\n", - " dataset=test_df,\n", - " input_id=\"monitoring_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### Initialize the ValidMind model\n", - "\n", - "You will also need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_xgb_model = vm.init_model(\n", - " xgb_model,\n", - " input_id=\"xgb_model\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_3__'></a>\n", - "\n", - "### Assign prediction values and probabilities to the datasets\n", - "\n", - "With our model now trained, we'll move on to assigning both the predictive probabilities coming directly from the model's predictions, and the binary prediction after applying the cutoff threshold described in the previous steps. \n", - "- These tasks are achieved through the use of the `assign_predictions()` method associated with the VM `dataset` object.\n", - "- This method links the model's class prediction values and probabilities to our VM train and test datasets." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_reference_ds.assign_predictions(\n", - " model=vm_xgb_model,\n", - " prediction_values=train_xgb_binary_predictions,\n", - " prediction_probabilities=train_xgb_prob,\n", - ")\n", - "\n", - "vm_monitoring_ds.assign_predictions(\n", - " model=vm_xgb_model,\n", - " prediction_values=test_xgb_binary_predictions,\n", - " prediction_probabilities=test_xgb_prob,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_4__'></a>\n", - "\n", - "### Compute credit risk scores\n", - "\n", - "In this phase, we translate model predictions into actionable scores using probability estimates generated by our trained model." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_xgb_scores = lending_club.compute_scores(train_xgb_prob)\n", - "test_xgb_scores = lending_club.compute_scores(test_xgb_prob)\n", - "\n", - "# Assign scores to the datasets\n", - "vm_reference_ds.add_extra_column(\"xgb_scores\", train_xgb_scores)\n", - "vm_monitoring_ds.add_extra_column(\"xgb_scores\", test_xgb_scores)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_5__'></a>\n", - "\n", - "### Adding custom context to the LLM descriptions\n", - "\n", - "To enable the LLM descriptions context, you need to set the `VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED` environment variable to `1`. This will enable the LLM descriptions context, which will be used to provide additional context to the LLM descriptions. This is a global setting that will affect all tests." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED\"] = \"1\"\n", - "\n", - "context = \"\"\"\n", - "FORMAT FOR THE LLM DESCRIPTIONS: \n", - " **<Test Name>** is designed to <begin with a concise overview of what the test does and its primary purpose, \n", - " extracted from the test description>.\n", - "\n", - " The test operates by <write a paragraph about the test mechanism, explaining how it works and what it measures. \n", - " Include any relevant formulas or methodologies mentioned in the test description.>\n", - "\n", - " The primary advantages of this test include <write a paragraph about the test's strengths and capabilities, \n", - " highlighting what makes it particularly useful for specific scenarios.>\n", - "\n", - " Users should be aware that <write a paragraph about the test's limitations and potential risks. \n", - " Include both technical limitations and interpretation challenges. \n", - " If the test description includes specific signs of high risk, incorporate these here.>\n", - "\n", - " **Key Insights:**\n", - "\n", - " The test results reveal:\n", - "\n", - " - **<insight title>**: <comprehensive description of one aspect of the results>\n", - " - **<insight title>**: <comprehensive description of another aspect>\n", - " ...\n", - "\n", - " Based on these results, <conclude with a brief paragraph that ties together the test results with the test's \n", - " purpose and provides any final recommendations or considerations.>\n", - "\n", - "ADDITIONAL INSTRUCTIONS:\n", - " Present insights in order from general to specific, with each insight as a single bullet point with bold title.\n", - "\n", - " For each metric in the test results, include in the test overview:\n", - " - The metric's purpose and what it measures\n", - " - Its mathematical formula\n", - " - The range of possible values\n", - " - What constitutes good/bad performance\n", - " - How to interpret different values\n", - "\n", - " Each insight should progressively cover:\n", - " 1. Overall scope and distribution\n", - " 2. Complete breakdown of all elements with specific values\n", - " 3. Natural groupings and patterns\n", - " 4. Comparative analysis between datasets/categories\n", - " 5. Stability and variations\n", - " 6. Notable relationships or dependencies\n", - "\n", - " Remember:\n", - " - Keep all insights at the same level (no sub-bullets or nested structures)\n", - " - Make each insight complete and self-contained\n", - " - Include specific numerical values and ranges\n", - " - Cover all elements in the results comprehensively\n", - " - Maintain clear, concise language\n", - " - Use only \"- **Title**: Description\" format for insights\n", - " - Progress naturally from general to specific observations\n", - "\n", - "\"\"\".strip()\n", - "\n", - "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT\"] = context" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_6__'></a>\n", - "\n", - "### Monitoring data description\n", - "\n", - "The Monitoring Data Description tests aim to provide a comprehensive statistical analysis of the monitoring dataset's characteristics. These tests examine the basic statistical properties, identify any missing data patterns, assess data uniqueness, visualize numerical feature distributions, and evaluate feature relationships through correlation analysis.\n", - "\n", - "The primary objective is to establish a baseline understanding of the monitoring data's structure and quality, enabling the detection of any significant deviations from expected patterns that could impact model performance. Each test is designed to capture different aspects of the data, from univariate statistics to multivariate relationships, providing a foundation for ongoing data quality assessment in the production environment." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.DescriptiveStatistics:monitoring_data\",\n", - " inputs={\n", - " \"dataset\": vm_monitoring_ds,\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.MissingValues:monitoring_data\",\n", - " inputs={\n", - " \"dataset\": vm_monitoring_ds,\n", - " },\n", - " params={\n", - " \"min_percentage_threshold\": 1\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.UniqueRows:monitoring_data\",\n", - " inputs={\n", - " \"dataset\": vm_monitoring_ds,\n", - " },\n", - " params={\n", - " \"min_percent_threshold\": 1\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.TabularNumericalHistograms:monitoring_data\",\n", - " inputs={\n", - " \"dataset\": vm_monitoring_ds,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.PearsonCorrelationMatrix:monitoring_data\",\n", - " inputs={\n", - " \"dataset\": vm_monitoring_ds,\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.data_validation.HighPearsonCorrelation:monitoring_data\",\n", - " inputs={\n", - " \"dataset\": vm_monitoring_ds,\n", - " },\n", - " params={\n", - " \"feature_columns\": vm_monitoring_ds.feature_columns,\n", - " \"max_threshold\": 0.5,\n", - " \"top_n_correlations\": 10\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.ongoing_monitoring.ClassImbalanceDrift\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " },\n", - " params={\n", - " \"drift_pct_threshold\": 1\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_7__'></a>\n", - "\n", - "### Target and feature drift\n", - "\n", - "Next, the goal is to investigate the distributional characteristics of predictions and features to determine if the underlying data has changed. These tests are crucial for assessing the expected accuracy of the model.\n", - "\n", - "1. **Target drift:** We compare the dataset used for testing (reference data) with the monitoring data. This helps to identify any shifts in the target variable distribution.\n", - "2. **Feature drift:** We compare the training dataset with the monitoring data. Since features were used to train the model, any drift in these features could indicate potential issues, as the underlying patterns that the model was trained on may have changed." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Next, we can examine the correlation between features and predictions. Significant changes in these correlations may trigger a deeper assessment." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.PopulationStabilityIndex\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.ongoing_monitoring.TargetPredictionDistributionPlot\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - " params={\n", - " \"drift_pct_threshold\": 5\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now we want see difference in correlation pairs between model prediction and features." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.ongoing_monitoring.PredictionCorrelation\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - " params={\n", - " \"drift_pct_threshold\": 5\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Finally for target drift, let's plot each prediction value and feature grid side by side." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.ongoing_monitoring.PredictionQuantilesAcrossFeatures\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Next, let's add run a test to investigate how or if the features have drifted. In this instance we want to compare the training data with prediction data." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.ongoing_monitoring.FeatureDrift\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - " params={\n", - " \"psi_threshold\": 0.2,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_8__'></a>\n", - "\n", - "### Classification accuracy\n", - "\n", - "We now evaluate the model's predictive performance by comparing its behavior between reference and monitoring datasets. These tests analyze shifts in overall accuracy metrics, examine changes in the confusion matrix to identify specific classification pattern changes, and assess the model's probability calibration across different prediction thresholds. \n", - "\n", - "The primary objective is to detect any degradation in the model's classification performance that might indicate reliability issues in production. The tests provide both aggregate performance metrics and detailed breakdowns of prediction patterns, enabling the identification of specific areas where the model's accuracy might be deteriorating." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.ongoing_monitoring.ClassificationAccuracyDrift\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - " params={\n", - " \"drift_pct_threshold\": 5,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.ongoing_monitoring.ConfusionMatrixDrift\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - " params={\n", - " \"drift_pct_threshold\": 5,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.ongoing_monitoring.CalibrationCurveDrift\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - " params={\n", - " \"n_bins\": 10,\n", - " \"drift_pct_threshold\": 10,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_9__'></a>\n", - "\n", - "### Class discrimination\n", - "\n", - "The following tests assess the model's ability to effectively separate different classes in both reference and monitoring datasets. These tests analyze the model's discriminative power by examining the separation between class distributions, evaluating changes in the ROC curve characteristics, comparing probability distribution patterns, and assessing cumulative prediction trends. \n", - "\n", - "The primary objective is to identify any deterioration in the model's ability to distinguish between classes, which could indicate a decline in model effectiveness. The tests examine both the overall discriminative capability and the granular patterns in prediction distributions, providing insights into whether the model maintains its ability to effectively differentiate between classes in the production environment." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.ongoing_monitoring.ClassDiscriminationDrift\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - " params={\n", - " \"drift_pct_threshold\": 5,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.ongoing_monitoring.ROCCurveDrift\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.ongoing_monitoring.PredictionProbabilitiesHistogramDrift\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - " params={\n", - " \"drift_pct_threshold\": 10,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.ongoing_monitoring.CumulativePredictionProbabilitiesDrift\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_10__'></a>\n", - "\n", - "### Scoring\n", - "\n", - "Next we analyze the distribution and stability of credit scores across reference and monitoring datasets. These tests evaluate shifts in score distributions, examine changes in score band populations, and assess the relationship between scores and default rates. \n", - "\n", - "The primary objective is to identify any significant changes in how the model assigns credit scores, which could indicate drift in risk assessment capabilities. The tests examine both the overall score distribution patterns and the specific performance within defined score bands, providing insights into whether the model maintains consistent and reliable risk segmentation." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.ongoing_monitoring.ScorecardHistogramDrift\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " },\n", - " params={\n", - " \"score_column\": \"xgb_scores\",\n", - " \"drift_pct_threshold\": 20,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.ongoing_monitoring.ScoreBandsDrift\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - " params={\n", - " \"score_column\": \"xgb_scores\",\n", - " \"score_bands\": [500, 540, 570],\n", - " \"drift_pct_threshold\": 20,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_11__'></a>\n", - "\n", - "### Model insights" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.PermutationFeatureImportance\",\n", - " input_grid={\n", - " \"dataset\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": [vm_xgb_model]\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.FeaturesAUC\",\n", - " input_grid={\n", - " \"model\": [vm_xgb_model],\n", - " \"dataset\": [vm_reference_ds, vm_monitoring_ds],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.SHAPGlobalImportance\",\n", - " input_grid={\n", - " \"model\": [vm_xgb_model],\n", - " \"dataset\": [vm_reference_ds, vm_monitoring_ds],\n", - " },\n", - " params={\n", - " \"kernel_explainer_samples\": 10,\n", - " \"tree_or_linear_explainer_samples\": 200,\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_12__'></a>\n", - "\n", - "### Diagnostic monitoring" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.WeakspotsDiagnosis\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.OverfitDiagnosis\",\n", - " inputs={\n", - " \"model\": vm_xgb_model,\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " },\n", - " params={\n", - " \"cut_off_threshold\": 0.04\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_13__'></a>\n", - "\n", - "### Robustness monitoring" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "run_test(\n", - " \"validmind.model_validation.sklearn.RobustnessDiagnosis\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", - " \"model\": vm_xgb_model,\n", - " },\n", - " params={\n", - " \"scaling_factor_std_dev_list\": [\n", - " 0.1,\n", - " 0.2,\n", - " 0.3,\n", - " 0.4,\n", - " 0.5\n", - " ],\n", - " \"performance_decay_threshold\": 0.05\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_14__'></a>\n", - "\n", - "### Performance history\n", - "\n", - "In this section we showcase how to track and visualize the temporal evolution of key model performance metrics, including AUC, F1 score, precision, recall, and accuracy. For demonstration purposes, the section simulates historical performance data by introducing a gradual downward trend and random noise to these metrics over a specified time period. These tests are useful for analyzing the stability and trends in model performance indicators, helping to identify potential degradation or unexpected fluctuations in model behavior over time. \n", - "\n", - "The main goal is to maintain a continuous record of model performance that can be used to detect gradual drift, sudden changes, or cyclical patterns in model effectiveness. This temporal monitoring approach provides early warning signals of potential issues and helps establish whether the model maintains consistent performance within acceptable boundaries throughout its deployment period." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "metrics = [metric for metric in list_metrics() if \"classification\" in metric]\n", - "\n", - "for metric_id in metrics:\n", - " describe_metric(metric_id)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = run_metric(\n", - " \"validmind.unit_metrics.classification.ROC_AUC\",\n", - " inputs={\n", - " \"model\": vm_xgb_model,\n", - " \"dataset\": vm_monitoring_ds,\n", - " },\n", - ")\n", - "auc = result.metric" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = run_metric(\n", - " \"validmind.unit_metrics.classification.Accuracy\",\n", - " inputs={\n", - " \"model\": vm_xgb_model,\n", - " \"dataset\": vm_monitoring_ds,\n", - " },\n", - ")\n", - "accuracy = result.metric" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = run_metric(\n", - " \"validmind.unit_metrics.classification.Recall\",\n", - " inputs={\n", - " \"model\": vm_xgb_model,\n", - " \"dataset\": vm_monitoring_ds,\n", - " },\n", - ")\n", - "recall = result.metric" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "f1 = run_metric(\n", - " \"validmind.unit_metrics.classification.F1\",\n", - " inputs={\n", - " \"model\": vm_xgb_model,\n", - " \"dataset\": vm_monitoring_ds,\n", - " },\n", - ")\n", - "f1 = result.metric" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "precision = run_metric(\n", - " \"validmind.unit_metrics.classification.Precision\",\n", - " inputs={\n", - " \"model\": vm_xgb_model,\n", - " \"dataset\": vm_monitoring_ds,\n", - " },\n", - ")\n", - "precision = result.metric" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "NUM_DAYS = 10\n", - "REFERENCE_DATE = datetime(2024, 1, 1) # Fixed date: January 1st, 2024\n", - "base_date = REFERENCE_DATE - timedelta(days=NUM_DAYS)\n", - "\n", - "\n", - "# Initial values\n", - "performance_metrics = {\n", - " \"AUC Score\": auc,\n", - " \"F1 Score\": f1,\n", - " \"Precision Score\": precision,\n", - " \"Recall Score\": recall,\n", - " \"Accuracy Score\": accuracy\n", - "}\n", - "\n", - "# Trend parameters\n", - "trend_factor = 0.98 # Slight downward trend (multiply by 0.98 each step)\n", - "noise_scale = 0.02 # Random fluctuation of ±2%\n", - "\n", - "\n", - "for i in range(NUM_DAYS):\n", - " recorded_at = base_date + timedelta(days=i)\n", - " print(f\"\\nrecorded_at: {recorded_at}\")\n", - "\n", - " # Log each metric with trend and noise\n", - " for metric_name, base_value in performance_metrics.items():\n", - " # Apply trend and add random noise\n", - " trend = base_value * (trend_factor ** i)\n", - " noise = np.random.normal(0, noise_scale * base_value)\n", - " value = max(0, min(1, trend + noise)) # Ensure value stays between 0 and 1\n", - " \n", - " log_metric(\n", - " key=metric_name,\n", - " value=value,\n", - " recorded_at=recorded_at.isoformat()\n", - " )\n", - " \n", - " print(f\"{metric_name:<15}: {value:.4f}\")\n" - ] - }, - { - "cell_type": "markdown", - "id": "copyright-a1aa6fcedbed410099c3b537625ad59b", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "validmind-eEL8LtKG-py3.10", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 2 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Ongoing Monitoring for Application Scorecard\n", + "\n", + "In this notebook, you'll learn how to seamlessly monitor your production models using the ValidMind Platform.\n", + "\n", + "We'll walk you through the process of initializing the ValidMind Library, loading a sample dataset and model, and running a monitoring test suite to quickly generate documentation about your new data and model." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply monitoring report template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Preview the monitoring report template](#toc2_3__) \n", + " - [Initialize the Python environment](#toc2_4__) \n", + " - [Preview the monitoring template](#toc2_5__) \n", + "- [Load the reference and monitoring datasets](#toc3__) \n", + "- [Train the model](#toc4__) \n", + " - [Initialize the ValidMind datasets](#toc4_1__) \n", + " - [Initialize the ValidMind model](#toc4_2__) \n", + " - [Assign prediction values and probabilities to the datasets](#toc4_3__) \n", + " - [Compute credit risk scores](#toc4_4__) \n", + " - [Adding custom context to the LLM descriptions](#toc4_5__) \n", + " - [Monitoring data description](#toc4_6__) \n", + " - [Target and feature drift](#toc4_7__) \n", + " - [Classification accuracy](#toc4_8__) \n", + " - [Class discrimination](#toc4_9__) \n", + " - [Scoring](#toc4_10__) \n", + " - [Model insights](#toc4_11__) \n", + " - [Diagnostic monitoring](#toc4_12__) \n", + " - [Robustness monitoring](#toc4_13__) \n", + " - [Performance history](#toc4_14__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation, validation, and monitoring tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", + "\n", + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**ongoing monitoring report**: A comprehensive and structured periodic report assessing the record's performance and compliance over time, ensuring it remains valid under changing conditions. Monitoring includes key elements such as data sources, inputs, performance metrics, and periodic evaluations, ensuring transparency and visibility of the record's performance in the production environment.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**monitoring template, monitoring report template**: A default ValidMind document template that serves as a standardized framework for ongoing monitoring, including sections designated for test results, performance metrics, and drift analyses. By outlining required monitoring checks and expected routine tests, monitoring templates ensure consistency and completeness across monitoring reports and help guide owners through a systematic monitoring process while promoting early detection of performance degradation.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply monitoring report template\n", + "\n", + "Once you've registered your model, let's select a monitoring report template. A template predefines sections for your monitoring report and provides a general outline to follow, making the monitoring process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Monitoring**.\n", + "\n", + " If you cannot locate your Monitoring document, make sure Monitoring type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Ongoing Monitoring for Classification Models`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Monitoring` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"monitoring\",\n", + " monitoring = True,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Preview the monitoring report template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_4__'></a>\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Next, let's import the necessary libraries and set up your Python environment for data analysis:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import xgboost as xgb\n", + "import numpy as np\n", + "\n", + "from datetime import datetime, timedelta\n", + "\n", + "from validmind.tests import run_test\n", + "from validmind.datasets.credit_risk import lending_club\n", + "from validmind.unit_metrics import list_metrics\n", + "from validmind.unit_metrics import describe_metric\n", + "from validmind.unit_metrics import run_metric\n", + "from validmind.api_client import log_metric\n", + "\n", + "%matplotlib inline" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_5__'></a>\n", + "\n", + "### Preview the monitoring template\n", + "\n", + "A template predefines sections for your monitoring documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "You will upload documentation and test results into this template later on. For now, take a look at the structure that the template provides with the `vm.preview_template()` function from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Load the reference and monitoring datasets\n", + "\n", + "The sample dataset used here is provided by the ValidMind library. For demonstration purposes we'll use the training, test dataset splits as `reference` and `monitoring` datasets." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "df = lending_club.load_data(source=\"offline\")\n", + "df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "preprocess_df = lending_club.preprocess(df)\n", + "preprocess_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "fe_df = lending_club.feature_engineering(preprocess_df)\n", + "fe_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Train the model\n", + "\n", + "In this section, we focus on constructing and refining our predictive model. \n", + "- We begin by dividing our data, which is based on Weight of Evidence (WoE) features, into training and testing sets (`train_df`, `test_df`). \n", + "- With `lending_club.split`, we employ a simple random split, randomly allocating data points to each set to ensure a mix of examples in both." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Split the data\n", + "train_df, test_df = lending_club.split(fe_df, test_size=0.2)\n", + "\n", + "x_train = train_df.drop(lending_club.target_column, axis=1)\n", + "y_train = train_df[lending_club.target_column]\n", + "\n", + "x_test = test_df.drop(lending_club.target_column, axis=1)\n", + "y_test = test_df[lending_club.target_column]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Define the XGBoost model\n", + "xgb_model = xgb.XGBClassifier(\n", + " n_estimators=50, \n", + " random_state=42, \n", + " early_stopping_rounds=10\n", + ")\n", + "xgb_model.set_params(\n", + " eval_metric=[\"error\", \"logloss\", \"auc\"],\n", + ")\n", + "\n", + "# Fit the model\n", + "xgb_model.fit(\n", + " x_train, \n", + " y_train,\n", + " eval_set=[(x_test, y_test)],\n", + " verbose=False\n", + ")\n", + "\n", + "# Compute probabilities\n", + "train_xgb_prob = xgb_model.predict_proba(x_train)[:, 1]\n", + "test_xgb_prob = xgb_model.predict_proba(x_test)[:, 1]\n", + "\n", + "# Compute binary predictions\n", + "cut_off_threshold = 0.3\n", + "train_xgb_binary_predictions = (train_xgb_prob > cut_off_threshold).astype(int)\n", + "test_xgb_binary_predictions = (test_xgb_prob > cut_off_threshold).astype(int)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", + "\n", + "This function takes a number of arguments:\n", + "\n", + "- `dataset` — The raw dataset that you want to provide as input to tests.\n", + "- `input_id` - A unique identifier that allows tracking what inputs are used when running each individual test.\n", + "- `target_column` — A required argument if tests require access to true values. This is the name of the target column in the dataset.\n", + "\n", + "With all datasets ready, you can now initialize training, reference(test) and monitor datasets (`reference_df` and `monitor_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_reference_ds = vm.init_dataset(\n", + " dataset=train_df,\n", + " input_id=\"reference_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")\n", + "\n", + "vm_monitoring_ds = vm.init_dataset(\n", + " dataset=test_df,\n", + " input_id=\"monitoring_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### Initialize the ValidMind model\n", + "\n", + "You will also need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_xgb_model = vm.init_model(\n", + " xgb_model,\n", + " input_id=\"xgb_model\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_3__'></a>\n", + "\n", + "### Assign prediction values and probabilities to the datasets\n", + "\n", + "With our model now trained, we'll move on to assigning both the predictive probabilities coming directly from the model's predictions, and the binary prediction after applying the cutoff threshold described in the previous steps. \n", + "- These tasks are achieved through the use of the `assign_predictions()` method associated with the VM `dataset` object.\n", + "- This method links the model's class prediction values and probabilities to our VM train and test datasets." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_reference_ds.assign_predictions(\n", + " model=vm_xgb_model,\n", + " prediction_values=train_xgb_binary_predictions,\n", + " prediction_probabilities=train_xgb_prob,\n", + ")\n", + "\n", + "vm_monitoring_ds.assign_predictions(\n", + " model=vm_xgb_model,\n", + " prediction_values=test_xgb_binary_predictions,\n", + " prediction_probabilities=test_xgb_prob,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_4__'></a>\n", + "\n", + "### Compute credit risk scores\n", + "\n", + "In this phase, we translate model predictions into actionable scores using probability estimates generated by our trained model." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "train_xgb_scores = lending_club.compute_scores(train_xgb_prob)\n", + "test_xgb_scores = lending_club.compute_scores(test_xgb_prob)\n", + "\n", + "# Assign scores to the datasets\n", + "vm_reference_ds.add_extra_column(\"xgb_scores\", train_xgb_scores)\n", + "vm_monitoring_ds.add_extra_column(\"xgb_scores\", test_xgb_scores)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_5__'></a>\n", + "\n", + "### Adding custom context to the LLM descriptions\n", + "\n", + "To enable the LLM descriptions context, you need to set the `VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED` environment variable to `1`. This will enable the LLM descriptions context, which will be used to provide additional context to the LLM descriptions. This is a global setting that will affect all tests." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import os\n", + "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED\"] = \"1\"\n", + "\n", + "context = \"\"\"\n", + "FORMAT FOR THE LLM DESCRIPTIONS: \n", + " **<Test Name>** is designed to <begin with a concise overview of what the test does and its primary purpose, \n", + " extracted from the test description>.\n", + "\n", + " The test operates by <write a paragraph about the test mechanism, explaining how it works and what it measures. \n", + " Include any relevant formulas or methodologies mentioned in the test description.>\n", + "\n", + " The primary advantages of this test include <write a paragraph about the test's strengths and capabilities, \n", + " highlighting what makes it particularly useful for specific scenarios.>\n", + "\n", + " Users should be aware that <write a paragraph about the test's limitations and potential risks. \n", + " Include both technical limitations and interpretation challenges. \n", + " If the test description includes specific signs of high risk, incorporate these here.>\n", + "\n", + " **Key Insights:**\n", + "\n", + " The test results reveal:\n", + "\n", + " - **<insight title>**: <comprehensive description of one aspect of the results>\n", + " - **<insight title>**: <comprehensive description of another aspect>\n", + " ...\n", + "\n", + " Based on these results, <conclude with a brief paragraph that ties together the test results with the test's \n", + " purpose and provides any final recommendations or considerations.>\n", + "\n", + "ADDITIONAL INSTRUCTIONS:\n", + " Present insights in order from general to specific, with each insight as a single bullet point with bold title.\n", + "\n", + " For each metric in the test results, include in the test overview:\n", + " - The metric's purpose and what it measures\n", + " - Its mathematical formula\n", + " - The range of possible values\n", + " - What constitutes good/bad performance\n", + " - How to interpret different values\n", + "\n", + " Each insight should progressively cover:\n", + " 1. Overall scope and distribution\n", + " 2. Complete breakdown of all elements with specific values\n", + " 3. Natural groupings and patterns\n", + " 4. Comparative analysis between datasets/categories\n", + " 5. Stability and variations\n", + " 6. Notable relationships or dependencies\n", + "\n", + " Remember:\n", + " - Keep all insights at the same level (no sub-bullets or nested structures)\n", + " - Make each insight complete and self-contained\n", + " - Include specific numerical values and ranges\n", + " - Cover all elements in the results comprehensively\n", + " - Maintain clear, concise language\n", + " - Use only \"- **Title**: Description\" format for insights\n", + " - Progress naturally from general to specific observations\n", + "\n", + "\"\"\".strip()\n", + "\n", + "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT\"] = context" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_6__'></a>\n", + "\n", + "### Monitoring data description\n", + "\n", + "The Monitoring Data Description tests aim to provide a comprehensive statistical analysis of the monitoring dataset's characteristics. These tests examine the basic statistical properties, identify any missing data patterns, assess data uniqueness, visualize numerical feature distributions, and evaluate feature relationships through correlation analysis.\n", + "\n", + "The primary objective is to establish a baseline understanding of the monitoring data's structure and quality, enabling the detection of any significant deviations from expected patterns that could impact model performance. Each test is designed to capture different aspects of the data, from univariate statistics to multivariate relationships, providing a foundation for ongoing data quality assessment in the production environment." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.DescriptiveStatistics:monitoring_data\",\n", + " inputs={\n", + " \"dataset\": vm_monitoring_ds,\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.MissingValues:monitoring_data\",\n", + " inputs={\n", + " \"dataset\": vm_monitoring_ds,\n", + " },\n", + " params={\n", + " \"min_percentage_threshold\": 1\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.UniqueRows:monitoring_data\",\n", + " inputs={\n", + " \"dataset\": vm_monitoring_ds,\n", + " },\n", + " params={\n", + " \"min_percent_threshold\": 1\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.TabularNumericalHistograms:monitoring_data\",\n", + " inputs={\n", + " \"dataset\": vm_monitoring_ds,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.PearsonCorrelationMatrix:monitoring_data\",\n", + " inputs={\n", + " \"dataset\": vm_monitoring_ds,\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.data_validation.HighPearsonCorrelation:monitoring_data\",\n", + " inputs={\n", + " \"dataset\": vm_monitoring_ds,\n", + " },\n", + " params={\n", + " \"feature_columns\": vm_monitoring_ds.feature_columns,\n", + " \"max_threshold\": 0.5,\n", + " \"top_n_correlations\": 10\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.ongoing_monitoring.ClassImbalanceDrift\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " },\n", + " params={\n", + " \"drift_pct_threshold\": 1\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_7__'></a>\n", + "\n", + "### Target and feature drift\n", + "\n", + "Next, the goal is to investigate the distributional characteristics of predictions and features to determine if the underlying data has changed. These tests are crucial for assessing the expected accuracy of the model.\n", + "\n", + "1. **Target drift:** We compare the dataset used for testing (reference data) with the monitoring data. This helps to identify any shifts in the target variable distribution.\n", + "2. **Feature drift:** We compare the training dataset with the monitoring data. Since features were used to train the model, any drift in these features could indicate potential issues, as the underlying patterns that the model was trained on may have changed." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next, we can examine the correlation between features and predictions. Significant changes in these correlations may trigger a deeper assessment." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.PopulationStabilityIndex\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.ongoing_monitoring.TargetPredictionDistributionPlot\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + " params={\n", + " \"drift_pct_threshold\": 5\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now we want see difference in correlation pairs between model prediction and features." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.ongoing_monitoring.PredictionCorrelation\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + " params={\n", + " \"drift_pct_threshold\": 5\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Finally for target drift, let's plot each prediction value and feature grid side by side." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.ongoing_monitoring.PredictionQuantilesAcrossFeatures\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next, let's add run a test to investigate how or if the features have drifted. In this instance we want to compare the training data with prediction data." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.ongoing_monitoring.FeatureDrift\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + " params={\n", + " \"psi_threshold\": 0.2,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_8__'></a>\n", + "\n", + "### Classification accuracy\n", + "\n", + "We now evaluate the model's predictive performance by comparing its behavior between reference and monitoring datasets. These tests analyze shifts in overall accuracy metrics, examine changes in the confusion matrix to identify specific classification pattern changes, and assess the model's probability calibration across different prediction thresholds. \n", + "\n", + "The primary objective is to detect any degradation in the model's classification performance that might indicate reliability issues in production. The tests provide both aggregate performance metrics and detailed breakdowns of prediction patterns, enabling the identification of specific areas where the model's accuracy might be deteriorating." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.ongoing_monitoring.ClassificationAccuracyDrift\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + " params={\n", + " \"drift_pct_threshold\": 5,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.ongoing_monitoring.ConfusionMatrixDrift\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + " params={\n", + " \"drift_pct_threshold\": 5,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.ongoing_monitoring.CalibrationCurveDrift\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + " params={\n", + " \"n_bins\": 10,\n", + " \"drift_pct_threshold\": 10,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_9__'></a>\n", + "\n", + "### Class discrimination\n", + "\n", + "The following tests assess the model's ability to effectively separate different classes in both reference and monitoring datasets. These tests analyze the model's discriminative power by examining the separation between class distributions, evaluating changes in the ROC curve characteristics, comparing probability distribution patterns, and assessing cumulative prediction trends. \n", + "\n", + "The primary objective is to identify any deterioration in the model's ability to distinguish between classes, which could indicate a decline in model effectiveness. The tests examine both the overall discriminative capability and the granular patterns in prediction distributions, providing insights into whether the model maintains its ability to effectively differentiate between classes in the production environment." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.ongoing_monitoring.ClassDiscriminationDrift\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + " params={\n", + " \"drift_pct_threshold\": 5,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.ongoing_monitoring.ROCCurveDrift\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.ongoing_monitoring.PredictionProbabilitiesHistogramDrift\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + " params={\n", + " \"drift_pct_threshold\": 10,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.ongoing_monitoring.CumulativePredictionProbabilitiesDrift\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_10__'></a>\n", + "\n", + "### Scoring\n", + "\n", + "Next we analyze the distribution and stability of credit scores across reference and monitoring datasets. These tests evaluate shifts in score distributions, examine changes in score band populations, and assess the relationship between scores and default rates. \n", + "\n", + "The primary objective is to identify any significant changes in how the model assigns credit scores, which could indicate drift in risk assessment capabilities. The tests examine both the overall score distribution patterns and the specific performance within defined score bands, providing insights into whether the model maintains consistent and reliable risk segmentation." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.ongoing_monitoring.ScorecardHistogramDrift\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " },\n", + " params={\n", + " \"score_column\": \"xgb_scores\",\n", + " \"drift_pct_threshold\": 20,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.ongoing_monitoring.ScoreBandsDrift\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + " params={\n", + " \"score_column\": \"xgb_scores\",\n", + " \"score_bands\": [500, 540, 570],\n", + " \"drift_pct_threshold\": 20,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_11__'></a>\n", + "\n", + "### Model insights" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.PermutationFeatureImportance\",\n", + " input_grid={\n", + " \"dataset\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": [vm_xgb_model]\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.FeaturesAUC\",\n", + " input_grid={\n", + " \"model\": [vm_xgb_model],\n", + " \"dataset\": [vm_reference_ds, vm_monitoring_ds],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.SHAPGlobalImportance\",\n", + " input_grid={\n", + " \"model\": [vm_xgb_model],\n", + " \"dataset\": [vm_reference_ds, vm_monitoring_ds],\n", + " },\n", + " params={\n", + " \"kernel_explainer_samples\": 10,\n", + " \"tree_or_linear_explainer_samples\": 200,\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_12__'></a>\n", + "\n", + "### Diagnostic monitoring" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.WeakspotsDiagnosis\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.OverfitDiagnosis\",\n", + " inputs={\n", + " \"model\": vm_xgb_model,\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " },\n", + " params={\n", + " \"cut_off_threshold\": 0.04\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_13__'></a>\n", + "\n", + "### Robustness monitoring" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "run_test(\n", + " \"validmind.model_validation.sklearn.RobustnessDiagnosis\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitoring_ds],\n", + " \"model\": vm_xgb_model,\n", + " },\n", + " params={\n", + " \"scaling_factor_std_dev_list\": [\n", + " 0.1,\n", + " 0.2,\n", + " 0.3,\n", + " 0.4,\n", + " 0.5\n", + " ],\n", + " \"performance_decay_threshold\": 0.05\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_14__'></a>\n", + "\n", + "### Performance history\n", + "\n", + "In this section we showcase how to track and visualize the temporal evolution of key model performance metrics, including AUC, F1 score, precision, recall, and accuracy. For demonstration purposes, the section simulates historical performance data by introducing a gradual downward trend and random noise to these metrics over a specified time period. These tests are useful for analyzing the stability and trends in model performance indicators, helping to identify potential degradation or unexpected fluctuations in model behavior over time. \n", + "\n", + "The main goal is to maintain a continuous record of model performance that can be used to detect gradual drift, sudden changes, or cyclical patterns in model effectiveness. This temporal monitoring approach provides early warning signals of potential issues and helps establish whether the model maintains consistent performance within acceptable boundaries throughout its deployment period." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "metrics = [metric for metric in list_metrics() if \"classification\" in metric]\n", + "\n", + "for metric_id in metrics:\n", + " describe_metric(metric_id)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_metric(\n", + " \"validmind.unit_metrics.classification.ROC_AUC\",\n", + " inputs={\n", + " \"model\": vm_xgb_model,\n", + " \"dataset\": vm_monitoring_ds,\n", + " },\n", + ")\n", + "auc = result.metric" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_metric(\n", + " \"validmind.unit_metrics.classification.Accuracy\",\n", + " inputs={\n", + " \"model\": vm_xgb_model,\n", + " \"dataset\": vm_monitoring_ds,\n", + " },\n", + ")\n", + "accuracy = result.metric" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = run_metric(\n", + " \"validmind.unit_metrics.classification.Recall\",\n", + " inputs={\n", + " \"model\": vm_xgb_model,\n", + " \"dataset\": vm_monitoring_ds,\n", + " },\n", + ")\n", + "recall = result.metric" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "f1 = run_metric(\n", + " \"validmind.unit_metrics.classification.F1\",\n", + " inputs={\n", + " \"model\": vm_xgb_model,\n", + " \"dataset\": vm_monitoring_ds,\n", + " },\n", + ")\n", + "f1 = result.metric" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "precision = run_metric(\n", + " \"validmind.unit_metrics.classification.Precision\",\n", + " inputs={\n", + " \"model\": vm_xgb_model,\n", + " \"dataset\": vm_monitoring_ds,\n", + " },\n", + ")\n", + "precision = result.metric" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "NUM_DAYS = 10\n", + "REFERENCE_DATE = datetime(2024, 1, 1) # Fixed date: January 1st, 2024\n", + "base_date = REFERENCE_DATE - timedelta(days=NUM_DAYS)\n", + "\n", + "\n", + "# Initial values\n", + "performance_metrics = {\n", + " \"AUC Score\": auc,\n", + " \"F1 Score\": f1,\n", + " \"Precision Score\": precision,\n", + " \"Recall Score\": recall,\n", + " \"Accuracy Score\": accuracy\n", + "}\n", + "\n", + "# Trend parameters\n", + "trend_factor = 0.98 # Slight downward trend (multiply by 0.98 each step)\n", + "noise_scale = 0.02 # Random fluctuation of ±2%\n", + "\n", + "\n", + "for i in range(NUM_DAYS):\n", + " recorded_at = base_date + timedelta(days=i)\n", + " print(f\"\\nrecorded_at: {recorded_at}\")\n", + "\n", + " # Log each metric with trend and noise\n", + " for metric_name, base_value in performance_metrics.items():\n", + " # Apply trend and add random noise\n", + " trend = base_value * (trend_factor ** i)\n", + " noise = np.random.normal(0, noise_scale * base_value)\n", + " value = max(0, min(1, trend + noise)) # Ensure value stays between 0 and 1\n", + " \n", + " log_metric(\n", + " key=metric_name,\n", + " value=value,\n", + " recorded_at=recorded_at.isoformat()\n", + " )\n", + " \n", + " print(f\"{metric_name:<15}: {value:.4f}\")\n" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-a1aa6fcedbed410099c3b537625ad59b" + } + ], + "metadata": { + "kernelspec": { + "display_name": "validmind-eEL8LtKG-py3.10", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.13" + } + }, + "nbformat": 4, + "nbformat_minor": 2 } diff --git a/site/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb b/site/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb index 839da8e5f6..9be6aa92fe 100644 --- a/site/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb +++ b/site/notebooks/use_cases/ongoing_monitoring/quickstart_customer_churn_ongoing_monitoring.ipynb @@ -1,908 +1,914 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Quickstart for ongoing monitoring of models with ValidMind\n", - "\n", - "Welcome! In this quickstart guide, you'll learn how to seamlessly monitor your production models using the ValidMind Platform.\n", - "\n", - "We'll walk you through the process of initializing the ValidMind Library, loading a sample dataset and model, and running a monitoring test suite to quickly generate documentation about your new data and model.\n", - "\n", - "This notebook utilizes the [Bank Customer Churn Prediction](https://www.kaggle.com/code/kmalit/bank-customer-churn-prediction/data) dataset from Kaggle to train a simple classification model for demonstration purposes." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply monitoring report template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Initialize the Python environment](#toc2_3__) \n", - " - [Preview the monitoring report template](#toc2_4__) \n", - "- [Load the reference and monitoring datasets](#toc3__) \n", - " - [Load the production model](#toc3_1__) \n", - " - [Initialize the ValidMind datasets](#toc3_2__) \n", - " - [Initialize the ValidMind model](#toc3_3__) \n", - " - [Assign predictions to the datasets](#toc3_4__) \n", - " - [Run the ongoing monitoring tests](#toc3_5__) \n", - " - [Conduct target and feature drift testing](#toc3_6__) \n", - " - [Feature drift tests](#toc3_6_1__) \n", - " - [Model performance monitoring tests](#toc3_7__) \n", - "- [Next steps](#toc4__) \n", - " - [Work with your monitoring report](#toc4_1__) \n", - " - [Discover more learning resources](#toc4_2__) \n", - "- [Upgrade ValidMind](#toc5__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation, validation, and monitoring tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Model monitoring report**: A comprehensive and structured record of a production model, including key elements such as data sources, inputs, performance metrics, and periodic evaluations. This documentation ensures transparency and visibility of the model's performance in the production environment.\n", - "\n", - "**Monitoring report template**: Similar to documentation template, The monitoring report template functions as a test suite and lays out the structure of model monitoring, segmented into various sections and sub-sections. Monitoring report templates define the structure of your model monitoring report, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Custom tests**: Custom tests are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom test.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom test. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply monitoring report template\n", - "\n", - "Once you've registered your model, let's select a monitoring report template. A template predefines sections for your monitoring report and provides a general outline to follow, making the monitoring process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Monitoring**.\n", - "\n", - " If you cannot locate your Monitoring document, make sure Monitoring type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Ongoing Monitoring for Classification Models`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Monitoring` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"monitoring\",\n", - " monitoring = True,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Next, let's import the necessary libraries and set up your Python environment for data analysis:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import xgboost as xgb\n", - "import validmind as vm\n", - "import pandas as pd\n", - "import numpy as np\n", - "import seaborn as sns\n", - "import matplotlib.pyplot as plt\n", - "from validmind.tests import run_test\n", - "\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_4__'></a>\n", - "\n", - "### Preview the monitoring report template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Load the reference and monitoring datasets\n", - "\n", - "The sample dataset used here is provided by the ValidMind library. For demonstration purposes we'll use the training, test and validation dataset splits as `training`, `reference` and `monitoring` datasets." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.datasets.classification import customer_churn\n", - "\n", - "raw_df = customer_churn.load_data()\n", - "\n", - "train_df, reference_df, monitor_df = customer_churn.preprocess(raw_df)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_1__'></a>\n", - "\n", - "### Load the production model\n", - "\n", - "We will also load a pre-trained model for demonstration purposes. This is a simple XGBoost model trained on the Bank Customer Churn Prediction dataset." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import xgboost as xgb\n", - "\n", - "# Load the saved model\n", - "model = xgb.XGBClassifier()\n", - "model.load_model(\"xgboost_model.model\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_2__'></a>\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", - "\n", - "This function takes a number of arguments:\n", - "\n", - "- `dataset` — The raw dataset that you want to provide as input to tests.\n", - "- `input_id` - A unique identifier that allows tracking what inputs are used when running each individual test.\n", - "- `target_column` — A required argument if tests require access to true values. This is the name of the target column in the dataset.\n", - "- `class_labels` — An optional value to map predicted classes to class labels.\n", - "\n", - "With all datasets ready, you can now initialize training, reference(test) and monitor datasets (`train_df`, `reference_df` and `monitor_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds = vm.init_dataset(\n", - " dataset=train_df,\n", - " input_id=\"train_df\",\n", - " target_column=customer_churn.target_column,\n", - ")\n", - "\n", - "vm_reference_ds = vm.init_dataset(\n", - " dataset=reference_df,\n", - " input_id=\"reference_df\",\n", - " target_column=customer_churn.target_column,\n", - ")\n", - "\n", - "vm_monitor_ds = vm.init_dataset(\n", - " dataset=monitor_df,\n", - " input_id=\"monitor_dataset\",\n", - " target_column=customer_churn.target_column,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_3__'></a>\n", - "\n", - "### Initialize the ValidMind model\n", - "\n", - "You'll also need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data for our model.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_model = vm.init_model(\n", - " model,\n", - " input_id=\"model\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_4__'></a>\n", - "\n", - "### Assign predictions to the datasets\n", - "\n", - "We can now use the `assign_predictions()` method from the Dataset object to link existing predictions to any model. If no prediction values are passed, the method will compute predictions automatically:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(\n", - " model=vm_model,\n", - ")\n", - "\n", - "vm_reference_ds.assign_predictions(\n", - " model=vm_model,\n", - ")\n", - "\n", - "vm_monitor_ds.assign_predictions(\n", - " model=vm_model,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_5__'></a>\n", - "\n", - "### Run the ongoing monitoring tests\n", - "\n", - "Before we start the testing procedure, let's take a look at the expected tests that are pre-configured:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test_list = vm.get_test_suite().get_default_config()\n", - "for l in test_list:\n", - " print(l)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's run the first test in the list. Note that you can use `vm.tests.describe_test()` to get information about the inputs required for the test:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.describe_test(\"validmind.model_validation.ModelMetadata\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "As you can see, the `ModelMetadata` only requires a model input. Let's run the test and log the results into the monitoring document with the `.log()` method:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test_result = vm.tests.run_test(\n", - " \"validmind.model_validation.ModelMetadata\",\n", - " model=vm_model,\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's run the tests needed to determine data quality of the monitoring dataset:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "data_qual = vm.get_test_suite(\n", - " section=\"prediction_data_description\"\n", - ").get_default_config()\n", - "\n", - "# Run all of the necessary data quality checks where the monitoring dataset is the basis\n", - "for l in data_qual:\n", - " vm.tests.run_test(\n", - " l,\n", - " inputs={\"dataset\": vm_monitor_ds},\n", - " show=False,\n", - " ).log()\n", - " print(\"Completed test: {0}\".format(l))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "To view the results of the model metadata and data quality tests, select **Monitoring** under Documents in the left sidebar of the model in the ValidMind Platform and click on the following sections:\n", - "\n", - "- 1. Model Monitoring Overview > **1.2. Model Details**\n", - "- 2. Data Quality & Drift Assessment > **2.1. Prediction Data Description**" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Next, let's run *comparison tests*, which will allow comparing differences between the training dataset and monitoring datasets. To run a test in comparison mode, you only need to pass an `input_grid` parameter to the `run_test()` method instead of `inputs`.\n", - "\n", - "For more information about comparison tests, see this [notebook](../../how_to/tests/run_tests/2-run_comparison_tests.ipynb)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "correlation_tests = [\n", - " \"validmind.data_validation.PearsonCorrelationMatrix:train_vs_test\",\n", - " \"validmind.data_validation.HighPearsonCorrelation:train_vs_test\",\n", - "]\n", - "\n", - "for test in correlation_tests:\n", - " vm.tests.run_test(\n", - " test,\n", - " input_grid={\n", - " \"dataset\": [vm_train_ds, vm_monitor_ds],\n", - " \"model\": [vm_model],\n", - " },\n", - " show=False,\n", - " ).log()\n", - " print(\"Completed test {0}\".format(test))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You can view these results in the ValidMind Platform in **Ongoing Monitoring** within Documents under the following section:\n", - "\n", - "- 2. Data Quality & Drift Assessment > **2.2. Prediction Data Correlations and Interactions**" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_6__'></a>\n", - "\n", - "### Conduct target and feature drift testing\n", - "\n", - "Next, the goal is to investigate the distributional characteristics of predictions and features to determine if the underlying data has changed. These tests are crucial for assessing the expected accuracy of the model.\n", - "\n", - "1. **Target drift:** We compare the dataset used for testing (reference data) with the monitoring data. This helps to identify any shifts in the target variable distribution.\n", - "2. **Feature drift:** We compare the training dataset with the monitoring data. Since features were used to train the model, any drift in these features could indicate potential issues, as the underlying patterns that the model was trained on may have changed." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "In the 2. Data Quality & Drift Assessment > **2.3 Target Drift** section we can confirm only there is only one pre-configured test:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for l in vm.get_test_suite(section=\"comparison_data_target\").get_default_config():\n", - " print(l)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "As part of running the rest of the tests, we will directly log the results to a section when calling the `.log()` method." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "First, let's run the *Population Stability Index (PSI)* for predictions. In this case, we want to compare the test data with the monitoring data. (Note: For predictions, the training data is irrelevant.)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.PopulationStabilityIndex\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitor_ds],\n", - " \"model\": vm_model,\n", - " },\n", - " show=False,\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Next, we can examine the correlation between features and predictions. Significant changes in these correlations may trigger a deeper assessment." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.ongoing_monitoring.TargetPredictionDistributionPlot\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitor_ds],\n", - " \"model\": vm_model,\n", - " },\n", - " show=False,\n", - ").log(section_id=\"comparison_data_target\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now we want see difference in correlation pairs between model prediction and features." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.ongoing_monitoring.PredictionCorrelation\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitor_ds],\n", - " \"model\": vm_model,\n", - " },\n", - " show=False,\n", - ").log(section_id=\"comparison_data_target\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Finally for target drift, let's plot each prediction value and feature grid side by side." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.ongoing_monitoring.PredictionAcrossEachFeature\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitor_ds],\n", - " \"model\": vm_model,\n", - " },\n", - " show=False,\n", - ").log(section_id=\"comparison_data_target\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_6_1__'></a>\n", - "\n", - "#### Feature drift tests\n", - "\n", - "Next, let's add run a test to investigate how or if the features have drifted. In this instance we want to compare the training data with prediction data. These results will be logged in the 2. Data Quality & Drift Assessment > **2.4. Feature Drift** section." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.ongoing_monitoring.FeatureDrift\",\n", - " inputs={\n", - " \"datasets\": [vm_reference_ds, vm_monitor_ds],\n", - " \"model\": vm_model,\n", - " },\n", - " show=False,\n", - ").log(section_id=\"comparison_data_feature\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_7__'></a>\n", - "\n", - "### Model performance monitoring tests\n", - "\n", - "Let's wrap up by monitoring the model's performance. Keep in mind that in some cases, it may not be possible to determine accuracy if the ground truth is unavailable. If this is the case, you can skip this test and instead focus on target and feature drift to inform the model owners.\n", - "\n", - "The pre-configured tests for model performance are:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for l in vm.get_test_suite(section=\"model_performance_monitoring\").get_default_config():\n", - " print(l)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The code below will run the tests and log the results into the monitoring document for each of the tests. Note the use of `input_grid` again, which is required for comparison tests:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Use the reference dataset vs monitoring dataset - the true comparison of accuracy\n", - "for test in vm.get_test_suite(\n", - " section=\"model_performance_monitoring\"\n", - ").get_default_config():\n", - " if test == \"validmind.model_validation.statsmodels.GINITable\":\n", - " vm.tests.run_test(\n", - " \"validmind.model_validation.statsmodels.GINITable\",\n", - " input_grid={\n", - " \"dataset\": [vm_reference_ds, vm_monitor_ds],\n", - " \"model\": [vm_model],\n", - " },\n", - " show=False,\n", - " ).log()\n", - " else:\n", - " vm.tests.run_test(\n", - " test,\n", - " input_grid={\n", - " \"dataset\": [vm_reference_ds, vm_monitor_ds],\n", - " \"model\": [vm_model],\n", - " },\n", - " show=False,\n", - " ).log()\n", - " print(\"Completed test: {0}\".format(test))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the output produced by the ValidMind Library right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your monitoring report.\n", - "\n", - "<a id='toc4_1__'></a>\n", - "\n", - "### Work with your monitoring report\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Monitoring** under Documents.\n", - "\n", - "What you see is the full draft of your monitoring report in a more easily consumable version. From here, you can make qualitative edits to monitoring reports, view guidelines, review monitoring results, and submit your monitoring report for approval when it's ready. (**Learn more:** [Ongoing monitoring](https://docs.validmind.ai/guide/monitoring/ongoing-monitoring.html))\n", - "\n", - "<a id='toc4_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-06926ffb7c9846eca24d1130049d6316", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "colab": { - "provenance": [] - }, - "gpuClass": "standard", - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 4 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Quickstart for ongoing monitoring of models with ValidMind\n", + "\n", + "Welcome! In this quickstart guide, you'll learn how to seamlessly monitor your production models using the ValidMind Platform.\n", + "\n", + "We'll walk you through the process of initializing the ValidMind Library, loading a sample dataset and model, and running a monitoring test suite to quickly generate documentation about your new data and model.\n", + "\n", + "This notebook utilizes the [Bank Customer Churn Prediction](https://www.kaggle.com/code/kmalit/bank-customer-churn-prediction/data) dataset from Kaggle to train a simple classification model for demonstration purposes." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply monitoring report template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Initialize the Python environment](#toc2_3__) \n", + " - [Preview the monitoring report template](#toc2_4__) \n", + "- [Load the reference and monitoring datasets](#toc3__) \n", + " - [Load the production model](#toc3_1__) \n", + " - [Initialize the ValidMind datasets](#toc3_2__) \n", + " - [Initialize the ValidMind model](#toc3_3__) \n", + " - [Assign predictions to the datasets](#toc3_4__) \n", + " - [Run the ongoing monitoring tests](#toc3_5__) \n", + " - [Conduct target and feature drift testing](#toc3_6__) \n", + " - [Feature drift tests](#toc3_6_1__) \n", + " - [Model performance monitoring tests](#toc3_7__) \n", + "- [Next steps](#toc4__) \n", + " - [Work with your monitoring report](#toc4_1__) \n", + " - [Discover more learning resources](#toc4_2__) \n", + "- [Upgrade ValidMind](#toc5__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation, validation, and monitoring tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", + "\n", + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**ongoing monitoring report**: A comprehensive and structured periodic report assessing the record's performance and compliance over time, ensuring it remains valid under changing conditions. Monitoring includes key elements such as data sources, inputs, performance metrics, and periodic evaluations, ensuring transparency and visibility of the record's performance in the production environment.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**monitoring template, monitoring report template**: A default ValidMind document template that serves as a standardized framework for ongoing monitoring, including sections designated for test results, performance metrics, and drift analyses. By outlining required monitoring checks and expected routine tests, monitoring templates ensure consistency and completeness across monitoring reports and help guide owners through a systematic monitoring process while promoting early detection of performance degradation.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply monitoring report template\n", + "\n", + "Once you've registered your model, let's select a monitoring report template. A template predefines sections for your monitoring report and provides a general outline to follow, making the monitoring process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Monitoring**.\n", + "\n", + " If you cannot locate your Monitoring document, make sure Monitoring type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Ongoing Monitoring for Classification Models`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Monitoring` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"monitoring\",\n", + " monitoring = True,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Next, let's import the necessary libraries and set up your Python environment for data analysis:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import xgboost as xgb\n", + "import validmind as vm\n", + "import pandas as pd\n", + "import numpy as np\n", + "import seaborn as sns\n", + "import matplotlib.pyplot as plt\n", + "from validmind.tests import run_test\n", + "\n", + "%matplotlib inline" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_4__'></a>\n", + "\n", + "### Preview the monitoring report template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Load the reference and monitoring datasets\n", + "\n", + "The sample dataset used here is provided by the ValidMind library. For demonstration purposes we'll use the training, test and validation dataset splits as `training`, `reference` and `monitoring` datasets." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.datasets.classification import customer_churn\n", + "\n", + "raw_df = customer_churn.load_data()\n", + "\n", + "train_df, reference_df, monitor_df = customer_churn.preprocess(raw_df)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1__'></a>\n", + "\n", + "### Load the production model\n", + "\n", + "We will also load a pre-trained model for demonstration purposes. This is a simple XGBoost model trained on the Bank Customer Churn Prediction dataset." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import xgboost as xgb\n", + "\n", + "# Load the saved model\n", + "model = xgb.XGBClassifier()\n", + "model.load_model(\"xgboost_model.model\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2__'></a>\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", + "\n", + "This function takes a number of arguments:\n", + "\n", + "- `dataset` — The raw dataset that you want to provide as input to tests.\n", + "- `input_id` - A unique identifier that allows tracking what inputs are used when running each individual test.\n", + "- `target_column` — A required argument if tests require access to true values. This is the name of the target column in the dataset.\n", + "- `class_labels` — An optional value to map predicted classes to class labels.\n", + "\n", + "With all datasets ready, you can now initialize training, reference(test) and monitor datasets (`train_df`, `reference_df` and `monitor_df`) created earlier into their own dataset objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds = vm.init_dataset(\n", + " dataset=train_df,\n", + " input_id=\"train_df\",\n", + " target_column=customer_churn.target_column,\n", + ")\n", + "\n", + "vm_reference_ds = vm.init_dataset(\n", + " dataset=reference_df,\n", + " input_id=\"reference_df\",\n", + " target_column=customer_churn.target_column,\n", + ")\n", + "\n", + "vm_monitor_ds = vm.init_dataset(\n", + " dataset=monitor_df,\n", + " input_id=\"monitor_dataset\",\n", + " target_column=customer_churn.target_column,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_3__'></a>\n", + "\n", + "### Initialize the ValidMind model\n", + "\n", + "You'll also need to initialize a ValidMind model object (`vm_model`) that can be passed to other functions for analysis and tests on the data for our model.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_model = vm.init_model(\n", + " model,\n", + " input_id=\"model\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_4__'></a>\n", + "\n", + "### Assign predictions to the datasets\n", + "\n", + "We can now use the `assign_predictions()` method from the Dataset object to link existing predictions to any model. If no prediction values are passed, the method will compute predictions automatically:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(\n", + " model=vm_model,\n", + ")\n", + "\n", + "vm_reference_ds.assign_predictions(\n", + " model=vm_model,\n", + ")\n", + "\n", + "vm_monitor_ds.assign_predictions(\n", + " model=vm_model,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_5__'></a>\n", + "\n", + "### Run the ongoing monitoring tests\n", + "\n", + "Before we start the testing procedure, let's take a look at the expected tests that are pre-configured:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test_list = vm.get_test_suite().get_default_config()\n", + "for l in test_list:\n", + " print(l)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's run the first test in the list. Note that you can use `vm.tests.describe_test()` to get information about the inputs required for the test:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.describe_test(\"validmind.model_validation.ModelMetadata\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As you can see, the `ModelMetadata` only requires a model input. Let's run the test and log the results into the monitoring document with the `.log()` method:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test_result = vm.tests.run_test(\n", + " \"validmind.model_validation.ModelMetadata\",\n", + " model=vm_model,\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's run the tests needed to determine data quality of the monitoring dataset:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "data_qual = vm.get_test_suite(\n", + " section=\"prediction_data_description\"\n", + ").get_default_config()\n", + "\n", + "# Run all of the necessary data quality checks where the monitoring dataset is the basis\n", + "for l in data_qual:\n", + " vm.tests.run_test(\n", + " l,\n", + " inputs={\"dataset\": vm_monitor_ds},\n", + " show=False,\n", + " ).log()\n", + " print(\"Completed test: {0}\".format(l))" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To view the results of the model metadata and data quality tests, select **Monitoring** under Documents in the left sidebar of the model in the ValidMind Platform and click on the following sections:\n", + "\n", + "- 1. Model Monitoring Overview > **1.2. Model Details**\n", + "- 2. Data Quality & Drift Assessment > **2.1. Prediction Data Description**" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next, let's run *comparison tests*, which will allow comparing differences between the training dataset and monitoring datasets. To run a test in comparison mode, you only need to pass an `input_grid` parameter to the `run_test()` method instead of `inputs`.\n", + "\n", + "For more information about comparison tests, see this [notebook](../../how_to/tests/run_tests/2-run_comparison_tests.ipynb)." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "correlation_tests = [\n", + " \"validmind.data_validation.PearsonCorrelationMatrix:train_vs_test\",\n", + " \"validmind.data_validation.HighPearsonCorrelation:train_vs_test\",\n", + "]\n", + "\n", + "for test in correlation_tests:\n", + " vm.tests.run_test(\n", + " test,\n", + " input_grid={\n", + " \"dataset\": [vm_train_ds, vm_monitor_ds],\n", + " \"model\": [vm_model],\n", + " },\n", + " show=False,\n", + " ).log()\n", + " print(\"Completed test {0}\".format(test))" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You can view these results in the ValidMind Platform in **Ongoing Monitoring** within Documents under the following section:\n", + "\n", + "- 2. Data Quality & Drift Assessment > **2.2. Prediction Data Correlations and Interactions**" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_6__'></a>\n", + "\n", + "### Conduct target and feature drift testing\n", + "\n", + "Next, the goal is to investigate the distributional characteristics of predictions and features to determine if the underlying data has changed. These tests are crucial for assessing the expected accuracy of the model.\n", + "\n", + "1. **Target drift:** We compare the dataset used for testing (reference data) with the monitoring data. This helps to identify any shifts in the target variable distribution.\n", + "2. **Feature drift:** We compare the training dataset with the monitoring data. Since features were used to train the model, any drift in these features could indicate potential issues, as the underlying patterns that the model was trained on may have changed." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In the 2. Data Quality & Drift Assessment > **2.3 Target Drift** section we can confirm only there is only one pre-configured test:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "for l in vm.get_test_suite(section=\"comparison_data_target\").get_default_config():\n", + " print(l)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As part of running the rest of the tests, we will directly log the results to a section when calling the `.log()` method." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "First, let's run the *Population Stability Index (PSI)* for predictions. In this case, we want to compare the test data with the monitoring data. (Note: For predictions, the training data is irrelevant.)" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.PopulationStabilityIndex\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitor_ds],\n", + " \"model\": vm_model,\n", + " },\n", + " show=False,\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next, we can examine the correlation between features and predictions. Significant changes in these correlations may trigger a deeper assessment." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.ongoing_monitoring.TargetPredictionDistributionPlot\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitor_ds],\n", + " \"model\": vm_model,\n", + " },\n", + " show=False,\n", + ").log(section_id=\"comparison_data_target\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now we want see difference in correlation pairs between model prediction and features." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.ongoing_monitoring.PredictionCorrelation\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitor_ds],\n", + " \"model\": vm_model,\n", + " },\n", + " show=False,\n", + ").log(section_id=\"comparison_data_target\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Finally for target drift, let's plot each prediction value and feature grid side by side." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.ongoing_monitoring.PredictionAcrossEachFeature\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitor_ds],\n", + " \"model\": vm_model,\n", + " },\n", + " show=False,\n", + ").log(section_id=\"comparison_data_target\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_6_1__'></a>\n", + "\n", + "#### Feature drift tests\n", + "\n", + "Next, let's add run a test to investigate how or if the features have drifted. In this instance we want to compare the training data with prediction data. These results will be logged in the 2. Data Quality & Drift Assessment > **2.4. Feature Drift** section." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.ongoing_monitoring.FeatureDrift\",\n", + " inputs={\n", + " \"datasets\": [vm_reference_ds, vm_monitor_ds],\n", + " \"model\": vm_model,\n", + " },\n", + " show=False,\n", + ").log(section_id=\"comparison_data_feature\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_7__'></a>\n", + "\n", + "### Model performance monitoring tests\n", + "\n", + "Let's wrap up by monitoring the model's performance. Keep in mind that in some cases, it may not be possible to determine accuracy if the ground truth is unavailable. If this is the case, you can skip this test and instead focus on target and feature drift to inform the model owners.\n", + "\n", + "The pre-configured tests for model performance are:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "for l in vm.get_test_suite(section=\"model_performance_monitoring\").get_default_config():\n", + " print(l)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The code below will run the tests and log the results into the monitoring document for each of the tests. Note the use of `input_grid` again, which is required for comparison tests:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Use the reference dataset vs monitoring dataset - the true comparison of accuracy\n", + "for test in vm.get_test_suite(\n", + " section=\"model_performance_monitoring\"\n", + ").get_default_config():\n", + " if test == \"validmind.model_validation.statsmodels.GINITable\":\n", + " vm.tests.run_test(\n", + " \"validmind.model_validation.statsmodels.GINITable\",\n", + " input_grid={\n", + " \"dataset\": [vm_reference_ds, vm_monitor_ds],\n", + " \"model\": [vm_model],\n", + " },\n", + " show=False,\n", + " ).log()\n", + " else:\n", + " vm.tests.run_test(\n", + " test,\n", + " input_grid={\n", + " \"dataset\": [vm_reference_ds, vm_monitor_ds],\n", + " \"model\": [vm_model],\n", + " },\n", + " show=False,\n", + " ).log()\n", + " print(\"Completed test: {0}\".format(test))" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the output produced by the ValidMind Library right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your monitoring report.\n", + "\n", + "<a id='toc4_1__'></a>\n", + "\n", + "### Work with your monitoring report\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Monitoring** under Documents.\n", + "\n", + "What you see is the full draft of your monitoring report in a more easily consumable version. From here, you can make qualitative edits to monitoring reports, view guidelines, review monitoring results, and submit your monitoring report for approval when it's ready. (**Learn more:** [Ongoing monitoring](https://docs.validmind.ai/guide/monitoring/ongoing-monitoring.html))\n", + "\n", + "<a id='toc4_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-06926ffb7c9846eca24d1130049d6316" + } + ], + "metadata": { + "colab": { + "provenance": [] + }, + "gpuClass": "standard", + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.13" + } + }, + "nbformat": 4, + "nbformat_minor": 4 } diff --git a/site/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb b/site/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb index 29e07e6e94..300dbfeb09 100644 --- a/site/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb +++ b/site/notebooks/use_cases/time_series/quickstart_time_series_full_suite.ipynb @@ -1,761 +1,765 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Document a time series forecasting model\n", - "\n", - "Use the [FRED](https://fred.stlouisfed.org/) sample dataset to train a simple time series model and document that model with the ValidMind Library.\n", - "\n", - "As part of the notebook, you will learn how to train a simple model while exploring how the documentation process works:\n", - "\n", - "- Initializing the ValidMind Library\n", - "- Loading a sample dataset provided by the library to train a simple time series model\n", - "- Running a ValidMind test suite to quickly generate documentation about the data and model" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Initialize the Python environment](#toc2_3__) \n", - " - [Preview the documentation template](#toc2_4__) \n", - "- [Load the sample dataset](#toc3__) \n", - "- [Document the model](#toc4__) \n", - " - [Prepocess the raw dataset](#toc4_1__) \n", - " - [Train random forests and gradient boosting regressor models](#toc4_2__) \n", - " - [Initialize the ValidMind datasets](#toc4_3__) \n", - " - [Initialize the ValidMind models](#toc4_4__) \n", - " - [Assign predictions to the datasets](#toc4_5__) \n", - " - [Run the full suite of tests](#toc4_6__) \n", - "- [Next steps](#toc5__) \n", - " - [Work with your documentation](#toc5_1__) \n", - " - [Discover more learning resources](#toc5_2__) \n", - "- [Upgrade ValidMind](#toc6__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Binary classification`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Next, let's import the necessary libraries and set up your Python environment for data analysis:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from sklearn.ensemble import RandomForestRegressor\n", - "from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor\n", - "from sklearn.metrics import mean_squared_error, r2_score\n", - "from sklearn.model_selection import train_test_split\n", - "\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_4__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Load the sample dataset\n", - "\n", - "The sample dataset used here is provided by the ValidMind library. To be able to use it, you need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.datasets.regression import fred_timeseries \n", - "\n", - "target_column = fred_timeseries.target_column\n", - "\n", - "print(\n", - " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{target_column}'\"\n", - ")\n", - "\n", - "raw_df = fred_timeseries.load_data()\n", - "raw_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Document the model\n", - "\n", - "As part of documenting the model with the ValidMind Library, you need to preprocess the raw dataset, initialize some training and test datasets, initialize a model object you can use for testing, and then run the full suite of tests." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Prepocess the raw dataset\n", - "\n", - "Preprocessing performs a number of operations to get ready for the subsequent steps:\n", - "- **Split the dataset**: Divide the original dataset into training and test sets for the primary model with an 80/20 split, without shuffling.\n", - "- **Difference the data**: Calculate the first difference of the train and test datasets to remove trends and seasonality, then drop any resulting NaN values.\n", - "- **Extract features and target variables**: Separate the feature columns (predictors) and the target variable from the differenced train and test datasets." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Split the raw dataset into training and test sets \n", - "train_df, test_df = train_test_split(raw_df, test_size=0.2, shuffle=False)\n", - "\n", - "# Take the first difference of the training and test sets\n", - "train_diff_df = train_df.diff().dropna()\n", - "test_diff_df = test_df.diff().dropna()\n", - "\n", - "# Extract the features and target variable from the training set\n", - "X_diff_train = train_diff_df.drop(target_column, axis=1)\n", - "y_diff_train = train_diff_df[target_column]\n", - "\n", - "# Extract the features and target variable from the test set\n", - "X_diff_test = test_diff_df.drop(target_column, axis=1)\n", - "y_diff_test = test_diff_df[target_column]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### Train random forests and gradient boosting regressor models\n", - "\n", - "This section trains random forest and gradient boosting models on differenced data, transforms predictions back to the original scale, and evaluates model performance using Mean Squared Error (MSE) and R-squared (R²) scores. \n", - "\n", - "The following helper functions are used to post-process predictions and evaluate model performance:\n", - "\n", - "- `transform_to_levels`: Reconstructs the original values from differenced predictions by cumulatively summing them, starting from a given initial value.\n", - "- `evaluate_model`: Calculates the Mean Squared Error (MSE) and R-squared (R²) score to evaluate the accuracy of the predictions against the true values." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def transform_to_levels(y_diff_pred, first_value=0): \n", - " y_pred = [first_value]\n", - " for pred in y_diff_pred:\n", - " y_pred.append(y_pred[-1] + pred)\n", - " return y_pred\n", - "\n", - "def evaluate_model(y_true, y_pred):\n", - " mse = mean_squared_error(y_true, y_pred)\n", - " r2 = r2_score(y_true, y_pred)\n", - " return mse, r2" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Fit the random forest model\n", - "model_rf = RandomForestRegressor(n_estimators=1500, random_state=0)\n", - "model_rf.fit(X_diff_train, y_diff_train)\n", - "\n", - "# Make predictions on the training and test sets\n", - "y_diff_train_pred = model_rf.predict(X_diff_train)\n", - "y_diff_test_pred = model_rf.predict(X_diff_test)\n", - "\n", - "# Transform the predictions back to the original scale\n", - "y_train_rf_pred = transform_to_levels(y_diff_train_pred, first_value=train_df[target_column].iloc[0])\n", - "y_test_rf_pred = transform_to_levels(y_diff_test_pred, first_value=test_df[target_column].iloc[0])\n", - "\n", - "# Evaluate the model's performance on the training and test sets\n", - "mse_train, r2_train = evaluate_model(train_df[target_column], y_train_rf_pred)\n", - "mse_test, r2_test = evaluate_model(test_df[target_column], y_test_rf_pred)\n", - "\n", - "print(f\"Train Mean Squared Error: {mse_train}\")\n", - "print(f\"Train R-Squared: {r2_train}\")\n", - "print(f\"Test Mean Squared Error: {mse_test}\")\n", - "print(f\"Test R-Squared: {r2_test}\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Fit the gradient boost model\n", - "model_gb = GradientBoostingRegressor(n_estimators=1500, random_state=0)\n", - "model_gb.fit(X_diff_train, y_diff_train)\n", - "\n", - "# Make predictions on the training and test sets\n", - "y_diff_train_pred = model_gb.predict(X_diff_train)\n", - "y_diff_test_pred = model_gb.predict(X_diff_test)\n", - "\n", - "# Transform the predictions back to the original scale\n", - "y_train_gb_pred = transform_to_levels(y_diff_train_pred, first_value=train_df[target_column].iloc[0])\n", - "y_test_gb_pred = transform_to_levels(y_diff_test_pred, first_value=test_df[target_column].iloc[0])\n", - "\n", - "# Evaluate the model's performance on the training and test sets\n", - "mse_train, r2_train = evaluate_model(train_df[target_column], y_train_gb_pred)\n", - "mse_test, r2_test = evaluate_model(test_df[target_column], y_test_gb_pred)\n", - "\n", - "print(f\"Train Mean Squared Error: {mse_train}\")\n", - "print(f\"Train R-Squared: {r2_train}\")\n", - "print(f\"Test Mean Squared Error: {mse_test}\")\n", - "print(f\"Test R-Squared: {r2_test}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_3__'></a>\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", - "\n", - "This function takes a number of arguments:\n", - "\n", - "- `dataset` — the raw dataset that you want to provide as input to tests\n", - "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", - "- `target_column` — a required argument if tests require access to true values. This is the name of the target column in the dataset\n", - "\n", - "With all dataframes ready, you can now initialize the ValidMind datasets objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):\n", - "\n", - "- `vm_raw_ds`: contains the raw, unprocessed data with the specified target column.\n", - "- `vm_train_diff_ds`: contains the training data with the differenced target column, excluding the first row to remove NaN values caused by differencing.\n", - "- `vm_test_diff_ds`: contains the test data with the differenced target column, excluding the first row to remove NaN values caused by differencing.\n", - "- `vm_train_ds`: contains the training data, excluding the first row to align with the differenced data.\n", - "- `vm_test_ds`: includes the test data split from the raw dataset." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_ds = vm.init_dataset(\n", - " input_id=\"raw_ds\",\n", - " dataset=raw_df,\n", - " target_column=target_column,\n", - ")\n", - "\n", - "vm_train_diff_ds = vm.init_dataset(\n", - " input_id=\"train_diff_ds\",\n", - " dataset=train_diff_df,\n", - " target_column=target_column,\n", - ")\n", - "\n", - "vm_test_diff_ds = vm.init_dataset(\n", - " input_id=\"test_diff_ds\",\n", - " dataset=test_diff_df,\n", - " target_column=target_column,\n", - ")\n", - "\n", - "vm_train_ds = vm.init_dataset(\n", - " input_id=\"train_ds\",\n", - " dataset=train_df,\n", - " target_column=target_column,\n", - ")\n", - "\n", - "vm_test_ds = vm.init_dataset(\n", - " input_id=\"test_ds\",\n", - " dataset=test_df,\n", - " target_column=target_column,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_4__'></a>\n", - "\n", - "### Initialize the ValidMind models\n", - "\n", - "You'll also need to initialize ValidMind model objects (`vm_model`) that can be passed to other functions for analysis and tests on the data for our models.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_model_rf = vm.init_model(\n", - " model_rf,\n", - " input_id=\"random_forests_model\",\n", - ")\n", - "\n", - "vm_model_gb = vm.init_model(\n", - " model_gb,\n", - " input_id=\"gradient_boosting_model\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_5__'></a>\n", - "\n", - "### Assign predictions to the datasets\n", - "\n", - "We can now use the assign_predictions() method from the Dataset object to link existing predictions to any model. If no prediction values are passed, the method will compute predictions automatically:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(\n", - " model=vm_model_rf,\n", - " prediction_values=y_train_rf_pred,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_model_rf,\n", - " prediction_values=y_test_rf_pred,\n", - ")\n", - "\n", - "vm_train_ds.assign_predictions(\n", - " model=vm_model_gb,\n", - " prediction_values=y_train_gb_pred,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_model_gb,\n", - " prediction_values=y_test_gb_pred,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_6__'></a>\n", - "\n", - "### Run the full suite of tests\n", - "\n", - "This is where it all comes together: you are now ready to run the documentation tests for the model as defined by the documentation template you looked at earlier.\n", - "\n", - "The [`vm.run_documentation_tests`](https://docs.validmind.ai/validmind/validmind.html#run_documentation_tests) function finds and runs every test specified in the template and then uploads all the documentation and test artifacts that get generated to the ValidMind Platform.\n", - "\n", - "The function requires information about the inputs to use on every test. These inputs can be passed as an `inputs` argument if we want to use the same inputs for all tests. It's also possible to pass a `config` argument that has information about the `params` and `inputs` that each test requires. The `config` parameter is a dictionary with the following structure:\n", - "\n", - "```python\n", - "config = {\n", - " \"<test-id>\": {\n", - " \"params\": {\n", - " \"param1\": \"value1\",\n", - " \"param2\": \"value2\",\n", - " ...\n", - " },\n", - " \"inputs\": {\n", - " \"input1\": \"value1\",\n", - " \"input2\": \"value2\",\n", - " ...\n", - " }\n", - " },\n", - " ...\n", - "}\n", - "```\n", - "\n", - "Each `<test-id>` above corresponds to the test driven block identifiers shown by `vm.preview_template()`. For this model, we will use the default parameters for all tests, but we'll need to specify the input configuration for each one. The method `get_demo_test_config()` below constructs the default input configuration for our demo." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.utils import preview_test_config\n", - "\n", - "test_config = fred_timeseries.get_demo_test_config()\n", - "preview_test_config(test_config)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now we can pass the input configuration to `vm.run_documentation_tests()` and run the full suite of tests. The variable `full_suite` then holds the result of these tests." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "full_suite = vm.run_documentation_tests(config=test_config)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "<a id='toc5_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (**Learn more:** [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. Click and expand the **Model Development** section.\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "<a id='toc5_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-ab56373aa7ee4e15909017ab135ceaae", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "validmind-eEL8LtKG-py3.10", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 2 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Document a time series forecasting model\n", + "\n", + "Use the [FRED](https://fred.stlouisfed.org/) sample dataset to train a simple time series model and document that model with the ValidMind Library.\n", + "\n", + "As part of the notebook, you will learn how to train a simple model while exploring how the documentation process works:\n", + "\n", + "- Initializing the ValidMind Library\n", + "- Loading a sample dataset provided by the library to train a simple time series model\n", + "- Running a ValidMind test suite to quickly generate documentation about the data and model" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Initialize the Python environment](#toc2_3__) \n", + " - [Preview the documentation template](#toc2_4__) \n", + "- [Load the sample dataset](#toc3__) \n", + "- [Document the model](#toc4__) \n", + " - [Prepocess the raw dataset](#toc4_1__) \n", + " - [Train random forests and gradient boosting regressor models](#toc4_2__) \n", + " - [Initialize the ValidMind datasets](#toc4_3__) \n", + " - [Initialize the ValidMind models](#toc4_4__) \n", + " - [Assign predictions to the datasets](#toc4_5__) \n", + " - [Run the full suite of tests](#toc4_6__) \n", + "- [Next steps](#toc5__) \n", + " - [Work with your documentation](#toc5_1__) \n", + " - [Discover more learning resources](#toc5_2__) \n", + "- [Upgrade ValidMind](#toc6__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", + "\n", + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Binary classification`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Next, let's import the necessary libraries and set up your Python environment for data analysis:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from sklearn.ensemble import RandomForestRegressor\n", + "from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor\n", + "from sklearn.metrics import mean_squared_error, r2_score\n", + "from sklearn.model_selection import train_test_split\n", + "\n", + "%matplotlib inline" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_4__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Load the sample dataset\n", + "\n", + "The sample dataset used here is provided by the ValidMind library. To be able to use it, you need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.datasets.regression import fred_timeseries \n", + "\n", + "target_column = fred_timeseries.target_column\n", + "\n", + "print(\n", + " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{target_column}'\"\n", + ")\n", + "\n", + "raw_df = fred_timeseries.load_data()\n", + "raw_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Document the model\n", + "\n", + "As part of documenting the model with the ValidMind Library, you need to preprocess the raw dataset, initialize some training and test datasets, initialize a model object you can use for testing, and then run the full suite of tests." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Prepocess the raw dataset\n", + "\n", + "Preprocessing performs a number of operations to get ready for the subsequent steps:\n", + "- **Split the dataset**: Divide the original dataset into training and test sets for the primary model with an 80/20 split, without shuffling.\n", + "- **Difference the data**: Calculate the first difference of the train and test datasets to remove trends and seasonality, then drop any resulting NaN values.\n", + "- **Extract features and target variables**: Separate the feature columns (predictors) and the target variable from the differenced train and test datasets." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Split the raw dataset into training and test sets \n", + "train_df, test_df = train_test_split(raw_df, test_size=0.2, shuffle=False)\n", + "\n", + "# Take the first difference of the training and test sets\n", + "train_diff_df = train_df.diff().dropna()\n", + "test_diff_df = test_df.diff().dropna()\n", + "\n", + "# Extract the features and target variable from the training set\n", + "X_diff_train = train_diff_df.drop(target_column, axis=1)\n", + "y_diff_train = train_diff_df[target_column]\n", + "\n", + "# Extract the features and target variable from the test set\n", + "X_diff_test = test_diff_df.drop(target_column, axis=1)\n", + "y_diff_test = test_diff_df[target_column]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### Train random forests and gradient boosting regressor models\n", + "\n", + "This section trains random forest and gradient boosting models on differenced data, transforms predictions back to the original scale, and evaluates model performance using Mean Squared Error (MSE) and R-squared (R²) scores. \n", + "\n", + "The following helper functions are used to post-process predictions and evaluate model performance:\n", + "\n", + "- `transform_to_levels`: Reconstructs the original values from differenced predictions by cumulatively summing them, starting from a given initial value.\n", + "- `evaluate_model`: Calculates the Mean Squared Error (MSE) and R-squared (R²) score to evaluate the accuracy of the predictions against the true values." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "def transform_to_levels(y_diff_pred, first_value=0): \n", + " y_pred = [first_value]\n", + " for pred in y_diff_pred:\n", + " y_pred.append(y_pred[-1] + pred)\n", + " return y_pred\n", + "\n", + "def evaluate_model(y_true, y_pred):\n", + " mse = mean_squared_error(y_true, y_pred)\n", + " r2 = r2_score(y_true, y_pred)\n", + " return mse, r2" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Fit the random forest model\n", + "model_rf = RandomForestRegressor(n_estimators=1500, random_state=0)\n", + "model_rf.fit(X_diff_train, y_diff_train)\n", + "\n", + "# Make predictions on the training and test sets\n", + "y_diff_train_pred = model_rf.predict(X_diff_train)\n", + "y_diff_test_pred = model_rf.predict(X_diff_test)\n", + "\n", + "# Transform the predictions back to the original scale\n", + "y_train_rf_pred = transform_to_levels(y_diff_train_pred, first_value=train_df[target_column].iloc[0])\n", + "y_test_rf_pred = transform_to_levels(y_diff_test_pred, first_value=test_df[target_column].iloc[0])\n", + "\n", + "# Evaluate the model's performance on the training and test sets\n", + "mse_train, r2_train = evaluate_model(train_df[target_column], y_train_rf_pred)\n", + "mse_test, r2_test = evaluate_model(test_df[target_column], y_test_rf_pred)\n", + "\n", + "print(f\"Train Mean Squared Error: {mse_train}\")\n", + "print(f\"Train R-Squared: {r2_train}\")\n", + "print(f\"Test Mean Squared Error: {mse_test}\")\n", + "print(f\"Test R-Squared: {r2_test}\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Fit the gradient boost model\n", + "model_gb = GradientBoostingRegressor(n_estimators=1500, random_state=0)\n", + "model_gb.fit(X_diff_train, y_diff_train)\n", + "\n", + "# Make predictions on the training and test sets\n", + "y_diff_train_pred = model_gb.predict(X_diff_train)\n", + "y_diff_test_pred = model_gb.predict(X_diff_test)\n", + "\n", + "# Transform the predictions back to the original scale\n", + "y_train_gb_pred = transform_to_levels(y_diff_train_pred, first_value=train_df[target_column].iloc[0])\n", + "y_test_gb_pred = transform_to_levels(y_diff_test_pred, first_value=test_df[target_column].iloc[0])\n", + "\n", + "# Evaluate the model's performance on the training and test sets\n", + "mse_train, r2_train = evaluate_model(train_df[target_column], y_train_gb_pred)\n", + "mse_test, r2_test = evaluate_model(test_df[target_column], y_test_gb_pred)\n", + "\n", + "print(f\"Train Mean Squared Error: {mse_train}\")\n", + "print(f\"Train R-Squared: {r2_train}\")\n", + "print(f\"Test Mean Squared Error: {mse_test}\")\n", + "print(f\"Test R-Squared: {r2_test}\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_3__'></a>\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", + "\n", + "This function takes a number of arguments:\n", + "\n", + "- `dataset` — the raw dataset that you want to provide as input to tests\n", + "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", + "- `target_column` — a required argument if tests require access to true values. This is the name of the target column in the dataset\n", + "\n", + "With all dataframes ready, you can now initialize the ValidMind datasets objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):\n", + "\n", + "- `vm_raw_ds`: contains the raw, unprocessed data with the specified target column.\n", + "- `vm_train_diff_ds`: contains the training data with the differenced target column, excluding the first row to remove NaN values caused by differencing.\n", + "- `vm_test_diff_ds`: contains the test data with the differenced target column, excluding the first row to remove NaN values caused by differencing.\n", + "- `vm_train_ds`: contains the training data, excluding the first row to align with the differenced data.\n", + "- `vm_test_ds`: includes the test data split from the raw dataset." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_ds = vm.init_dataset(\n", + " input_id=\"raw_ds\",\n", + " dataset=raw_df,\n", + " target_column=target_column,\n", + ")\n", + "\n", + "vm_train_diff_ds = vm.init_dataset(\n", + " input_id=\"train_diff_ds\",\n", + " dataset=train_diff_df,\n", + " target_column=target_column,\n", + ")\n", + "\n", + "vm_test_diff_ds = vm.init_dataset(\n", + " input_id=\"test_diff_ds\",\n", + " dataset=test_diff_df,\n", + " target_column=target_column,\n", + ")\n", + "\n", + "vm_train_ds = vm.init_dataset(\n", + " input_id=\"train_ds\",\n", + " dataset=train_df,\n", + " target_column=target_column,\n", + ")\n", + "\n", + "vm_test_ds = vm.init_dataset(\n", + " input_id=\"test_ds\",\n", + " dataset=test_df,\n", + " target_column=target_column,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_4__'></a>\n", + "\n", + "### Initialize the ValidMind models\n", + "\n", + "You'll also need to initialize ValidMind model objects (`vm_model`) that can be passed to other functions for analysis and tests on the data for our models.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_model_rf = vm.init_model(\n", + " model_rf,\n", + " input_id=\"random_forests_model\",\n", + ")\n", + "\n", + "vm_model_gb = vm.init_model(\n", + " model_gb,\n", + " input_id=\"gradient_boosting_model\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_5__'></a>\n", + "\n", + "### Assign predictions to the datasets\n", + "\n", + "We can now use the assign_predictions() method from the Dataset object to link existing predictions to any model. If no prediction values are passed, the method will compute predictions automatically:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(\n", + " model=vm_model_rf,\n", + " prediction_values=y_train_rf_pred,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_model_rf,\n", + " prediction_values=y_test_rf_pred,\n", + ")\n", + "\n", + "vm_train_ds.assign_predictions(\n", + " model=vm_model_gb,\n", + " prediction_values=y_train_gb_pred,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_model_gb,\n", + " prediction_values=y_test_gb_pred,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_6__'></a>\n", + "\n", + "### Run the full suite of tests\n", + "\n", + "This is where it all comes together: you are now ready to run the documentation tests for the model as defined by the documentation template you looked at earlier.\n", + "\n", + "The [`vm.run_documentation_tests`](https://docs.validmind.ai/validmind/validmind.html#run_documentation_tests) function finds and runs every test specified in the template and then uploads all the documentation and test artifacts that get generated to the ValidMind Platform.\n", + "\n", + "The function requires information about the inputs to use on every test. These inputs can be passed as an `inputs` argument if we want to use the same inputs for all tests. It's also possible to pass a `config` argument that has information about the `params` and `inputs` that each test requires. The `config` parameter is a dictionary with the following structure:\n", + "\n", + "```python\n", + "config = {\n", + " \"<test-id>\": {\n", + " \"params\": {\n", + " \"param1\": \"value1\",\n", + " \"param2\": \"value2\",\n", + " ...\n", + " },\n", + " \"inputs\": {\n", + " \"input1\": \"value1\",\n", + " \"input2\": \"value2\",\n", + " ...\n", + " }\n", + " },\n", + " ...\n", + "}\n", + "```\n", + "\n", + "Each `<test-id>` above corresponds to the test driven block identifiers shown by `vm.preview_template()`. For this model, we will use the default parameters for all tests, but we'll need to specify the input configuration for each one. The method `get_demo_test_config()` below constructs the default input configuration for our demo." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.utils import preview_test_config\n", + "\n", + "test_config = fred_timeseries.get_demo_test_config()\n", + "preview_test_config(test_config)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now we can pass the input configuration to `vm.run_documentation_tests()` and run the full suite of tests. The variable `full_suite` then holds the result of these tests." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "full_suite = vm.run_documentation_tests(config=test_config)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "<a id='toc5_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (**Learn more:** [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. Click and expand the **Model Development** section.\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "<a id='toc5_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-ab56373aa7ee4e15909017ab135ceaae" + } + ], + "metadata": { + "kernelspec": { + "display_name": "validmind-eEL8LtKG-py3.10", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.13" + } + }, + "nbformat": 4, + "nbformat_minor": 2 } diff --git a/site/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb b/site/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb index fec02e5874..1dfae1e06e 100644 --- a/site/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb +++ b/site/notebooks/use_cases/time_series/quickstart_time_series_high_code.ipynb @@ -1,1019 +1,1023 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Document a time series forecasting model\n", - "\n", - "Use the [FRED](https://fred.stlouisfed.org/) sample dataset to train a simple time series model and document that model with the ValidMind Library.\n", - "\n", - "As part of the notebook, you will learn how to train a simple model while exploring how the documentation process works:\n", - "\n", - "- Initializing the ValidMind Library\n", - "- Loading a sample dataset provided by the library to train a simple time series model\n", - "- Running a ValidMind test suite to quickly generate documentation about the data and model" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Install the ValidMind Library](#toc2_1__) \n", - " - [Initialize the ValidMind Library](#toc2_2__) \n", - " - [Register sample model](#toc2_2_1__) \n", - " - [Apply documentation template](#toc2_2_2__) \n", - " - [Get your code snippet](#toc2_2_3__) \n", - " - [Initialize the Python environment](#toc2_3__) \n", - " - [Preview the documentation template](#toc2_4__) \n", - "- [Load the sample dataset](#toc3__) \n", - "- [Document the model](#toc4__) \n", - " - [Prepocess the raw dataset](#toc4_1__) \n", - " - [Train random forests and gradient boosting regressor models](#toc4_2__) \n", - " - [Initialize the ValidMind datasets](#toc4_3__) \n", - " - [Initialize the ValidMind models](#toc4_4__) \n", - " - [Assign predictions to the datasets](#toc4_5__) \n", - " - [Run data validation tests](#toc4_6__) \n", - " - [Run model validation tests](#toc4_7__) \n", - "- [Next steps](#toc5__) \n", - " - [Work with your documentation](#toc5_1__) \n", - " - [Discover more learning resources](#toc5_2__) \n", - "- [Upgrade ValidMind](#toc6__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", - "\n", - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", - "\n", - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", - "\n", - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Model documentation**: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.\n", - "\n", - "**Documentation template**: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. See this [example](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html) for more information.\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.\n", - "\n", - "**Test suites**: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.\n", - "\n", - "Example: the [`classifier_full_suite`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html#ClassifierFullSuite) test suite runs tests from the [`tabular_dataset`](https://docs.validmind.ai/validmind/validmind/test_suites/tabular_datasets.html) and [`classifier`](https://docs.validmind.ai/validmind/validmind/test_suites/classifier.html) test suites to fully document the data and model sections for binary classification model use-cases." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_1__'></a>\n", - "\n", - "#### Register sample model\n", - "\n", - "Let's first register a sample record (model) for use with this notebook:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_2__'></a>\n", - "\n", - "#### Apply documentation template\n", - "\n", - "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", - "\n", - " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Time Series Forecasting with ML`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2_3__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"documentation\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Initialize the Python environment\n", - "\n", - "Next, let's import the necessary libraries and set up your Python environment for data analysis:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor\n", - "from sklearn.metrics import mean_squared_error, r2_score\n", - "from sklearn.model_selection import train_test_split\n", - "\n", - "%matplotlib inline" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_4__'></a>\n", - "\n", - "### Preview the documentation template\n", - "\n", - "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", - "\n", - "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.preview_template()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Load the sample dataset\n", - "\n", - "The sample dataset used here is provided by the ValidMind library. To be able to use it, you need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.datasets.regression import fred_timeseries \n", - "\n", - "target_column = fred_timeseries.target_column\n", - "\n", - "print(\n", - " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{target_column}'\"\n", - ")\n", - "\n", - "raw_df = fred_timeseries.load_data()\n", - "raw_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Document the model\n", - "\n", - "As part of documenting the model with the ValidMind Library, you need to preprocess the raw dataset, initialize some training and test datasets, initialize a model object you can use for testing, and then run the full suite of tests." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Prepocess the raw dataset\n", - "\n", - "Preprocessing performs a number of operations to get ready for the subsequent steps:\n", - "- **Split the dataset**: Divide the original dataset into training and test sets for the primary model with an 80/20 split, without shuffling.\n", - "- **Difference the data**: Calculate the first difference of the train and test datasets to remove trends and seasonality, then drop any resulting NaN values.\n", - "- **Extract features and target variables**: Separate the feature columns (predictors) and the target variable from the differenced train and test datasets." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Split the raw dataset into training and test sets \n", - "train_df, test_df = train_test_split(raw_df, test_size=0.2, shuffle=False)\n", - "\n", - "# Take the first difference of the training and test sets\n", - "train_diff_df = train_df.diff().dropna()\n", - "test_diff_df = test_df.diff().dropna()\n", - "\n", - "# Extract the features and target variable from the training set\n", - "X_diff_train = train_diff_df.drop(target_column, axis=1)\n", - "y_diff_train = train_diff_df[target_column]\n", - "\n", - "# Extract the features and target variable from the test set\n", - "X_diff_test = test_diff_df.drop(target_column, axis=1)\n", - "y_diff_test = test_diff_df[target_column]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### Train random forests and gradient boosting regressor models\n", - "\n", - "This section trains random forest and gradient boosting models on differenced data, transforms predictions back to the original scale, and evaluates model performance using Mean Squared Error (MSE) and R-squared (R²) scores. \n", - "\n", - "The following helper functions are used to post-process predictions and evaluate model performance:\n", - "\n", - "- `transform_to_levels`: Reconstructs the original values from differenced predictions by cumulatively summing them, starting from a given initial value.\n", - "- `evaluate_model`: Calculates the Mean Squared Error (MSE) and R-squared (R²) score to evaluate the accuracy of the predictions against the true values." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def transform_to_levels(y_diff_pred, first_value=0): \n", - " y_pred = [first_value]\n", - " for pred in y_diff_pred:\n", - " y_pred.append(y_pred[-1] + pred)\n", - " return y_pred\n", - "\n", - "def evaluate_model(y_true, y_pred):\n", - " mse = mean_squared_error(y_true, y_pred)\n", - " r2 = r2_score(y_true, y_pred)\n", - " return mse, r2" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Fit the random forest model\n", - "model_rf = RandomForestRegressor(n_estimators=1500, random_state=0)\n", - "model_rf.fit(X_diff_train, y_diff_train)\n", - "\n", - "# Make predictions on the training and test sets\n", - "y_diff_train_pred = model_rf.predict(X_diff_train)\n", - "y_diff_test_pred = model_rf.predict(X_diff_test)\n", - "\n", - "# Transform the predictions back to the original scale\n", - "y_train_rf_pred = transform_to_levels(y_diff_train_pred, first_value=train_df[target_column].iloc[0])\n", - "y_test_rf_pred = transform_to_levels(y_diff_test_pred, first_value=test_df[target_column].iloc[0])\n", - "\n", - "# Evaluate the model's performance on the training and test sets\n", - "mse_train, r2_train = evaluate_model(train_df[target_column], y_train_rf_pred)\n", - "mse_test, r2_test = evaluate_model(test_df[target_column], y_test_rf_pred)\n", - "\n", - "print(f\"Train Mean Squared Error: {mse_train}\")\n", - "print(f\"Train R-Squared: {r2_train}\")\n", - "print(f\"Test Mean Squared Error: {mse_test}\")\n", - "print(f\"Test R-Squared: {r2_test}\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Fit the gradient boost model\n", - "model_gb = GradientBoostingRegressor(n_estimators=1500, random_state=0)\n", - "model_gb.fit(X_diff_train, y_diff_train)\n", - "\n", - "# Make predictions on the training and test sets\n", - "y_diff_train_pred = model_gb.predict(X_diff_train)\n", - "y_diff_test_pred = model_gb.predict(X_diff_test)\n", - "\n", - "# Transform the predictions back to the original scale\n", - "y_train_gb_pred = transform_to_levels(y_diff_train_pred, first_value=train_df[target_column].iloc[0])\n", - "y_test_gb_pred = transform_to_levels(y_diff_test_pred, first_value=test_df[target_column].iloc[0])\n", - "\n", - "# Evaluate the model's performance on the training and test sets\n", - "mse_train, r2_train = evaluate_model(train_df[target_column], y_train_gb_pred)\n", - "mse_test, r2_test = evaluate_model(test_df[target_column], y_test_gb_pred)\n", - "\n", - "print(f\"Train Mean Squared Error: {mse_train}\")\n", - "print(f\"Train R-Squared: {r2_train}\")\n", - "print(f\"Test Mean Squared Error: {mse_test}\")\n", - "print(f\"Test R-Squared: {r2_test}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_3__'></a>\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", - "\n", - "This function takes a number of arguments:\n", - "\n", - "- `dataset` — the raw dataset that you want to provide as input to tests\n", - "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", - "- `target_column` — a required argument if tests require access to true values. This is the name of the target column in the dataset\n", - "\n", - "With all dataframes ready, you can now initialize the ValidMind datasets objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):\n", - "\n", - "- `vm_raw_ds`: contains the raw, unprocessed data with the specified target column.\n", - "- `vm_train_diff_ds`: contains the training data with the differenced target column, excluding the first row to remove NaN values caused by differencing.\n", - "- `vm_test_diff_ds`: contains the test data with the differenced target column, excluding the first row to remove NaN values caused by differencing.\n", - "- `vm_train_ds`: contains the training data, excluding the first row to align with the differenced data.\n", - "- `vm_test_ds`: includes the test data split from the raw dataset." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_raw_ds = vm.init_dataset(\n", - " input_id=\"raw_ds\",\n", - " dataset=raw_df,\n", - " target_column=target_column,\n", - ")\n", - "\n", - "vm_train_diff_ds = vm.init_dataset(\n", - " input_id=\"train_diff_ds\",\n", - " dataset=train_diff_df,\n", - " target_column=target_column,\n", - ")\n", - "\n", - "vm_test_diff_ds = vm.init_dataset(\n", - " input_id=\"test_diff_ds\",\n", - " dataset=test_diff_df,\n", - " target_column=target_column,\n", - ")\n", - "\n", - "vm_train_ds = vm.init_dataset(\n", - " input_id=\"train_ds\",\n", - " dataset=train_df,\n", - " target_column=target_column,\n", - ")\n", - "\n", - "vm_test_ds = vm.init_dataset(\n", - " input_id=\"test_ds\",\n", - " dataset=test_df,\n", - " target_column=target_column,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_4__'></a>\n", - "\n", - "### Initialize the ValidMind models\n", - "\n", - "You'll also need to initialize ValidMind model objects (`vm_model`) that can be passed to other functions for analysis and tests on the data for our models.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_model_rf = vm.init_model(\n", - " model_rf,\n", - " input_id=\"random_forests_model\",\n", - ")\n", - "\n", - "vm_model_gb = vm.init_model(\n", - " model_gb,\n", - " input_id=\"gradient_boosting_model\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_5__'></a>\n", - "\n", - "### Assign predictions to the datasets\n", - "\n", - "We can now use the assign_predictions() method from the Dataset object to link existing predictions to any model. If no prediction values are passed, the method will compute predictions automatically:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm_train_ds.assign_predictions(\n", - " model=vm_model_rf,\n", - " prediction_values=y_train_rf_pred,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_model_rf,\n", - " prediction_values=y_test_rf_pred,\n", - ")\n", - "\n", - "vm_train_ds.assign_predictions(\n", - " model=vm_model_gb,\n", - " prediction_values=y_train_gb_pred,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_model_gb,\n", - " prediction_values=y_test_gb_pred,\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.utils import preview_test_config\n", - "\n", - "test_config = fred_timeseries.get_demo_test_config()\n", - "preview_test_config(test_config)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_6__'></a>\n", - "\n", - "### Run data validation tests" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.TimeSeriesDescription\",\n", - " input_grid={\n", - " \"dataset\": [\"raw_ds\", \"train_diff_ds\", \"test_diff_ds\", \"train_ds\", \"test_ds\"],\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.TimeSeriesLinePlot\",\n", - " input_grid={\n", - " \"dataset\": [\"raw_ds\"],\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.TimeSeriesMissingValues\",\n", - " input_grid={\n", - " \"dataset\": [\"raw_ds\", \"train_diff_ds\", \"test_diff_ds\", \"train_ds\", \"test_ds\"],\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.SeasonalDecompose\",\n", - " input_grid={\n", - " \"dataset\": [\"raw_ds\"],\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.TimeSeriesDescriptiveStatistics\",\n", - " input_grid={\n", - " \"dataset\": [\"train_diff_ds\", \"test_diff_ds\"],\n", - " },\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.TimeSeriesOutliers\",\n", - " input_grid={\n", - " \"dataset\": [\"train_diff_ds\", \"test_diff_ds\"],\n", - " },\n", - " params={\n", - " \"zscore_threshold\": 4\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.TimeSeriesHistogram\",\n", - " input_grid={\n", - " \"dataset\": [ \"train_diff_ds\", \"test_diff_ds\"],\n", - " },\n", - " params={\n", - " \"nbins\": 100\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.data_validation.DatasetSplit\",\n", - " inputs={\n", - " \"datasets\": [\"train_diff_ds\", \"test_diff_ds\"],\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_7__'></a>\n", - "\n", - "### Run model validation tests" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.ModelMetadata\",\n", - " input_grid={\n", - " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.RegressionErrors\",\n", - " input_grid={\n", - " \"dataset\": [\"train_ds\", \"test_ds\"],\n", - " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.RegressionR2Square\",\n", - " input_grid={\n", - " \"dataset\": [\"train_ds\", \"test_ds\"],\n", - " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.TimeSeriesR2SquareBySegments:train_data\",\n", - " input_grid={\n", - " \"dataset\": [\"train_ds\"],\n", - " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.TimeSeriesR2SquareBySegments:test_data\",\n", - " input_grid={\n", - " \"dataset\": [\"test_ds\"],\n", - " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", - " },\n", - " params={\n", - " \"segments\":{\n", - " \"start_date\": [\"2012-11-01\",\"2018-02-01\"],\n", - " \"end_date\": [\"2018-01-01\",\"2023-03-01\"]\n", - " }\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.TimeSeriesPredictionsPlot\",\n", - " input_grid={\n", - " \"dataset\": [\"train_ds\", \"test_ds\"],\n", - " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.TimeSeriesPredictionWithCI\",\n", - " input_grid={\n", - " \"dataset\": [\"train_ds\", \"test_ds\"],\n", - " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.ModelPredictionResiduals\",\n", - " input_grid={\n", - " \"dataset\": [\"train_ds\", \"test_ds\"],\n", - " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.FeatureImportance\",\n", - " input_grid={\n", - " \"dataset\": [\"train_ds\", \"test_ds\"],\n", - " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.PermutationFeatureImportance\",\n", - " input_grid={\n", - " \"dataset\": [\"train_ds\", \"test_ds\"],\n", - " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", - " }\n", - ")\n", - "test.log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Next steps\n", - "\n", - "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", - "\n", - "<a id='toc5_1__'></a>\n", - "\n", - "### Work with your documentation\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", - "\n", - "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", - "\n", - "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", - "\n", - "<a id='toc5_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-d549f9055f374ee392fb42facfd75cb9", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 2 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Document a time series forecasting model\n", + "\n", + "Use the [FRED](https://fred.stlouisfed.org/) sample dataset to train a simple time series model and document that model with the ValidMind Library.\n", + "\n", + "As part of the notebook, you will learn how to train a simple model while exploring how the documentation process works:\n", + "\n", + "- Initializing the ValidMind Library\n", + "- Loading a sample dataset provided by the library to train a simple time series model\n", + "- Running a ValidMind test suite to quickly generate documentation about the data and model" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Install the ValidMind Library](#toc2_1__) \n", + " - [Initialize the ValidMind Library](#toc2_2__) \n", + " - [Register sample model](#toc2_2_1__) \n", + " - [Apply documentation template](#toc2_2_2__) \n", + " - [Get your code snippet](#toc2_2_3__) \n", + " - [Initialize the Python environment](#toc2_3__) \n", + " - [Preview the documentation template](#toc2_4__) \n", + "- [Load the sample dataset](#toc3__) \n", + "- [Document the model](#toc4__) \n", + " - [Prepocess the raw dataset](#toc4_1__) \n", + " - [Train random forests and gradient boosting regressor models](#toc4_2__) \n", + " - [Initialize the ValidMind datasets](#toc4_3__) \n", + " - [Initialize the ValidMind models](#toc4_4__) \n", + " - [Assign predictions to the datasets](#toc4_5__) \n", + " - [Run data validation tests](#toc4_6__) \n", + " - [Run model validation tests](#toc4_7__) \n", + "- [Next steps](#toc5__) \n", + " - [Work with your documentation](#toc5_1__) \n", + " - [Discover more learning resources](#toc5_2__) \n", + "- [Upgrade ValidMind](#toc6__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on documentation. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and validators.\n", + "\n", + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html).\n", + "\n", + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>\n", + "\n", + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**documentation, model documentation**: A structured and detailed document pertaining to a record, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. Within the realm of risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the record's application.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**documentation template**: A default ValidMind document type that serves as a standardized framework for developing and documenting records, including sections designated for record details, data descriptions, test results, and performance metrics. By outlining required documentation and recommended analyses, document templates ensure consistency and completeness across documentation and help guide developers through a systematic development process while promoting comparability and traceability of development outcomes.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_1__'></a>\n", + "\n", + "#### Register sample model\n", + "\n", + "Let's first register a sample record (model) for use with this notebook:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_2__'></a>\n", + "\n", + "#### Apply documentation template\n", + "\n", + "Once you've registered your model, let's select a documentation template. A template predefines sections for your documentation and provides a general outline to follow, making the documentation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Development**.\n", + "\n", + " If you cannot locate your Development document, make sure Development type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Time Series Forecasting with ML`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2_3__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Development` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"documentation\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Initialize the Python environment\n", + "\n", + "Next, let's import the necessary libraries and set up your Python environment for data analysis:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor\n", + "from sklearn.metrics import mean_squared_error, r2_score\n", + "from sklearn.model_selection import train_test_split\n", + "\n", + "%matplotlib inline" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_4__'></a>\n", + "\n", + "### Preview the documentation template\n", + "\n", + "Let's verify that you have connected the ValidMind Library to the ValidMind Platform and that the appropriate *template* is selected for your model.\n", + "\n", + "You will upload documentation and test results unique to your model based on this template later on. For now, **take a look at the default structure that the template provides with [the `vm.preview_template()` function](https://docs.validmind.ai/validmind/validmind.html#preview_template)** from the ValidMind library and note the empty sections:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.preview_template()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Load the sample dataset\n", + "\n", + "The sample dataset used here is provided by the ValidMind library. To be able to use it, you need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.datasets.regression import fred_timeseries \n", + "\n", + "target_column = fred_timeseries.target_column\n", + "\n", + "print(\n", + " f\"Loaded demo dataset with: \\n\\n\\t• Target column: '{target_column}'\"\n", + ")\n", + "\n", + "raw_df = fred_timeseries.load_data()\n", + "raw_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Document the model\n", + "\n", + "As part of documenting the model with the ValidMind Library, you need to preprocess the raw dataset, initialize some training and test datasets, initialize a model object you can use for testing, and then run the full suite of tests." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Prepocess the raw dataset\n", + "\n", + "Preprocessing performs a number of operations to get ready for the subsequent steps:\n", + "- **Split the dataset**: Divide the original dataset into training and test sets for the primary model with an 80/20 split, without shuffling.\n", + "- **Difference the data**: Calculate the first difference of the train and test datasets to remove trends and seasonality, then drop any resulting NaN values.\n", + "- **Extract features and target variables**: Separate the feature columns (predictors) and the target variable from the differenced train and test datasets." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Split the raw dataset into training and test sets \n", + "train_df, test_df = train_test_split(raw_df, test_size=0.2, shuffle=False)\n", + "\n", + "# Take the first difference of the training and test sets\n", + "train_diff_df = train_df.diff().dropna()\n", + "test_diff_df = test_df.diff().dropna()\n", + "\n", + "# Extract the features and target variable from the training set\n", + "X_diff_train = train_diff_df.drop(target_column, axis=1)\n", + "y_diff_train = train_diff_df[target_column]\n", + "\n", + "# Extract the features and target variable from the test set\n", + "X_diff_test = test_diff_df.drop(target_column, axis=1)\n", + "y_diff_test = test_diff_df[target_column]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### Train random forests and gradient boosting regressor models\n", + "\n", + "This section trains random forest and gradient boosting models on differenced data, transforms predictions back to the original scale, and evaluates model performance using Mean Squared Error (MSE) and R-squared (R²) scores. \n", + "\n", + "The following helper functions are used to post-process predictions and evaluate model performance:\n", + "\n", + "- `transform_to_levels`: Reconstructs the original values from differenced predictions by cumulatively summing them, starting from a given initial value.\n", + "- `evaluate_model`: Calculates the Mean Squared Error (MSE) and R-squared (R²) score to evaluate the accuracy of the predictions against the true values." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "def transform_to_levels(y_diff_pred, first_value=0): \n", + " y_pred = [first_value]\n", + " for pred in y_diff_pred:\n", + " y_pred.append(y_pred[-1] + pred)\n", + " return y_pred\n", + "\n", + "def evaluate_model(y_true, y_pred):\n", + " mse = mean_squared_error(y_true, y_pred)\n", + " r2 = r2_score(y_true, y_pred)\n", + " return mse, r2" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Fit the random forest model\n", + "model_rf = RandomForestRegressor(n_estimators=1500, random_state=0)\n", + "model_rf.fit(X_diff_train, y_diff_train)\n", + "\n", + "# Make predictions on the training and test sets\n", + "y_diff_train_pred = model_rf.predict(X_diff_train)\n", + "y_diff_test_pred = model_rf.predict(X_diff_test)\n", + "\n", + "# Transform the predictions back to the original scale\n", + "y_train_rf_pred = transform_to_levels(y_diff_train_pred, first_value=train_df[target_column].iloc[0])\n", + "y_test_rf_pred = transform_to_levels(y_diff_test_pred, first_value=test_df[target_column].iloc[0])\n", + "\n", + "# Evaluate the model's performance on the training and test sets\n", + "mse_train, r2_train = evaluate_model(train_df[target_column], y_train_rf_pred)\n", + "mse_test, r2_test = evaluate_model(test_df[target_column], y_test_rf_pred)\n", + "\n", + "print(f\"Train Mean Squared Error: {mse_train}\")\n", + "print(f\"Train R-Squared: {r2_train}\")\n", + "print(f\"Test Mean Squared Error: {mse_test}\")\n", + "print(f\"Test R-Squared: {r2_test}\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Fit the gradient boost model\n", + "model_gb = GradientBoostingRegressor(n_estimators=1500, random_state=0)\n", + "model_gb.fit(X_diff_train, y_diff_train)\n", + "\n", + "# Make predictions on the training and test sets\n", + "y_diff_train_pred = model_gb.predict(X_diff_train)\n", + "y_diff_test_pred = model_gb.predict(X_diff_test)\n", + "\n", + "# Transform the predictions back to the original scale\n", + "y_train_gb_pred = transform_to_levels(y_diff_train_pred, first_value=train_df[target_column].iloc[0])\n", + "y_test_gb_pred = transform_to_levels(y_diff_test_pred, first_value=test_df[target_column].iloc[0])\n", + "\n", + "# Evaluate the model's performance on the training and test sets\n", + "mse_train, r2_train = evaluate_model(train_df[target_column], y_train_gb_pred)\n", + "mse_test, r2_test = evaluate_model(test_df[target_column], y_test_gb_pred)\n", + "\n", + "print(f\"Train Mean Squared Error: {mse_train}\")\n", + "print(f\"Train R-Squared: {r2_train}\")\n", + "print(f\"Test Mean Squared Error: {mse_test}\")\n", + "print(f\"Test R-Squared: {r2_test}\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_3__'></a>\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests, you must first initialize a ValidMind dataset object using the [`init_dataset`](https://docs.validmind.ai/validmind/validmind.html#init_dataset) function from the ValidMind (`vm`) module.\n", + "\n", + "This function takes a number of arguments:\n", + "\n", + "- `dataset` — the raw dataset that you want to provide as input to tests\n", + "- `input_id` - a unique identifier that allows tracking what inputs are used when running each individual test\n", + "- `target_column` — a required argument if tests require access to true values. This is the name of the target column in the dataset\n", + "\n", + "With all dataframes ready, you can now initialize the ValidMind datasets objects using [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset):\n", + "\n", + "- `vm_raw_ds`: contains the raw, unprocessed data with the specified target column.\n", + "- `vm_train_diff_ds`: contains the training data with the differenced target column, excluding the first row to remove NaN values caused by differencing.\n", + "- `vm_test_diff_ds`: contains the test data with the differenced target column, excluding the first row to remove NaN values caused by differencing.\n", + "- `vm_train_ds`: contains the training data, excluding the first row to align with the differenced data.\n", + "- `vm_test_ds`: includes the test data split from the raw dataset." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_raw_ds = vm.init_dataset(\n", + " input_id=\"raw_ds\",\n", + " dataset=raw_df,\n", + " target_column=target_column,\n", + ")\n", + "\n", + "vm_train_diff_ds = vm.init_dataset(\n", + " input_id=\"train_diff_ds\",\n", + " dataset=train_diff_df,\n", + " target_column=target_column,\n", + ")\n", + "\n", + "vm_test_diff_ds = vm.init_dataset(\n", + " input_id=\"test_diff_ds\",\n", + " dataset=test_diff_df,\n", + " target_column=target_column,\n", + ")\n", + "\n", + "vm_train_ds = vm.init_dataset(\n", + " input_id=\"train_ds\",\n", + " dataset=train_df,\n", + " target_column=target_column,\n", + ")\n", + "\n", + "vm_test_ds = vm.init_dataset(\n", + " input_id=\"test_ds\",\n", + " dataset=test_df,\n", + " target_column=target_column,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_4__'></a>\n", + "\n", + "### Initialize the ValidMind models\n", + "\n", + "You'll also need to initialize ValidMind model objects (`vm_model`) that can be passed to other functions for analysis and tests on the data for our models.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model object with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_model_rf = vm.init_model(\n", + " model_rf,\n", + " input_id=\"random_forests_model\",\n", + ")\n", + "\n", + "vm_model_gb = vm.init_model(\n", + " model_gb,\n", + " input_id=\"gradient_boosting_model\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_5__'></a>\n", + "\n", + "### Assign predictions to the datasets\n", + "\n", + "We can now use the assign_predictions() method from the Dataset object to link existing predictions to any model. If no prediction values are passed, the method will compute predictions automatically:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm_train_ds.assign_predictions(\n", + " model=vm_model_rf,\n", + " prediction_values=y_train_rf_pred,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_model_rf,\n", + " prediction_values=y_test_rf_pred,\n", + ")\n", + "\n", + "vm_train_ds.assign_predictions(\n", + " model=vm_model_gb,\n", + " prediction_values=y_train_gb_pred,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_model_gb,\n", + " prediction_values=y_test_gb_pred,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.utils import preview_test_config\n", + "\n", + "test_config = fred_timeseries.get_demo_test_config()\n", + "preview_test_config(test_config)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_6__'></a>\n", + "\n", + "### Run data validation tests" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.TimeSeriesDescription\",\n", + " input_grid={\n", + " \"dataset\": [\"raw_ds\", \"train_diff_ds\", \"test_diff_ds\", \"train_ds\", \"test_ds\"],\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.TimeSeriesLinePlot\",\n", + " input_grid={\n", + " \"dataset\": [\"raw_ds\"],\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.TimeSeriesMissingValues\",\n", + " input_grid={\n", + " \"dataset\": [\"raw_ds\", \"train_diff_ds\", \"test_diff_ds\", \"train_ds\", \"test_ds\"],\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.SeasonalDecompose\",\n", + " input_grid={\n", + " \"dataset\": [\"raw_ds\"],\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.TimeSeriesDescriptiveStatistics\",\n", + " input_grid={\n", + " \"dataset\": [\"train_diff_ds\", \"test_diff_ds\"],\n", + " },\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.TimeSeriesOutliers\",\n", + " input_grid={\n", + " \"dataset\": [\"train_diff_ds\", \"test_diff_ds\"],\n", + " },\n", + " params={\n", + " \"zscore_threshold\": 4\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.TimeSeriesHistogram\",\n", + " input_grid={\n", + " \"dataset\": [ \"train_diff_ds\", \"test_diff_ds\"],\n", + " },\n", + " params={\n", + " \"nbins\": 100\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.data_validation.DatasetSplit\",\n", + " inputs={\n", + " \"datasets\": [\"train_diff_ds\", \"test_diff_ds\"],\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_7__'></a>\n", + "\n", + "### Run model validation tests" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.ModelMetadata\",\n", + " input_grid={\n", + " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.RegressionErrors\",\n", + " input_grid={\n", + " \"dataset\": [\"train_ds\", \"test_ds\"],\n", + " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.RegressionR2Square\",\n", + " input_grid={\n", + " \"dataset\": [\"train_ds\", \"test_ds\"],\n", + " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.TimeSeriesR2SquareBySegments:train_data\",\n", + " input_grid={\n", + " \"dataset\": [\"train_ds\"],\n", + " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.TimeSeriesR2SquareBySegments:test_data\",\n", + " input_grid={\n", + " \"dataset\": [\"test_ds\"],\n", + " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", + " },\n", + " params={\n", + " \"segments\":{\n", + " \"start_date\": [\"2012-11-01\",\"2018-02-01\"],\n", + " \"end_date\": [\"2018-01-01\",\"2023-03-01\"]\n", + " }\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.TimeSeriesPredictionsPlot\",\n", + " input_grid={\n", + " \"dataset\": [\"train_ds\", \"test_ds\"],\n", + " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.TimeSeriesPredictionWithCI\",\n", + " input_grid={\n", + " \"dataset\": [\"train_ds\", \"test_ds\"],\n", + " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.ModelPredictionResiduals\",\n", + " input_grid={\n", + " \"dataset\": [\"train_ds\", \"test_ds\"],\n", + " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.FeatureImportance\",\n", + " input_grid={\n", + " \"dataset\": [\"train_ds\", \"test_ds\"],\n", + " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.PermutationFeatureImportance\",\n", + " input_grid={\n", + " \"dataset\": [\"train_ds\", \"test_ds\"],\n", + " \"model\": [\"random_forests_model\", \"gradient_boosting_model\"],\n", + " }\n", + ")\n", + "test.log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Next steps\n", + "\n", + "You can look at the results of this test suite right in the notebook where you ran the code, as you would expect. But there is a better way — use the ValidMind Platform to work with your model documentation.\n", + "\n", + "<a id='toc5_1__'></a>\n", + "\n", + "### Work with your documentation\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you registered earlier. (Learn more: [Working with the inventory](https://docs.validmind.ai/guide/inventory/working-with-the-inventory.html))\n", + "\n", + "2. In the left sidebar that appears for your model, click **Development** under Documents.\n", + "\n", + "What you see is the full draft of your documentation in a more easily consumable version. From here, you can make qualitative edits to documentation, view guidelines, collaborate with validators, and submit your documentation for approval when it's ready. (**Learn more:** [Working with documentation](https://docs.validmind.ai/guide/documentation/working-with-documentation.html))\n", + "\n", + "<a id='toc5_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-d549f9055f374ee392fb42facfd75cb9" + } + ], + "metadata": { + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.13" + } + }, + "nbformat": 4, + "nbformat_minor": 2 } diff --git a/site/notebooks/use_cases/validation/validate_application_scorecard.ipynb b/site/notebooks/use_cases/validation/validate_application_scorecard.ipynb index c47c6645c3..563c622a21 100644 --- a/site/notebooks/use_cases/validation/validate_application_scorecard.ipynb +++ b/site/notebooks/use_cases/validation/validate_application_scorecard.ipynb @@ -1,1883 +1,1893 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Validate an application scorecard model\n", - "\n", - "Learn how to independently assess an application scorecard model developed using the ValidMind Library as a validator. You'll evaluate the development of the model by conducting thorough testing and analysis, including the use of challenger models to benchmark performance.\n", - "\n", - "An *application scorecard model* is a type of statistical model used in credit scoring to evaluate the creditworthiness of potential borrowers by generating a score based on various characteristics of an applicant such as credit history, income, employment status, and other relevant financial data.\n", - "\n", - " - This score assists lenders in making informed decisions about whether to approve or reject loan applications, as well as in determining the terms of the loan, including interest rates and credit limits.\n", - " - Effective validation of application scorecard models ensures that lenders can manage risk efficiently while maintaining a fast and transparent loan application process for applicants.\n", - "\n", - "This interactive notebook provides a step-by-step guide for:\n", - "\n", - "- Verifying the data quality steps performed by the development team\n", - "- Independently replicating the champion's results and conducting additional tests to assess performance, stability, and robustness\n", - "- Setting up test inputs and challenger models for comparative analysis\n", - "- Running validation tests, analyzing results, and logging artifacts (findings) to ValidMind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.content-hidden when-format=\"html\"}\n", - "## Contents \n", - "- [About ValidMind](#toc1__) \n", - " - [Before you begin](#toc1_1__) \n", - " - [New to ValidMind?](#toc1_2__) \n", - " - [Key concepts](#toc1_3__) \n", - "- [Setting up](#toc2__) \n", - " - [Register a sample model](#toc2_1__) \n", - " - [Assign validator credentials](#toc2_1_1__) \n", - " - [Apply validation report template](#toc2_1_2__) \n", - " - [Install the ValidMind Library](#toc2_2__) \n", - " - [Initialize the ValidMind Library](#toc2_3__) \n", - " - [Get your code snippet](#toc2_3_1__) \n", - " - [Importing the champion model](#toc2_4__) \n", - " - [Load the sample dataset](#toc2_5__) \n", - " - [Preprocess the dataset](#toc2_5_1__) \n", - " - [Apply feature engineering to the dataset](#toc2_5_2__) \n", - " - [Split the feature engineered dataset](#toc2_6__) \n", - "- [Developing potential challenger models](#toc3__) \n", - " - [Train potential challenger models](#toc3_1__) \n", - " - [Random forest classification model](#toc3_1_1__) \n", - " - [Logistic regression model](#toc3_1_2__) \n", - " - [Extract predicted probabilities](#toc3_2__) \n", - " - [Compute binary predictions](#toc3_2_1__) \n", - "- [Initializing the ValidMind objects](#toc4__) \n", - " - [Initialize the ValidMind datasets](#toc4_1__) \n", - " - [Initialize the ValidMind models](#toc4_2__) \n", - " - [Assign predictions](#toc4_3__) \n", - " - [Compute credit risk scores](#toc4_4__) \n", - "- [Running data quality tests](#toc5__) \n", - " - [Identify relevant data quality tests](#toc5_1__) \n", - " - [Run and log an individual data quality test](#toc5_2__) \n", - " - [Log multiple data quality tests](#toc5_3__) \n", - " - [Run data quality comparison tests](#toc5_4__) \n", - "- [Running performance tests](#toc6__) \n", - " - [Identify relevant performance tests](#toc6_1__) \n", - " - [Run and log an individual performance test](#toc6_2__) \n", - " - [Log multiple performance tests](#toc6_3__) \n", - " - [Evaluate performance of the champion model](#toc6_4__) \n", - " - [Evaluate performance of challenger models](#toc6_5__) \n", - " - [Enable custom context for test descriptions](#toc6_5_1__) \n", - " - [Run performance comparison tests](#toc6_5_2__) \n", - "- [Adjust a ValidMind test](#toc7__) \n", - "- [Run diagnostic tests](#toc8__) \n", - "- [Run feature importance tests](#toc9__) \n", - "- [Implement a custom test](#toc10__) \n", - "- [Verify test runs](#toc11__) \n", - "- [Next steps](#toc12__) \n", - " - [Work with your validation report](#toc12_1__) \n", - " - [Discover more learning resources](#toc12_2__) \n", - "- [Upgrade ValidMind](#toc13__) \n", - "\n", - ":::\n", - "<!-- jn-toc-notebook-config\n", - "\tnumbering=false\n", - "\tanchor=true\n", - "\tflat=false\n", - "\tminLevel=2\n", - "\tmaxLevel=4\n", - "\t/jn-toc-notebook-config -->\n", - "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1__'></a>\n", - "\n", - "## About ValidMind\n", - "\n", - "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", - "\n", - "You use the ValidMind Library to automate comparison and other validation tests, and then use the ValidMind Platform to submit compliance assessments of champions via comprehensive validation reports. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and developers." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1_1__'></a>\n", - "\n", - "### Before you begin\n", - "\n", - "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", - "\n", - "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1_2__'></a>\n", - "\n", - "### New to ValidMind?\n", - "\n", - "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about validating records such as models and running tests, as well as find code samples and our Python Library API reference.\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", - "<br></br>\n", - "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc1_3__'></a>\n", - "\n", - "### Key concepts\n", - "\n", - "**Validation report**: A comprehensive and structured assessment of a model’s development and performance, focusing on verifying its integrity, appropriateness, and alignment with its intended use. It includes analyses of model assumptions, data quality, performance metrics, outcomes of testing procedures, and risk considerations. The validation report supports transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", - "\n", - "**Validation report template**: Serves as a standardized framework for conducting and documenting model validation activities. It outlines the required sections, recommended analyses, and expected validation tests, ensuring consistency and completeness across validation reports. The template helps guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", - "\n", - "**Tests**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets.\n", - "\n", - "**Metrics**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", - "\n", - "**Custom metrics**: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", - "\n", - "**Inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", - "\n", - " - **model**: A single model that has been initialized in ValidMind with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model).\n", - " - **dataset**: Single dataset that has been initialized in ValidMind with [`vm.init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", - " - **models**: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.\n", - " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", - "\n", - "**Parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.\n", - "\n", - "**Outputs**: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2__'></a>\n", - "\n", - "## Setting up" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1__'></a>\n", - "\n", - "### Register a sample model\n", - "\n", - "In a usual lifecycle, a champion will have been independently registered in your inventory and submitted to you for validation by your development team as part of the effective challenge process. (**Learn more:** [Submit documents](https://docs.validmind.ai/guide/documentation/submit-documents.html))\n", - "\n", - "For this notebook, we'll have you register a dummy record (model) in the ValidMind Platform inventory and assign yourself as the validator to familiarize you with the ValidMind interface and circumvent the need for an existing model:\n", - "\n", - "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", - "\n", - "2. In the left sidebar, select **Inventory**.\n", - "\n", - "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", - "\n", - "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", - "\n", - "5. Select your own name under the **RECORD OWNER** drop-down — don’t worry, we’ll adjust these permissions next for validation.\n", - "\n", - "6. Click **Register Model** to add the model to your inventory." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1_1__'></a>\n", - "\n", - "#### Assign validator credentials\n", - "\n", - "In order to log tests as a validator instead of as a developer, on the details page that appears after you've successfully registered your sample model:\n", - "\n", - "1. Remove yourself as an owner:\n", - "\n", - " - Click on the **OWNERS** tile.\n", - " - Click the **x** next to your name to remove yourself from that model's role.\n", - " - Click **Save** to apply your changes to that role.\n", - "\n", - "2. Remove yourself as a developer:\n", - "\n", - " - Click on the **DEVELOPERS** tile.\n", - " - Click the **x** next to your name to remove yourself from that model's role.\n", - " - Click **Save** to apply your changes to that role.\n", - "\n", - "3. Add yourself as a validator:\n", - "\n", - " - Click on the **VALIDATORS** tile.\n", - " - Select your name from the drop-down menu.\n", - " - Click **Save** to apply your changes to that role." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_1_2__'></a>\n", - "\n", - "#### Apply validation report template\n", - "\n", - "Next, let's select a validation report template. A template predefines sections for your report and provides a general outline to follow, making the validation process much easier.\n", - "\n", - "1. In the left sidebar that appears for your model, click **Documents** and select **Validation**.\n", - "\n", - " If you cannot locate your Validation document, make sure Validation type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", - "\n", - "2. Under **TEMPLATE**, select `Generic Validation Report`.\n", - "\n", - "3. Click **Use Template** to apply the template." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_2__'></a>\n", - "\n", - "### Install the ValidMind Library\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", - "<br></br>\n", - "Python 3.8 <= x <= 3.14</div>\n", - "\n", - "To install the library:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip install -q validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_3__'></a>\n", - "\n", - "### Initialize the ValidMind Library" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_3_1__'></a>\n", - "\n", - "#### Get your code snippet\n", - "\n", - "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", - "\n", - "1. On the left sidebar that appears for your model, select **Getting Started** and select `Validation` from the **DOCUMENT** drop-down menu.\n", - "\n", - "2. Click **Copy snippet to clipboard**.\n", - "\n", - "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Load your model identifier credentials from an `.env` file\n", - "\n", - "%load_ext dotenv\n", - "%dotenv .env\n", - "\n", - "# Or replace with your code snippet\n", - "\n", - "import validmind as vm\n", - "\n", - "vm.init(\n", - " # api_host=\"...\",\n", - " # api_key=\"...\",\n", - " # api_secret=\"...\",\n", - " # model=\"...\",\n", - " document=\"validation-report\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_4__'></a>\n", - "\n", - "### Importing the champion model\n", - "\n", - "With the ValidMind Library set up and ready to go, let's go ahead and import the champion submitted by the development team in the format of a `.pkl` file: **[xgb_model_champion.pkl](xgb_model_champion.pkl)**" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import xgboost as xgb\n", - "\n", - "#Load the saved model\n", - "xgb_model = xgb.XGBClassifier()\n", - "xgb_model.load_model(\"xgb_model_champion.pkl\")\n", - "xgb_model" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Ensure that we have to appropriate order in feature names from Champion model and dataset\n", - "cols_when_model_builds = xgb_model.get_booster().feature_names" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_5__'></a>\n", - "\n", - "### Load the sample dataset\n", - "\n", - "Let's next import the public [Lending Club](https://www.kaggle.com/datasets/devanshi23/loan-data-2007-2014/data) dataset from Kaggle, which was used to develop the dummy champion model.\n", - "\n", - "- We'll use this dataset to review steps that should have been conducted during the initial development and documentation of the model to ensure that the model was built correctly.\n", - "- By independently performing steps such as preprocessing and feature engineering, we can confirm whether the model was built using appropriate and properly processed data.\n", - "\n", - "To be able to use the dataset, you'll need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from validmind.datasets.credit_risk import lending_club\n", - "\n", - "df = lending_club.load_data(source=\"offline\")\n", - "df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_5_1__'></a>\n", - "\n", - "#### Preprocess the dataset\n", - "\n", - "We'll first quickly preprocess the dataset for data quality testing purposes using `lending_club.preprocess`. This function performs the following operations:\n", - "\n", - "- Filters the dataset to include only loans for debt consolidation or credit card purposes\n", - "- Removes loans classified under the riskier grades \"F\" and \"G\"\n", - "- Excludes uncommon home ownership types and standardizes employment length and loan terms into numerical formats\n", - "- Discards unnecessary fields and any entries with missing information to maintain a clean and robust dataset for modeling" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "preprocess_df = lending_club.preprocess(df)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_5_2__'></a>\n", - "\n", - "#### Apply feature engineering to the dataset\n", - "\n", - "Feature engineering improves the dataset's structure to better match what our model expects, and ensures that the model performs optimally by leveraging additional insights from raw data.\n", - "\n", - "We'll apply the following transformations using the `ending_club.feature_engineering()` function to optimize the dataset for predictive modeling in our application scorecard:\n", - "\n", - "- **WoE encoding**: Converts both numerical and categorical features into Weight of Evidence (WoE) values. WoE is a statistical measure used in scorecard modeling that quantifies the relationship between a predictor variable and the binary target variable. It calculates the ratio of the distribution of good outcomes to the distribution of bad outcomes for each category or bin of a feature. This transformation helps to ensure that the features are predictive and consistent in their contribution to the model.\n", - "- **Integration of WoE bins**: Ensures that the WoE transformed values are integrated throughout the dataset, replacing the original feature values while excluding the target variable from this transformation. This transformation is used to maintain a consistent scale and impact of each variable within the model, which helps make the predictions more stable and accurate." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "fe_df = lending_club.feature_engineering(preprocess_df)\n", - "fe_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc2_6__'></a>\n", - "\n", - "### Split the feature engineered dataset\n", - "\n", - "With our dummy model imported and our independently preprocessed and feature engineered dataset ready to go, let's now **spilt our dataset into train and test** to start the validation testing process.\n", - "\n", - "Splitting our dataset into training and testing is essential for proper validation testing, as this helps assess how well the model generalizes to unseen data:\n", - "\n", - "- We begin by dividing our data, which is based on Weight of Evidence (WoE) features, into training and testing sets (`train_df`, `test_df`).\n", - "- With `lending_club.split`, we employ a simple random split, randomly allocating data points to each set to ensure a mix of examples in both." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Split the data\n", - "train_df, test_df = lending_club.split(fe_df, test_size=0.2)\n", - "\n", - "x_train = train_df.drop(lending_club.target_column, axis=1)\n", - "y_train = train_df[lending_club.target_column]\n", - "\n", - "x_test = test_df.drop(lending_club.target_column, axis=1)\n", - "y_test = test_df[lending_club.target_column]\n", - "\n", - "# Now let's apply the order of features from the champion model construction\n", - "x_train = x_train[cols_when_model_builds]\n", - "x_test = x_test[cols_when_model_builds]" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "cols_use = ['annual_inc_woe',\n", - " 'verification_status_woe',\n", - " 'emp_length_woe',\n", - " 'installment_woe',\n", - " 'term_woe',\n", - " 'home_ownership_woe',\n", - " 'purpose_woe',\n", - " 'open_acc_woe',\n", - " 'total_acc_woe',\n", - " 'int_rate_woe',\n", - " 'sub_grade_woe',\n", - " 'grade_woe','loan_status']\n", - "\n", - "\n", - "train_df = train_df[cols_use]\n", - "test_df = test_df[cols_use]\n", - "test_df.head()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3__'></a>\n", - "\n", - "## Developing potential challenger models" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_1__'></a>\n", - "\n", - "### Train potential challenger models\n", - "\n", - "We're curious how alternate models compare to our champion model, so let's train two challenger models as basis for our testing.\n", - "\n", - "Our selected options below offer decreased complexity in terms of implementation — such as lessened manual preprocessing — which can reduce the amount of risk for implementation. However, model risk is not calculated in isolation from a single factor, but rather in consideration with trade-offs in predictive performance, ease of interpretability, and overall alignment with business objectives." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_1_1__'></a>\n", - "\n", - "#### Random forest classification model\n", - "\n", - "A *random forest classification model* is an ensemble machine learning algorithm that uses multiple decision trees to classify data. In ensemble learning, multiple models are combined to improve prediction accuracy and robustness.\n", - "\n", - "Random forest classification models generally have higher accuracy because they capture complex, non-linear relationships, but as a result they lack transparency in their predictions." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Import the Random Forest Classification model\n", - "from sklearn.ensemble import RandomForestClassifier\n", - "\n", - "# Create the model instance with 50 decision trees\n", - "rf_model = RandomForestClassifier(\n", - " n_estimators=50,\n", - " random_state=42,\n", - ")\n", - "\n", - "# Train the model\n", - "rf_model.fit(x_train, y_train)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_1_2__'></a>\n", - "\n", - "#### Logistic regression model\n", - "\n", - "A *logistic regression model* is a statistical machine learning algorithm that uses a linear equation (straight-line relationship between variables) and the logistic function (or sigmoid function, which maps any real-valued number to a range between `0` and `1`) to classify data. In statistical modeling, a single equation is used to estimate the probability of an outcome based on input features.\n", - "\n", - "Logistic regression models are simple and interpretable because they provide clear probability estimates and feature coefficients (numerical value that represents the influence of a particular input feature on the model's prediction), but they may struggle with capturing complex, non-linear relationships in the data." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Import the Logistic Regression model\n", - "from sklearn.linear_model import LogisticRegression\n", - "\n", - "# Logistic Regression grid params\n", - "log_reg_params = {\n", - " \"penalty\": [\"l1\", \"l2\"],\n", - " \"C\": [0.001, 0.01, 0.1, 1, 10, 100, 1000],\n", - " \"solver\": [\"liblinear\"],\n", - "}\n", - "\n", - "# Grid search for Logistic Regression\n", - "from sklearn.model_selection import GridSearchCV\n", - "\n", - "grid_log_reg = GridSearchCV(LogisticRegression(), log_reg_params)\n", - "grid_log_reg.fit(x_train, y_train)\n", - "\n", - "# Logistic Regression best estimator\n", - "log_reg = grid_log_reg.best_estimator_\n", - "log_reg" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_2__'></a>\n", - "\n", - "### Extract predicted probabilities\n", - "\n", - "With our challenger models trained, let's extract the predicted probabilities from our three models:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Champion — Application scorecard model\n", - "train_xgb_prob = xgb_model.predict_proba(x_train)[:, 1]\n", - "test_xgb_prob = xgb_model.predict_proba(x_test)[:, 1]\n", - "\n", - "# Challenger — Random forest classification model\n", - "train_rf_prob = rf_model.predict_proba(x_train)[:, 1]\n", - "test_rf_prob = rf_model.predict_proba(x_test)[:, 1]\n", - "\n", - "# Challenger — Logistic regression model\n", - "train_log_prob = log_reg.predict_proba(x_train)[:, 1]\n", - "test_log_prob = log_reg.predict_proba(x_test)[:, 1]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc3_2_1__'></a>\n", - "\n", - "#### Compute binary predictions\n", - "\n", - "Next, we'll convert the probability predictions from our three models into a binary, based on a threshold of `0.3`:\n", - "\n", - "- If the probability is greater than `0.3`, the prediction becomes `1` (positive).\n", - "- Otherwise, it becomes `0` (negative)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "cut_off_threshold = 0.3\n", - "\n", - "# Champion — Application scorecard model\n", - "train_xgb_binary_predictions = (train_xgb_prob > cut_off_threshold).astype(int)\n", - "test_xgb_binary_predictions = (test_xgb_prob > cut_off_threshold).astype(int)\n", - "\n", - "# Challenger — Random forest classification model\n", - "train_rf_binary_predictions = (train_rf_prob > cut_off_threshold).astype(int)\n", - "test_rf_binary_predictions = (test_rf_prob > cut_off_threshold).astype(int)\n", - "\n", - "# Challenger — Logistic regression model\n", - "train_log_binary_predictions = (train_log_prob > cut_off_threshold).astype(int)\n", - "test_log_binary_predictions = (test_log_prob > cut_off_threshold).astype(int)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4__'></a>\n", - "\n", - "## Initializing the ValidMind objects" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_1__'></a>\n", - "\n", - "### Initialize the ValidMind datasets\n", - "\n", - "Before you can run tests, you'll need to connect your data with a ValidMind `Dataset` object. **This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind,** but you only need to do it once per dataset.\n", - "\n", - "Initialize a ValidMind dataset object using the [`init_dataset` function](https://docs.validmind.ai/validmind/validmind.html#init_dataset) from the ValidMind (`vm`) module. For this example, we'll pass in the following arguments:\n", - "\n", - "- **`dataset`** — The raw dataset that you want to provide as input to tests.\n", - "- **`input_id`** — A unique identifier that allows tracking what inputs are used when running each individual test.\n", - "- **`target_column`** — A required argument if tests require access to true values. This is the name of the target column in the dataset." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Initialize the raw dataset\n", - "vm_raw_dataset = vm.init_dataset(\n", - " dataset=df,\n", - " input_id=\"raw_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")\n", - "\n", - "# Initialize the preprocessed dataset\n", - "vm_preprocess_dataset = vm.init_dataset(\n", - " dataset=preprocess_df,\n", - " input_id=\"preprocess_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")\n", - "\n", - "# Initialize the feature engineered dataset\n", - "vm_fe_dataset = vm.init_dataset(\n", - " dataset=fe_df,\n", - " input_id=\"fe_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")\n", - "\n", - "# Initialize the training dataset\n", - "vm_train_ds = vm.init_dataset(\n", - " dataset=train_df,\n", - " input_id=\"train_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")\n", - "\n", - "# Initialize the test dataset\n", - "vm_test_ds = vm.init_dataset(\n", - " dataset=test_df,\n", - " input_id=\"test_dataset\",\n", - " target_column=lending_club.target_column,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "After initialization, you can pass the ValidMind `Dataset` objects `vm_raw_dataset`, `vm_preprocess_dataset`, `vm_fe_dataset`, `vm_train_ds`, and `vm_test_ds` into any ValidMind tests." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_2__'></a>\n", - "\n", - "### Initialize the ValidMind models\n", - "\n", - "You'll also need to initialize ValidMind model objects (`vm_model`) that can be passed to other functions for analysis and tests on the data for each of our three models.\n", - "\n", - "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", - "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", - "\n", - "Initialize your model objects with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Initialize the champion application scorecard model\n", - "vm_xgb_model = vm.init_model(\n", - " xgb_model,\n", - " input_id=\"xgb_model_developer_champion\",\n", - ")\n", - "\n", - "# Initialize the challenger random forest classification model\n", - "vm_rf_model = vm.init_model(\n", - " rf_model,\n", - " input_id=\"rf_model\",\n", - ")\n", - "\n", - "# Initialize the challenger logistic regression model\n", - "vm_log_model = vm.init_model(\n", - " log_reg,\n", - " input_id=\"log_model\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_3__'></a>\n", - "\n", - "### Assign predictions\n", - "\n", - "With our models registered, we'll move on to assigning both the predictive probabilities coming directly from each model's predictions, and the binary prediction after applying the cutoff threshold described in the Compute binary predictions step above.\n", - "\n", - "- The [`assign_predictions()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#VMDataset.assign_predictions) from the `Dataset` object can link existing predictions to any number of models.\n", - "- This method links the model's class prediction values and probabilities to our `vm_train_ds` and `vm_test_ds` datasets." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Champion — Application scorecard model\n", - "vm_train_ds.assign_predictions(\n", - " model=vm_xgb_model,\n", - " prediction_values=train_xgb_binary_predictions,\n", - " prediction_probabilities=train_xgb_prob,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_xgb_model,\n", - " prediction_values=test_xgb_binary_predictions,\n", - " prediction_probabilities=test_xgb_prob,\n", - ")\n", - "\n", - "# Challenger — Random forest classification model\n", - "vm_train_ds.assign_predictions(\n", - " model=vm_rf_model,\n", - " prediction_values=train_rf_binary_predictions,\n", - " prediction_probabilities=train_rf_prob,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_rf_model,\n", - " prediction_values=test_rf_binary_predictions,\n", - " prediction_probabilities=test_rf_prob,\n", - ")\n", - "\n", - "\n", - "# Challenger — Logistic regression model\n", - "vm_train_ds.assign_predictions(\n", - " model=vm_log_model,\n", - " prediction_values=train_log_binary_predictions,\n", - " prediction_probabilities=train_log_prob,\n", - ")\n", - "\n", - "vm_test_ds.assign_predictions(\n", - " model=vm_log_model,\n", - " prediction_values=test_log_binary_predictions,\n", - " prediction_probabilities=test_log_prob,\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc4_4__'></a>\n", - "\n", - "### Compute credit risk scores\n", - "\n", - "Finally, we'll translate model predictions into actionable scores using probability estimates generated by our trained model:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Compute the scores\n", - "train_xgb_scores = lending_club.compute_scores(train_xgb_prob)\n", - "test_xgb_scores = lending_club.compute_scores(test_xgb_prob)\n", - "train_rf_scores = lending_club.compute_scores(train_rf_prob)\n", - "test_rf_scores = lending_club.compute_scores(test_rf_prob)\n", - "train_log_scores = lending_club.compute_scores(train_log_prob)\n", - "test_log_scores = lending_club.compute_scores(test_log_prob)\n", - "\n", - "# Assign scores to the datasets\n", - "vm_train_ds.add_extra_column(\"xgb_scores\", train_xgb_scores)\n", - "vm_test_ds.add_extra_column(\"xgb_scores\", test_xgb_scores)\n", - "vm_train_ds.add_extra_column(\"rf_scores\", train_rf_scores)\n", - "vm_test_ds.add_extra_column(\"rf_scores\", test_rf_scores)\n", - "vm_train_ds.add_extra_column(\"log_scores\", train_log_scores)\n", - "vm_test_ds.add_extra_column(\"log_scores\", test_log_scores)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5__'></a>\n", - "\n", - "## Running data quality tests\n", - "\n", - "With everything ready to go, let's explore some of ValidMind's available tests. Using ValidMind’s repository of tests streamlines your validation testing, and helps you ensure that your records are being validated appropriately." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_1__'></a>\n", - "\n", - "### Identify relevant data quality tests\n", - "\n", - "We want to narrow down the tests we want to run from the selection provided by ValidMind, so we'll use the [`vm.tests.list_tasks_and_tags()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tasks_and_tags) to list which `tags` are associated with each `task` type:\n", - "\n", - "- **`tasks`** represent the kind of modeling task associated with a test. Here we'll focus on `classification` tasks.\n", - "- **`tags`** are free-form descriptions providing more details about the test, for example, what category the test falls into. Here we'll focus on the `data_quality` tag." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tasks_and_tags()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Then we'll call [the `vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to list all the data quality tests for classification:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tests(\n", - " tags=[\"data_quality\"], task=\"classification\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Want to learn more about navigating ValidMind tests?</b></span>\n", - "<br></br>\n", - "Refer to our notebook outlining the utilities available for viewing and understanding available ValidMind tests: <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/explore_tests/explore_tests.html\" style=\"color: #DE257E;\"><b>Explore tests</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_2__'></a>\n", - "\n", - "### Run and log an individual data quality test\n", - "\n", - "Next, we'll use our previously initialized preprocessed dataset (`vm_preprocess_dataset`) as input to run an individual test, then log the result to the ValidMind Platform.\n", - "\n", - "- You run validation tests by calling [the `run_test` function](https://docs.validmind.ai/validmind/validmind/tests.html#run_test) provided by the `validmind.tests` module.\n", - "- Every test result returned by the `run_test()` function has a [`.log()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#TestResult.log) that can be used to send the test results to the ValidMind Platform.\n", - "\n", - "Here, we'll use the `HighPearsonCorrelation` test as an example:\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " test_id=\"validmind.data_validation.HighPearsonCorrelation\",\n", - " inputs={\n", - " \"dataset\": vm_preprocess_dataset\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Note the output returned indicating that a test-driven block doesn't currently exist in your documentation for some test IDs. </b></span>\n", - "<br></br>\n", - "That's expected, as when we run validations tests the results logged need to be manually added to your report as part of your compliance assessment process within the ValidMind Platform. You'll continue to see this message throughout this notebook as we run and log more tests.</div>" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_3__'></a>\n", - "\n", - "### Log multiple data quality tests\n", - "\n", - "Now that we understand how to run a test with ValidMind, we want to run all the tests that were returned for our `classification` tasks focusing on `data_quality`.\n", - "\n", - "We'll store the identified tests in `dq` in preparation for batch running these tests and logging their results to the ValidMind Platform:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "dq = vm.tests.list_tests(tags=[\"data_quality\"], task=\"classification\",pretty=False)\n", - "dq" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "With our data quality tests stored, let's run our first batch of tests using the same preprocessed dataset (`vm_preprocess_dataset`) and log their results." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for test in dq:\n", - " vm.tests.run_test(\n", - " test,\n", - " inputs={\n", - " \"dataset\": vm_preprocess_dataset\n", - " }\n", - " ).log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc5_4__'></a>\n", - "\n", - "### Run data quality comparison tests\n", - "\n", - "Next, let's reuse the tests in `dq` to perform comparison tests between the raw (`vm_raw_dataset`) and preprocessed (`vm_preprocess_dataset`) dataset, again logging the results to the ValidMind Platform:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for test in dq:\n", - " vm.tests.run_test(\n", - " test,\n", - " input_grid={\n", - " \"dataset\": [vm_raw_dataset,vm_preprocess_dataset]\n", - " }\n", - " ).log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6__'></a>\n", - "\n", - "## Running performance tests\n", - "\n", - "We'll also run some performance tests, beginning with independent testing of our champion application scorecard model, then moving on to our potential challenger models." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_1__'></a>\n", - "\n", - "### Identify relevant performance tests\n", - "\n", - "Use `vm.tests.list_tests()` to this time identify all the model performance tests for classification:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "\n", - "vm.tests.list_tests(tags=[\"model_performance\"], task=\"classification\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_2__'></a>\n", - "\n", - "### Run and log an individual performance test\n", - "\n", - "Before we run our batch of performance tests, we'll use our previously initialized testing dataset (`vm_test_ds`) as input to run an individual test, then log the result to the ValidMind Platform.\n", - "\n", - "When running individual tests, you can use a custom `result_id` to tag the individual result with a unique identifier by appending this `result_id` to the `test_id` with a `:` separator. We'll append an identifier for our champion model here (`xgboost_champion`):\n", - "\n", - "Here, we'll use the `ClassifierPerformance` test as an example:\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " test_id=\"validmind.model_validation.sklearn.ClassifierPerformance:xgboost_champion\",\n", - " inputs={\n", - " \"dataset\": vm_test_ds, \"model\" : vm_xgb_model\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_3__'></a>\n", - "\n", - "### Log multiple performance tests\n", - "\n", - "We only want to run a few other tests that were returned for our `classification` tasks focusing on `model_performance`, so we'll isolate the specific tests we want to batch run in `mpt`:\n", - "\n", - "- `ClassifierPerformance`\n", - "- `ConfusionMatrix`\n", - "- `MinimumAccuracy`\n", - "- `MinimumF1Score`\n", - "- `ROCCurve`\n", - "\n", - "Note the custom `result_id`s appended to the `test_id`s for our champion model (`xgboost_champion`):\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "mpt = [\n", - " \"validmind.model_validation.sklearn.ClassifierPerformance:xgboost_champion\",\n", - " \"validmind.model_validation.sklearn.ConfusionMatrix:xgboost_champion\",\n", - " \"validmind.model_validation.sklearn.MinimumAccuracy:xgboost_champion\",\n", - " \"validmind.model_validation.sklearn.MinimumF1Score:xgboost_champion\",\n", - " \"validmind.model_validation.sklearn.ROCCurve:xgboost_champion\"\n", - "]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_4__'></a>\n", - "\n", - "### Evaluate performance of the champion model\n", - "\n", - "Now, let's run and log our batch of model performance tests using our testing dataset (`vm_test_ds`) for our champion model:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for test in mpt:\n", - " vm.tests.run_test(\n", - " test,\n", - " inputs={\n", - " \"dataset\": vm_test_ds, \"model\" : vm_xgb_model\n", - " },\n", - " ).log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_5__'></a>\n", - "\n", - "### Evaluate performance of challenger models\n", - "\n", - "We've now conducted similar tests as the development team for our champion, with the aim of verifying their test results.\n", - "\n", - "Next, let's see how our challenger models compare. We'll use the same batch of tests here as we did in `mpt`, but append a different `result_id` to indicate that these results should be associated with our challenger models:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "mpt_chall = [\n", - " \"validmind.model_validation.sklearn.ClassifierPerformance:xgboost_champion_vs_challengers\",\n", - " \"validmind.model_validation.sklearn.ConfusionMatrix:xgboost_champion_vs_challengers\",\n", - " \"validmind.model_validation.sklearn.MinimumAccuracy:xgboost_champion_vs_challengers\",\n", - " \"validmind.model_validation.sklearn.MinimumF1Score:xgboost_champion_vs_challengers\",\n", - " \"validmind.model_validation.sklearn.ROCCurve:xgboost_champion_vs_challengers\"\n", - "]" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_5_1__'></a>\n", - "\n", - "#### Enable custom context for test descriptions" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "When you run ValidMind tests, test descriptions are automatically generated with LLM using the test results, the test name, and the static test definitions provided in the test’s docstring. While this metadata offers valuable high-level overviews of tests, insights produced by the LLM-based descriptions may not always align with your specific use cases or incorporate organizational policy requirements.\n", - "\n", - "Before we run our next batch of tests, we'll include some custom use case context to focus on comparison testing going forward, improving the relevancy, insight, and format of the test descriptions returned. By default, custom context for LLM-generated descriptions is disabled, meaning that the output will not include any additional context. To enable custom use case context, set the `VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED` environment variable to `1`.\n", - "\n", - "This is a global setting that will affect all tests for your linked model:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED\"] = \"1\"" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Enabling use case context allows you to pass in additional context to the LLM-generated text descriptions within `context`:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED\"] = \"1\"\n", - "\n", - "context = \"\"\"\n", - "FORMAT FOR THE LLM DESCRIPTIONS: \n", - " **<Test Name>** is designed to <begin with a concise overview of what the test does and its primary purpose, \n", - " extracted from the test description>.\n", - "\n", - " The test operates by <write a paragraph about the test mechanism, explaining how it works and what it measures. \n", - " Include any relevant formulas or methodologies mentioned in the test description.>\n", - "\n", - " The primary advantages of this test include <write a paragraph about the test's strengths and capabilities, \n", - " highlighting what makes it particularly useful for specific scenarios.>\n", - "\n", - " Users should be aware that <write a paragraph about the test's limitations and potential risks. \n", - " Include both technical limitations and interpretation challenges. \n", - " If the test description includes specific signs of high risk, incorporate these here.>\n", - "\n", - " **Key Insights:**\n", - "\n", - " The test results reveal:\n", - "\n", - " - **<insight title>**: <comprehensive description of one aspect of the results>\n", - " - **<insight title>**: <comprehensive description of another aspect>\n", - " ...\n", - "\n", - " Based on these results, <conclude with a brief paragraph that ties together the test results with the test's \n", - " purpose and provides any final recommendations or considerations.>\n", - "\n", - "ADDITIONAL INSTRUCTIONS:\n", - "\n", - " The champion model as the basis for comparison is called \"xgb_model_developer_champion\" and emphasis should be on the following:\n", - " - The metrics for the champion model compared against the challenger models\n", - " - Which model potentially outperforms the champion model based on the metrics, this should be highlighted and emphasized\n", - "\n", - "\n", - " For each metric in the test results, include in the test overview:\n", - " - The metric's purpose and what it measures\n", - " - Its mathematical formula\n", - " - The range of possible values\n", - " - What constitutes good/bad performance\n", - " - How to interpret different values\n", - "\n", - " Each insight should progressively cover:\n", - " 1. Overall scope and distribution\n", - " 2. Complete breakdown of all elements with specific values\n", - " 3. Natural groupings and patterns\n", - " 4. Comparative analysis between datasets/categories\n", - " 5. Stability and variations\n", - " 6. Notable relationships or dependencies\n", - "\n", - " Remember:\n", - " - Champion model (xgb_model_developer_champion) is the selection and challenger models are used to challenge the selection\n", - " - Keep all insights at the same level (no sub-bullets or nested structures)\n", - " - Make each insight complete and self-contained\n", - " - Include specific numerical values and ranges\n", - " - Cover all elements in the results comprehensively\n", - " - Maintain clear, concise language\n", - " - Use only \"- **Title**: Description\" format for insights\n", - " - Progress naturally from general to specific observations\n", - "\n", - "\"\"\".strip()\n", - "\n", - "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT\"] = context" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Want to learn more about setting custom context for LLM-generated test descriptions?</b></span>\n", - "<br></br>\n", - "Refer to our extended walkthrough notebook: <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/customize_test_result_descriptions.html\" style=\"color: #DE257E;\"><b>Add context to LLM-generated test descriptions\n", - "</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc6_5_2__'></a>\n", - "\n", - "#### Run performance comparison tests\n", - "\n", - "With the use case context set, we'll run each test in `mpt_chall` once for each model with the same `vm_test_ds` dataset to compare them:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for test in mpt_chall:\n", - " vm.tests.run_test(\n", - " test,\n", - " input_grid={\n", - " \"dataset\": [vm_test_ds], \"model\" : [vm_xgb_model,vm_log_model,vm_rf_model]\n", - " }\n", - " ).log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Based on the performance metrics, we can conclude that the random forest classification model is not a viable candidate for our use case and can be disregarded in our tests going forward.</b></span>\n", - "<br></br>\n", - "In the next section, we'll dive a bit deeper into some tests comparing our champion application scorecard model and our remaining challenger logistic regression model, including tests that will allow us to customize parameters and thresholds for performance standards.</div>" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc7__'></a>\n", - "\n", - "## Adjust a ValidMind test\n", - "\n", - "Let's dig deeper into the `MinimumF1Score` test we ran previously in Run performance tests to ensure that the models maintain a minimum acceptable balance between *precision* and *recall*. Precision refers to how many out of the positive predictions made by the model were actually correct, and recall refers to how many out of the actual positive cases did the model correctly identify.\n", - "\n", - "Use `run_test()` with our testing dataset (`vm_test_ds`) to run the test in isolation again for our two remaining models without logging the result to have the output to compare with a subsequent iteration:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.MinimumF1Score:xgboost_champion_vs_challengers\",\n", - " input_grid={\n", - " \"dataset\": [vm_test_ds],\n", - " \"model\": [vm_xgb_model, vm_log_model]\n", - " },\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "As `MinimumF1Score` allows us to customize parameters and thresholds for performance standards, let's adjust the threshold to see if it improves metrics:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = vm.tests.run_test(\n", - " \"validmind.model_validation.sklearn.MinimumF1Score:AdjThreshold\",\n", - " input_grid={\n", - " \"dataset\": [vm_test_ds],\n", - " \"model\": [vm_xgb_model, vm_log_model],\n", - " \"params\": {\"min_threshold\": 0.35}\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc8__'></a>\n", - "\n", - "## Run diagnostic tests\n", - "\n", - "Next, we want to inspect the robustness and stability testing comparison between our champion and challenger model.\n", - "\n", - "Use `list_tests()` to list all available diagnosis tests applicable to classification tasks:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.list_tests(tags=[\"model_diagnosis\"], task=\"classification\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's see if models suffer from any *overfit* potentials and also where there are potential sub-segments of issues with the `OverfitDiagnosis` test. \n", - "\n", - "Overfitting occurs when a model learns the training data too well, capturing not only the true pattern but noise and random fluctuations resulting in excellent performance on the training dataset but poor generalization to new, unseen data.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " test_id=\"validmind.model_validation.sklearn.OverfitDiagnosis:Champion_vs_LogRegression\",\n", - " input_grid={\n", - " \"datasets\": [[vm_train_ds,vm_test_ds]],\n", - " \"model\" : [vm_xgb_model,vm_log_model]\n", - " }\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's also conduct *robustness* and *stability* testing of the two models with the `RobustnessDiagnosis` test.\n", - "\n", - "Robustness refers to a model's ability to maintain consistent performance, and stability refers to a model's ability to produce consistent outputs over time across different data subsets.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vm.tests.run_test(\n", - " test_id=\"validmind.model_validation.sklearn.RobustnessDiagnosis:Champion_vs_LogRegression\",\n", - " input_grid={\n", - " \"datasets\": [[vm_train_ds,vm_test_ds]],\n", - " \"model\" : [vm_xgb_model,vm_log_model]\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc9__'></a>\n", - "\n", - "## Run feature importance tests\n", - "\n", - "We also want to verify the relative influence of different input features on our models' predictions, as well as inspect the differences between our champion and challenger model to see if a certain model offers more understandable or logical importance scores for features.\n", - "\n", - "Use `list_tests()` to identify all the feature importance tests for classification:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Store the feature importance tests\n", - "FI = vm.tests.list_tests(tags=[\"feature_importance\"], task=\"classification\",pretty=False)\n", - "FI" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Run and log our feature importance tests for both models for the testing dataset\n", - "for test in FI:\n", - " vm.tests.run_test(\n", - " \"\".join((test,':Champion_vs_LogisticRegression')),\n", - " input_grid={\n", - " \"dataset\": [vm_test_ds], \"model\" : [vm_xgb_model,vm_log_model]\n", - " },\n", - " ).log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc10__'></a>\n", - "\n", - "## Implement a custom test\n", - "\n", - "Let's finish up testing by implementing a custom *inline test* that outputs a FICO score-type score. An inline test refers to a test written and executed within the same environment as the code being tested — in this case, right in this Jupyter Notebook — without requiring a separate test file or framework.\n", - "\n", - "The [`@vm.test` wrapper](https://docs.validmind.ai/validmind/validmind.html#test) allows you to create a reusable test:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import numpy as np\n", - "import pandas as pd\n", - "import plotly.graph_objects as go\n", - "\n", - "@vm.test(\"my_custom_tests.ScoreToOdds\")\n", - "def score_to_odds_analysis(dataset, score_column='score', score_bands=[410, 440, 470]):\n", - " \"\"\"\n", - " Analyzes the relationship between score bands and odds (good:bad ratio).\n", - " Good odds = (1 - default_rate) / default_rate\n", - " \n", - " Higher scores should correspond to higher odds of being good.\n", - "\n", - " If there are multiple scores provided through score_column, this means that there are two different models and the scores reflect each model\n", - "\n", - " If there are more scores provided in the score_column then focus the assessment on the differences between the two scores and indicate through evidence which one is preferred.\n", - " \"\"\"\n", - " df = dataset.df\n", - " \n", - " # Create score bands\n", - " df['score_band'] = pd.cut(\n", - " df[score_column],\n", - " bins=[-np.inf] + score_bands + [np.inf],\n", - " labels=[f'<{score_bands[0]}'] + \n", - " [f'{score_bands[i]}-{score_bands[i+1]}' for i in range(len(score_bands)-1)] +\n", - " [f'>{score_bands[-1]}']\n", - " )\n", - " \n", - " # Calculate metrics per band\n", - " results = df.groupby('score_band').agg({\n", - " dataset.target_column: ['mean', 'count']\n", - " })\n", - " \n", - " results.columns = ['Default Rate', 'Total']\n", - " results['Good Count'] = results['Total'] - (results['Default Rate'] * results['Total'])\n", - " results['Bad Count'] = results['Default Rate'] * results['Total']\n", - " results['Odds'] = results['Good Count'] / results['Bad Count']\n", - " \n", - " # Create visualization\n", - " fig = go.Figure()\n", - " \n", - " # Add odds bars\n", - " fig.add_trace(go.Bar(\n", - " name='Odds (Good:Bad)',\n", - " x=results.index,\n", - " y=results['Odds'],\n", - " marker_color='blue'\n", - " ))\n", - " \n", - " fig.update_layout(\n", - " title='Score-to-Odds Analysis',\n", - " yaxis=dict(title='Odds Ratio (Good:Bad)'),\n", - " showlegend=False\n", - " )\n", - " \n", - " return fig" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "With the custom test available, run and log the test for our champion and challenger models with our testing dataset (`vm_test_ds`):" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "result = vm.tests.run_test(\n", - " \"my_custom_tests.ScoreToOdds:Champion_vs_Challenger\",\n", - " inputs={\n", - " \"dataset\": vm_test_ds,\n", - " },\n", - " param_grid={\n", - " \"score_column\": [\"xgb_scores\",\"log_scores\"],\n", - " \"score_bands\": [[500, 540, 570]],\n", - " },\n", - ").log()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Want to learn more about custom tests?</b></span>\n", - "<br></br>\n", - "Refer to our in-depth introduction to custom tests: <a href=\"../../how_to/tests/custom_tests/implement_custom_tests.ipynb\" style=\"color: #DE257E;\"><b>Implement custom tests</b></a></div>" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc11__'></a>\n", - "\n", - "## Verify test runs\n", - "\n", - "Our final task is to verify that all the tests provided by the development team were run and reported accurately. Note the appended `result_ids` to delineate which dataset we ran the test with for the relevant tests.\n", - "\n", - "Here, we'll specify all the tests we'd like to independently rerun in a dictionary called `test_config`. **Note here that `inputs` and `input_grid` expect the `input_id` of the dataset or model as the value rather than the variable name we specified**:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test_config = {\n", - " # Run with the raw dataset\n", - " 'validmind.data_validation.DatasetDescription:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'}\n", - " },\n", - " 'validmind.data_validation.DescriptiveStatistics:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'}\n", - " },\n", - " 'validmind.data_validation.MissingValues:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {'min_percentage_threshold': 1}\n", - " },\n", - " 'validmind.data_validation.ClassImbalance:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {'min_percent_threshold': 10}\n", - " },\n", - " 'validmind.data_validation.Duplicates:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {'min_threshold': 1}\n", - " },\n", - " 'validmind.data_validation.HighCardinality:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {\n", - " 'num_threshold': 100,\n", - " 'percent_threshold': 0.1,\n", - " 'threshold_type': 'percent'\n", - " }\n", - " },\n", - " 'validmind.data_validation.Skewness:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {'max_threshold': 1}\n", - " },\n", - " 'validmind.data_validation.UniqueRows:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {'min_percent_threshold': 1}\n", - " },\n", - " 'validmind.data_validation.TooManyZeroValues:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {'max_percent_threshold': 0.03}\n", - " },\n", - " 'validmind.data_validation.IQROutliersTable:raw_data': {\n", - " 'inputs': {'dataset': 'raw_dataset'},\n", - " 'params': {'threshold': 5}\n", - " },\n", - " # Run with the preprocessed dataset\n", - " 'validmind.data_validation.DescriptiveStatistics:preprocessed_data': {\n", - " 'inputs': {'dataset': 'preprocess_dataset'}\n", - " },\n", - " 'validmind.data_validation.TabularDescriptionTables:preprocessed_data': {\n", - " 'inputs': {'dataset': 'preprocess_dataset'}\n", - " },\n", - " 'validmind.data_validation.MissingValues:preprocessed_data': {\n", - " 'inputs': {'dataset': 'preprocess_dataset'},\n", - " 'params': {'min_percentage_threshold': 1}\n", - " },\n", - " 'validmind.data_validation.TabularNumericalHistograms:preprocessed_data': {\n", - " 'inputs': {'dataset': 'preprocess_dataset'}\n", - " },\n", - " 'validmind.data_validation.TabularCategoricalBarPlots:preprocessed_data': {\n", - " 'inputs': {'dataset': 'preprocess_dataset'}\n", - " },\n", - " 'validmind.data_validation.TargetRateBarPlots:preprocessed_data': {\n", - " 'inputs': {'dataset': 'preprocess_dataset'},\n", - " 'params': {'default_column': 'loan_status'}\n", - " },\n", - " # Run with the training and test datasets\n", - " 'validmind.data_validation.DescriptiveStatistics:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", - " },\n", - " 'validmind.data_validation.TabularDescriptionTables:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", - " },\n", - " 'validmind.data_validation.ClassImbalance:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", - " 'params': {'min_percent_threshold': 10}\n", - " },\n", - " 'validmind.data_validation.UniqueRows:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", - " 'params': {'min_percent_threshold': 1}\n", - " },\n", - " 'validmind.data_validation.TabularNumericalHistograms:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", - " },\n", - " 'validmind.data_validation.MutualInformation:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", - " 'params': {'min_threshold': 0.01}\n", - " },\n", - " 'validmind.data_validation.PearsonCorrelationMatrix:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", - " },\n", - " 'validmind.data_validation.HighPearsonCorrelation:development_data': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", - " 'params': {'max_threshold': 0.3, 'top_n_correlations': 10}\n", - " },\n", - " 'validmind.model_validation.ModelMetadata': {\n", - " 'input_grid': {'model': ['xgb_model_developer_champion', 'rf_model']}\n", - " },\n", - " 'validmind.model_validation.sklearn.ModelParameters': {\n", - " 'input_grid': {'model': ['xgb_model_developer_champion', 'rf_model']}\n", - " },\n", - " 'validmind.model_validation.sklearn.ROCCurve': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model_developer_champion']}\n", - " },\n", - " 'validmind.model_validation.sklearn.MinimumROCAUCScore': {\n", - " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model_developer_champion']},\n", - " 'params': {'min_threshold': 0.5}\n", - " }\n", - "}" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Then batch run and log our tests in `test_config`:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for t in test_config:\n", - " print(t)\n", - " try:\n", - " # Check if test has input_grid\n", - " if 'input_grid' in test_config[t]:\n", - " # For tests with input_grid, pass the input_grid configuration\n", - " if 'params' in test_config[t]:\n", - " vm.tests.run_test(t, input_grid=test_config[t]['input_grid'], params=test_config[t]['params']).log()\n", - " else:\n", - " vm.tests.run_test(t, input_grid=test_config[t]['input_grid']).log()\n", - " else:\n", - " # Original logic for regular inputs\n", - " if 'params' in test_config[t]:\n", - " vm.tests.run_test(t, inputs=test_config[t]['inputs'], params=test_config[t]['params']).log()\n", - " else:\n", - " vm.tests.run_test(t, inputs=test_config[t]['inputs']).log()\n", - " except Exception as e:\n", - " print(f\"Error running test {t}: {str(e)}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc12__'></a>\n", - "\n", - "## Next steps" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc12_1__'></a>\n", - "\n", - "### Work with your validation report\n", - "\n", - "Now that you've logged all your test results and verified the work done by the development team, head to the ValidMind Platform to wrap up your validation report:\n", - "\n", - "1. From the **Inventory** in the ValidMind Platform, go to the model you connected to earlier.\n", - "\n", - "2. In the left sidebar that appears for your model, click **Validation** under Documents.\n", - "\n", - "Include your logged test results as evidence, create risk assessment notes, add artifacts, and assess compliance, then submit your report for review when it's ready. (**Learn more:** [Preparing validation reports](https://docs.validmind.ai/guide/validation/preparing-validation-reports.html))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc12_2__'></a>\n", - "\n", - "### Discover more learning resources\n", - "\n", - "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", - "\n", - "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", - "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", - "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", - "\n", - "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "<a id='toc13__'></a>\n", - "\n", - "## Upgrade ValidMind\n", - "\n", - "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", - "\n", - "Retrieve the information for the currently installed version of ValidMind:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "%pip show validmind" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", - "\n", - "```bash\n", - "%pip install --upgrade validmind\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You may need to restart your kernel after running the upgrade package for changes to be applied." - ] - }, - { - "cell_type": "markdown", - "id": "copyright-7c52ad62bcf7411eaaa00aefbac6c756", - "metadata": {}, - "source": [ - "<!-- VALIDMIND COPYRIGHT -->\n", - "\n", - "<small>\n", - "\n", - "***\n", - "\n", - "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", - "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", - "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "ValidMind Library", - "language": "python", - "name": "validmind" - }, - "language_info": { - "name": "python", - "version": "3.10.13" - } - }, - "nbformat": 4, - "nbformat_minor": 2 + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Validate an application scorecard model\n", + "\n", + "Learn how to independently assess an application scorecard model developed using the ValidMind Library as a validator. You'll evaluate the development of the model by conducting thorough testing and analysis, including the use of challenger models to benchmark performance.\n", + "\n", + "An *application scorecard model* is a type of statistical model used in credit scoring to evaluate the creditworthiness of potential borrowers by generating a score based on various characteristics of an applicant such as credit history, income, employment status, and other relevant financial data.\n", + "\n", + " - This score assists lenders in making informed decisions about whether to approve or reject loan applications, as well as in determining the terms of the loan, including interest rates and credit limits.\n", + " - Effective validation of application scorecard models ensures that lenders can manage risk efficiently while maintaining a fast and transparent loan application process for applicants.\n", + "\n", + "This interactive notebook provides a step-by-step guide for:\n", + "\n", + "- Verifying the data quality steps performed by the development team\n", + "- Independently replicating the champion's results and conducting additional tests to assess performance, stability, and robustness\n", + "- Setting up test inputs and challenger models for comparative analysis\n", + "- Running validation tests, analyzing results, and logging artifacts (findings) to ValidMind" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "::: {.content-hidden when-format=\"html\"}\n", + "## Contents \n", + "- [About ValidMind](#toc1__) \n", + " - [Before you begin](#toc1_1__) \n", + " - [New to ValidMind?](#toc1_2__) \n", + " - [Key concepts](#toc1_3__) \n", + "- [Setting up](#toc2__) \n", + " - [Register a sample model](#toc2_1__) \n", + " - [Assign validator credentials](#toc2_1_1__) \n", + " - [Apply validation report template](#toc2_1_2__) \n", + " - [Install the ValidMind Library](#toc2_2__) \n", + " - [Initialize the ValidMind Library](#toc2_3__) \n", + " - [Get your code snippet](#toc2_3_1__) \n", + " - [Importing the champion model](#toc2_4__) \n", + " - [Load the sample dataset](#toc2_5__) \n", + " - [Preprocess the dataset](#toc2_5_1__) \n", + " - [Apply feature engineering to the dataset](#toc2_5_2__) \n", + " - [Split the feature engineered dataset](#toc2_6__) \n", + "- [Developing potential challenger models](#toc3__) \n", + " - [Train potential challenger models](#toc3_1__) \n", + " - [Random forest classification model](#toc3_1_1__) \n", + " - [Logistic regression model](#toc3_1_2__) \n", + " - [Extract predicted probabilities](#toc3_2__) \n", + " - [Compute binary predictions](#toc3_2_1__) \n", + "- [Initializing the ValidMind objects](#toc4__) \n", + " - [Initialize the ValidMind datasets](#toc4_1__) \n", + " - [Initialize the ValidMind models](#toc4_2__) \n", + " - [Assign predictions](#toc4_3__) \n", + " - [Compute credit risk scores](#toc4_4__) \n", + "- [Running data quality tests](#toc5__) \n", + " - [Identify relevant data quality tests](#toc5_1__) \n", + " - [Run and log an individual data quality test](#toc5_2__) \n", + " - [Log multiple data quality tests](#toc5_3__) \n", + " - [Run data quality comparison tests](#toc5_4__) \n", + "- [Running performance tests](#toc6__) \n", + " - [Identify relevant performance tests](#toc6_1__) \n", + " - [Run and log an individual performance test](#toc6_2__) \n", + " - [Log multiple performance tests](#toc6_3__) \n", + " - [Evaluate performance of the champion model](#toc6_4__) \n", + " - [Evaluate performance of challenger models](#toc6_5__) \n", + " - [Enable custom context for test descriptions](#toc6_5_1__) \n", + " - [Run performance comparison tests](#toc6_5_2__) \n", + "- [Adjust a ValidMind test](#toc7__) \n", + "- [Run diagnostic tests](#toc8__) \n", + "- [Run feature importance tests](#toc9__) \n", + "- [Implement a custom test](#toc10__) \n", + "- [Verify test runs](#toc11__) \n", + "- [Next steps](#toc12__) \n", + " - [Work with your validation report](#toc12_1__) \n", + " - [Discover more learning resources](#toc12_2__) \n", + "- [Upgrade ValidMind](#toc13__) \n", + "\n", + ":::\n", + "<!-- jn-toc-notebook-config\n", + "\tnumbering=false\n", + "\tanchor=true\n", + "\tflat=false\n", + "\tminLevel=2\n", + "\tmaxLevel=4\n", + "\t/jn-toc-notebook-config -->\n", + "<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1__'></a>\n", + "\n", + "## About ValidMind\n", + "\n", + "ValidMind is a suite of tools for managing risk, including risk associated with AI and statistical models.\n", + "\n", + "You use the ValidMind Library to automate comparison and other validation tests, and then use the ValidMind Platform to submit compliance assessments of champions via comprehensive validation reports. Together, these products simplify risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and developers." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_1__'></a>\n", + "\n", + "### Before you begin\n", + "\n", + "This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language. \n", + "\n", + "If you encounter errors due to missing modules in your Python environment, install the modules with `pip install`, and then re-run the notebook. For more help, refer to [Installing Python Modules](https://docs.python.org/3/installing/index.html)." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_2__'></a>\n", + "\n", + "### New to ValidMind?\n", + "\n", + "If you haven't already seen our documentation on the [ValidMind Library](https://docs.validmind.ai/developer/validmind-library.html), we recommend you begin by exploring the available resources in this section. There, you can learn more about validating records such as models and running tests, as well as find code samples and our Python Library API reference.\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>For access to all features available in this notebook, you'll need access to a ValidMind account.</b></span>\n", + "<br></br>\n", + "<a href=\"https://docs.validmind.ai/guide/access/register-with-validmind.html\" style=\"color: #DE257E;\"><b>Register with ValidMind</b></a></div>" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc1_3__'></a>\n", + "\n", + "### Key concepts\n", + "\n", + "**record**: A tool tracked in the ValidMind inventory, such as a model. Records include traditional statistical models, legacy systems, artificial intelligence/machine learning models, large language models (LLMs), agentic AI systems, and other documentable items that benefit from oversight, testing, and lifecycle management.\n", + "\n", + "**model**: [SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm) (which supersedes SR 11-7) defines a model as a \"complex quantitative method, system, or approach that applies statistical, economic, or financial theories to process input data into quantitative estimates.\" Simple arithmetic, deterministic rule-based processes, or software without statistical, economic, or financial theories underpinning their design or use are generally outside SR 26-2’s definition of a model. Within ValidMind, a model is a type of record tracked in the inventory.\n", + "\n", + "**validation report:** A validation report is a comprehensive and structured review evaluating a record's accuracy, performance, and suitability for its intended purpose. A report follows established validation guidelines to ensure consistency and adherence to internal and regulatory standards — encompassing the process of risk assessment, identifying areas of potential error or risk within the record's components, supporting transparency, regulatory compliance, and informed decision-making by documenting the validator’s independent review and conclusions.\n", + "\n", + "**document template**: Lays out the structure of documents, segmented into various sections and sub-sections, and functions as a test suite specifying the tests that should be run, and how the results should be displayed. Document templates help automate your development, validation, monitoring, and other risk management processes. Document templates are available for default ValidMind document types as well as custom document types.\n", + "\n", + "**validation report template**: A default ValidMind document template that serves as a standardized framework for conducting and documenting validation, including sections designated for attaching test results, evidence, or artifacts (findings). By outlining required documentation, recommended analyses, and expected validation tests, validation report templates ensure consistency and completeness across validation reports and help guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.\n", + "\n", + "**artifacts (findings)**: Observations or issues identified during validation, including any deviations from expected performance or standards. Artifacts are organized by type — default types provided by ValidMind include Validation Issue, Policy Exception, and Limitation. Custom artifact types can be created to track other categories relevant to your organization.\n", + "\n", + "**test**: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or record. Test results are logged to the ValidMind Platform, where they are attached to documents. Tests are the building blocks of ValidMind, used to evaluate and document records and datasets, and can be run individually or as part of a suite defined by your templates.\n", + "\n", + "**test suite**: A collection of tests designed to run together to automate and generate documentation end-to-end for specific use cases. (Learn more: [`test_suites`](https://docs.validmind.ai/validmind/validmind/test_suites.html))\n", + "\n", + "**metric**: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.\n", + "\n", + "**custom test**: Functions that you define to evaluate your record or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.\n", + "\n", + "**inputs**: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:\n", + "\n", + " - **model**: A single record that has been initialized in ValidMind with [`init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model). Despite the naming convention, model objects can be any type of record you want to test, document, validate, or monitor with ValidMind.\n", + " - **dataset**: A single dataset that has been initialized in ValidMind with [`init_dataset()`](https://docs.validmind.ai/validmind/validmind.html#init_dataset).\n", + " - **models**: A list of ValidMind records - usually this is used when you want to compare multiple records in your custom tests.\n", + " - **datasets**: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets](https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/run_tests_that_require_multiple_datasets.html))\n", + "\n", + "**parameters**: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.\n", + "\n", + "**outputs**: Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2__'></a>\n", + "\n", + "## Setting up" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1__'></a>\n", + "\n", + "### Register a sample model\n", + "\n", + "In a usual lifecycle, a champion will have been independently registered in your inventory and submitted to you for validation by your development team as part of the effective challenge process. (**Learn more:** [Submit documents](https://docs.validmind.ai/guide/documentation/submit-documents.html))\n", + "\n", + "For this notebook, we'll have you register a dummy record (model) in the ValidMind Platform inventory and assign yourself as the validator to familiarize you with the ValidMind interface and circumvent the need for an existing model:\n", + "\n", + "1. In a browser, [log in to ValidMind](https://docs.validmind.ai/guide/access/log-in-to-validmind.html).\n", + "\n", + "2. In the left sidebar, select **Inventory**.\n", + "\n", + "3. Under the **RECORD TYPE** drop-down, select `Model` and click **+ Register Model**. (Learn more: [Register records in the inventory](https://docs.validmind.ai/guide/inventory/register-records-in-inventory.html))\n", + "\n", + "4. Enter the model details and click **Next >** to continue to assignment of inventory record stakeholders.\n", + "\n", + "5. Select your own name under the **RECORD OWNER** drop-down — don’t worry, we’ll adjust these permissions next for validation.\n", + "\n", + "6. Click **Register Model** to add the model to your inventory." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1_1__'></a>\n", + "\n", + "#### Assign validator credentials\n", + "\n", + "In order to log tests as a validator instead of as a developer, on the details page that appears after you've successfully registered your sample model:\n", + "\n", + "1. Remove yourself as an owner:\n", + "\n", + " - Click on the **OWNERS** tile.\n", + " - Click the **x** next to your name to remove yourself from that model's role.\n", + " - Click **Save** to apply your changes to that role.\n", + "\n", + "2. Remove yourself as a developer:\n", + "\n", + " - Click on the **DEVELOPERS** tile.\n", + " - Click the **x** next to your name to remove yourself from that model's role.\n", + " - Click **Save** to apply your changes to that role.\n", + "\n", + "3. Add yourself as a validator:\n", + "\n", + " - Click on the **VALIDATORS** tile.\n", + " - Select your name from the drop-down menu.\n", + " - Click **Save** to apply your changes to that role." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_1_2__'></a>\n", + "\n", + "#### Apply validation report template\n", + "\n", + "Next, let's select a validation report template. A template predefines sections for your report and provides a general outline to follow, making the validation process much easier.\n", + "\n", + "1. In the left sidebar that appears for your model, click **Documents** and select **Validation**.\n", + "\n", + " If you cannot locate your Validation document, make sure Validation type documents are enabled for model records and create a new document. (**Learn more:** [Manage documents](https://docs.validmind.ai/guide/templates/manage-documents.html#add-record-documents))\n", + "\n", + "2. Under **TEMPLATE**, select `Generic Validation Report`.\n", + "\n", + "3. Click **Use Template** to apply the template." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_2__'></a>\n", + "\n", + "### Install the ValidMind Library\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Recommended Python versions</b></span>\n", + "<br></br>\n", + "Python 3.8 <= x <= 3.14</div>\n", + "\n", + "To install the library:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip install -q validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3__'></a>\n", + "\n", + "### Initialize the ValidMind Library" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_3_1__'></a>\n", + "\n", + "#### Get your code snippet\n", + "\n", + "Initialize the ValidMind Library with the *code snippet* unique to each record per document, ensuring your test results are uploaded to the correct record and automatically populated in the right document in the ValidMind Platform when you run the Library.\n", + "\n", + "1. On the left sidebar that appears for your model, select **Getting Started** and select `Validation` from the **DOCUMENT** drop-down menu.\n", + "\n", + "2. Click **Copy snippet to clipboard**.\n", + "\n", + "3. Next, [load your model identifier credentials from an `.env` file](https://docs.validmind.ai/developer/quickstart/store-credentials-in-env-file.html) or replace the placeholder with your own code snippet:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Load your model identifier credentials from an `.env` file\n", + "\n", + "%load_ext dotenv\n", + "%dotenv .env\n", + "\n", + "# Or replace with your code snippet\n", + "\n", + "import validmind as vm\n", + "\n", + "vm.init(\n", + " # api_host=\"...\",\n", + " # api_key=\"...\",\n", + " # api_secret=\"...\",\n", + " # model=\"...\",\n", + " document=\"validation-report\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_4__'></a>\n", + "\n", + "### Importing the champion model\n", + "\n", + "With the ValidMind Library set up and ready to go, let's go ahead and import the champion submitted by the development team in the format of a `.pkl` file: **[xgb_model_champion.pkl](xgb_model_champion.pkl)**" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import xgboost as xgb\n", + "\n", + "#Load the saved model\n", + "xgb_model = xgb.XGBClassifier()\n", + "xgb_model.load_model(\"xgb_model_champion.pkl\")\n", + "xgb_model" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Ensure that we have to appropriate order in feature names from Champion model and dataset\n", + "cols_when_model_builds = xgb_model.get_booster().feature_names" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_5__'></a>\n", + "\n", + "### Load the sample dataset\n", + "\n", + "Let's next import the public [Lending Club](https://www.kaggle.com/datasets/devanshi23/loan-data-2007-2014/data) dataset from Kaggle, which was used to develop the dummy champion model.\n", + "\n", + "- We'll use this dataset to review steps that should have been conducted during the initial development and documentation of the model to ensure that the model was built correctly.\n", + "- By independently performing steps such as preprocessing and feature engineering, we can confirm whether the model was built using appropriate and properly processed data.\n", + "\n", + "To be able to use the dataset, you'll need to import the dataset and load it into a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), a two-dimensional tabular data structure that makes use of rows and columns:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "from validmind.datasets.credit_risk import lending_club\n", + "\n", + "df = lending_club.load_data(source=\"offline\")\n", + "df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_5_1__'></a>\n", + "\n", + "#### Preprocess the dataset\n", + "\n", + "We'll first quickly preprocess the dataset for data quality testing purposes using `lending_club.preprocess`. This function performs the following operations:\n", + "\n", + "- Filters the dataset to include only loans for debt consolidation or credit card purposes\n", + "- Removes loans classified under the riskier grades \"F\" and \"G\"\n", + "- Excludes uncommon home ownership types and standardizes employment length and loan terms into numerical formats\n", + "- Discards unnecessary fields and any entries with missing information to maintain a clean and robust dataset for modeling" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "preprocess_df = lending_club.preprocess(df)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_5_2__'></a>\n", + "\n", + "#### Apply feature engineering to the dataset\n", + "\n", + "Feature engineering improves the dataset's structure to better match what our model expects, and ensures that the model performs optimally by leveraging additional insights from raw data.\n", + "\n", + "We'll apply the following transformations using the `ending_club.feature_engineering()` function to optimize the dataset for predictive modeling in our application scorecard:\n", + "\n", + "- **WoE encoding**: Converts both numerical and categorical features into Weight of Evidence (WoE) values. WoE is a statistical measure used in scorecard modeling that quantifies the relationship between a predictor variable and the binary target variable. It calculates the ratio of the distribution of good outcomes to the distribution of bad outcomes for each category or bin of a feature. This transformation helps to ensure that the features are predictive and consistent in their contribution to the model.\n", + "- **Integration of WoE bins**: Ensures that the WoE transformed values are integrated throughout the dataset, replacing the original feature values while excluding the target variable from this transformation. This transformation is used to maintain a consistent scale and impact of each variable within the model, which helps make the predictions more stable and accurate." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "fe_df = lending_club.feature_engineering(preprocess_df)\n", + "fe_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc2_6__'></a>\n", + "\n", + "### Split the feature engineered dataset\n", + "\n", + "With our dummy model imported and our independently preprocessed and feature engineered dataset ready to go, let's now **spilt our dataset into train and test** to start the validation testing process.\n", + "\n", + "Splitting our dataset into training and testing is essential for proper validation testing, as this helps assess how well the model generalizes to unseen data:\n", + "\n", + "- We begin by dividing our data, which is based on Weight of Evidence (WoE) features, into training and testing sets (`train_df`, `test_df`).\n", + "- With `lending_club.split`, we employ a simple random split, randomly allocating data points to each set to ensure a mix of examples in both." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Split the data\n", + "train_df, test_df = lending_club.split(fe_df, test_size=0.2)\n", + "\n", + "x_train = train_df.drop(lending_club.target_column, axis=1)\n", + "y_train = train_df[lending_club.target_column]\n", + "\n", + "x_test = test_df.drop(lending_club.target_column, axis=1)\n", + "y_test = test_df[lending_club.target_column]\n", + "\n", + "# Now let's apply the order of features from the champion model construction\n", + "x_train = x_train[cols_when_model_builds]\n", + "x_test = x_test[cols_when_model_builds]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "cols_use = ['annual_inc_woe',\n", + " 'verification_status_woe',\n", + " 'emp_length_woe',\n", + " 'installment_woe',\n", + " 'term_woe',\n", + " 'home_ownership_woe',\n", + " 'purpose_woe',\n", + " 'open_acc_woe',\n", + " 'total_acc_woe',\n", + " 'int_rate_woe',\n", + " 'sub_grade_woe',\n", + " 'grade_woe','loan_status']\n", + "\n", + "\n", + "train_df = train_df[cols_use]\n", + "test_df = test_df[cols_use]\n", + "test_df.head()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3__'></a>\n", + "\n", + "## Developing potential challenger models" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1__'></a>\n", + "\n", + "### Train potential challenger models\n", + "\n", + "We're curious how alternate models compare to our champion model, so let's train two challenger models as basis for our testing.\n", + "\n", + "Our selected options below offer decreased complexity in terms of implementation — such as lessened manual preprocessing — which can reduce the amount of risk for implementation. However, model risk is not calculated in isolation from a single factor, but rather in consideration with trade-offs in predictive performance, ease of interpretability, and overall alignment with business objectives." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1_1__'></a>\n", + "\n", + "#### Random forest classification model\n", + "\n", + "A *random forest classification model* is an ensemble machine learning algorithm that uses multiple decision trees to classify data. In ensemble learning, multiple models are combined to improve prediction accuracy and robustness.\n", + "\n", + "Random forest classification models generally have higher accuracy because they capture complex, non-linear relationships, but as a result they lack transparency in their predictions." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Import the Random Forest Classification model\n", + "from sklearn.ensemble import RandomForestClassifier\n", + "\n", + "# Create the model instance with 50 decision trees\n", + "rf_model = RandomForestClassifier(\n", + " n_estimators=50,\n", + " random_state=42,\n", + ")\n", + "\n", + "# Train the model\n", + "rf_model.fit(x_train, y_train)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_1_2__'></a>\n", + "\n", + "#### Logistic regression model\n", + "\n", + "A *logistic regression model* is a statistical machine learning algorithm that uses a linear equation (straight-line relationship between variables) and the logistic function (or sigmoid function, which maps any real-valued number to a range between `0` and `1`) to classify data. In statistical modeling, a single equation is used to estimate the probability of an outcome based on input features.\n", + "\n", + "Logistic regression models are simple and interpretable because they provide clear probability estimates and feature coefficients (numerical value that represents the influence of a particular input feature on the model's prediction), but they may struggle with capturing complex, non-linear relationships in the data." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Import the Logistic Regression model\n", + "from sklearn.linear_model import LogisticRegression\n", + "\n", + "# Logistic Regression grid params\n", + "log_reg_params = {\n", + " \"penalty\": [\"l1\", \"l2\"],\n", + " \"C\": [0.001, 0.01, 0.1, 1, 10, 100, 1000],\n", + " \"solver\": [\"liblinear\"],\n", + "}\n", + "\n", + "# Grid search for Logistic Regression\n", + "from sklearn.model_selection import GridSearchCV\n", + "\n", + "grid_log_reg = GridSearchCV(LogisticRegression(), log_reg_params)\n", + "grid_log_reg.fit(x_train, y_train)\n", + "\n", + "# Logistic Regression best estimator\n", + "log_reg = grid_log_reg.best_estimator_\n", + "log_reg" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2__'></a>\n", + "\n", + "### Extract predicted probabilities\n", + "\n", + "With our challenger models trained, let's extract the predicted probabilities from our three models:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Champion — Application scorecard model\n", + "train_xgb_prob = xgb_model.predict_proba(x_train)[:, 1]\n", + "test_xgb_prob = xgb_model.predict_proba(x_test)[:, 1]\n", + "\n", + "# Challenger — Random forest classification model\n", + "train_rf_prob = rf_model.predict_proba(x_train)[:, 1]\n", + "test_rf_prob = rf_model.predict_proba(x_test)[:, 1]\n", + "\n", + "# Challenger — Logistic regression model\n", + "train_log_prob = log_reg.predict_proba(x_train)[:, 1]\n", + "test_log_prob = log_reg.predict_proba(x_test)[:, 1]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc3_2_1__'></a>\n", + "\n", + "#### Compute binary predictions\n", + "\n", + "Next, we'll convert the probability predictions from our three models into a binary, based on a threshold of `0.3`:\n", + "\n", + "- If the probability is greater than `0.3`, the prediction becomes `1` (positive).\n", + "- Otherwise, it becomes `0` (negative)." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "cut_off_threshold = 0.3\n", + "\n", + "# Champion — Application scorecard model\n", + "train_xgb_binary_predictions = (train_xgb_prob > cut_off_threshold).astype(int)\n", + "test_xgb_binary_predictions = (test_xgb_prob > cut_off_threshold).astype(int)\n", + "\n", + "# Challenger — Random forest classification model\n", + "train_rf_binary_predictions = (train_rf_prob > cut_off_threshold).astype(int)\n", + "test_rf_binary_predictions = (test_rf_prob > cut_off_threshold).astype(int)\n", + "\n", + "# Challenger — Logistic regression model\n", + "train_log_binary_predictions = (train_log_prob > cut_off_threshold).astype(int)\n", + "test_log_binary_predictions = (test_log_prob > cut_off_threshold).astype(int)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4__'></a>\n", + "\n", + "## Initializing the ValidMind objects" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_1__'></a>\n", + "\n", + "### Initialize the ValidMind datasets\n", + "\n", + "Before you can run tests, you'll need to connect your data with a ValidMind `Dataset` object. **This step is always necessary every time you want to connect a dataset to documentation and produce test results through ValidMind,** but you only need to do it once per dataset.\n", + "\n", + "Initialize a ValidMind dataset object using the [`init_dataset` function](https://docs.validmind.ai/validmind/validmind.html#init_dataset) from the ValidMind (`vm`) module. For this example, we'll pass in the following arguments:\n", + "\n", + "- **`dataset`** — The raw dataset that you want to provide as input to tests.\n", + "- **`input_id`** — A unique identifier that allows tracking what inputs are used when running each individual test.\n", + "- **`target_column`** — A required argument if tests require access to true values. This is the name of the target column in the dataset." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Initialize the raw dataset\n", + "vm_raw_dataset = vm.init_dataset(\n", + " dataset=df,\n", + " input_id=\"raw_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")\n", + "\n", + "# Initialize the preprocessed dataset\n", + "vm_preprocess_dataset = vm.init_dataset(\n", + " dataset=preprocess_df,\n", + " input_id=\"preprocess_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")\n", + "\n", + "# Initialize the feature engineered dataset\n", + "vm_fe_dataset = vm.init_dataset(\n", + " dataset=fe_df,\n", + " input_id=\"fe_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")\n", + "\n", + "# Initialize the training dataset\n", + "vm_train_ds = vm.init_dataset(\n", + " dataset=train_df,\n", + " input_id=\"train_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")\n", + "\n", + "# Initialize the test dataset\n", + "vm_test_ds = vm.init_dataset(\n", + " dataset=test_df,\n", + " input_id=\"test_dataset\",\n", + " target_column=lending_club.target_column,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "After initialization, you can pass the ValidMind `Dataset` objects `vm_raw_dataset`, `vm_preprocess_dataset`, `vm_fe_dataset`, `vm_train_ds`, and `vm_test_ds` into any ValidMind tests." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_2__'></a>\n", + "\n", + "### Initialize the ValidMind models\n", + "\n", + "You'll also need to initialize ValidMind model objects (`vm_model`) that can be passed to other functions for analysis and tests on the data for each of our three models.\n", + "\n", + "- Despite the naming convention, ValidMind model objects can be any type of record you want to test, document, validate, or monitor with the ValidMind Library.\n", + "- From classical statistical and machine learning models, to generative and agentic AI systems and more, the ValidMind model object provides a consistent wrapper around your record so it can be passed as a unified input to any ValidMind test or test suite, with results sent directly to the ValidMind Platform.\n", + "\n", + "Initialize your model objects with [`vm.init_model()`](https://docs.validmind.ai/validmind/validmind.html#init_model):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Initialize the champion application scorecard model\n", + "vm_xgb_model = vm.init_model(\n", + " xgb_model,\n", + " input_id=\"xgb_model_developer_champion\",\n", + ")\n", + "\n", + "# Initialize the challenger random forest classification model\n", + "vm_rf_model = vm.init_model(\n", + " rf_model,\n", + " input_id=\"rf_model\",\n", + ")\n", + "\n", + "# Initialize the challenger logistic regression model\n", + "vm_log_model = vm.init_model(\n", + " log_reg,\n", + " input_id=\"log_model\",\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_3__'></a>\n", + "\n", + "### Assign predictions\n", + "\n", + "With our models registered, we'll move on to assigning both the predictive probabilities coming directly from each model's predictions, and the binary prediction after applying the cutoff threshold described in the Compute binary predictions step above.\n", + "\n", + "- The [`assign_predictions()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#VMDataset.assign_predictions) from the `Dataset` object can link existing predictions to any number of models.\n", + "- This method links the model's class prediction values and probabilities to our `vm_train_ds` and `vm_test_ds` datasets." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Champion — Application scorecard model\n", + "vm_train_ds.assign_predictions(\n", + " model=vm_xgb_model,\n", + " prediction_values=train_xgb_binary_predictions,\n", + " prediction_probabilities=train_xgb_prob,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_xgb_model,\n", + " prediction_values=test_xgb_binary_predictions,\n", + " prediction_probabilities=test_xgb_prob,\n", + ")\n", + "\n", + "# Challenger — Random forest classification model\n", + "vm_train_ds.assign_predictions(\n", + " model=vm_rf_model,\n", + " prediction_values=train_rf_binary_predictions,\n", + " prediction_probabilities=train_rf_prob,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_rf_model,\n", + " prediction_values=test_rf_binary_predictions,\n", + " prediction_probabilities=test_rf_prob,\n", + ")\n", + "\n", + "\n", + "# Challenger — Logistic regression model\n", + "vm_train_ds.assign_predictions(\n", + " model=vm_log_model,\n", + " prediction_values=train_log_binary_predictions,\n", + " prediction_probabilities=train_log_prob,\n", + ")\n", + "\n", + "vm_test_ds.assign_predictions(\n", + " model=vm_log_model,\n", + " prediction_values=test_log_binary_predictions,\n", + " prediction_probabilities=test_log_prob,\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc4_4__'></a>\n", + "\n", + "### Compute credit risk scores\n", + "\n", + "Finally, we'll translate model predictions into actionable scores using probability estimates generated by our trained model:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Compute the scores\n", + "train_xgb_scores = lending_club.compute_scores(train_xgb_prob)\n", + "test_xgb_scores = lending_club.compute_scores(test_xgb_prob)\n", + "train_rf_scores = lending_club.compute_scores(train_rf_prob)\n", + "test_rf_scores = lending_club.compute_scores(test_rf_prob)\n", + "train_log_scores = lending_club.compute_scores(train_log_prob)\n", + "test_log_scores = lending_club.compute_scores(test_log_prob)\n", + "\n", + "# Assign scores to the datasets\n", + "vm_train_ds.add_extra_column(\"xgb_scores\", train_xgb_scores)\n", + "vm_test_ds.add_extra_column(\"xgb_scores\", test_xgb_scores)\n", + "vm_train_ds.add_extra_column(\"rf_scores\", train_rf_scores)\n", + "vm_test_ds.add_extra_column(\"rf_scores\", test_rf_scores)\n", + "vm_train_ds.add_extra_column(\"log_scores\", train_log_scores)\n", + "vm_test_ds.add_extra_column(\"log_scores\", test_log_scores)" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5__'></a>\n", + "\n", + "## Running data quality tests\n", + "\n", + "With everything ready to go, let's explore some of ValidMind's available tests. Using ValidMind’s repository of tests streamlines your validation testing, and helps you ensure that your records are being validated appropriately." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_1__'></a>\n", + "\n", + "### Identify relevant data quality tests\n", + "\n", + "We want to narrow down the tests we want to run from the selection provided by ValidMind, so we'll use the [`vm.tests.list_tasks_and_tags()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tasks_and_tags) to list which `tags` are associated with each `task` type:\n", + "\n", + "- **`tasks`** represent the kind of modeling task associated with a test. Here we'll focus on `classification` tasks.\n", + "- **`tags`** are free-form descriptions providing more details about the test, for example, what category the test falls into. Here we'll focus on the `data_quality` tag." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tasks_and_tags()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Then we'll call [the `vm.tests.list_tests()` function](https://docs.validmind.ai/validmind/validmind/tests.html#list_tests) to list all the data quality tests for classification:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tests(\n", + " tags=[\"data_quality\"], task=\"classification\"\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Want to learn more about navigating ValidMind tests?</b></span>\n", + "<br></br>\n", + "Refer to our notebook outlining the utilities available for viewing and understanding available ValidMind tests: <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/explore_tests/explore_tests.html\" style=\"color: #DE257E;\"><b>Explore tests</b></a></div>" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_2__'></a>\n", + "\n", + "### Run and log an individual data quality test\n", + "\n", + "Next, we'll use our previously initialized preprocessed dataset (`vm_preprocess_dataset`) as input to run an individual test, then log the result to the ValidMind Platform.\n", + "\n", + "- You run validation tests by calling [the `run_test` function](https://docs.validmind.ai/validmind/validmind/tests.html#run_test) provided by the `validmind.tests` module.\n", + "- Every test result returned by the `run_test()` function has a [`.log()` method](https://docs.validmind.ai/validmind/validmind/vm_models.html#TestResult.log) that can be used to send the test results to the ValidMind Platform.\n", + "\n", + "Here, we'll use the `data_validation.HighPearsonCorrelation` test as an example:\n" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " test_id=\"validmind.data_validation.HighPearsonCorrelation\",\n", + " inputs={\n", + " \"dataset\": vm_preprocess_dataset\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Note the output returned indicating that a test-driven block doesn't currently exist in your documentation for some test IDs. </b></span>\n", + "<br></br>\n", + "That's expected, as when we run validations tests the results logged need to be manually added to your report as part of your compliance assessment process within the ValidMind Platform. You'll continue to see this message throughout this notebook as we run and log more tests.</div>" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_3__'></a>\n", + "\n", + "### Log multiple data quality tests\n", + "\n", + "Now that we understand how to run a test with ValidMind, we want to run all the tests that were returned for our `classification` tasks focusing on `data_quality`.\n", + "\n", + "We'll store the identified tests in `dq` in preparation for batch running these tests and logging their results to the ValidMind Platform:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "dq = vm.tests.list_tests(tags=[\"data_quality\"], task=\"classification\",pretty=False)\n", + "dq" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "With our data quality tests stored, let's run our first batch of tests using the same preprocessed dataset (`vm_preprocess_dataset`) and log their results." + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "for test in dq:\n", + " vm.tests.run_test(\n", + " test,\n", + " inputs={\n", + " \"dataset\": vm_preprocess_dataset\n", + " }\n", + " ).log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc5_4__'></a>\n", + "\n", + "### Run data quality comparison tests\n", + "\n", + "Next, let's reuse the tests in `dq` to perform comparison tests between the raw (`vm_raw_dataset`) and preprocessed (`vm_preprocess_dataset`) dataset, again logging the results to the ValidMind Platform:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "for test in dq:\n", + " vm.tests.run_test(\n", + " test,\n", + " input_grid={\n", + " \"dataset\": [vm_raw_dataset,vm_preprocess_dataset]\n", + " }\n", + " ).log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6__'></a>\n", + "\n", + "## Running performance tests\n", + "\n", + "We'll also run some performance tests, beginning with independent testing of our champion application scorecard model, then moving on to our potential challenger models." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_1__'></a>\n", + "\n", + "### Identify relevant performance tests\n", + "\n", + "Use `vm.tests.list_tests()` to this time identify all the model performance tests for classification:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "\n", + "vm.tests.list_tests(tags=[\"model_performance\"], task=\"classification\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_2__'></a>\n", + "\n", + "### Run and log an individual performance test\n", + "\n", + "Before we run our batch of performance tests, we'll use our previously initialized testing dataset (`vm_test_ds`) as input to run an individual test, then log the result to the ValidMind Platform.\n", + "\n", + "When running individual tests, you can use a custom `result_id` to tag the individual result with a unique identifier by appending this `result_id` to the `test_id` with a `:` separator. We'll append an identifier for our champion model here (`xgboost_champion`):\n", + "\n", + "Here, we'll use the `model_validation.sklearn.ClassifierPerformance` test as an example:\n" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " test_id=\"validmind.model_validation.sklearn.ClassifierPerformance:xgboost_champion\",\n", + " inputs={\n", + " \"dataset\": vm_test_ds, \"model\" : vm_xgb_model\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_3__'></a>\n", + "\n", + "### Log multiple performance tests\n", + "\n", + "We only want to run a few other tests that were returned for our `classification` tasks focusing on `model_performance`, so we'll isolate the specific tests we want to batch run in `mpt`:\n", + "\n", + "- `model_validation.sklearn.ClassifierPerformance`\n", + "- `model_validation.sklearn.ConfusionMatrix`\n", + "- `model_validation.sklearn.MinimumAccuracy`\n", + "- `model_validation.sklearn.MinimumF1Score`\n", + "- `model_validation.sklearn.ROCCurve`\n", + "\n", + "Note the custom `result_id`s appended to the `test_id`s for our champion model (`xgboost_champion`):\n" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "mpt = [\n", + " \"validmind.model_validation.sklearn.ClassifierPerformance:xgboost_champion\",\n", + " \"validmind.model_validation.sklearn.ConfusionMatrix:xgboost_champion\",\n", + " \"validmind.model_validation.sklearn.MinimumAccuracy:xgboost_champion\",\n", + " \"validmind.model_validation.sklearn.MinimumF1Score:xgboost_champion\",\n", + " \"validmind.model_validation.sklearn.ROCCurve:xgboost_champion\"\n", + "]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_4__'></a>\n", + "\n", + "### Evaluate performance of the champion model\n", + "\n", + "Now, let's run and log our batch of model performance tests using our testing dataset (`vm_test_ds`) for our champion model:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "for test in mpt:\n", + " vm.tests.run_test(\n", + " test,\n", + " inputs={\n", + " \"dataset\": vm_test_ds, \"model\" : vm_xgb_model\n", + " },\n", + " ).log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_5__'></a>\n", + "\n", + "### Evaluate performance of challenger models\n", + "\n", + "We've now conducted similar tests as the development team for our champion, with the aim of verifying their test results.\n", + "\n", + "Next, let's see how our challenger models compare. We'll use the same batch of tests here as we did in `mpt`, but append a different `result_id` to indicate that these results should be associated with our challenger models:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "mpt_chall = [\n", + " \"validmind.model_validation.sklearn.ClassifierPerformance:xgboost_champion_vs_challengers\",\n", + " \"validmind.model_validation.sklearn.ConfusionMatrix:xgboost_champion_vs_challengers\",\n", + " \"validmind.model_validation.sklearn.MinimumAccuracy:xgboost_champion_vs_challengers\",\n", + " \"validmind.model_validation.sklearn.MinimumF1Score:xgboost_champion_vs_challengers\",\n", + " \"validmind.model_validation.sklearn.ROCCurve:xgboost_champion_vs_challengers\"\n", + "]" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_5_1__'></a>\n", + "\n", + "#### Enable custom context for test descriptions" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "When you run ValidMind tests, test descriptions are automatically generated with LLM using the test results, the test name, and the static test definitions provided in the test’s docstring. While this metadata offers valuable high-level overviews of tests, insights produced by the LLM-based descriptions may not always align with your specific use cases or incorporate organizational policy requirements.\n", + "\n", + "Before we run our next batch of tests, we'll include some custom use case context to focus on comparison testing going forward, improving the relevancy, insight, and format of the test descriptions returned. By default, custom context for LLM-generated descriptions is disabled, meaning that the output will not include any additional context. To enable custom use case context, set the `VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED` environment variable to `1`.\n", + "\n", + "This is a global setting that will affect all tests for your linked model:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import os\n", + "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED\"] = \"1\"" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Enabling use case context allows you to pass in additional context to the LLM-generated text descriptions within `context`:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import os\n", + "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT_ENABLED\"] = \"1\"\n", + "\n", + "context = \"\"\"\n", + "FORMAT FOR THE LLM DESCRIPTIONS: \n", + " **<Test Name>** is designed to <begin with a concise overview of what the test does and its primary purpose, \n", + " extracted from the test description>.\n", + "\n", + " The test operates by <write a paragraph about the test mechanism, explaining how it works and what it measures. \n", + " Include any relevant formulas or methodologies mentioned in the test description.>\n", + "\n", + " The primary advantages of this test include <write a paragraph about the test's strengths and capabilities, \n", + " highlighting what makes it particularly useful for specific scenarios.>\n", + "\n", + " Users should be aware that <write a paragraph about the test's limitations and potential risks. \n", + " Include both technical limitations and interpretation challenges. \n", + " If the test description includes specific signs of high risk, incorporate these here.>\n", + "\n", + " **Key Insights:**\n", + "\n", + " The test results reveal:\n", + "\n", + " - **<insight title>**: <comprehensive description of one aspect of the results>\n", + " - **<insight title>**: <comprehensive description of another aspect>\n", + " ...\n", + "\n", + " Based on these results, <conclude with a brief paragraph that ties together the test results with the test's \n", + " purpose and provides any final recommendations or considerations.>\n", + "\n", + "ADDITIONAL INSTRUCTIONS:\n", + "\n", + " The champion model as the basis for comparison is called \"xgb_model_developer_champion\" and emphasis should be on the following:\n", + " - The metrics for the champion model compared against the challenger models\n", + " - Which model potentially outperforms the champion model based on the metrics, this should be highlighted and emphasized\n", + "\n", + "\n", + " For each metric in the test results, include in the test overview:\n", + " - The metric's purpose and what it measures\n", + " - Its mathematical formula\n", + " - The range of possible values\n", + " - What constitutes good/bad performance\n", + " - How to interpret different values\n", + "\n", + " Each insight should progressively cover:\n", + " 1. Overall scope and distribution\n", + " 2. Complete breakdown of all elements with specific values\n", + " 3. Natural groupings and patterns\n", + " 4. Comparative analysis between datasets/categories\n", + " 5. Stability and variations\n", + " 6. Notable relationships or dependencies\n", + "\n", + " Remember:\n", + " - Champion model (xgb_model_developer_champion) is the selection and challenger models are used to challenge the selection\n", + " - Keep all insights at the same level (no sub-bullets or nested structures)\n", + " - Make each insight complete and self-contained\n", + " - Include specific numerical values and ranges\n", + " - Cover all elements in the results comprehensively\n", + " - Maintain clear, concise language\n", + " - Use only \"- **Title**: Description\" format for insights\n", + " - Progress naturally from general to specific observations\n", + "\n", + "\"\"\".strip()\n", + "\n", + "os.environ[\"VALIDMIND_LLM_DESCRIPTIONS_CONTEXT\"] = context" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Want to learn more about setting custom context for LLM-generated test descriptions?</b></span>\n", + "<br></br>\n", + "Refer to our extended walkthrough notebook: <a href=\"https://docs.validmind.ai/notebooks/how_to/tests/run_tests/configure_tests/customize_test_result_descriptions.html\" style=\"color: #DE257E;\"><b>Add context to LLM-generated test descriptions\n", + "</b></a></div>" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc6_5_2__'></a>\n", + "\n", + "#### Run performance comparison tests\n", + "\n", + "With the use case context set, we'll run each test in `mpt_chall` once for each model with the same `vm_test_ds` dataset to compare them:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "for test in mpt_chall:\n", + " vm.tests.run_test(\n", + " test,\n", + " input_grid={\n", + " \"dataset\": [vm_test_ds], \"model\" : [vm_xgb_model,vm_log_model,vm_rf_model]\n", + " }\n", + " ).log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Based on the performance metrics, we can conclude that the random forest classification model is not a viable candidate for our use case and can be disregarded in our tests going forward.</b></span>\n", + "<br></br>\n", + "In the next section, we'll dive a bit deeper into some tests comparing our champion application scorecard model and our remaining challenger logistic regression model, including tests that will allow us to customize parameters and thresholds for performance standards.</div>" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc7__'></a>\n", + "\n", + "## Adjust a ValidMind test\n", + "\n", + "Let's dig deeper into the `model_validation.sklearn.MinimumF1Score` test we ran previously in Run performance tests to ensure that the models maintain a minimum acceptable balance between *precision* and *recall*. Precision refers to how many out of the positive predictions made by the model were actually correct, and recall refers to how many out of the actual positive cases did the model correctly identify.\n", + "\n", + "Use `run_test()` with our testing dataset (`vm_test_ds`) to run the test in isolation again for our two remaining models without logging the result to have the output to compare with a subsequent iteration:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.MinimumF1Score:xgboost_champion_vs_challengers\",\n", + " input_grid={\n", + " \"dataset\": [vm_test_ds],\n", + " \"model\": [vm_xgb_model, vm_log_model]\n", + " },\n", + ")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As `model_validation.sklearn.MinimumF1Score` allows us to customize parameters and thresholds for performance standards, let's adjust the threshold to see if it improves metrics:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = vm.tests.run_test(\n", + " \"validmind.model_validation.sklearn.MinimumF1Score:AdjThreshold\",\n", + " input_grid={\n", + " \"dataset\": [vm_test_ds],\n", + " \"model\": [vm_xgb_model, vm_log_model],\n", + " \"params\": {\"min_threshold\": 0.35}\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc8__'></a>\n", + "\n", + "## Run diagnostic tests\n", + "\n", + "Next, we want to inspect the robustness and stability testing comparison between our champion and challenger model.\n", + "\n", + "Use `list_tests()` to list all available diagnosis tests applicable to classification tasks:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.list_tests(tags=[\"model_diagnosis\"], task=\"classification\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's see if models suffer from any *overfit* potentials and also where there are potential sub-segments of issues with the `model_validation.sklearn.OverfitDiagnosis` test. \n", + "\n", + "Overfitting occurs when a model learns the training data too well, capturing not only the true pattern but noise and random fluctuations resulting in excellent performance on the training dataset but poor generalization to new, unseen data.\n" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " test_id=\"validmind.model_validation.sklearn.OverfitDiagnosis:Champion_vs_LogRegression\",\n", + " input_grid={\n", + " \"datasets\": [[vm_train_ds,vm_test_ds]],\n", + " \"model\" : [vm_xgb_model,vm_log_model]\n", + " }\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's also conduct *robustness* and *stability* testing of the two models with the `model_validation.sklearn.RobustnessDiagnosis` test.\n", + "\n", + "Robustness refers to a model's ability to maintain consistent performance, and stability refers to a model's ability to produce consistent outputs over time across different data subsets.\n" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "vm.tests.run_test(\n", + " test_id=\"validmind.model_validation.sklearn.RobustnessDiagnosis:Champion_vs_LogRegression\",\n", + " input_grid={\n", + " \"datasets\": [[vm_train_ds,vm_test_ds]],\n", + " \"model\" : [vm_xgb_model,vm_log_model]\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc9__'></a>\n", + "\n", + "## Run feature importance tests\n", + "\n", + "We also want to verify the relative influence of different input features on our models' predictions, as well as inspect the differences between our champion and challenger model to see if a certain model offers more understandable or logical importance scores for features.\n", + "\n", + "Use `list_tests()` to identify all the feature importance tests for classification:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Store the feature importance tests\n", + "FI = vm.tests.list_tests(tags=[\"feature_importance\"], task=\"classification\",pretty=False)\n", + "FI" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "# Run and log our feature importance tests for both models for the testing dataset\n", + "for test in FI:\n", + " vm.tests.run_test(\n", + " \"\".join((test,':Champion_vs_LogisticRegression')),\n", + " input_grid={\n", + " \"dataset\": [vm_test_ds], \"model\" : [vm_xgb_model,vm_log_model]\n", + " },\n", + " ).log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc10__'></a>\n", + "\n", + "## Implement a custom test\n", + "\n", + "Let's finish up testing by implementing a custom *inline test* that outputs a FICO score-type score. An inline test refers to a test written and executed within the same environment as the code being tested — in this case, right in this Jupyter Notebook — without requiring a separate test file or framework.\n", + "\n", + "The [`@vm.test` wrapper](https://docs.validmind.ai/validmind/validmind.html#test) allows you to create a reusable test:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "import numpy as np\n", + "import pandas as pd\n", + "import plotly.graph_objects as go\n", + "\n", + "@vm.test(\"my_custom_tests.ScoreToOdds\")\n", + "def score_to_odds_analysis(dataset, score_column='score', score_bands=[410, 440, 470]):\n", + " \"\"\"\n", + " Analyzes the relationship between score bands and odds (good:bad ratio).\n", + " Good odds = (1 - default_rate) / default_rate\n", + " \n", + " Higher scores should correspond to higher odds of being good.\n", + "\n", + " If there are multiple scores provided through score_column, this means that there are two different models and the scores reflect each model\n", + "\n", + " If there are more scores provided in the score_column then focus the assessment on the differences between the two scores and indicate through evidence which one is preferred.\n", + " \"\"\"\n", + " df = dataset.df\n", + " \n", + " # Create score bands\n", + " df['score_band'] = pd.cut(\n", + " df[score_column],\n", + " bins=[-np.inf] + score_bands + [np.inf],\n", + " labels=[f'<{score_bands[0]}'] + \n", + " [f'{score_bands[i]}-{score_bands[i+1]}' for i in range(len(score_bands)-1)] +\n", + " [f'>{score_bands[-1]}']\n", + " )\n", + " \n", + " # Calculate metrics per band\n", + " results = df.groupby('score_band').agg({\n", + " dataset.target_column: ['mean', 'count']\n", + " })\n", + " \n", + " results.columns = ['Default Rate', 'Total']\n", + " results['Good Count'] = results['Total'] - (results['Default Rate'] * results['Total'])\n", + " results['Bad Count'] = results['Default Rate'] * results['Total']\n", + " results['Odds'] = results['Good Count'] / results['Bad Count']\n", + " \n", + " # Create visualization\n", + " fig = go.Figure()\n", + " \n", + " # Add odds bars\n", + " fig.add_trace(go.Bar(\n", + " name='Odds (Good:Bad)',\n", + " x=results.index,\n", + " y=results['Odds'],\n", + " marker_color='blue'\n", + " ))\n", + " \n", + " fig.update_layout(\n", + " title='Score-to-Odds Analysis',\n", + " yaxis=dict(title='Odds Ratio (Good:Bad)'),\n", + " showlegend=False\n", + " )\n", + " \n", + " return fig" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "With the custom test available, run and log the test for our champion and challenger models with our testing dataset (`vm_test_ds`):" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "result = vm.tests.run_test(\n", + " \"my_custom_tests.ScoreToOdds:Champion_vs_Challenger\",\n", + " inputs={\n", + " \"dataset\": vm_test_ds,\n", + " },\n", + " param_grid={\n", + " \"score_column\": [\"xgb_scores\",\"log_scores\"],\n", + " \"score_bands\": [[500, 540, 570]],\n", + " },\n", + ").log()" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\"><span style=\"color: #083E44;\"><b>Want to learn more about custom tests?</b></span>\n", + "<br></br>\n", + "Refer to our in-depth introduction to custom tests: <a href=\"../../how_to/tests/custom_tests/implement_custom_tests.ipynb\" style=\"color: #DE257E;\"><b>Implement custom tests</b></a></div>" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc11__'></a>\n", + "\n", + "## Verify test runs\n", + "\n", + "Our final task is to verify that all the tests provided by the development team were run and reported accurately. Note the appended `result_ids` to delineate which dataset we ran the test with for the relevant tests.\n", + "\n", + "Here, we'll specify all the tests we'd like to independently rerun in a dictionary called `test_config`. **Note here that `inputs` and `input_grid` expect the `input_id` of the dataset or model as the value rather than the variable name we specified**:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "test_config = {\n", + " # Run with the raw dataset\n", + " 'validmind.data_validation.DatasetDescription:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'}\n", + " },\n", + " 'validmind.data_validation.DescriptiveStatistics:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'}\n", + " },\n", + " 'validmind.data_validation.MissingValues:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {'min_percentage_threshold': 1}\n", + " },\n", + " 'validmind.data_validation.ClassImbalance:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {'min_percent_threshold': 10}\n", + " },\n", + " 'validmind.data_validation.Duplicates:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {'min_threshold': 1}\n", + " },\n", + " 'validmind.data_validation.HighCardinality:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {\n", + " 'num_threshold': 100,\n", + " 'percent_threshold': 0.1,\n", + " 'threshold_type': 'percent'\n", + " }\n", + " },\n", + " 'validmind.data_validation.Skewness:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {'max_threshold': 1}\n", + " },\n", + " 'validmind.data_validation.UniqueRows:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {'min_percent_threshold': 1}\n", + " },\n", + " 'validmind.data_validation.TooManyZeroValues:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {'max_percent_threshold': 0.03}\n", + " },\n", + " 'validmind.data_validation.IQROutliersTable:raw_data': {\n", + " 'inputs': {'dataset': 'raw_dataset'},\n", + " 'params': {'threshold': 5}\n", + " },\n", + " # Run with the preprocessed dataset\n", + " 'validmind.data_validation.DescriptiveStatistics:preprocessed_data': {\n", + " 'inputs': {'dataset': 'preprocess_dataset'}\n", + " },\n", + " 'validmind.data_validation.TabularDescriptionTables:preprocessed_data': {\n", + " 'inputs': {'dataset': 'preprocess_dataset'}\n", + " },\n", + " 'validmind.data_validation.MissingValues:preprocessed_data': {\n", + " 'inputs': {'dataset': 'preprocess_dataset'},\n", + " 'params': {'min_percentage_threshold': 1}\n", + " },\n", + " 'validmind.data_validation.TabularNumericalHistograms:preprocessed_data': {\n", + " 'inputs': {'dataset': 'preprocess_dataset'}\n", + " },\n", + " 'validmind.data_validation.TabularCategoricalBarPlots:preprocessed_data': {\n", + " 'inputs': {'dataset': 'preprocess_dataset'}\n", + " },\n", + " 'validmind.data_validation.TargetRateBarPlots:preprocessed_data': {\n", + " 'inputs': {'dataset': 'preprocess_dataset'},\n", + " 'params': {'default_column': 'loan_status'}\n", + " },\n", + " # Run with the training and test datasets\n", + " 'validmind.data_validation.DescriptiveStatistics:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", + " },\n", + " 'validmind.data_validation.TabularDescriptionTables:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", + " },\n", + " 'validmind.data_validation.ClassImbalance:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", + " 'params': {'min_percent_threshold': 10}\n", + " },\n", + " 'validmind.data_validation.UniqueRows:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", + " 'params': {'min_percent_threshold': 1}\n", + " },\n", + " 'validmind.data_validation.TabularNumericalHistograms:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", + " },\n", + " 'validmind.data_validation.MutualInformation:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", + " 'params': {'min_threshold': 0.01}\n", + " },\n", + " 'validmind.data_validation.PearsonCorrelationMatrix:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']}\n", + " },\n", + " 'validmind.data_validation.HighPearsonCorrelation:development_data': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset']},\n", + " 'params': {'max_threshold': 0.3, 'top_n_correlations': 10}\n", + " },\n", + " 'validmind.model_validation.ModelMetadata': {\n", + " 'input_grid': {'model': ['xgb_model_developer_champion', 'rf_model']}\n", + " },\n", + " 'validmind.model_validation.sklearn.ModelParameters': {\n", + " 'input_grid': {'model': ['xgb_model_developer_champion', 'rf_model']}\n", + " },\n", + " 'validmind.model_validation.sklearn.ROCCurve': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model_developer_champion']}\n", + " },\n", + " 'validmind.model_validation.sklearn.MinimumROCAUCScore': {\n", + " 'input_grid': {'dataset': ['train_dataset', 'test_dataset'], 'model': ['xgb_model_developer_champion']},\n", + " 'params': {'min_threshold': 0.5}\n", + " }\n", + "}" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Then batch run and log our tests in `test_config`:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "for t in test_config:\n", + " print(t)\n", + " try:\n", + " # Check if test has input_grid\n", + " if 'input_grid' in test_config[t]:\n", + " # For tests with input_grid, pass the input_grid configuration\n", + " if 'params' in test_config[t]:\n", + " vm.tests.run_test(t, input_grid=test_config[t]['input_grid'], params=test_config[t]['params']).log()\n", + " else:\n", + " vm.tests.run_test(t, input_grid=test_config[t]['input_grid']).log()\n", + " else:\n", + " # Original logic for regular inputs\n", + " if 'params' in test_config[t]:\n", + " vm.tests.run_test(t, inputs=test_config[t]['inputs'], params=test_config[t]['params']).log()\n", + " else:\n", + " vm.tests.run_test(t, inputs=test_config[t]['inputs']).log()\n", + " except Exception as e:\n", + " print(f\"Error running test {t}: {str(e)}\")" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc12__'></a>\n", + "\n", + "## Next steps" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc12_1__'></a>\n", + "\n", + "### Work with your validation report\n", + "\n", + "Now that you've logged all your test results and verified the work done by the development team, head to the ValidMind Platform to wrap up your validation report:\n", + "\n", + "1. From the **Inventory** in the ValidMind Platform, go to the model you connected to earlier.\n", + "\n", + "2. In the left sidebar that appears for your model, click **Validation** under Documents.\n", + "\n", + "Include your logged test results as evidence, create risk assessment notes, add artifacts, and assess compliance, then submit your report for review when it's ready. (**Learn more:** [Preparing validation reports](https://docs.validmind.ai/guide/validation/preparing-validation-reports.html))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc12_2__'></a>\n", + "\n", + "### Discover more learning resources\n", + "\n", + "We also offer many interactive notebooks to help you use the ValidMind Library to streamline your work:\n", + "\n", + "- [Run tests & test suites](https://docs.validmind.ai/developer/how-to/testing-overview.html)\n", + "- [Use ValidMind Library features](https://docs.validmind.ai/developer/how-to/feature-overview.html)\n", + "- [Code samples by use case](https://docs.validmind.ai/developer/samples-jupyter-notebooks.html)\n", + "\n", + "Or, visit our [documentation](https://docs.validmind.ai/) to learn more about ValidMind." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<a id='toc13__'></a>\n", + "\n", + "## Upgrade ValidMind\n", + "\n", + "<div class=\"alert alert-block alert-info\" style=\"background-color: #B5B5B510; color: black; border: 1px solid #083E44; border-left-width: 5px; box-shadow: 2px 2px 4px rgba(0, 0, 0, 0.2);border-radius: 5px;\">After installing ValidMind, you’ll want to periodically make sure you are on the latest version to access any new features and other enhancements.</div>\n", + "\n", + "Retrieve the information for the currently installed version of ValidMind:" + ] + }, + { + "cell_type": "code", + "metadata": {}, + "source": [ + "%pip show validmind" + ], + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If the version returned is lower than the version indicated in our [production open-source code](https://github.com/validmind/validmind-library/blob/prod/validmind/__version__.py), restart your notebook and run:\n", + "\n", + "```bash\n", + "%pip install --upgrade validmind\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You may need to restart your kernel after running the upgrade package for changes to be applied." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "<!-- VALIDMIND COPYRIGHT -->\n", + "\n", + "<small>\n", + "\n", + "***\n", + "\n", + "Copyright © 2023-2026 ValidMind Inc. All rights reserved.<br>\n", + "Refer to [LICENSE](https://github.com/validmind/validmind-library/blob/main/LICENSE) for details.<br>\n", + "SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial</small>" + ], + "id": "copyright-7c52ad62bcf7411eaaa00aefbac6c756" + } + ], + "metadata": { + "kernelspec": { + "display_name": "ValidMind Library", + "language": "python", + "name": "validmind" + }, + "language_info": { + "name": "python", + "version": "3.10.13" + } + }, + "nbformat": 4, + "nbformat_minor": 2 } diff --git a/site/releases/2024/2024-aug-13/release-notes.qmd b/site/releases/2024/2024-aug-13/release-notes.qmd index a1d4760ee0..3302c709e4 100644 --- a/site/releases/2024/2024-aug-13/release-notes.qmd +++ b/site/releases/2024/2024-aug-13/release-notes.qmd @@ -192,7 +192,7 @@ Manage both upstream and downstream model interdependencies: ::: {.w-50-ns} -![Model dependency management](manage-model-interdependencies.png){fig-alt="An screenshot showcasing the Manage Model Interdependences screen" .screenshot group="interdependencies"} +![Model dependency management](manage-model-interdependencies.png){fig-alt="A screenshot showcasing the Manage Model Interdependencies screen" .screenshot group="interdependencies"} ::: diff --git a/site/releases/2024/2024-dec-06/release-notes.qmd b/site/releases/2024/2024-dec-06/release-notes.qmd index ca302bc3f1..653ce046b3 100644 --- a/site/releases/2024/2024-dec-06/release-notes.qmd +++ b/site/releases/2024/2024-dec-06/release-notes.qmd @@ -13,7 +13,7 @@ listing: max-description-length: 250 # image-height: 100% contents: - - path: ../../../about/overview-model-documentation.qmd + - path: ../../../about/overview-documentation.qmd title: "Learn more — {{< var validmind.developer >}}" description: "The {{< var validmind.developer >}} is a Python library and documentation engine designed to streamline the process of documenting various types of models." fields: [title, description] @@ -216,7 +216,7 @@ You can now add, edit, or remove custom data to your analytics within the {{< va ::: ::: {.w-50-ns .tc} -![Example setup for a custom stacked bar chart](/guide/reporting/custom-visualization-setup.png){width=85% fig-alt="An screenshot of an example setup for a custom stacked bar chart" .screenshot} +![Example setup for a custom stacked bar chart](/guide/reporting/custom-visualization-setup.png){width=85% fig-alt="A screenshot of an example setup for a custom stacked bar chart" .screenshot} ::: @@ -713,7 +713,7 @@ If more than one set of test results has been logged with the {{< var validmind. :::: -![Historical test result filters](test-result-filters.png){ fig-alt="An screenshot of the historical test result filters" .screenshot width=90%} +![Historical test result filters](test-result-filters.png){ fig-alt="A screenshot of the historical test result filters" .screenshot width=90%} <!--- diff --git a/site/releases/2024/2024-feb-14/highlights.qmd b/site/releases/2024/2024-feb-14/highlights.qmd index 9fcf3ac718..ca223cc2e9 100644 --- a/site/releases/2024/2024-feb-14/highlights.qmd +++ b/site/releases/2024/2024-feb-14/highlights.qmd @@ -173,7 +173,7 @@ To enable model developers to know what task types and tags are available to fil [init_dataset()](/validmind/validmind.qmd#init_dataset){.button target="_blank" .button-green} -[init_model()](/validmind/validmind.qmd#init_model){.button target="_blank".button-green} +[init_model()](/validmind/validmind.qmd#init_model){.button target="_blank" .button-green} ::: @@ -252,7 +252,7 @@ You can now narrow down models in your **{{< fa cubes >}} Inventory** with our a ::: ::: {.w-30-ns .tc} -[Search, filter, and sort models](/guide/inventory/working-with-the-inventory.qmd#search-filter-and-sort-models){.button} +[Search, filter, and sort records](/guide/inventory/working-with-the-inventory.qmd#search-filter-and-sort-records){.button} ::: diff --git a/site/releases/2024/2024-may-22/release-notes.qmd b/site/releases/2024/2024-may-22/release-notes.qmd index 531c279db2..6a7e6319af 100644 --- a/site/releases/2024/2024-may-22/release-notes.qmd +++ b/site/releases/2024/2024-may-22/release-notes.qmd @@ -820,7 +820,7 @@ Also available is an improved look and functionality for filtering the **{{< fa ::: {.w-40-ns} -[Search, filter, and sort models](/guide/inventory/working-with-the-inventory.qmd#search-filter-and-sort-models){.button} +[Search, filter, and sort records](/guide/inventory/working-with-the-inventory.qmd#search-filter-and-sort-records){.button} ::: diff --git a/site/releases/2024/2024-oct-22/release-notes.qmd b/site/releases/2024/2024-oct-22/release-notes.qmd index ea144ea8d2..4bcff311ff 100644 --- a/site/releases/2024/2024-oct-22/release-notes.qmd +++ b/site/releases/2024/2024-oct-22/release-notes.qmd @@ -174,7 +174,7 @@ When you need to decommission models that you no longer need, you can now archiv :::: {.flex .flex-wrap .justify-around} ::: {.w-70-ns} -You now have new stages for inventory models, including `ACTIVE`, `ARCHIVED`, and `DELETED`, which are shown as a new column in the model inventory and as field in the model overview. +You now have new stages for inventory models, including `ACTIVE`, `ARCHIVED`, and `DELETED`, which are shown as a new column in the model inventory and as a field in the model overview. ::: diff --git a/site/releases/2025/2025-jan-31/release-notes.qmd b/site/releases/2025/2025-jan-31/release-notes.qmd index 4367dc1e49..eecdd7e308 100644 --- a/site/releases/2025/2025-jan-31/release-notes.qmd +++ b/site/releases/2025/2025-jan-31/release-notes.qmd @@ -540,7 +540,7 @@ We replaced the plugin for the editor of mathematical equations and formulas. Th The new editor also includes a real-time preview and common mathematical symbols for easier equation creation. ::: {.tc} -[Add mathematical formulas](/guide/documentation/work-with-content-blocks.html#add-mathematical-formulas.qmd){.button} +[Add mathematical formulas](/guide/documentation/work-with-content-blocks.qmd#insert-mathematical-formulas){.button} ::: ::: diff --git a/site/releases/_metadata.yml b/site/releases/_metadata.yml index 623ad3459e..e521a2dd9a 100644 --- a/site/releases/_metadata.yml +++ b/site/releases/_metadata.yml @@ -2,8 +2,6 @@ # Refer to the LICENSE file in the root of this repository for details. # SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial -search: false - filters: - category-filter diff --git a/site/support/_submit-feedback.qmd b/site/support/_submit-feedback.qmd index 605b5f2ba8..8c8122e005 100644 --- a/site/support/_submit-feedback.qmd +++ b/site/support/_submit-feedback.qmd @@ -8,7 +8,7 @@ SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> Did you know you can submit feedback without leaving the {{< var validmind.platform >}}? -1. [Log in to ValidMind](/guide/access/log-in-to-validmind.qmd). +1. [Log in to {{< var vm.product >}}](/guide/access/log-in-to-validmind.qmd). 2. On any page within the {{< var vm.platform >}}, click on **Talk to us.** diff --git a/site/support/support.qmd b/site/support/support.qmd index aec916f877..ae75dc12e8 100644 --- a/site/support/support.qmd +++ b/site/support/support.qmd @@ -23,7 +23,7 @@ listing: grid-columns: 1 contents: - path: https://support.validmind.com - title: "{{< fa cricle-question >}} {{< var vm.product >}} Help Center" + title: "{{< fa circle-question >}} {{< var vm.product >}} Help Center" subtitle: "https://{{< var support.center >}} {{< fa angle-right >}}" description: "Sign in with your {{< var vm.product >}} account and then click on **Submit a request**." fields: [title, subtitle, description] diff --git a/site/support/troubleshooting.qmd b/site/support/troubleshooting.qmd index 1a057a71f3..443618c45e 100644 --- a/site/support/troubleshooting.qmd +++ b/site/support/troubleshooting.qmd @@ -64,7 +64,7 @@ or ### Fix -Make sure that you are using the correct initialization credentials for the model you are trying to connect to. +Make sure that you are using the correct initialization credentials for the record (model) you are trying to connect to. Follow the steps in [Install and initialize the {{< var validmind.developer >}}](/developer/quickstart/install-and-initialize-validmind-library.qmd) for detailed instructions on how to integrate the {{< var vm.developer >}} and upload to the {{< var vm.platform >}}. diff --git a/site/training/administrator-fundamentals/_invite-new-user.qmd b/site/training/administrator-fundamentals/_invite-new-user.qmd index 104a05816f..6c999f1ed5 100644 --- a/site/training/administrator-fundamentals/_invite-new-user.qmd +++ b/site/training/administrator-fundamentals/_invite-new-user.qmd @@ -34,7 +34,7 @@ b. Then, confirm that the invitation has disappeared from Pending Invites and th :::: {.content-hidden unless-format="revealjs"} **Invite a new user** -1. Enter in the details under Invite by Email: +1. Enter the details under Invite by Email: - **Group** — The group you created earlier. - **Role** — The role you created earlier. 2. Click **{{< fa envelope >}} Send Invite**. diff --git a/site/training/administrator-fundamentals/onboarding-your-organization.qmd b/site/training/administrator-fundamentals/onboarding-your-organization.qmd index be2041460c..140a167272 100644 --- a/site/training/administrator-fundamentals/onboarding-your-organization.qmd +++ b/site/training/administrator-fundamentals/onboarding-your-organization.qmd @@ -165,7 +165,7 @@ Get your organization ready for use by first defining business units and use cas ::: 1. Click **{{< fa plus >}} Add Business Unit** under Business Units. -2. Enter in your **[business unit name]{.smallcaps}**. +2. Enter your **[business unit name]{.smallcaps}**. 3. Click **Add Business Unit** to save your changes. When you're done, click [{{< fa chevron-right >}}]() to continue. diff --git a/site/training/administrator-fundamentals/organizational-oversight-reporting.qmd b/site/training/administrator-fundamentals/organizational-oversight-reporting.qmd index ecfb3bf891..728fc2488f 100644 --- a/site/training/administrator-fundamentals/organizational-oversight-reporting.qmd +++ b/site/training/administrator-fundamentals/organizational-oversight-reporting.qmd @@ -307,7 +307,7 @@ Manage custom reports ::: 1. Click **{{< fa plus >}} Add Page**. -2. On the Add New Page module, enter in the **[page name]{.smallcaps}** and the **[description]{.smallcaps}**. +2. On the Add New Page module, enter the **[page name]{.smallcaps}** and the **[description]{.smallcaps}**. 3. Click **Add New Page** to create your custom analytics page. When you're done, click [{{< fa chevron-right >}}]() to continue. @@ -323,7 +323,7 @@ When you're done, click [{{< fa chevron-right >}}]() to continue. 1. Click on the tab for the custom page you added previously. 1. Click **{{< fa pencil >}} Edit Dashboard** and select **{{< fa pencil >}} Add Widget** then **{{< fa pencil >}} Add Visualization**. -3. On the Add Visualization panel, enter in your **[title]{.smallcaps}**. +3. On the Add Visualization panel, enter your **[title]{.smallcaps}**. 4. Select a **[visualization type]{.smallcaps}**. 5. Select a **[dataset]{.smallcaps}**. 6. Select the visualization configuration options to apply to the dataset. diff --git a/site/training/administrator-fundamentals/using-validmind-for-risk-management.qmd b/site/training/administrator-fundamentals/using-validmind-for-risk-management.qmd index 759a7387a2..e1de109313 100644 --- a/site/training/administrator-fundamentals/using-validmind-for-risk-management.qmd +++ b/site/training/administrator-fundamentals/using-validmind-for-risk-management.qmd @@ -278,7 +278,7 @@ To set up a new custom workflow, you'll need to complete these four steps in seq 1. Click **{{< fa plus >}} Add Workflow**. 2. Select **Inventory Record** under [workflow target]{.smallcaps}. -3. Enter in a **[title]{.smallcaps}** and a **[description]{.smallcaps}** the workflow. +3. Enter a **[title]{.smallcaps}** and a **[description]{.smallcaps}** for the workflow. 4. Under [record type]{.smallcaps}, select **Model**. 5. Under **[workflow start]{.smallcaps}**, select **Manually**. 6. Under **[workflow expected duration]{.smallcaps}**, define the SLA for the workflow. @@ -548,7 +548,7 @@ Add assessment questions 2. Select the regulation or policy you added previously by clicking on it. 3. Select the assessment you added previously by clicking on it. 4. Click **{{< fa plus >}} Add Question** to create a new question. -5. Enter in the **[questions]{.smallcaps}**. +5. Enter the **[questions]{.smallcaps}**. 6. Click **Add # Question(s)** to insert questions into the assessment. When you're done, click [{{< fa chevron-right >}}]() to continue. @@ -664,7 +664,7 @@ When you're done, click [{{< fa chevron-right >}}]() to continue. Click on the name of the attestation you added previously to configure it: 1. Click **{{< fa plus >}} Add Attestation Period** and add a period. - - Enter in the **[name]{.smallcaps}**, **[start date]{.smallcaps}**, and **[end date]{.smallcaps}**. + - Enter the **[name]{.smallcaps}**, **[start date]{.smallcaps}**, and **[end date]{.smallcaps}**. - Click **Add Period**. 2. Under Relevant Attestation Fields, drag fields into the **Relevant Attestation Fields** column to display in model snapshots. 3. Under Questionnaire Template, click the template area to edit, then click **Save** to apply your changes. diff --git a/site/training/common-slides/_register-sample-model.qmd b/site/training/common-slides/_register-sample-model.qmd index 0fa40653b8..7a7c5349e1 100644 --- a/site/training/common-slides/_register-sample-model.qmd +++ b/site/training/common-slides/_register-sample-model.qmd @@ -9,7 +9,7 @@ SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> ::: {.slideover--r .three-quarters .auto-collapse-10} **Register a sample model** -1. Enter in some details for your sample model, then click **Next {{< fa angle-right >}}**. +1. Enter some details for your sample model, then click **Next {{< fa angle-right >}}**. 2. Select your own name under the **[record owner]{.smallcaps}** drop-down. 3. Click **Register Model** to add the model to your inventory. @@ -28,7 +28,7 @@ When you're done, click [{{< fa chevron-right >}}]() to continue. ::: {.slideover--r .three-quarters .auto-collapse-10} **Register a sample model** -1. Enter in some details for your sample model, then click **Next {{< fa angle-right >}}**. +1. Enter some details for your sample model, then click **Next {{< fa angle-right >}}**. 2. Select your own name under the **[record owner]{.smallcaps}** drop-down — don't worry, we'll adjust these permissions next for validation. 3. Click **Register Model** to add the model to your inventory. diff --git a/site/training/common-slides/_validmind-test-repository.qmd b/site/training/common-slides/_validmind-test-repository.qmd index ce418f48aa..e2ab46b5d8 100644 --- a/site/training/common-slides/_validmind-test-repository.qmd +++ b/site/training/common-slides/_validmind-test-repository.qmd @@ -7,7 +7,7 @@ SPDX-License-Identifier: AGPL-3.0 AND ValidMind Commercial --> :::: {.slideover--l .three-quarters .auto-collapse-5} **{{< var vm.product >}} test repository** -{{< var vm.product >}} provides a wealth out-of-the-box of tests to help you ensure that your record (model) is being built appropriately. +{{< var vm.product >}} provides a wealth of out-of-the-box tests to help you ensure that your record (model) is being built appropriately. In this module, you'll become familiar with the individual tests available in {{< var vm.product >}}, as well as how to run them and change parameters as necessary. diff --git a/site/training/developer-fundamentals/developer-fundamentals-register.qmd b/site/training/developer-fundamentals/developer-fundamentals-register.qmd index 103a41084e..ca0a84e2f3 100644 --- a/site/training/developer-fundamentals/developer-fundamentals-register.qmd +++ b/site/training/developer-fundamentals/developer-fundamentals-register.qmd @@ -44,7 +44,7 @@ listing: fields: [title, subtitle, description, reading-time] --- -Learn how to use {{< var vm.product >}} as a **developer** to generate documentation, automate testing, and track your record (model)'s progress throughout its entire lifecycle. +Learn how to use {{< var vm.product >}} as a **developer** to generate documentation, automate testing, and track your record's progress throughout its entire lifecycle. ::: {.column-margin} {{< include /training/_compatibility.qmd >}} diff --git a/site/training/program/learning-paths.qmd b/site/training/program/learning-paths.qmd index 8fc4cac2f2..7d469a3128 100644 --- a/site/training/program/learning-paths.qmd +++ b/site/training/program/learning-paths.qmd @@ -80,7 +80,7 @@ Learn how to use {{< var vm.product >}} as an **administrator** to onboard your :::: {.flex .flex-wrap .justify-around} ::: {.w-80-ns} -Learn how to use {{< var vm.product >}} as a **developer** to generate documentation, automate testing, and track your record (model)'s progress throughout its entire lifecycle. +Learn how to use {{< var vm.product >}} as a **developer** to generate documentation, automate testing, and track your record's progress throughout its entire lifecycle. ::: @@ -182,7 +182,7 @@ Learn how to use {{< var vm.product >}} as a **validator** to generate validatio #### <sup>Module 3</sup><br> Developing Potential Challengers - Initialize records (models) for use with the {{< var validmind.developer >}} -- Run and log out-of-the box and custom tests +- Run and log out-of-the-box and custom tests - Use the results of tests to log artifacts (findings) ::: @@ -314,7 +314,7 @@ As a solutions architect who is new to {{< var vm.product >}}, learn how to set #### Module 1: TBD - Notebooks that show running tests, test suites, single-function documentation -- LaTex formulas in JSON templates +- LaTeX formulas in JSON templates ::: --> diff --git a/site/training/validator-fundamentals/developing-potential-challengers.qmd b/site/training/validator-fundamentals/developing-potential-challengers.qmd index 9e378ca934..7625b6b481 100644 --- a/site/training/validator-fundamentals/developing-potential-challengers.qmd +++ b/site/training/validator-fundamentals/developing-potential-challengers.qmd @@ -366,7 +366,7 @@ As we can observe from the output in our notebook, our champion model doesn't pa 3. Click on **2.2.2. Model Performance** to expand that section. 4. Under the Model Performance Metrics guideline, click to expand the **Artifacts** panel. 5. Click **{{< fa link >}} Link Artifact** and select **Validation Issue** as the type of artifact. -6. Click **{{< fa plus >}} Add Validation Issue** and enter in the details for your validation issue. +6. Click **{{< fa plus >}} Add Validation Issue** and enter the details for your validation issue. 7. Click **Add Validation Issue** to submit the validation issue. 8. Select the validation issue you just added to link to your validation report. 9. Click **Update Linked Artifacts** to insert your validation issue. diff --git a/site/training/validator-fundamentals/validator-fundamentals-register.qmd b/site/training/validator-fundamentals/validator-fundamentals-register.qmd index f87346a32b..81f500c9e4 100644 --- a/site/training/validator-fundamentals/validator-fundamentals-register.qmd +++ b/site/training/validator-fundamentals/validator-fundamentals-register.qmd @@ -32,7 +32,7 @@ listing: - path: developing-potential-challengers.html title: "Developing Potential Challengers" subtitle: "Module 3" - description: "{{< fa check >}} Initialize records (models) for use with the {{< var validmind.developer >}} <br> {{< fa check >}} Run and log out-of-the box and custom tests <br> {{< fa check >}} Use the results of tests to log artifacts (findings)" + description: "{{< fa check >}} Initialize records (models) for use with the {{< var validmind.developer >}} <br> {{< fa check >}} Run and log out-of-the-box and custom tests <br> {{< fa check >}} Use the results of tests to log artifacts (findings)" reading-time: "75" author: "{{< var vm.product >}}" - path: finalizing-validation-reports.html