In [None]:
# !pip install trubrics rich

**In this tutorial of the [Titanic Use Case](https://www.kaggle.com/c/titanic), you will:**
- Initialise a `DataContext` with ML datasets and metadata from the titanic use case
- Build some out-of-the-box validations on a trained model and the `DataContext` with the `ModelValidator`
- Build a custom validation
- Save validations to a `Trubric`

## Load data & model

In [None]:
import rich  # for pretty cell outputs

from trubrics.example import get_titanic_data_and_model
train_df, test_df, model = get_titanic_data_and_model()

## Init DataContext

*The DataContext allows you to wrap all ML data assets into a single object that can be used in the ModelValidator and FeedbackCollector. Read more [here](https://trubrics.github.io/trubrics-sdk/data_context/).*

In [None]:
from trubrics.context import DataContext

In [None]:
data_context = DataContext(
    name="my_first_dataset",
    version=0.1,
    testing_data=test_df,
    target="Survived",
    training_data=train_df,
    minimum_functionality_data=test_df.head(),
)

## Init ModelValidator

*The `ModelValidator` allows you to build out-of-the-box and custom validations around your model. Read more [here](https://trubrics.github.io/trubrics-sdk/models/).*

In [None]:
from trubrics.validations import ModelValidator

In [None]:
model_validator = ModelValidator(
    data=data_context,
    model=model,
)

## Use the ModelValidator to build out-of-the-box validations

*Out-of-the-box validations allow you to start validating your model with a couple of lines of code. See all validations [here](https://trubrics.github.io/trubrics-sdk/validations/).*

In [None]:
performance = [
    model_validator.validate_performance_against_threshold(metric="precision", threshold=0.7),
    model_validator.validate_performance_against_threshold(metric="f1", threshold=0.75),
]
rich.print(performance[0].dict())

#### Other features of [performance validations](https://trubrics.github.io/trubrics-sdk/validations/#performance):
- Integrate [custom metrics](https://trubrics.github.io/trubrics-sdk/metrics/#2-custom-scoring-functions) with any python function
- Validate [performance on any split](https://trubrics.github.io/trubrics-sdk/metrics/#3-data-slicing-functions) of the data with any python slicing function, or validate two splits have the same performance
- Validate performance between training & testing datasets (e.g. to avoid overfitting / underfitting)

In [None]:
import pandas as pd
# metrics are computed once, and stored in the .performances attribute of the `ModelValidator` 
pd.DataFrame(model_validator.performances)

#### Other out of the box validations
- [Minimum functionality](https://trubrics.github.io/trubrics-sdk/validations/#minimum-functionality) validations
- [Feature importance](https://trubrics.github.io/trubrics-sdk/validations/#feature-importance) validations
- [Inference time](https://trubrics.github.io/trubrics-sdk/validations/#inference-time) validations

## Build custom validations

*Custom validations allow to build validations around feedback, or for specific needs of your use case (e.g. data validations). Read more [here](https://trubrics.github.io/trubrics-sdk/custom_validations/).*

In [None]:
#%%writefile custom_validator.py
from trubrics.context import DataContext
from trubrics.validations import ModelValidator
from trubrics.validations.validation_output import (
    validation_output,
    validation_output_type,
)


class CustomValidator(ModelValidator):
    def __init__(self, data: DataContext, model, custom_scorers=None, slicing_functions=None):
        super().__init__(data, model, custom_scorers, slicing_functions)

    def _validate_master_age(self, age_limit_master) -> validation_output_type:
        """
        Write your custom validation function here.

        Notes
        -----
            This method is separated from validate_performance_for_different_fares
            to apply @validation_output and for unit testing.

            The @validation_output decorator allows you to generate a Validation object,
            and must be used to be able to save your validation as part of a Trubric.
            This decorator requires you to return values with the same type as validation_output_type.
        """
        master_df = self.tm.data.testing_data.loc[lambda df: df["Title"] == "Master"]
        errors_df = master_df.loc[lambda df: df["Age"] >= age_limit_master]
        return len(errors_df) == 0, {"errors_df": errors_df.to_dict()}

    @validation_output
    def validate_master_age(self, age_limit_master: int, severity=None):
        """Validate that passengers with the title "master" are younger than a certain age

        Args:
            age_limit_master: cut off value for master

        Returns:
            True for success, false otherwise. With a results dictionary giving dict of errors.
        """
        return self._validate_master_age(age_limit_master)


In [None]:
model_custom_validator = CustomValidator(data=data_context, model=model)

In [None]:
custom = [model_custom_validator.validate_master_age(age_limit_master=13, severity="warning")]
rich.print(custom[0].dict())

## Save validations as a trubric

*A `Trubric` is a checklist of validations, and represents the gold standard that your ML system must conform to. Read more about saving your validations as a `Trubric` [here](https://trubrics.github.io/trubrics-sdk/save_trubric/).*

In [None]:
from trubrics.validations import Trubric

validations = performance + custom

trubric = Trubric(
    name="my_first_trubric",
    model_name="my_model",
    data_context_name=data_context.name,
    data_context_version=data_context.version,
    metadata={"tag": "master"},
    validations=validations,
)

In [None]:
# save trubric to a local .json
trubric.save_local(path=".")

In [None]:
!cat my_first_trubric.json

## Execute the trubric from the CLI tool

*Once you have saved a `Trubric`, run these validations on a new model & data context in your CI/CD/CT pipelines with the CLI tool. Read more about setting up the CLI [here](https://trubrics.github.io/trubrics-sdk/trubrics_cli/).*

Example code snippet:

```bash
(venv)$ trubrics run \
        --trubric-config-path "." \
        --trubric-output-file-path "." \
        --trubric-output-file-name "my_new_trubric.json"
```