# Example usage of MLTest

This notebook shows you how to get started with MLTest. 

You'll see how to evaluate a state-of-the-art YOLO model on a COCO dataset using MLTest. 

We will also launch the MLTest dashboard and introduce you to MLTest's programmatic API which allows easy customization for specific use cases.

In [None]:
import logging
from rich.console import Console
import warnings

from evaluator import Evaluator, start_dashboard, stop_dashboard
from lakera.config import Config
from lakera.test_pipeline import TestPipeline
import lakera.util as util

console = Console()
warnings.filterwarnings('ignore')

In [None]:
PATH_TO_OUTPUT = "./mltest_results"

## Run MLTest

In this section you get to evaluate a YOLO model on a COCO dataset using MLTest.

Let's start by setting up the MLTest `Evaluator` object with a default configuration. The `Evaluator` and the MLTest Dashboard communicate through a shared folder which is specified with `PATH_TO_OUTPUT`. Executing the next couple of lines will execute MLTest's default test suite (performance, robustness, and data tests). It may take a few seconds to complete, please be patient 🙏.

In [None]:
evaluator = Evaluator(PATH_TO_OUTPUT)

evaluator.load_dataset("coco")
evaluator.load_model("yolo")
evaluator.load_default_config()

# You're all setup, simply execute the evaluation. 
evaluator.run()

## Start the Lakera dashboard

Once MLTest has completed its evaluation, you can see the results quickly using the MLTest Dashboard. 

Execute the following cell to initialize the dashboard with the results from the previous step.

Please make sure that Docker is installed and running. 

In [None]:
d_id = start_dashboard(PATH_TO_OUTPUT)

## Add additional tests

Using MLTest's programmatic Python API, it is very simple to add additional tests to your evaluator and re-run it. 

In this example, we add robustness tests that test the model against realistic environmental changes that can severely impact performance in production. 

Once the run has finished, you should see the run in the Dashboard after a minute. Be sure to use the comparison feature to compare with the previous run!

In [None]:
from lakera.transforms import Brightness, PackageLoss, LocalContrastEnhancement

evaluator.add_robustness_test(Brightness())
evaluator.add_robustness_test(PackageLoss())
evaluator.add_robustness_test(LocalContrastEnhancement())

# Run MLTest
evaluator.run()

In [None]:
stop_dashboard(d_id)
d_id = start_dashboard(PATH_TO_OUTPUT)

## Add regression sets on the fly

Many times, we want to keep track of regression datasets to make sure that our models don't fail where it matters the most. 

With MLTest, it's very easy to add regression sets on the fly and automate testing on these as well. The following step adds a regression dataset inline and re-runs MLTest.

In [None]:
evaluator.add_dataset(
    name_tag="first_regression_set",
    compute_invariance=True,
    batch_size=5,
    path_to_dataset="data/coco/coco_B"
)
evaluator.run()

Don't forget to check out the results on the dashboard.

This is the end of this quick tutorial. But stay tuned – we are already working on extending it.

In [None]:
stop_dashboard(d_id)
d_id = start_dashboard(PATH_TO_OUTPUT)

## Stop the Dashboard

Once you are done exploring the dashboard, make sure to stop it from running in the background.

In [None]:
stop_dashboard(d_id)