## 1. Collect Evidence

In the second phase of SDMT, we collect _evidence_ to attest to the fact that the model realized the properties specified in the previous phase.

We define and instantiate `Measurement`s to generate this evidence. Each individual piece of evidence is a `Value`. Once `Value`s are produced, we can persist them to an _artifact store_ to maintain our evidence across sessions. 

#### Preliminaries

In [5]:
# Preliminaries for loading the package locally
import os
import sys

def package_root() -> str:
    """Resolve the path to the project root."""
    return os.path.abspath(os.path.join(os.getcwd(), "..", "src/"))

sys.path.append(package_root())
sys.path.append(os.getcwd())

#### Initialize MLTE Context

MLTE contains a global context that manages the currently active _session_. Initializing the context tells MLTE how to store all of the artifacts that it produces.

In [7]:
import mlte

store_path = os.path.join(os.getcwd(), "store")
os.makedirs(store_path, exist_ok=True)

mlte.set_model("OxfordFlower", "0.0.1")
mlte.set_artifact_store_uri(f"local://{store_path}")

#### Fairnesss Measurements

Evidence collected in this section checks for fairness.

In [8]:
import garden
from pathlib import Path

# The path at which datasets are stored
DATASETS_DIR = Path(os.getcwd()) / "data"

# Prepare the data.
data = garden.load_data(DATASETS_DIR)
split_data = garden.split_data(data[0], data[1])

102 102 102


In this first example, we simply wrap the output from `accuracy_score` with a custom `Result` type to cope with the output of a third-party library that is not supported by a MLTE builtin.

In [10]:
from multiple_accuracy import MultipleAccuracy
from mlte.measurement import ExternalMeasurement

# Evaluate accuracy, identifier has to be the same one defined in the Spec.
accuracy_measurement = ExternalMeasurement("accuracy across gardens", MultipleAccuracy)
accuracy = accuracy_measurement.evaluate(garden.calculate_model_performance_acc(split_data[0], split_data[1]))

# Inspect value
print(accuracy)

# Save to artifact store
accuracy.save()

0 0.927 0.008 0.903 0.946
1 0.911 0.008 0.884 0.937
2 0.902 0.009 0.872 0.932
[0.927, 0.911, 0.902]
