# Use Model Metrics to Monitor Drift for a Deployed Model

This notebook shows a full example how to make use of Model Metrics to monitor prediction quality and data drift based on the **deployed** model from the previous lab. This is done by tracking inputs, predictions and a delayed ground truth, e.g. in order to implement model performance drift monitoring workflows.

- ✅ Create synthetic data and labels (ground truth) based on the `Iris` dataset.
- ✅ Track and correlate the ground truth with predictions via the `track_delayed_metrics` function.
- ✅ Test the metrics store in "production" mode.

In [12]:
import time
from sklearn import datasets
import numpy as np

import cml.metrics_v1 as metrics
from cml.models_v1 import call_model

# Configure the deployed model to be used from the previous lab
# Navigate to the Deployment UI to retrieve model Access Key and CRN
MODEL_ACCESS_KEY="m27jyo2p6hxxwjaleddmo35jne4rtfx9"
MODEL_CRN = "crn:cdp:ml:us-west-1:558bc1d2-8867-4357-8524-311d51259233:workspace:d09086fa-a7fe-40bd-b52c-7d99da43255f/75603914-34fb-495e-b183-cc06e70f38c7"
N_SAMPLES = 100

# Load Iris dataset
iris = datasets.load_iris()
feature_min = iris.data.min(axis=0)
feature_max = iris.data.max(axis=0)

# Generate a random synthetic sample
synthetic_sample = np.random.uniform(feature_min, feature_max, size=(N_SAMPLES, iris.data.shape[1])) # e.g. (100, 4)
synthetic_labels = np.random.choice(iris.target, size=N_SAMPLES) # e.g. (100, )

In [13]:
# Record the current time so we can retrieve the metrics
# tracked for these calls later on.
start_timestamp_ms=int(round(time.time() * 1000))

# Simulate batch inference
uuids = []

# Iterate over each row in the synthetic_sample array and call the model
for sample in synthetic_sample:
    sample_input = {"inputs": [sample.tolist()]}
    output = call_model(MODEL_ACCESS_KEY, ipt=sample_input)
    # Record the UUID of each prediction for correlation with ground truth.
    uuids.append(output["response"]["uuid"])

# Record the current to mark the end of the time window.
end_timestamp_ms=int(round(time.time() * 1000))

In [14]:
# We can now use the read_metrics function to read the metrics we just
# generated into the current session, by querying by time window.
data = metrics.read_metrics(start_timestamp_ms=start_timestamp_ms,
                            end_timestamp_ms=end_timestamp_ms,
                            model_crn=MODEL_CRN)["metrics"]

# Show a single logged prediction with metrics
data[0]

{'modelDeploymentCrn': 'crn:cdp:ml:us-west-1:558bc1d2-8867-4357-8524-311d51259233:workspace:d09086fa-a7fe-40bd-b52c-7d99da43255f/ee50a723-aed8-46ee-a96f-c97d8e211428',
 'modelBuildCrn': 'crn:cdp:ml:us-west-1:558bc1d2-8867-4357-8524-311d51259233:workspace:d09086fa-a7fe-40bd-b52c-7d99da43255f/3aff08df-7d58-4919-8521-8ea1f9fddae9',
 'modelCrn': 'crn:cdp:ml:us-west-1:558bc1d2-8867-4357-8524-311d51259233:workspace:d09086fa-a7fe-40bd-b52c-7d99da43255f/75603914-34fb-495e-b183-cc06e70f38c7',
 'startTimeStampMs': 1746629493901,
 'endTimeStampMs': 1746629493936,
 'predictionUuid': 'e58969ea-73b2-4e37-953d-1b7e36a6a43b',
 'metrics': {'prediction': [2]}}

In [15]:
# Now, ground truth is known and we want to track the true value
# corresponding to each prediction above.
synthetic_labels = np.random.choice(iris.target, size=N_SAMPLES)

# Track the true values alongside the corresponding predictions
# with track_delayed_metrics function
n = len(synthetic_labels)
for i in range(n):
    ground_truth = synthetic_labels[i]
    metrics.track_delayed_metrics(
        {"actual_result": str(ground_truth)},
        uuids[i]
    )

In [16]:
# Read the metrics again.
data = metrics.read_metrics(start_timestamp_ms=start_timestamp_ms,
                            end_timestamp_ms=end_timestamp_ms,
                            model_crn=MODEL_CRN)["metrics"]

# Show a single logged prediction with metrics, now along with the ground truth.
data[0]

{'modelDeploymentCrn': 'crn:cdp:ml:us-west-1:558bc1d2-8867-4357-8524-311d51259233:workspace:d09086fa-a7fe-40bd-b52c-7d99da43255f/ee50a723-aed8-46ee-a96f-c97d8e211428',
 'modelBuildCrn': 'crn:cdp:ml:us-west-1:558bc1d2-8867-4357-8524-311d51259233:workspace:d09086fa-a7fe-40bd-b52c-7d99da43255f/3aff08df-7d58-4919-8521-8ea1f9fddae9',
 'modelCrn': 'crn:cdp:ml:us-west-1:558bc1d2-8867-4357-8524-311d51259233:workspace:d09086fa-a7fe-40bd-b52c-7d99da43255f/75603914-34fb-495e-b183-cc06e70f38c7',
 'startTimeStampMs': 1746629493901,
 'endTimeStampMs': 1746629493936,
 'predictionUuid': 'e58969ea-73b2-4e37-953d-1b7e36a6a43b',
 'metrics': {'actual_result': '1', 'prediction': [2]}}