# Monitoring

This notebook takes care of scheduling the monitoring jobs.

This notebook is part of the [Machine Learning School](https://www.ml.school) program.

In [28]:
import boto3
import json
import os
import sagemaker

from pathlib import Path
from IPython.display import JSON
from sagemaker.s3 import S3Downloader
from sagemaker.model_monitor import CronExpressionGenerator, DefaultModelMonitor, MonitoringExecution

BUCKET = "mlschool"
S3_FILEPATH = f"s3://{BUCKET}/penguins"
CODE_FOLDER = Path("code")
ENDPOINT = "penguins-endpoint"
DATA_QUALITY_LOCATION = f"{S3_FILEPATH}/monitoring/data-quality"

sagemaker_client = boto3.client("sagemaker")
role = sagemaker.get_execution_role()
sagemaker_session = sagemaker.session.Session()

## Data Monitoring

### Statistics and Constraints

Our pipeline generated baseline statistics and constraints using our train set. We can take a look at what these values look like by downloading them from S3.

In [10]:
statistics = f"{DATA_QUALITY_LOCATION}/statistics.json"
JSON(json.loads(S3Downloader.read_file(statistics)))

<IPython.core.display.JSON object>

In [11]:
constraints = f"{DATA_QUALITY_LOCATION}/constraints.json"
JSON(json.loads(S3Downloader.read_file(constraints)))

<IPython.core.display.JSON object>

### Scheduling the Monitoring Job

We can now set up a schedule to continuously monitor data going into the endpoint and compare it to the baseline we generated before. This monitoring job will use the baseline statistics and constraints we generated during the Data Quality Check Step. Check [Schedule Data Quality Monitoring Jobs](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-schedule-data-monitor.html) for more information.

SageMaker looks for violations in the data captured by the endpoint. By default, it combines the input data with the endpoint output and compare the result with the baseline we generated. If we let SageMaker do this, we will get a few violations, for example an "extra column check" violation because the fields `confidence` and `prediction` don't exist in the baseline data.

We can fix these violations by creating a preprocessing script configuring the data we want the monitoring job to use.


In [15]:
DATA_QUALITY_PREPROCESSOR = "data_quality_preprocessor.py"

Here is the preprocessing script for the Data Quality Monitoring Job. Check [Preprocessing and Postprocessing](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-pre-and-post-processing.html) for more information about how to configure these scripts.

In [20]:
%%writefile {CODE_FOLDER}/{DATA_QUALITY_PREPROCESSOR}
import json

def preprocess_handler(inference_record):
    input_data = inference_record.endpoint_input.data
    output_data = json.loads(inference_record.endpoint_output.data)
    
    response = json.loads(input_data)
    response["species"] = output_data["species"]

    # The `response` variable contains the data that we want the
    # monitoring job to use to compare with the baseline.
    return response

Overwriting code/data_quality_preprocessor.py


The monitoring schedule expects an S3 location pointing to the preprocessing script. Let's upload the script to the default bucket.

In [25]:
bucket = boto3.Session().resource("s3").Bucket(sagemaker_session.default_bucket())
prefix = "penguins-monitoring"
bucket.Object(os.path.join(prefix, DATA_QUALITY_PREPROCESSOR)).upload_file(str(CODE_FOLDER / DATA_QUALITY_PREPROCESSOR))
data_quality_preprocessor = f"s3://{os.path.join(bucket.name, prefix, DATA_QUALITY_PREPROCESSOR)}"
data_quality_preprocessor

's3://sagemaker-us-east-1-325223348818/penguins-monitoring/data_quality_preprocessor.py'

We can now set up the Data Quality Monitoring Job using the [DefaultModelMonitor](https://sagemaker.readthedocs.io/en/stable/api/inference/model_monitor.html#sagemaker.model_monitor.model_monitoring.DefaultModelMonitor) class. Notice how we specify the `record_preprocessor_script` using the S3 location where we uploaded our script.

In [29]:
data_monitor = DefaultModelMonitor(
    instance_type="ml.m5.xlarge",
    instance_count=1,
    max_runtime_in_seconds=3600,
    role=role,
)

data_monitor.create_monitoring_schedule(
    monitor_schedule_name="penguins-data-monitoring-schedule",
    endpoint_input=ENDPOINT,
    record_preprocessor_script=data_quality_preprocessor,
    statistics=f"{DATA_QUALITY_LOCATION}/statistics.json",
    constraints=f"{DATA_QUALITY_LOCATION}/constraints.json",
    schedule_cron_expression=CronExpressionGenerator.hourly(),
)

### Checking Monitoring Violations

We can check the results of the monitoring job by looking at whether it generated any violations.

In [38]:
description = data_monitor.describe_schedule()
description

{'MonitoringScheduleArn': 'arn:aws:sagemaker:us-east-1:325223348818:monitoring-schedule/penguins-data-monitoring-schedule',
 'MonitoringScheduleName': 'penguins-data-monitoring-schedule',
 'MonitoringScheduleStatus': 'Scheduled',
 'MonitoringType': 'DataQuality',
 'CreationTime': datetime.datetime(2023, 7, 12, 12, 45, 52, 155000, tzinfo=tzlocal()),
 'LastModifiedTime': datetime.datetime(2023, 7, 12, 14, 10, 15, 907000, tzinfo=tzlocal()),
 'MonitoringScheduleConfig': {'ScheduleConfig': {'ScheduleExpression': 'cron(0 * ? * * *)'},
  'MonitoringJobDefinitionName': 'data-quality-job-definition-2023-07-12-12-45-51-295',
  'MonitoringType': 'DataQuality'},
 'EndpointName': 'penguins-endpoint',
 'LastMonitoringExecutionSummary': {'MonitoringScheduleName': 'penguins-data-monitoring-schedule',
  'ScheduledTime': datetime.datetime(2023, 7, 12, 14, 0, tzinfo=tzlocal()),
  'CreationTime': datetime.datetime(2023, 7, 12, 14, 1, 33, 466000, tzinfo=tzlocal()),
  'LastModifiedTime': datetime.datetime(2

In [39]:
status = description["LastMonitoringExecutionSummary"]["MonitoringExecutionStatus"]
print(f"Status: {status}")

if status == "CompletedWithViolations":
    processing_job_arn = description["LastMonitoringExecutionSummary"]["ProcessingJobArn"]
    execution = MonitoringExecution.from_processing_arn(sagemaker_session=sagemaker_session, processing_job_arn=processing_job_arn)
    execution_destination = execution.output.destination
    
    violations_filepath = os.path.join(execution_destination, "constraint_violations.json")
    violations = json.loads(S3Downloader.read_file(violations_filepath))["violations"]
    
    print(json.dumps(violations, indent=2))

Status: CompletedWithViolations
[
  {
    "feature_name": "Extra columns",
    "constraint_check_type": "extra_column_check",
    "description": "There are extra columns in current dataset. Number of columns in current dataset: 8, Number of columns in baseline constraints: 6"
  }
]


### Delete Monitoring Schedule

Let's stop the monitoring jobs by deleting the monitoring schedule we created before. The following function waits for the job to finish before deleting it.

In [40]:
def delete_monitoring_schedule(schedule):
    attempts = 30
    
    try:
        status = schedule.describe_schedule()["MonitoringScheduleStatus"]
    except Exception:
        print("Monitoring schedule deleted.")
        return
        
    while status in ("Pending", "InProgress") and attempts > 0:
        attempts -= 1
        print(f"Monitoring schedule status: {status}. Waiting for it to finish.")
        time.sleep(30)
        status = schedule.describe_schedule()["MonitoringScheduleStatus"]

    if status not in ("Pending", "InProgress"):
        schedule.delete_monitoring_schedule()
        print("Monitoring schedule deleted.")
    else:
        print("Waiting for monitoring schedule timed out")

In [41]:
delete_monitoring_schedule(data_monitor)


Deleting Monitoring Schedule with name: penguins-data-monitoring-schedule
Monitoring schedule deleted.


# Model Monitoring

## Step 7 - Generating Ground Truth Data

To monitor our model, we need to generate ground truth data for the samples captured by the endpoint. We can simulate this by generating a random ground truth for every sample. Check [Ingest Ground Truth Labels and Merge Them With Predictions](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-model-quality-merge.html) for more information about this.

In [128]:
ground_truth_path = f"{S3_FILEPATH}/monitoring/groundtruth" 

def generate_ground_truth_data(predictor, ground_truth_path):
    
    def _generate_ground_truth_record(inference_id):
        random.seed(inference_id)

        return {
            "groundTruthData": {
                "data": random.choice(["Adelie", "Chinstrap", "Gentoo"]),
                "encoding": "CSV",
            },
            "eventMetadata": {
                "eventId": str(inference_id),
            },
            "eventVersion": "0",
        }


    def _upload_ground_truth(records, upload_time):
        records = [json.dumps(r) for r in records]
        data = "\n".join(records)
        uri = f"{ground_truth_path}/{upload_time:%Y/%m/%d/%H/%M%S}.jsonl"

        print(f"Uploading ground truth data to {uri}...")

        S3Uploader.upload_string_as_file_body(data, uri)    

                
    def _generate_ground_truth_data(max_records, stop_ground_truth_thread):
        while True:
            records = [_generate_ground_truth_record(i) for i in range(max_records)]
            _upload_ground_truth(records, datetime.utcnow())

            if stop_ground_truth_thread.is_set():
                break

            sleep(30)

                
    stop_ground_truth_thread = Event()
    data = pd.read_csv(LOCAL_FILEPATH).dropna()
    
    groundtruth_thread = Thread(
        target=_generate_ground_truth_data,
        args=(len(data), stop_ground_truth_thread,)
    )
    
    groundtruth_thread.start()
    
    return stop_ground_truth_thread, traffic_thread


In [None]:
stop_traffic_thread, traffic_thread = generate_traffic(predictor)
stop_ground_truth_thread, groundtruth_thread = generate_ground_truth_data(predictor, ground_truth_path)

## Step 8 - Scheduling the Monitoring Job

Let's set up a schedule to continuously monitor the quality of the model and compare it to the baseline we generated before. This monitoring job will use the baseline constraints we generated during the Model Quality Check Step. Check [Schedule Model Quality Monitoring Jobs](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-model-quality-schedule.html) for more information.

To set up a Model Quality Monitoring Job, we can use the [ModelQualityMonitor](https://sagemaker.readthedocs.io/en/stable/api/inference/model_monitor.html#sagemaker.model_monitor.model_monitoring.ModelQualityMonitor) class. The [EndpointInput](https://sagemaker.readthedocs.io/en/v2.24.2/api/inference/model_monitor.html#sagemaker.model_monitor.model_monitoring.EndpointInput) instance configures the attribute the monitoring job should use to determine the prediction from the model.

Check [Amazon SageMaker Model Quality Monitor](https://sagemaker-examples.readthedocs.io/en/latest/sagemaker_model_monitor/model_quality/model_quality_churn_sdk.html) for a complete tutorial on how to run a Model Monitoring Job in SageMaker.

In [130]:
model_monitor = ModelQualityMonitor(
    instance_type="ml.m5.xlarge",
    instance_count=1,
    max_runtime_in_seconds=1800,
    role=role
)

model_monitor.create_monitoring_schedule(
    monitor_schedule_name="penguins-model-monitoring-schedule",
    
    endpoint_input = EndpointInput(
        endpoint_name=predictor.endpoint_name,

        # The endpoint returns an attribute `species` with the
        # prediction from the model. That's the attribute we want to
        # use to compare with the groundtruth.
        inference_attribute="species",

        destination="/opt/ml/processing/input_data",
    ),
    
    problem_type="MulticlassClassification",
    ground_truth_input=ground_truth_path,
    
    constraints=f"{model_quality_location}/constraints.json",
    
    schedule_cron_expression=CronExpressionGenerator.hourly(),
    output_s3_uri=f"{S3_FILEPATH}/monitoring/model-quality",
    enable_cloudwatch_metrics=True,
)

INFO:sagemaker.image_uris:Defaulting to the only supported framework/algorithm version: .
INFO:sagemaker.image_uris:Ignoring unnecessary instance type: None.
INFO:sagemaker.model_monitor.model_monitoring:Creating Monitoring Schedule with name: penguins-model-monitoring-schedule


## Step 9 - Checking Monitoring Violations

We can check the results of the monitoring job by looking at whether it generated any violations.

In [138]:
description = model_monitor.describe_schedule()
description

{'MonitoringScheduleArn': 'arn:aws:sagemaker:us-east-1:325223348818:monitoring-schedule/penguins-model-monitoring-schedule',
 'MonitoringScheduleName': 'penguins-model-monitoring-schedule',
 'MonitoringScheduleStatus': 'Scheduled',
 'MonitoringType': 'ModelQuality',
 'CreationTime': datetime.datetime(2023, 6, 28, 12, 40, 7, 350000, tzinfo=tzlocal()),
 'LastModifiedTime': datetime.datetime(2023, 6, 28, 13, 32, 55, 997000, tzinfo=tzlocal()),
 'MonitoringScheduleConfig': {'ScheduleConfig': {'ScheduleExpression': 'cron(0 * ? * * *)'},
  'MonitoringJobDefinitionName': 'model-quality-job-definition-2023-06-28-12-40-06-773',
  'MonitoringType': 'ModelQuality'},
 'EndpointName': 'penguins-endpoint',
 'LastMonitoringExecutionSummary': {'MonitoringScheduleName': 'penguins-model-monitoring-schedule',
  'ScheduledTime': datetime.datetime(2023, 6, 28, 13, 0, tzinfo=tzlocal()),
  'CreationTime': datetime.datetime(2023, 6, 28, 13, 8, 8, 325000, tzinfo=tzlocal()),
  'LastModifiedTime': datetime.dateti

In [139]:
status = description["LastMonitoringExecutionSummary"]["MonitoringExecutionStatus"]
print(f"Status: {status}")

if status == "CompletedWithViolations":
    processing_job_arn = description["LastMonitoringExecutionSummary"]["ProcessingJobArn"]
    execution = MonitoringExecution.from_processing_arn(sagemaker_session=sagemaker_session, processing_job_arn=processing_job_arn)
    execution_destination = execution.output.destination
    
    violations_filepath = os.path.join(execution_destination, "constraint_violations.json")
    violations = json.loads(S3Downloader.read_file(violations_filepath))["violations"]
    
    print(json.dumps(violations, indent=2))

Status: CompletedWithViolations
[
  {
    "constraint_check_type": "LessThanThreshold",
    "description": "Metric weightedF2 with 0.35210111607474637 +/- 2.489284148228981E-5 was LessThanThreshold '0.9807900477958932'",
    "metric_name": "weightedF2"
  },
  {
    "constraint_check_type": "LessThanThreshold",
    "description": "Metric accuracy with 0.35944503735325506 +/- 1.8368026615194536E-5 was LessThanThreshold '0.9807692307692307'",
    "metric_name": "accuracy"
  },
  {
    "constraint_check_type": "LessThanThreshold",
    "description": "Metric weightedRecall with 0.359445037353255 +/- 1.8368026615206323E-5 was LessThanThreshold '0.9807692307692308'",
    "metric_name": "weightedRecall"
  },
  {
    "constraint_check_type": "LessThanThreshold",
    "description": "Metric weightedPrecision with 0.35481246824881185 +/- 3.635184951719884E-5 was LessThanThreshold '0.9835164835164835'",
    "metric_name": "weightedPrecision"
  },
  {
    "constraint_check_type": "LessThanThreshold"

## Step 10 - Cleaning up

Let's stop the monitoring job by deleting the monitoring schedule we created before.

In [62]:
delete_monitoring_schedule(model_monitor)

NameError: name 'model_monitor' is not defined

In [63]:
stop_traffic_thread.set()
stop_ground_truth_thread.set()

traffic_thread.join()
groundtruth_thread.join()

NameError: name 'stop_traffic_thread' is not defined