# Prototype: MLOps CDK Pipeline Model Quality Monitor (SageMaker SDK)

## Overview
The purpose of this note book is to formulate and test the data preparation and setup for the Model Quality Monitor Baseline as it relates to the MLOps CDK Pipeline MKV testing. The notebopok uses the SageMaker SDK to verify that the solution works first, before portgint the functionality to a Lambda Function.

><div class="alert alert-block alert-info"><b>NOTE: </b> The code is roughly based on the official <a href="https://github.com/aws/amazon-sagemaker-examples/blob/master/sagemaker_model_monitor/model_quality/model_quality_churn_sdk.ipynb"><b>SageMaker Example</b></a></div> 

## Section 1 - Setup

### Section 1.1 - Necessary Libraries

In [None]:
from datetime import datetime, timedelta, timezone
import json
import os
import re
import boto3
from time import sleep
from threading import Thread

import pandas as pd

from sagemaker import get_execution_role, session, Session, image_uris
from sagemaker.s3 import S3Downloader, S3Uploader
from sagemaker.processing import ProcessingJob
from sagemaker.serializers import CSVSerializer

from sagemaker.model import Model
from sagemaker.model_monitor import DataCaptureConfig

session = Session()
ssm = boto3.client('ssm')

### Section 1.2 - AWS Region and IAM Role

In [None]:
#Get Execution role
role = get_execution_role()
print("RoleArn:", role)

region = session.boto_region_name
print("Region:", region)

### Section 1.3 - Bucket and Prefixes

For the sake of testing, we will leverage the SageMaker Production Endpoint and the Production Logs S3 Bucket.

><div class="alert alert-block alert-warning"><b>NOTE: </b>The following section assumes that the CDK Pipeline has already been deployed into production.</div> 

In [None]:
# Set the Endpoint name
model_name = 'abalone'
endpoint_name = f'{model_name}-prod-endpoint'

# Get the Baseline Data URI SSM Parameter
parameter_name = 'BaselineDataUri'
baseline_data_uri = ssm.get_parameter(Name=parameter_name)['Parameter']['Value']

# Leverage the existing "Prod" parameters
# Production Logs bucket name
bucket = 'proddeploymentstage-prodappl-logss3bucket004b0f70-3nv3l2whchah'

# S3 Prefixes
data_capture_prefix = 'endpoint-data-capture'
reports_prefix = 'reports'

# S3 URIs
s3_capture_uri = f's3://{bucket}/{data_capture_prefix}'
s3_ground_truth_uri = f's3://{bucket}/ground-truth-data/{datetime.now():%Y-%m-%d-%H-%M-%S}'
s3_report_uri = f's3://{bucket}/{reports_prefix}'

# Get the Model Monitor URI
monitor_image_uri = image_uris.retrieve(framework='model-monitor', region=region)

# Print these outputs to remember
print(f'Baseline Data Location: {baseline_data_uri}')
print(f'Model Monitor Image URI: {monitor_image_uri}')
print(f'Endpoint Data Capture Location: {s3_capture_uri}')
print(f'Ground Truth Data Location: {s3_ground_truth_uri}')
print(f'Model Quality Report Location: {s3_report_uri}')

### Section 1.4 - Download the Baseline Dataset

The baseline dtaset should have already been created by the "Evaluation Step" of the pipeline, with the location stored as an SSM parameter. Next we download this baseline data and use it to suggest our baseline constraints.

In [None]:
# Test downloading the baseline data
S3Downloader.download(baseline_data_uri, '.')

In [None]:
# Look at the data
!head baseline.csv

---

## Section 2 - Generate the baseline for model quality performance

### Section 2.1 - Create a SageMaker Predictor object for the Production Endpoint

><div class="alert alert-block alert-warning"><b>NOTE: </b>The following uses the SageMaker SDK <em>Predicor</em> along with the <em>CSVSerializer</em>. It is unknown at this stage how this will translate to the the funciton call in the Lambda Function. Additioanlly, it is unknown at this point of creating a Baseline Suggestion actually requires the Endpoint.</div> 

In [None]:
# Instantiate a `Predictor` to callthe endpoint
from sagemaker.predictor import Predictor
predictor = Predictor(endpoint_name=endpoint_name, sagemaker_session=session, serializer=CSVSerializer())

### Section 2.2 - Setup S3 locations for capturing baseline results

In [None]:
# Set up the locations for capturing the baseline results
baseline_prefix = 'baselining'
baseline_data_prefix = baseline_prefix + '/data'
baseline_results_prefix = baseline_prefix + '/results'

baseline_data_uri = f's3://{bucket}/{baseline_data_prefix}'
baseline_results_uri = f's3://{bucket}/{baseline_results_prefix}'
print(f'Baseline data uri: {baseline_data_uri}')
print(f'Baseline results uri: {baseline_results_uri}')

In [None]:
baseline_dataset_uri = S3Uploader.upload('baseline.csv', baseline_data_uri)
baseline_dataset_uri

### Section 2.3 - Create the Baseline Suggesiton

In [None]:
from sagemaker.model_monitor import ModelQualityMonitor
from sagemaker.model_monitor import EndpointInput
from sagemaker.model_monitor.dataset_format import DatasetFormat

#Create the model quality monitoring object
model_quality_monitor = ModelQualityMonitor(
    role=role,
    instance_count=1,
    instance_type='ml.m5.xlarge',
    volume_size_in_gb=20,
    max_runtime_in_seconds=1800,
    sagemaker_session=session
)

In [None]:
#Name of the model quality baseline job
baseline_job_name = f"abalone-baseline-job-{datetime.utcnow():%Y-%m-%d-%H%M}"
baseline_job_name

In [None]:
#Execute the baseline suggestion job. 
#You will specify problem type, in this case Binary Classification, and provide other required attributes.
job = model_quality_monitor.suggest_baseline(
    job_name=baseline_job_name,
    baseline_dataset=baseline_dataset_uri,
    dataset_format=DatasetFormat.csv(header=True),
    output_s3_uri = baseline_results_uri,
    problem_type='Regression',
    inference_attribute= "prediction",
    ground_truth_attribute= "label"
)
job.wait(logs=False)

### Section 2.4 - Review the Results

#### Baseline Statistics

In [None]:
baseline_job = model_quality_monitor.latest_baselining_job
binary_metrics = baseline_job.baseline_statistics().body_dict['regression_metrics']
pd.json_normalize(binary_metrics).T

#### Baseline Constraints

In [None]:
pd.DataFrame(baseline_job.suggested_constraints().body_dict["regression_constraints"]).T

---

## Section 3 - Conclusion

So basicall, fromt he constraints that are generated, the Model Monitor makes tracks to make sure that the $R^2$ score doesn't drop below $0.617261$. Since the main evaluation metric for this use case is leveraging the $RMSE$, this will be the the easiest determinator for concept drift in model quality. So if the Model Monitor picks up that the contraints are exceeding $2.04811$ (which incidently is simialr to the evaluation threshold of $2.1$ in the Model Framing Example), then this will cause the alert.