# Step 6: Add data monitoring
After executing six previous notebooks, you have a production-ready solution with automated model building and model deployment CI/CD pipelines.
This notebook adds continuous monitoring of the data quality in real-time. [Amazon SageMaker model monitor](https://aws.amazon.com/sagemaker/model-monitor/) enables you to set up an automated alert triggering system when there are deviations in the data and model quality, such as data drift and anomalies.



![](img/six-steps-6.png)

In [148]:
import boto3
import sagemaker 
import json
from sagemaker.model_monitor import DataCaptureConfig
from sagemaker.predictor import Predictor
import time
from time import gmtime, strftime
import uuid
import pandas as pd
import numpy as np
from sagemaker.model_monitor import DefaultModelMonitor
from sagemaker.model_monitor.dataset_format import DatasetFormat
from sagemaker.model_monitor import CronExpressionGenerator

In [3]:
sm = boto3.client("sagemaker")

In [97]:
%store -r 

%store

try:
    initialized
except NameError:
    print("+++++++++++++++++++++++++++++++++++++++++++++++++")
    print("[ERROR] YOU HAVE TO RUN 00-start-here notebook   ")
    print("+++++++++++++++++++++++++++++++++++++++++++++++++")

Stored variables and their in-db values:
abalone_dataset_file_name                -> 'abalone.csv'
abalone_dataset_local_url                -> '../dataset/abalone.csv'
auto_ml_job_name                         -> 'automl-asinproc-24-14-33-56'
baseline_s3_url                          -> 's3://sagemaker-us-east-1-906545278380/from-idea-t
bucket_name                              -> 'sagemaker-us-east-1-906545278380'
bucket_prefix                            -> 'from-idea-to-prod/xgboost'
customers_count                          -> 10000
customers_feature_group_name             -> 'fscw-customers-07-20-17-46'
data_bucket                              -> 'sagemaker-us-east-1-906545278380'
data_capture_prefix                      -> 'from-idea-to-prod/datacapture'
data_uploaded                            -> True
domain_id                                -> 'd-r8pbvl3oamh6'
dw_flow_file_url                         -> 's3://sagemaker-us-east-1-906545278380/feature-sto
dw_output_name               

## How model monitor works
Amazon SageMaker Model Monitor automatically monitors ML models in production and notifies you when quality issues arise. Model Monitor uses rules to detect drift in your models and data and alerts you when it happens. The following figure shows how this process works.

![](img/model-monitor.png)

The process for setting up the data monitoring:
1. Enable the endpoint to capture data from incoming requests to a trained ML model and the resulting model predictions
2. Create a baseline from the dataset that was used to train the model. The baseline computes metrics and suggests constraints for the metrics. Real-time predictions from your model are compared to the constraints, and are reported as violations if they are outside the constrained values
3. Create a monitoring schedule specifying what data to collect, how often to collect it, how to analyze it, and which reports to produce
4. Inspect the reports, which compare the latest data with the baseline, and watch for any violations reported and for metrics and notifications from Amazon CloudWatch.

## Real-time inference data capture from a SageMaker endpoint
To demonstrate the usage of model monitor we use the existing endpoints deployed by the deployment pipeline in the step 5 notebook.

The model deployment MLOps project implemented in the step 5 notebook contains a data capture configuration for the deployed endpoints. If you clone the project's code repository to the Studio file system, you can see the project files.

The CloudFormation deployment template `endpoint-config-template.yml` enables data capture for the endpoint configuration:
```yaml
EndpointConfig:
    Type: AWS::SageMaker::EndpointConfig
    Properties:
      ProductionVariants:
        - InitialInstanceCount: !Ref EndpointInstanceCount
          InitialVariantWeight: 1.0
          InstanceType: !Ref EndpointInstanceType
          ModelName: !GetAtt Model.ModelName
          VariantName: AllTraffic
      DataCaptureConfig:
          EnableCapture: !Ref EnableDataCapture 
          InitialSamplingPercentage: !Ref SamplingPercentage
          DestinationS3Uri: !Ref DataCaptureUploadPath
          CaptureOptions:
            - CaptureMode: Input
            - CaptureMode: Output
          CaptureContentTypeHeader:
            CsvContentTypes:
              - "text/csv"
```

The configuration files `prod-config.json` and `staging-config.json` provide the actual values for `EnableCapture`, `InitialSamplingPercentage`, and `DestinationS3Uri`:
```json
{
  "Parameters": {
    "StageName": "prod",
    "EndpointInstanceCount": "1",
    "EndpointInstanceType": "ml.m5.large",
    "SamplingPercentage": "80",
    "EnableDataCapture": "true"
  }
}
```

Let's check the endpoint configuration and see how data capture is confgured.

In [93]:
for ep in sm.list_endpoints(StatusEquals="InService")["Endpoints"]:
    print(f"Data capture configuration for {ep['EndpointName']}:")
    print(f"{json.dumps(sm.describe_endpoint(EndpointName=ep['EndpointName'])['DataCaptureConfig'], indent=2)}")

Data capture configuration for step5-deploy-model-prod:
{
  "EnableCapture": true,
  "CaptureStatus": "Started",
  "CurrentSamplingPercentage": 80,
  "DestinationS3Uri": "s3://sagemaker-project-p-yh0gano2qax2/datacapture-prod"
}
Data capture configuration for step5-deploy-model-staging:
{
  "EnableCapture": true,
  "CaptureStatus": "Started",
  "CurrentSamplingPercentage": 100,
  "DestinationS3Uri": "s3://sagemaker-project-p-yh0gano2qax2/datacapture-staging"
}


In [78]:
# Get the configuration for a specific endpoint name
endpoint_name = "step5-deploy-model-prod"
data_capture_uri = sm.describe_endpoint(EndpointName=endpoint_name)['DataCaptureConfig']['DestinationS3Uri']
data_capture_bucket = data_capture_uri.split('/')[2]
data_capture_prefix = '/'.join(data_capture_uri.split('/')[3:])

### Generate captured data
You must send some data to an endpoint for inference to generate data capture.

In [94]:
!aws s3 ls s3://sagemaker-project-p-yh0gano2qax2/datacapture-prod/ --recursive

2022-09-27 10:58:57    5846610 datacapture-prod/step5-deploy-model-prod/AllTraffic/2022/09/27/10/57-23-493-215917dc-5813-474e-a145-2f21e7eaa2a3.jsonl
2022-09-27 11:00:36    4677288 datacapture-prod/step5-deploy-model-prod/AllTraffic/2022/09/27/10/58-58-063-04b60d38-d87f-4add-9707-82a8a3c30103.jsonl


In [39]:
predictor = Predictor(
    endpoint_name=endpoint_name, 
    serializer=sagemaker.serializers.CSVSerializer(),
    deserializer=sagemaker.deserializers.CSVDeserializer()
)

In [48]:
test_x = pd.read_csv("tmp/test_x.csv", names=[f'{i}' for i in range(59)])
test_y = pd.read_csv("tmp/test_y.csv", names=['y'])

In [154]:
def generate_endpoint_traffic(predictor, repeats=10):
    for _ in range(0,repeats):
        print("sending inference data to the endpoint")
        predictions = np.array(predictor.predict(test_x.values), dtype=float).squeeze()
        print(predictions)
        time.sleep(3)

In [155]:
generate_endpoint_traffic(predictor, 10)

sending inference data to the endpoint
[0.05137555 0.09782112 0.22581661 ... 0.04346842 0.04000453 0.03656681]
sending inference data to the endpoint
[0.05137555 0.09782112 0.22581661 ... 0.04346842 0.04000453 0.03656681]
sending inference data to the endpoint
[0.05137555 0.09782112 0.22581661 ... 0.04346842 0.04000453 0.03656681]
sending inference data to the endpoint
[0.05137555 0.09782112 0.22581661 ... 0.04346842 0.04000453 0.03656681]
sending inference data to the endpoint
[0.05137555 0.09782112 0.22581661 ... 0.04346842 0.04000453 0.03656681]
sending inference data to the endpoint
[0.05137555 0.09782112 0.22581661 ... 0.04346842 0.04000453 0.03656681]
sending inference data to the endpoint
[0.05137555 0.09782112 0.22581661 ... 0.04346842 0.04000453 0.03656681]
sending inference data to the endpoint
[0.05137555 0.09782112 0.22581661 ... 0.04346842 0.04000453 0.03656681]
sending inference data to the endpoint
[0.05137555 0.09782112 0.22581661 ... 0.04346842 0.04000453 0.03656681]
s

### View captured data
Now list the data capture files stored in Amazon S3. The data is stored as `jsonl` an Amazon S3 path format is `s3://{destination-bucket-prefix}/{endpoint-name}/{variant-name}/yyyy/mm/dd/hh/filename.jsonl`.

In [95]:
s3_client = boto3.Session().client("s3")

capture_files = [
    capture_file.get("Key") 
    for capture_file in s3_client.list_objects(Bucket=data_capture_bucket, Prefix=data_capture_prefix).get("Contents")
]
print("Found data capture files:")
print("\n ".join(capture_files))

Found data capture files:
datacapture-prod/step5-deploy-model-prod/AllTraffic/2022/09/27/10/57-23-493-215917dc-5813-474e-a145-2f21e7eaa2a3.jsonl
 datacapture-prod/step5-deploy-model-prod/AllTraffic/2022/09/27/10/58-58-063-04b60d38-d87f-4add-9707-82a8a3c30103.jsonl


Each inference request is captured in one line in the `jsonl` file. The line contains both the input and output merged together. In the example, you provided the ContentType as `text/csv` which is reflected in the `observedContentType` value. Also, you expose the encoding that you used to encode the input and output payloads in the capture format with the encoding value.

In [96]:
def get_obj_body(obj_key):
    return s3_client.get_object(Bucket=data_capture_bucket, Key=obj_key).get("Body").read().decode("utf-8")

capture_file = get_obj_body(capture_files[-1])
print(capture_file[:1000])

{"captureData":{"endpointInput":{"observedContentType":"text/csv","mode":"INPUT","data":"25,1,999,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,1,0,0,0,0,1,1,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,1,0\n28,3,999,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,1,0,0,1,0,0,1,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,1,0\n38,1,999,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,1,0,0,1,0,0,1,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0\n32,1,999,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,1,1,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,1,0,0\n40,1,999,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,1,0,0,0,0,1,0,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,1,0\n55,3,999,0,1,1,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,1,1,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,1,0\n31,3,999,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,0,1,0,0,1,0,0,1,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,1,0\n30,2,999,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,

## Model monitor - monitor data quality
In this example you learn how to setup data quality monitoring.

To enable inference data quality monitoring and evaluation you must:
1. Enable [data capture](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-data-capture.html)
1. [Create a baseline](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-create-baseline.html) with which you compare the realtime traffic
1. Once a baseline is ready, [schedule monitoring jobs](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-scheduling.html) to continously evaluate and compare against the baseline
1. [Interpret the results](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-interpreting-results.html) of monitoring jobs
1. [Integrate data quality monitoring](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-interpreting-cloudwatch.html) with Amazon CloudWatch

### Create a baselineing job with training dataset
The training dataset with which you trained the model is usually a good baseline dataset. Note that the training dataset data schema and the inference dataset schema should exactly match (i.e. the number and order of the features).

From the training dataset you can ask Amazon SageMaker to suggest a set of baseline _constraints_ and generate descriptive _statistics_ to explore the data. Model Monitor provides a [built-in container](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-pre-built-container.html) that provides the ability to suggest the constraints automatically for CSV and flat JSON input. This `sagemaker-model-monitor-analyzer` container also provides you with a range of model monitoring capabilities, including constraint validation against a baseline, and emitting Amazon CloudWatch metrics. This container is based on Spark and is built with [Deequ](https://github.com/awslabs/deequ). All column names in your baseline dataset must be compliant with Spark. For column names, use only lowercase characters, and _ as the only special character.

We use the training dataset you created in the step 2 notebook data processing.

In [127]:
!aws s3 ls {train_s3_url}/

2022-09-28 07:07:51    3544984 train.csv


In [128]:
baseline_results_s3_url = f"{baseline_s3_url}/results"
reports_s3_url = f"{baseline_s3_url}/reports"

In [129]:
baseline_dataset_uri = f"{train_s3_url}/train.csv"

In [142]:
baseline_job_name = f"from-idea-to-prod-processing-baselining-{strftime('%d-%H-%M-%S', gmtime())}-{str(uuid.uuid4())[:8]}"

Start a SageMaker projcessing job on the baseline data to profile data and suggest constraints.

In [144]:
data_monitor = DefaultModelMonitor(
    role=sm_role,
    instance_count=1,
    instance_type="ml.m5.xlarge",
    volume_size_in_gb=20,
    max_runtime_in_seconds=3600,
)

data_monitor.suggest_baseline(
    baseline_dataset=baseline_dataset_uri,
    dataset_format=DatasetFormat.csv(header=False),
    output_s3_uri=baseline_results_s3_url,
    wait=True,
    logs=False,
    job_name=baseline_job_name,
)


Job Name:  from-idea-to-prod-processing-baselining-28-08-25-51-a13f8446
Inputs:  [{'InputName': 'baseline_dataset_input', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://sagemaker-us-east-1-906545278380/from-idea-to-prod/xgboost/train/train.csv', 'LocalPath': '/opt/ml/processing/input/baseline_dataset_input', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}]
Outputs:  [{'OutputName': 'monitoring_output', 'AppManaged': False, 'S3Output': {'S3Uri': 's3://sagemaker-us-east-1-906545278380/from-idea-to-prod/xgboost/baseline/results', 'LocalPath': '/opt/ml/processing/output', 'S3UploadMode': 'EndOfJob'}}]
......................................................................!

<sagemaker.processing.ProcessingJob at 0x7f798832f0d0>

### See the generated statistics and constraints
The baselining jobs saves the baseline statistics to the `statistics.json` file and the suggested baseline constraints to the `constraints.json` file in the location you specify with `output_s3_uri`.

In [145]:
!aws s3 ls {baseline_results_s3_url}/

2022-09-28 08:31:50       9423 constraints.json
2022-09-28 08:31:50    1558448 statistics.json


In [146]:
baseline_job = data_monitor.latest_baselining_job
schema_df = pd.io.json.json_normalize(baseline_job.baseline_statistics().body_dict["features"])
schema_df.head(20)

  


Unnamed: 0,name,inferred_type,numerical_statistics.common.num_present,numerical_statistics.common.num_missing,numerical_statistics.mean,numerical_statistics.sum,numerical_statistics.std_dev,numerical_statistics.min,numerical_statistics.max,numerical_statistics.distribution.kll.buckets,numerical_statistics.distribution.kll.sketch.parameters.c,numerical_statistics.distribution.kll.sketch.parameters.k,numerical_statistics.distribution.kll.sketch.data
0,_c0,Integral,28831,0,0.111651,3219.0,0.314936,0.0,1.0,"[{'lower_bound': 0.0, 'upper_bound': 0.1, 'cou...",0.64,2048.0,"[[0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0,..."
1,_c1,Integral,28831,0,40.007943,1153469.0,10.390958,17.0,98.0,"[{'lower_bound': 17.0, 'upper_bound': 25.1, 'c...",0.64,2048.0,"[[44.0, 29.0, 30.0, 37.0, 37.0, 44.0, 54.0, 36..."
2,_c2,Integral,28831,0,2.573376,74193.0,2.803649,1.0,56.0,"[{'lower_bound': 1.0, 'upper_bound': 6.5, 'cou...",0.64,2048.0,"[[8.0, 1.0, 1.0, 2.0, 2.0, 1.0, 2.0, 5.0, 2.0,..."
3,_c3,Integral,28831,0,961.082307,27708964.0,190.294629,0.0,999.0,"[{'lower_bound': 0.0, 'upper_bound': 99.9, 'co...",0.64,2048.0,"[[999.0, 4.0, 999.0, 999.0, 999.0, 999.0, 999...."
4,_c4,Integral,28831,0,0.176407,5086.0,0.506651,0.0,7.0,"[{'lower_bound': 0.0, 'upper_bound': 0.7, 'cou...",0.64,2048.0,"[[1.0, 3.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,..."
5,_c5,Integral,28831,0,0.961812,27730.0,0.19165,0.0,1.0,"[{'lower_bound': 0.0, 'upper_bound': 0.1, 'cou...",0.64,2048.0,"[[1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,..."
6,_c6,Integral,28831,0,0.086851,2504.0,0.281617,0.0,1.0,"[{'lower_bound': 0.0, 'upper_bound': 0.1, 'cou...",0.64,2048.0,"[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,..."
7,_c7,Integral,28831,0,0.251916,7263.0,0.434113,0.0,1.0,"[{'lower_bound': 0.0, 'upper_bound': 0.1, 'cou...",0.64,2048.0,"[[1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 1.0,..."
8,_c8,Integral,28831,0,0.223787,6452.0,0.416781,0.0,1.0,"[{'lower_bound': 0.0, 'upper_bound': 0.1, 'cou...",0.64,2048.0,"[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0,..."
9,_c9,Integral,28831,0,0.035899,1035.0,0.186038,0.0,1.0,"[{'lower_bound': 0.0, 'upper_bound': 0.1, 'cou...",0.64,2048.0,"[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,..."


In [147]:
constraints_df = pd.io.json.json_normalize(
    baseline_job.suggested_constraints().body_dict["features"]
)
constraints_df.head(10)

  


Unnamed: 0,name,inferred_type,completeness,num_constraints.is_non_negative
0,_c0,Integral,1.0,True
1,_c1,Integral,1.0,True
2,_c2,Integral,1.0,True
3,_c3,Integral,1.0,True
4,_c4,Integral,1.0,True
5,_c5,Integral,1.0,True
6,_c6,Integral,1.0,True
7,_c7,Integral,1.0,True
8,_c8,Integral,1.0,True
9,_c9,Integral,1.0,True


### Create a monitoring schedule
With a monitoring schedule, SageMaker launches processing jobs at a specified frequency to analyze the data collected during a given period. SageMaker provides a [built-in container](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-pre-built-container.html) for performing analysis on tabular datasets. In the processing job, SageMaker compares the dataset for the current analysis with the baseline statistics and constraints and generates a violations report. In addition, CloudWatch metrics are emitted for each data feature under analysis.

In [153]:
mon_schedule_name = "from-idea-to-prod-monitor-schedule-" + strftime(
    "%Y-%m-%d-%H-%M-%S", gmtime()
)

data_monitor.create_monitoring_schedule(
    monitor_schedule_name=mon_schedule_name,
    endpoint_input=predictor.endpoint_name,
    # record_preprocessor_script=pre_processor_script,
    # post_analytics_processor_script=s3_code_postprocessor_uri,
    output_s3_uri=reports_s3_url,
    statistics=data_monitor.baseline_statistics(),
    constraints=data_monitor.suggested_constraints(),
    schedule_cron_expression=CronExpressionGenerator.hourly(),
    enable_cloudwatch_metrics=True,
)

In [None]:
generate_endpoint_traffic(predictor, 10)

In [172]:
desc_schedule_result = data_monitor.describe_schedule()
print("Schedule status: {}".format(desc_schedule_result["MonitoringScheduleStatus"]))

Schedule status: Scheduled


### List schedule executions
Youe created a hourly schedule above that begins executions on the hour (plus 0-20 min buffer. You will have to wait till the clock hit the hour. You can also change the schedule.

In [173]:
mon_executions = data_monitor.list_executions()

No executions found for schedule. monitoring_schedule_name: from-idea-to-prod-monitor-schedule-2022-09-28-09-04-36


### View a monitoring job execution

In [174]:
if len(mon_executions):
    latest_execution = mon_executions[-1]  # get the latest execution
    latest_execution.wait(logs=False)

    print("Latest execution status: {}".format(latest_execution.describe()["ProcessingJobStatus"]))
    print("Latest execution result: {}".format(latest_execution.describe()["ExitMessage"]))

    latest_job = latest_execution.describe()
    if latest_job["ProcessingJobStatus"] != "Completed":
        print("No completed executions to inspect further")
    else:
        report_uri = latest_execution.output.destination
        print(f"Report Uri: {report_uri}")
else:
    print("No executions found")

No executions found


### View a violation report
Model monitor outputs any violations compared to the baseline to a violation report.

In [175]:
!aws s3 ls {reports_s3_url}/

In [169]:
violations = data_monitor.latest_monitoring_constraint_violations()

No executions found for schedule. monitoring_schedule_name: from-idea-to-prod-monitor-schedule-2022-09-28-09-04-36
No executions found for schedule. monitoring_schedule_name: from-idea-to-prod-monitor-schedule-2022-09-28-09-04-36


In [176]:
if violations:
    pd.set_option("display.max_colwidth", None)
    constraints_df = pd.io.json.json_normalize(violations.body_dict["violations"])
    constraints_df.head(10)
else:
    print("No violations report found")

No violations report found


## Model monitor - monitor model quality
Model quality monitoring jobs monitor the performance of a model by comparing the predictions that the model makes with the actual ground truth labels that the model attempts to predict. To do this, model quality monitoring merges data that is captured from real-time inference with actual labels that you store in an Amazon S3 bucket, and then compares the predictions with the actual labels.

Model quality monitoring follows the same steps as data quality monitoring, but adds the additional step of merging the actual labels from Amazon S3 with the predictions captured from the real-time inference endpoint.

To monitor model quality, follow these steps:
1. Enable [data capture](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-data-capture.html)
1. [Create a baseline](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-model-quality-baseline.html). A baseline job compares predictions from the model with ground truth labels in a baseline dataset
1. [Schedule monitoring jobs](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-model-quality-schedule.html)
1. [Ingest ground truth labels](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-model-quality-merge.html) that model monitor merges with captured prediction data from real-time inference endpoint
1. [Intepret the results](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-interpreting-results.html)
1. [Integrate model quality monitoring](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-model-quality-cw.html) with Amazon CloudWatch

## Additional monitoring
Additionally to data and model quality monitoring with Model Monitor, you can use Amazon SageMaker Clarify to:
- [Monitor bias drift](https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-model-monitor-bias-drift.html)
- [Monitor feature attribution drift](https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-model-monitor-feature-attribution-drift.html)

Refer to a sample notebook [Monitoring bias drift and feature attribution drift Amazon SageMaker Clarify](https://sagemaker-examples.readthedocs.io/en/latest/sagemaker_model_monitor/fairness_and_explainability/SageMaker-Model-Monitor-Fairness-and-Explainability.html) for a hands-on example and more details.

## Use SageMaker Studio for data and model monitoring
You can use Studio UX to enable and configure data and model monitoring and to visualize results. You can view the details of any monitoring job run, and you can create charts that show the baseline and captured values for any metric that the monitoring job calculates.

Navigate to **SageMaker resources** to the left side bar and choose **Endpoints** in the drop-down menu. Double-click on an endpoint for which you would like to configure the model monitoring:

<img src="img/endpoints.png" width="400"/>

In the displayed **Endpoint details** tab you can configure data and model monitoring:

![](img/model-monitoring-ux.png)

## Clean-up resources

In [152]:
data_monitor.stop_monitoring_schedule()
data_monitor.delete_monitoring_schedule()


Stopping Monitoring Schedule with name: from-idea-to-prod-monitor-schedule-2022-09-28-08-56-21

Deleting Monitoring Schedule with name: from-idea-to-prod-monitor-schedule-2022-09-28-08-56-21


### Final clean-up
This is the last notebook in this workshop. If you are finished with exploration, to avoid charges on your AWS account, run the [clean-up notebook](99-clean-up.ipynb).

## Further development ideas for your real-world projects
- Add [visualizations](https://sagemaker-examples.readthedocs.io/en/latest/sagemaker_model_monitor/visualization/SageMaker-Model-Monitor-Visualize.html) for model monitoring reports
- Add data baselining, explainability report generation, and bias report to the model building pipeline
- Implement [model quality monitoring](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-model-quality.html)
- Try different inference options such as [serverless](https://docs.aws.amazon.com/sagemaker/latest/dg/serverless-endpoints.html) or [asynchronous](https://docs.aws.amazon.com/sagemaker/latest/dg/async-inference.html) inference
- Address security considerations for your ML environment and solutions. Start with the developer guide [Security in Amazon SageMaker](https://docs.aws.amazon.com/sagemaker/latest/dg/security.html)
- Implement [deployment guardrails](https://docs.aws.amazon.com/sagemaker/latest/dg/deployment-guardrails.html) to control how to update your models in production

## Additional resources
- [Monitor models for data and model quality, bias, and explainability](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor.html)
- [Monitor data quality](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-data-quality.html)
- [Model Monitor visualizations](https://sagemaker-examples.readthedocs.io/en/latest/sagemaker_model_monitor/visualization/SageMaker-Model-Monitor-Visualize.html)
- [Monitor Amazon SageMaker](https://docs.aws.amazon.com/sagemaker/latest/dg/monitoring-overview.html)
- [Monitoring a Model in Production](https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-model-monitor.html)
- [Security in Amazon SageMakerv](https://docs.aws.amazon.com/sagemaker/latest/dg/security.html)
- [Deployment guardrails](https://docs.aws.amazon.com/sagemaker/latest/dg/deployment-guardrails.html)

# Shutdown kernel

In [5]:
%%html

<p><b>Shutting down your kernel for this notebook to release resources.</b></p>
<button class="sm-command-button" data-commandlinker-command="kernelmenu:shutdown" style="display:none;">Shutdown Kernel</button>
        
<script>
try {
    els = document.getElementsByClassName("sm-command-button");
    els[0].click();
}
catch(err) {
    // NoOp
}    
</script>