# Deploying and Monitoring

In this notebook we will deploy the network traffic classification model that we have trained in the previous steps to Amazon SageMaker hosting, which will expose a fully-managed real-time endpoint to execute inferences.

Amazon SageMaker is adding new capabilities that monitor ML models while in production and detect deviations in data quality in comparison to a baseline dataset (e.g. training data set). They enable you to capture the metadata and the input and output for invocations of the models that you deploy with Amazon SageMaker. They also enable you to analyze the data and monitor its quality. 

We will deploy the model to a real-time endpoint with data capture enabled and start collecting some inference inputs/outputs. Then, we will create a baseline and finally enable model monitoring to compare inference data with respect to the baseline and analyze the quality.

## Deployment with Data Capture

We are going to deploy the latest network traffic classification model that we have trained. To deploy a model using the SM Python SDK, we need to make sure we have the Amazon S3 URI where the model artifacts are stored and the URI of the Docker container that will be used for hosting this model.

First, let's determine the Amazon S3 URI of the model artifacts by using a couple of utility functions which query Amazon SageMaker service to get the latest training job whose name starts with **fraud-detection-debug** and then describing the training job.

In [52]:
import boto3

def get_latest_training_job_name(base_job_name):
    client = boto3.client('sagemaker')
    response = client.list_training_jobs(NameContains=base_job_name, SortBy='CreationTime', 
                                         SortOrder='Descending', StatusEquals='Completed')#,CreationTimeAfter=datetime(2020, 1, 30))
    if len(response['TrainingJobSummaries']) > 0 :
        return response['TrainingJobSummaries'][0]['TrainingJobName']
    else:
        raise Exception('Training job not found.')

def get_training_job_s3_model_artifacts(job_name):
    client = boto3.client('sagemaker')
    response = client.describe_training_job(TrainingJobName=job_name)
    s3_model_artifacts = response['ModelArtifacts']['S3ModelArtifacts']
    return s3_model_artifacts

#latest_training_job_name = get_latest_training_job_name('nw-traffic-classification-xgb')

latest_training_job_name = get_latest_training_job_name('fraud-detection-debug')
print(latest_training_job_name)
model_path = get_training_job_s3_model_artifacts(latest_training_job_name)
print(model_path)

fraud-detection-debug-2020-02-04-10-37-56-447
s3://sagemaker-us-east-2-298841451579/aim362/output/fraud-detection-debug-2020-02-04-10-37-56-447/output/model.tar.gz


In [54]:
from time import gmtime, strftime
from sagemaker.xgboost import XGBoostModel

model_name = 'nw-traffic-classification-xgb-model-' + strftime("%Y-%m-%d-%H-%M-%S", gmtime())

code_location = 's3://{0}/{1}/code'.format(bucket_name, prefix)
xgboost_model = XGBoostModel(model_data=model_path,
                             entry_point='deploy_xgboost.py',
                             source_dir='source_dir/',
                             name=model_name,
                             code_location=code_location,
                             framework_version='0.90-2',
                             role=role, 
                             sagemaker_session=sagemaker_session)

Finally we create an endpoint with data capture enabled, for monitoring the model data quality.
Data capture is enabled at enpoint configuration level for the Amazon SageMaker real-time endpoint. You can choose to capture the request payload, the response payload or both and captured data is stored in JSON format.

In [55]:
from time import gmtime, strftime
from sagemaker.model_monitor import DataCaptureConfig

s3_capture_upload_path = 's3://{}/{}/monitoring/datacapture'.format(bucket_name, prefix)
print(s3_capture_upload_path)

endpoint_name = 'nw-traffic-classification-xgb-ep-' + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
print(endpoint_name)

pred = xgboost_model.deploy(initial_instance_count=1,
                            instance_type='ml.m5.xlarge',
                            endpoint_name=endpoint_name,
                            data_capture_config=DataCaptureConfig(
                                enable_capture=True,
                                sampling_percentage=100,
                                destination_s3_uri=s3_capture_upload_path))

s3://sagemaker-us-east-2-298841451579/aim362/monitoring/datacapture
nw-traffic-classification-xgb-ep-2020-02-04-10-52-18
-----------!

After the deployment has been completed, we can leverage on the RealTimePredictor object to execute HTTPs requests against the deployed endpoint and get inference results.

In [72]:
from sagemaker.predictor import RealTimePredictor

pred = RealTimePredictor(endpoint_name)
pred.content_type = 'text/csv'
pred.accept = 'text/csv'


# Expecting class 0
test_values = "3388251889,231.0673178,12708.70248,0,0,0,0,986,650,8"
result = pred.predict(test_values)
print(result)

b'0'


## Baselining

From our validation dataset let's ask Amazon SageMaker to suggest a set of baseline constraints and generate descriptive statistics for our features. Note that we are using the validation dataset for this workshop to make sure baselining time is short, and that file extension needs to be changed since the baselining jobs require .CSV file extension as default.
In reality, you might be willing to use a larger dataset as baseline.

In [78]:
from sagemaker.model_monitor import DefaultModelMonitor
from sagemaker.model_monitor.dataset_format import DatasetFormat

my_default_monitor = DefaultModelMonitor(
    role=role,
    instance_count=1,
    instance_type='ml.c5.4xlarge',
    volume_size_in_gb=20,
    max_runtime_in_seconds=3600,
)

In [79]:
my_default_monitor.suggest_baseline(
    baseline_dataset=baseline_data_path,
    dataset_format=DatasetFormat.csv(header=False),
    output_s3_uri=baseline_results_path,
    wait=True
)


Job Name:  baseline-suggestion-job-2020-02-04-11-35-43-300
Inputs:  [{'InputName': 'baseline_dataset_input', 'S3Input': {'S3Uri': 's3://sagemaker-us-east-2-298841451579/aim362/val/val_data.csv', 'LocalPath': '/opt/ml/processing/input/baseline_dataset_input', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}]
Outputs:  [{'OutputName': 'monitoring_output', 'S3Output': {'S3Uri': 's3://sagemaker-us-east-2-298841451579/aim362/monitoring/baselining/results', 'LocalPath': '/opt/ml/processing/output', 'S3UploadMode': 'EndOfJob'}}]
.................[34m2020-02-04 11:38:22,114 - __main__ - INFO - All params:{'ProcessingJobArn': 'arn:aws:sagemaker:us-east-2:298841451579:processing-job/baseline-suggestion-job-2020-02-04-11-35-43-300', 'ProcessingJobName': 'baseline-suggestion-job-2020-02-04-11-35-43-300', 'Environment': {'dataset_format': '{"csv": {"header": false, "output_columns_position": "START"}}', 'dataset_source': '

<sagemaker.processing.ProcessingJob at 0x7f1f7d868e80>

Let's display the statistics that were generated by the baselining job.

In [80]:
import pandas as pd

baseline_job = my_default_monitor.latest_baselining_job
schema_df = pd.io.json.json_normalize(baseline_job.baseline_statistics().body_dict["features"])
schema_df.head(10)

Unnamed: 0,inferred_type,name,numerical_statistics.common.num_missing,numerical_statistics.common.num_present,numerical_statistics.distribution.kll.buckets,numerical_statistics.distribution.kll.sketch.data,numerical_statistics.distribution.kll.sketch.parameters.c,numerical_statistics.distribution.kll.sketch.parameters.k,numerical_statistics.max,numerical_statistics.mean,numerical_statistics.min,numerical_statistics.std_dev,numerical_statistics.sum
0,Integral,_c0,0,615,"[{'lower_bound': 2147483647.0, 'upper_bound': ...","[[3386662126.0, 3388251889.0, 3729024090.0, 38...",0.64,2048.0,6665254000.0,4955335000.0,3200017000.0,974201100.0,3047531000000.0
1,Fractional,_c1,0,615,"[{'lower_bound': 10.13506361, 'upper_bound': 1...","[[222.5298151, 231.06731780000004, 360.1675018...",0.64,2048.0,997.6217,516.8227,10.13506,287.7551,317846.0
2,Fractional,_c2,0,615,"[{'lower_bound': 0.0, 'upper_bound': 5649.1813...","[[12461.66965, 12708.70248, 20169.3801, 20262....",0.64,2048.0,56491.81,9042.773,0.0,8934.9,5561306.0
3,Integral,_c3,0,615,"[{'lower_bound': 0.0, 'upper_bound': 0.1, 'cou...","[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,...",0.64,2048.0,1.0,0.004878049,0.0,0.06967247,3.0
4,Integral,_c4,0,615,"[{'lower_bound': 0.0, 'upper_bound': 0.9, 'cou...","[[9.0, 0.0, 0.0, 5.0, 9.0, 0.0, 7.0, 0.0, 8.0,...",0.64,2048.0,9.0,0.8325203,0.0,1.990118,512.0
5,Integral,_c5,0,615,"[{'lower_bound': 0.0, 'upper_bound': 0.1, 'cou...","[[1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0,...",0.64,2048.0,1.0,0.2178862,0.0,0.4128096,134.0
6,Integral,_c6,0,615,"[{'lower_bound': 0.0, 'upper_bound': 0.1, 'cou...","[[0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0,...",0.64,2048.0,1.0,0.04065041,0.0,0.197479,25.0
7,Integral,_c7,0,615,"[{'lower_bound': 0.0, 'upper_bound': 99.2, 'co...","[[0.0, 986.0, 923.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...",0.64,2048.0,992.0,37.29593,0.0,171.3512,22937.0
8,Fractional,_c8,0,615,"[{'lower_bound': 0.0, 'upper_bound': 97.5, 'co...","[[0.0, 650.0, 542.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...",0.64,2048.0,975.0,25.78049,0.0,126.4123,15855.0
9,Integral,_c9,0,615,"[{'lower_bound': 0.0, 'upper_bound': 0.9, 'cou...","[[0.0, 8.0, 7.0, 0.0, 0.0, 0.0, 0.0, 0.0, 3.0,...",0.64,2048.0,9.0,0.2406504,0.0,1.181417,148.0


Then, we can also visualize the constraints.

In [81]:
constraints_df = pd.io.json.json_normalize(baseline_job.suggested_constraints().body_dict["features"])
constraints_df.head(10)

Unnamed: 0,completeness,inferred_type,name,num_constraints.is_non_negative
0,1.0,Integral,_c0,True
1,1.0,Fractional,_c1,True
2,1.0,Fractional,_c2,True
3,1.0,Integral,_c3,True
4,1.0,Integral,_c4,True
5,1.0,Integral,_c5,True
6,1.0,Integral,_c6,True
7,1.0,Integral,_c7,True
8,1.0,Fractional,_c8,True
9,1.0,Integral,_c9,True


#### Results

The baselining job has inspected the validation dataset and generated constraints and statistics, that will be used to monitor our endpoint.

## Generating violations artificially

In order to get some result relevant to monitoring analysis, we are going to generate artificially some inferences with feature values causing specific violations, and then invoke the endpoint with this data.

This requires about 2 minutes for 1000 inferences.

In [None]:
import time
import numpy as np
dist_values = np.random.normal(1, 0.2, 1000)
dist_values[1:5]

In [97]:
artificial_values = "{0},231.0673178,12708.70248,0,0,0,986,650,{0}"

for i in range(1000):
    pred.predict(artificial_values.format(str(dist_values[i])))
    time.sleep(0.15)
    if i > 0 and i % 100 == 0 :
        print('Executed {0} inferences.'.format(i))

Executed 100 inferences.
Executed 200 inferences.
Executed 300 inferences.
Executed 400 inferences.
Executed 500 inferences.
Executed 600 inferences.
Executed 700 inferences.
Executed 800 inferences.
Executed 900 inferences.


## Monitoring

### Create Monitoring Schedule

In [None]:
from sagemaker.model_monitor import CronExpressionGenerator
from time import gmtime, strftime

endpoint_name = pred.endpoint

mon_schedule_name = 'nw-traffic-classification-xgb-mon-sch-' + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
my_default_monitor.create_monitoring_schedule(
    monitor_schedule_name=mon_schedule_name,
    endpoint_input=endpoint_name,
    post_analytics_processor_script=postprocessor_path,
    output_s3_uri=reports_path,
    statistics=my_default_monitor.baseline_statistics(),
    constraints=my_default_monitor.suggested_constraints(),
    schedule_cron_expression=CronExpressionGenerator.hourly(),
    enable_cloudwatch_metrics=True
)

### Describe Monitoring Schedule

In [None]:
desc_schedule_result = my_default_monitor.describe_schedule()
desc_schedule_result

### Triggering execution manually

In oder to trigger the execution manually, we first get all paths to data capture, baseline statistics, baseline constraints, etc.
Then, we use a utility fuction, defined in <a href="./monitoringjob_utils.py">monitoringjob_utils.py</a>, to run the processing job.

In [99]:
from monitoringjob_utils import run_model_monitor_job_processor

run_model_monitor_job_processor(region, 'ml.m5.xlarge', role, data_capture_path, statistics_path, constraints_path, reports_path,
                                postprocessor_path=postprocessor_path)


Job Name:  sagemaker-model-monitor-analyzer-2020-02-04-12-05-18-449
Inputs:  [{'InputName': 'input_1', 'S3Input': {'S3Uri': 's3://sagemaker-us-east-2-298841451579/aim362/monitoring/datacapture/nw-traffic-classification-xgb-ep-2020-02-04-10-52-18/AllTraffic/2020/02/04/12', 'LocalPath': '/opt/ml/processing/input/endpoint/nw-traffic-classification-xgb-ep-2020-02-04-10-52-18/AllTraffic/2020/02/04/12', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}, {'InputName': 'baseline', 'S3Input': {'S3Uri': 's3://sagemaker-us-east-2-298841451579/aim362/monitoring/baselining/results/statistics.json', 'LocalPath': '/opt/ml/processing/baseline/stats', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}, {'InputName': 'constraints', 'S3Input': {'S3Uri': 's3://sagemaker-us-east-2-298841451579/aim362/monitoring/baselining/results/constraints.json', 'LocalPath':

### Analysis

When the monitoring job completes, monitoring reports are saved to Amazon S3. Let's list the generated reports.

In [102]:
import pandas as pd
pd.set_option('display.max_colwidth', -1)

file = open('monitoring/constraint_violations.json', 'r')
data = file.read()

violations_df = pd.io.json.json_normalize(json.loads(data)['violations'])
violations_df.head(10)

Unnamed: 0,constraint_check_type,description,feature_name
0,data_type_check,"Data type match requirement is not met. Expected data type: Integral, Expected match: 100.0%. Observed: Only 0.0% of data is Integral.",_c3
1,data_type_check,"Data type match requirement is not met. Expected data type: Integral, Expected match: 100.0%. Observed: Only 0.0% of data is Integral.",_c9
2,data_type_check,"Data type match requirement is not met. Expected data type: Fractional, Expected match: 100.0%. Observed: Only 0.0% of data is Fractional.",_c8
3,baseline_drift_check,Baseline drift distance: 0.9077130942973524 exceeds threshold: 0.1,_c1
