# Table of Contents
- [**Introduction**](#Introduction)
- [**Initiate SageMaker**](#InitiateSageMaker)
- [**Download & Processing the Dataset**](#DownloadingandProcessingtheDataset)
- [**Create SageMaker Experiment**](#SageMakerExperiment)
- [**Evaluate the Model Performance**](#EvaluateModelPerformance)
- [**Deploy the Model**](#DeploytheModel)
- [**Monitor the Model**](#MonitortheModelEndpoint)
- [**Summary**](#Summary)

## Introduction <a name="Introduction"></a>

This notebook describes how Cloud Service - **AWS SageMaker** helps us to reduce time to production for Machine Learning projects. Amazon SageMaker provides a fully integrated development environment (IDE) called **SageMaker Studio**  for Machine Learning that provides a single, web-based visual interface to perform all the steps for ML development.

We have used SageMaker Studio to build, train, deploy, and monitor an XGBoost model. We have covered the entire machine learning (ML) workflow from feature engineering and model training to batch and live deployments for ML models.

The model will be trained on the [UCI Credit Card Default](https://archive.ics.uci.edu/ml/datasets/default+of+credit+card+clients) dataset that contains information on customer demographics, payment history, and billing history.

>**This notebook is for tutorial purpose only and primarily demonstrate screenshots of steps taken during execution in SageMaker Studio**

##  Initiate SageMaker <a name="InitiateSageMaker"></a>

1. Open AWS console and create an AWS account, navigate to AWS SageMaker

2. In the Amazon SageMaker navigation pane, choose SageMaker Studio

3. In the Get started box, choose Quick Start and specify a user name

4. For Execution role, choose Create an IAM role. In the dialog box that appears, choose Any S3 bucket and choose Create role and click submit.Amazon SageMaker creates a role with the required permissions and assigns it to your instance.

> **The resources created and used in this notebook are AWS Free Tier eligible**


## Downloading and Processing the Dataset <a name="DownloadingandProcessingtheDataset"></a>

Amazon SageMaker Studio notebooks are one-click Jupyter notebooks that contain everything you need to build and test your training scripts. SageMaker Studio also includes experiment tracking and visualization so that it’s easy to manage your entire machine learning workflow in one place.

In the following steps we will create a SageMaker Notebook, download the dataset, and then upload the dataset to Amazon S3


```Amazon Simple Storage Service (Amazon S3) is an object storage service offering industry-leading scalability, data availability, security, and performance```

**1.** In the Amazon SageMaker Studio Control Panel, choose Open Studio.

**2.** In JupyterLab, on the File menu, choose New, then Notebook. In the Select Kernel box, choose Python 3 (Data Science)
<img src="images/2.2.png">

**3.** Verify version of the Amazon SageMaker Python SDK.Execute the following code block

In [None]:
import boto3
import sagemaker
from sagemaker import get_execution_role
import sys
import IPython

if int(sagemaker.__version__.split('.')[0]) == 2:
    print("Installing previous SageMaker Version and restarting the kernel")
    !{sys.executable} -m pip install sagemaker==1.72.0
    IPython.Application.instance().kernel.do_shutdown(True)

else:
    print("Version is good")


role = get_execution_role()
sess = sagemaker.Session()
region = boto3.session.Session().region_name
print("Region = {}".format(region))
sm = boto3.Session().client('sagemaker')

<img src="images/2.3.png">

**4.** Import required libraries and SageMaker Experiments by running following code

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import os
from time import sleep, gmtime, strftime
import json
import time

In [None]:
!pip install sagemaker-experiments 
from sagemaker.analytics import ExperimentAnalytics
from smexperiments.experiment import Experiment
from smexperiments.trial import Trial
from smexperiments.trial_component import TrialComponent
from smexperiments.tracker import Tracker

<img src="images/2.4.png">

**5.** Define the Amazon S3 buckets and folders for the project

In [None]:
rawbucket= sess.default_bucket() # Alternatively you can use our custom bucket here. 

prefix = 'sagemaker-modelmonitor' # use this prefix to store all files pertaining to this workshop.

dataprefix = prefix + '/data'
traindataprefix = prefix + '/train_data'
testdataprefix = prefix + '/test_data'
testdatanolabelprefix = prefix + '/test_data_no_label'
trainheaderprefix = prefix + '/train_headers'

<img src="images/2.5.png">

**6.** Download the dataset and import it using the pandas library

In [None]:
! wget https://archive.ics.uci.edu/ml/machine-learning-databases/00350/default%20of%20credit%20card%20clients.xls
data = pd.read_excel('default of credit card clients.xls', header=1)
data = data.drop(columns = ['ID'])
data.head()

<img src="images/2.6.png">

**7.** We will use Amazon SageMaker's built-in **XGBoost algorithm** for modeling, but before importing required libraries for that we need to rename the last column as **Label** and extract the label column separately

In [None]:
data.rename(columns={"default payment next month": "Label"}, inplace=True)
lbl = data.Label
data = pd.concat([lbl, data.drop(columns=['Label'])], axis = 1)
data.head()

<img src="images/2.7.png" >

**8.** Now upload the CSV dataset into an Amazon S3 bucket.

In [None]:
if not os.path.exists('rawdata/rawdata.csv'):
    !mkdir rawdata
    data.to_csv('rawdata/rawdata.csv', index=None)
else:
    pass
# Upload the raw dataset
raw_data_location = sess.upload_data('rawdata', bucket=rawbucket, key_prefix=dataprefix)
print(raw_data_location)

<img src="images/2.8.png">

Data is uploaded in S3 bucket
<img src="images/2.8_S3.png">

**9.** After uploading the data, we will preprocessing the dataset which includes scaling the columns and splitting the dataset into train and test data. SageMaker lets us run our preprocessing, postprocessing, and model evaluation workloads

```Amazon SageMaker Processing runs on separate compute instances from notebook instance. This means we can continue to experiment and run code in our notebook while the processing job is under way. This will incur additional charges for the cost of the instance which is up and running for the duration of the processing job. The instances are automatically terminated by SageMaker once the processing job completes```

To perform Data Processing task we have to import the scikit-learn processing container. We have written **writefile preprocessing.py** script for this.

In [None]:
from sagemaker.sklearn.processing import SKLearnProcessor
sklearn_processor = SKLearnProcessor(framework_version='0.20.0',
                                     role=role,
                                     instance_type='ml.c4.xlarge',
                                     instance_count=1)

In [None]:
%%writefile preprocessing.py

import argparse
import os
import warnings

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.exceptions import DataConversionWarning
from sklearn.compose import make_column_transformer

warnings.filterwarnings(action='ignore', category=DataConversionWarning)

if __name__=='__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('--train-test-split-ratio', type=float, default=0.3)
    parser.add_argument('--random-split', type=int, default=0)
    args, _ = parser.parse_known_args()
    
    print('Received arguments {}'.format(args))

    input_data_path = os.path.join('/opt/ml/processing/input', 'rawdata.csv')
    
    print('Reading input data from {}'.format(input_data_path))
    df = pd.read_csv(input_data_path)
    df.sample(frac=1)
    
    COLS = df.columns
    newcolorder = ['PAY_AMT1','BILL_AMT1'] + list(COLS[1:])[:11] + list(COLS[1:])[12:17] + list(COLS[1:])[18:]
    
    split_ratio = args.train_test_split_ratio
    random_state=args.random_split
    
    X_train, X_test, y_train, y_test = train_test_split(df.drop('Label', axis=1), df['Label'], 
                                                        test_size=split_ratio, random_state=random_state)
    
    preprocess = make_column_transformer(
        (['PAY_AMT1'], StandardScaler()),
        (['BILL_AMT1'], MinMaxScaler()),
    remainder='passthrough')
    
    print('Running preprocessing and feature engineering transformations')
    train_features = pd.DataFrame(preprocess.fit_transform(X_train), columns = newcolorder)
    test_features = pd.DataFrame(preprocess.transform(X_test), columns = newcolorder)
    
    # concat to ensure Label column is the first column in dataframe
    train_full = pd.concat([pd.DataFrame(y_train.values, columns=['Label']), train_features], axis=1)
    test_full = pd.concat([pd.DataFrame(y_test.values, columns=['Label']), test_features], axis=1)
    
    print('Train data shape after preprocessing: {}'.format(train_features.shape))
    print('Test data shape after preprocessing: {}'.format(test_features.shape))
    
    train_features_headers_output_path = os.path.join('/opt/ml/processing/train_headers', 'train_data_with_headers.csv')
    
    train_features_output_path = os.path.join('/opt/ml/processing/train', 'train_data.csv')
    
    test_features_output_path = os.path.join('/opt/ml/processing/test', 'test_data.csv')
    
    print('Saving training features to {}'.format(train_features_output_path))
    train_full.to_csv(train_features_output_path, header=False, index=False)
    print("Complete")
    
    print("Save training data with headers to {}".format(train_features_headers_output_path))
    train_full.to_csv(train_features_headers_output_path, index=False)
                 
    print('Saving test features to {}'.format(test_features_output_path))
    test_full.to_csv(test_features_output_path, header=False, index=False)
    print("Complete")

<img src = "images/2.9.png">

**10.** Now, we will move this preprocessing code to S3 bucket

In [None]:
# Copy the preprocessing code over to the s3 bucket
codeprefix = prefix + '/code'
codeupload = sess.upload_data('preprocessing.py', bucket=rawbucket, key_prefix=codeprefix)
print(codeupload)

<img src ="images/2.10.png">
Code is uploaded in S3 bucket
<img src ="images/2.10_S3.png">

**11.** Specify Training and Testing data location

In [None]:
train_data_location = rawbucket + '/' + traindataprefix
test_data_location = rawbucket+'/'+testdataprefix
print("Training data location = {}".format(train_data_location))
print("Test data location = {}".format(test_data_location))

<img src="images/2.11.png">

**12.** Now Process the data by SageMaker Processing Job. Following code will call **sklearn_processor.run** and extracts some optional metadata about the processing job, such as where the training and test outputs were stored. In below code snippet, we can see that locations of the code, train and test data in the outputs provided to the processor. Also, note the arguments provided to the processing scripts.

In [None]:
from sagemaker.processing import ProcessingInput, ProcessingOutput

sklearn_processor.run(code=codeupload,
                      inputs=[ProcessingInput(
                        source=raw_data_location,
                        destination='/opt/ml/processing/input')],
                      outputs=[ProcessingOutput(output_name='train_data',
                                                source='/opt/ml/processing/train',
                               destination='s3://' + train_data_location),
                               ProcessingOutput(output_name='test_data',
                                                source='/opt/ml/processing/test',
                                               destination="s3://"+test_data_location),
                               ProcessingOutput(output_name='train_data_headers',
                                                source='/opt/ml/processing/train_headers',
                                               destination="s3://" + rawbucket + '/' + prefix + '/train_headers')],
                      arguments=['--train-test-split-ratio', '0.2']
                     )

preprocessing_job_description = sklearn_processor.jobs[-1].describe()

output_config = preprocessing_job_description['ProcessingOutputConfig']
for output in output_config['Outputs']:
    if output['OutputName'] == 'train_data':
        preprocessed_training_data = output['S3Output']['S3Uri']
    if output['OutputName'] == 'test_data':
        preprocessed_test_data = output['S3Output']['S3Uri']


Above code will take some time to execute and following will be the console output
<img src ="images/2.12.png">
After the execution following folders will be created in S3 bucket
<img src ="images/2.12_S3.png">

## Create SageMaker Experiment <a name="SageMakerExperiment"></a>

**1.** After downloading and staging the dataset in Amazon S3, we will create an **SageMaker Experiment**. An experiment is a collection of processing and training jobs related to the same machine learning project. SageMaker Experiments is a capability of Amazon SageMaker that lets you **organize, track, compare, and evaluate** your machine learning experiments.

In [None]:
# Create a SageMaker Experiment
cc_experiment = Experiment.create(
    experiment_name=f"Build-train-deploy-{int(time.time())}", 
    description="Predict credit card default from payments data", 
    sagemaker_boto_client=sm)
print(cc_experiment)

Every training job is logged as a trial. Each trial is an iteration of our end-to-end training job. In addition to the training job, it can also track pre-processing and post-processing jobs as well as datasets and other metadata. A single experiment can include multiple trials which makes it easy for us to track multiple iterations over time within the Amazon SageMaker Studio Experiments pane. 

```This feature enables works done in one project to be utilized in another project,this is one of the advantage of using cloud service during POC phase```

<img src="images/3.1.png">

**2.** Following code will track our pre-processing job under Experiments as well as a step in the training pipeline

In [None]:
# Start Tracking parameters used in the Pre-processing pipeline.
with Tracker.create(display_name="Preprocessing", sagemaker_boto_client=sm) as tracker:
    tracker.log_parameters({
        "train_test_split_ratio": 0.2,
        "random_state":0
    })
    # we can log the s3 uri to the dataset we just uploaded
    tracker.log_input(name="ccdefault-raw-dataset", media_type="s3/uri", value=raw_data_location)
    tracker.log_input(name="ccdefault-train-dataset", media_type="s3/uri", value=train_data_location)
    tracker.log_input(name="ccdefault-test-dataset", media_type="s3/uri", value=test_data_location)

To view details of the experiment, Navigate to SageMaker Resources and select option `Experiments and trials` as SageMaker resources. Right click on the listed Trial Component and select `Open in trail component list` option
<img src="images/3.2.png">

**3.** To train an **XGBoost classifier**, we need to first import the XGBoost container maintained by Amazon SageMaker. Then, we will log the training run under a Trial so SageMaker Experiments can track it under a Trial name. The pre-processing job is included under the same trial name since it is part of the pipeline. Next, we will create a SageMaker Estimator object, which automatically provisions the underlying instance type of our choosing, copies over the training data from the specified output location from the processing job, trains the model, and outputs the model artifacts.Following code will do the needful. 

In [None]:
from sagemaker.amazon.amazon_estimator import get_image_uri
container = get_image_uri(boto3.Session().region_name, 'xgboost', '1.0-1')
s3_input_train = sagemaker.s3_input(s3_data='s3://' + train_data_location, content_type='csv')
preprocessing_trial_component = tracker.trial_component

trial_name = f"cc-default-training-job-{int(time.time())}"
cc_trial = Trial.create(
        trial_name=trial_name, 
            experiment_name=cc_experiment.experiment_name,
        sagemaker_boto_client=sm
    )

cc_trial.add_trial_component(preprocessing_trial_component)
cc_training_job_name = "cc-training-job-{}".format(int(time.time()))

xgb = sagemaker.estimator.Estimator(container,
                                    role, 
                                    train_instance_count=1, 
                                    train_instance_type='ml.m4.xlarge',
                                    train_max_run=86400,
                                    output_path='s3://{}/{}/models'.format(rawbucket, prefix),
                                    sagemaker_session=sess) # set to true for distributed training

xgb.set_hyperparameters(max_depth=5,
                        eta=0.2,
                        gamma=4,
                        min_child_weight=6,
                        subsample=0.8,
                        verbosity=0,
                        objective='binary:logistic',
                        num_round=100)

xgb.fit(inputs = {'train':s3_input_train},
       job_name=cc_training_job_name,
        experiment_config={
            "TrialName": cc_trial.trial_name, #log training job in Trials for lineage
            "TrialComponentDisplayName": "Training",
        },
        wait=True,
    )
time.sleep(2)

Following will be the console output
<img src ="images/3.3.png">

**4.** In the Experiment, a tab opens the listed Trial Component by clicking on the `Open in trail component list` option. The job will be listed as completed in the `Trial and Component List` tab
<img src="images/3.4.png">

**5.** To explore the associated metadata with the training job, right-click on one of the completed `Training jobs` and choose `Describe Trial Component`. This Page needs to be refreshed to see the latest results.

<img src="images/3.5.png">

## Evaluate the Model Performance <a name="EvaluateModelPerformance"></a>

**1.** To evaluate the model performance on unseen data we will create **offline/batch inference** from the trained model.Refer [Batch Transform](https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform.html) in Amazon documentation.

Following code will copy the dataset over from the Amazon S3 location into our local folder.


In [None]:
test_data_path = 's3://' + test_data_location + '/test_data.csv'
! aws s3 cp $test_data_path .

<img src="images/4.1.png">

**2.** After downloading the data, check the dataset along with all columns and extract the label column

In [None]:
test_full = pd.read_csv('test_data.csv', names = [str(x) for x in range(len(data.columns))])
test_full.head()

<img src="images/4.2.png">

In [None]:
label = test_full['0'] 

**3.** To create the Batch Transform job we will run the following code. Like the training job, SageMaker provisions all the underlying resources, copies over the trained model artifacts, sets up a Batch endpoint locally, copies over the data, and runs inferences on the data and pushes the outputs to Amazon S3. Note that by setting the input_filter, we are letting Batch Transform know to neglect the first column in the test data which is the label column.

In [None]:
%%time

sm_transformer = xgb.transformer(1, 'ml.m5.xlarge', accept = 'text/csv')

# start a transform job
sm_transformer.transform(test_data_path, split_type='Line', input_filter='$[1:]', content_type='text/csv')
sm_transformer.wait()

**4.** To evaluate the model metrics we will run the following code.Here, we have defined a function that pulls the output of the Batch Transform job, which is contained in a file with a **.out** extension from the Amazon S3 bucket and then we have extracted the predicted labels into a dataframe and append the true labels to this dataframe.

In [None]:
import json
import io
from urllib.parse import urlparse

def get_csv_output_from_s3(s3uri, file_name):
    parsed_url = urlparse(s3uri)
    bucket_name = parsed_url.netloc
    prefix = parsed_url.path[1:]
    s3 = boto3.resource('s3')
    obj = s3.Object(bucket_name, '{}/{}'.format(prefix, file_name))
    return obj.get()["Body"].read().decode('utf-8')
output = get_csv_output_from_s3(sm_transformer.output_path, 'test_data.csv.out')
output_df = pd.read_csv(io.StringIO(output), sep=",", header=None)
output_df.head(8)
output_df['Predicted']=np.round(output_df.values)
output_df['Label'] = label
from sklearn.metrics import confusion_matrix, accuracy_score
confusion_matrix = pd.crosstab(output_df['Predicted'], output_df['Label'], rownames=['Actual'], colnames=['Predicted'], margins = True)
confusion_matrix

Cosole output will be like this, which shows the total number of **Predicted** True and False values compared to the **Actual** values.  
<img src="images/4.4.png">

**5.** After model metrics evaluation, we will use the following code to extract baseline model accuracy and the model accuracy.

```A helpful model for the baseline accuracy can be the fraction of non-default cases. A model that always predicts that a user will not default has that accuracy.```

In [None]:
print("Baseline Accuracy = {}".format(1- np.unique(data['Label'], return_counts=True)[1][1]/(len(data['Label']))))
print("Accuracy Score = {}".format(accuracy_score(label, output_df['Predicted'])))

Cosole output will be like this.The results show that a simple model can already beat the baseline accuracy. In order to improve the results, we can tune the hyperparametersby using **Hyperparameter optimization (HPO)** on SageMaker for automatic model tuning
<img src="images/4.5.png">

***
> ```At this point typical POC will be ended and based on the model evaluation results stakeholders will decide to deploy into production. If we executed above steps in a local notebook then it will take some time to move model/code and dataset for deployment which eventually increase time to production. Since we have used Cloud infrastructure there is no need to move data and model/code for deployment which means time to prodcution is reduced.```
***

## Deploy the Model <a name="DeploytheModel"></a>

**1.** Amazon SageMaker automatically handles the model hosting and creation of the endpoint, with the helpof below code we will deploy the model as a **RESTful HTTPS** endpoint to serve live inferences

In [None]:
from sagemaker.model_monitor import DataCaptureConfig
from sagemaker import RealTimePredictor
from sagemaker.predictor import csv_serializer

sm_client = boto3.client('sagemaker')

latest_training_job = sm_client.list_training_jobs(MaxResults=1,
                                                SortBy='CreationTime',
                                                SortOrder='Descending')

training_job_name=TrainingJobName=latest_training_job['TrainingJobSummaries'][0]['TrainingJobName']

training_job_description = sm_client.describe_training_job(TrainingJobName=training_job_name)

model_data = training_job_description['ModelArtifacts']['S3ModelArtifacts']
container_uri = training_job_description['AlgorithmSpecification']['TrainingImage']

# create a model.
def create_model(role, model_name, container_uri, model_data):
    return sm_client.create_model(
        ModelName=model_name,
        PrimaryContainer={
        'Image': container_uri,
        'ModelDataUrl': model_data,
        },
        ExecutionRoleArn=role)
    

try:
    model = create_model(role, training_job_name, container_uri, model_data)
except Exception as e:
        sm_client.delete_model(ModelName=training_job_name)
        model = create_model(role, training_job_name, container_uri, model_data)
        

print('Model created: '+model['ModelArn'])

Cosole Output will be
<img src="images/5.1.png">

**2.** Following code will specify data configuration settings SageMaker for captureing 100% of the inference payloads received by the endpoint, capture both inputs and outputs, and also note the input content type as csv.

In [None]:
s3_capture_upload_path = 's3://{}/{}/monitoring/datacapture'.format(rawbucket, prefix)
data_capture_configuration = {
    "EnableCapture": True,
    "InitialSamplingPercentage": 100,
    "DestinationS3Uri": s3_capture_upload_path,
    "CaptureOptions": [
        { "CaptureMode": "Output" },
        { "CaptureMode": "Input" }
    ],
    "CaptureContentTypeHeader": {
       "CsvContentTypes": ["text/csv"],
       "JsonContentTypes": ["application/json"]}}

**3.** The following code will create endpoint configuration and deploys the endpoint. Here, we have specified instance type and traffic allocation to this endpoint

In [None]:
def create_endpoint_config(model_config, data_capture_config): 
    return sm_client.create_endpoint_config(
                                                EndpointConfigName=model_config,
                                                ProductionVariants=[
                                                        {
                                                            'VariantName': 'AllTraffic',
                                                            'ModelName': model_config,
                                                            'InitialInstanceCount': 1,
                                                            'InstanceType': 'ml.m4.xlarge',
                                                            'InitialVariantWeight': 1.0,
                                                },
                                                    
                                                    ],
                                                DataCaptureConfig=data_capture_config
                                                )




try:
    endpoint_config = create_endpoint_config(training_job_name, data_capture_configuration)
except Exception as e:
    sm_client.delete_endpoint_config(EndpointConfigName=endpoint)
    endpoint_config = create_endpoint_config(training_job_name, data_capture_configuration)

print('Endpoint configuration created: '+ endpoint_config['EndpointConfigArn'])

<img src="images/5.3.png">

**4.** After Endpoint configuration, we need to create an actual endpoint

In [None]:
# Enable data capture, sampling 100% of the data for now. Next we deploy the endpoint in the correct VPC.

endpoint_name = training_job_name
def create_endpoint(endpoint_name, config_name):
    return sm_client.create_endpoint(
                                    EndpointName=endpoint_name,
                                    EndpointConfigName=training_job_name
                                )


try:
    endpoint = create_endpoint(endpoint_name, endpoint_config)
except Exception as e:
    sm_client.delete_endpoint(EndpointName=endpoint_name)
    endpoint = create_endpoint(endpoint_name, endpoint_config)

print('Endpoint created: '+ endpoint['EndpointArn'])

<img src="images/5.4.png">

**5.** Navigate to SageMaker Resources and select option `Endpoints` as SageMaker resources. The **Endpoints** list displays all of the endpoints in service.Once the endpoint is created, the status changes to **InService**
<img src="images/5.5.png">

**6.** We will take first 10 rows as a sample of test dataset in `test_sample.csv` and send inference requests from this sample dataset to our endpoint

In [None]:
!head -10 test_data.csv > test_sample.csv

In [None]:
from sagemaker import RealTimePredictor
from sagemaker.predictor import csv_serializer

predictor = RealTimePredictor(endpoint=endpoint_name, content_type = 'text/csv')

with open('test_sample.csv', 'r') as f:
    for row in f:
        payload = row.rstrip('\n')
        response = predictor.predict(data=payload[2:])
        sleep(0.5)
print('done!')

**7.** The following code will verify that Model Monitor is correctly capturing the incoming data.In the code, the `current_endpoint_capture_prefix` captures the directory path where our ModelMonitor outputs are stored.

In [None]:
# Extract the captured json files.
data_capture_prefix = '{}/monitoring'.format(prefix)
s3_client = boto3.Session().client('s3')
current_endpoint_capture_prefix = '{}/datacapture/{}/AllTraffic'.format(data_capture_prefix, endpoint_name)
print(current_endpoint_capture_prefix)
result = s3_client.list_objects(Bucket=rawbucket, Prefix=current_endpoint_capture_prefix)
capture_files = [capture_file.get("Key") for capture_file in result.get('Contents')]
print("Found Capture Files:")
print("\n ".join(capture_files))

capture_files[0]

The console output indicates that data capture is configured and saving the incoming requests
<img src="images/5.7.png">

**8.** Now we will extract the content of one of the json files and view the captured outputs. 

In [None]:
# View contents of the captured file.
def get_obj_body(bucket, obj_key):
    return s3_client.get_object(Bucket=rawbucket, Key=obj_key).get('Body').read().decode("utf-8")

capture_file = get_obj_body(rawbucket, capture_files[0])
print(json.dumps(json.loads(capture_file.split('\n')[5]), indent = 2, sort_keys =True))

Console output indicates that data capture is capturing both the input payload and the output of the model. 
<img src="images/5.8.png">

## Monitor the Model Endpoint <a name="MonitortheModelEndpoint"></a>

**1.** Now we will enable **SageMaker Model Monitor** to monitor the deployed endpoint for data drift. To do so we will compare the payload and outputs sent to the model against a baseline and determine whether there is any drift in the input data, or the label.
Following code will create two folders: one folder stores the baseline data which will be used to train our model; the second folder stores any violations from that baseline.

In [None]:
model_prefix = prefix + "/" + endpoint_name
baseline_prefix = model_prefix + '/baselining'
baseline_data_prefix = baseline_prefix + '/data'
baseline_results_prefix = baseline_prefix + '/results'

baseline_data_uri = 's3://{}/{}'.format(rawbucket,baseline_data_prefix)
baseline_results_uri = 's3://{}/{}'.format(rawbucket, baseline_results_prefix)
train_data_header_location = "s3://" + rawbucket + '/' + prefix + '/train_headers'
print('Baseline data uri: {}'.format(baseline_data_uri))
print('Baseline results uri: {}'.format(baseline_results_uri))
print(train_data_header_location)

<img src="images/6.1.png">

**2.** The following code will set up a baseline job for Model Monitor to capture the statistics of the training data. To do this, Model Monitor uses the **deequ** library built on top of Apache Spark for conducting unit tests on data

In [None]:
from sagemaker.model_monitor import DefaultModelMonitor
from sagemaker.model_monitor.dataset_format import DatasetFormat

my_default_monitor = DefaultModelMonitor(
    role=role,
    instance_count=1,
    instance_type='ml.m5.xlarge',
    volume_size_in_gb=20,
    max_runtime_in_seconds=3600)

my_default_monitor.suggest_baseline(
    baseline_dataset=os.path.join(train_data_header_location, 'train_data_with_headers.csv'),
    dataset_format=DatasetFormat.csv(header=True),
    output_s3_uri=baseline_results_uri,
    wait=True
)

Model Monitor sets up a separate instance, copies over the training data, and generates some statistics. The service generates a lot of Apache Spark logs, which can be ignored. Once the job is completed, we can see **Spark job completed** output.
<img src="images/6.2.png">

**3.** Following code will check the output generated by the baseline job. It will create two files - **constraints.json** and **statistics.json**. This will converts the json output in these files into a pandas dataframe.

In [None]:
s3_client = boto3.Session().client('s3')
result = s3_client.list_objects(Bucket=rawbucket, Prefix=baseline_results_prefix)
report_files = [report_file.get("Key") for report_file in result.get('Contents')]
print("Found Files:")
print("\n ".join(report_files))

baseline_job = my_default_monitor.latest_baselining_job
schema_df = pd.io.json.json_normalize(baseline_job.baseline_statistics().body_dict["features"])
schema_df

Content of **statistics.json**
<img src="images/6.3_statistics.png">
Content of **constraints.json**
<img src="images/6.3_constraint.png">

**4.** We can set up the frequency for endpoint monitoring. Here we have specifies an hourly frequency, but it can be changed  for production applications as hourly frequency will generate a lot of data. Model Monitor will produce a report consisting of all the violations it finds

In [None]:
reports_prefix = '{}/reports'.format(prefix)
s3_report_path = 's3://{}/{}'.format(rawbucket,reports_prefix)
print(s3_report_path)

from sagemaker.model_monitor import CronExpressionGenerator
from time import gmtime, strftime

mon_schedule_name = 'Built-train-deploy-model-monitor-schedule-' + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
my_default_monitor.create_monitoring_schedule(
    monitor_schedule_name=mon_schedule_name,
    endpoint_input=predictor.endpoint,
    output_s3_uri=s3_report_path,
    statistics=my_default_monitor.baseline_statistics(),
    constraints=my_default_monitor.suggested_constraints(),
    schedule_cron_expression=CronExpressionGenerator.hourly(),
    enable_cloudwatch_metrics=True,)

Note that this code enables Amazon CloudWatch Metrics, which instructs Model Monitor to send outputs to CloudWatch. You can use this approach to trigger alarms using CloudWatch Alarms to let engineers or admins know when data drift has been detected.
<img src="images/6.4.png">

## Summary <a name="Summary"></a>

We have done the following as a part of POC and Deployment. 
 
- Download and Preprocessing the Data
- Create reusable Experiment Trials with model building
- Evaluate Model Performance 
- Deploy and monitor the Model

The first three are mainly part of POC and the last one belongs to Deployment. And all these activities are done within a single notebook without bothering about additional resources and infrastructure. Amazon SageMaker manages an S3 bucket for storage and separates compute instances for processing

For this demo we have used the `ml.t3.medium` instance and for one-day usage, it cost us around **1.2 USD** only which is very **cost-effective**.Refer  [this](https://aws.amazon.com/sagemaker/pricing/) link for AWS resources pricing.

<img src="images/pricing.png">

Like XGBoost, SageMaker provides the container and libraries for other industry-standard algorithms, and this will come with plenty of documentation which means the **short learning curve** We have referred AWS official [tutorial](https://aws.amazon.com/getting-started/hands-on/build-train-deploy-monitor-machine-learning-model-sagemaker-studio/?trk=gs_card) for this demo

We have covered the following advantages of using Cloud service right from the POC phase
***
- **Reduced time to production**
- **Cost-Effective**
- **Avoid Local infrastructure**
- **Shorter Learning Curve**
- **Early adoption of MLOPs**
***