# Problem:  Predicting remaining useful life for Turbofan engines 

### Introduction

In many real life situations, you will encounter where you do not have labeled data and still you need to detect anomalous activites. This technique walks you through how to learn features, build a model using unlabeled dataset.  We will start by converting to problem from an unsupervised learning problem to a supervised learning problem.

The data for this notebook comes from a well known NASA competition in 2008. The dataset was generated via simulation using C-MAPSS. Four different sets simulated under different combinations of operational conditions and fault modes. Recording several sensor channels to characterize fault evolution. The data set was provided by the Prognostics CoE at NASA Ames.  We will use this dataset to learn the behavior of a normal engine.  

### Background

<img src="https://aws-machine-learning-immersion-day.s3.amazonaws.com/resources/engine_failure.gif" width="600" align="center">
<p style="text-align: center;">fan blade containment failure test</p>


Predictive maintenance is important for safety systems and systems where unexpected maintenace and downtime impacts the business objectives.  A turbofan engine is both a safety system and one where unexpected downtime impacts the operation of the airline.  

For predictive maintenance there are typically 3 approaches, it depends on how much you know about the failures and systems: 
1. Similarity - Use this approach if your data captures degridation from the healthy state to the failed state.  
2. Survival - Use this approach when you only have data from the failure event.
3. Degradation - Use this approach when you want the operation of the machine to operate above some limit threshold.

### Contact

* Aaron Sengstacken
* awsaaron@amazon.com

### References
* [Predictive Maintenance - wikipedia](https://en.wikipedia.org/wiki/Predictive_maintenance)
* A. Saxena and K. Goebel (2008). "Turbofan Engine Degradation Simulation Data Set", NASA Ames Prognostics Data Repository (http://ti.arc.nasa.gov/project/prognostic-data-repository), NASA Ames Research Center, Moffett Field, CA
* https://ti.arc.nasa.gov/tech/dash/groups/pcoe/prognostic-data-repository/


In [None]:
# Import Libraries
%matplotlib inline

# general python
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from datetime import datetime
import io
import os

# sklearn
from sklearn.metrics import f1_score
from sklearn.preprocessing import MinMaxScaler, RobustScaler, StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from sklearn.metrics import precision_score,accuracy_score
from sklearn.metrics import confusion_matrix, precision_recall_curve,cohen_kappa_score
from sklearn.metrics import (confusion_matrix, precision_recall_curve, auc,
                             roc_curve, recall_score, classification_report, f1_score,
                             precision_recall_fscore_support)

# aws
import boto3
import sagemaker
from sagemaker import KMeans
from sagemaker import get_execution_role
import sagemaker.amazon.common as smac
from sagemaker.tensorflow import TensorFlow

# tensorflow
import tensorflow as tf
import keras
from keras import backend as K
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Input
from keras.models import Model

RANDOM_SEED = 1234 #used to help randomly select the data points
np.random.seed(RANDOM_SEED)
tf.random.set_seed(RANDOM_SEED)

In [None]:
import sagemaker

sess = sagemaker.Session()
role = sagemaker.get_execution_role()

# Data

The data for this notebook comes from a well known NASA competition in 2008.  The dataset was generated via simulation using C-MAPSS. Four different sets simulated under different combinations of operational conditions and fault modes. Recording several sensor channels to characterize fault evolution. The data set was provided by the Prognostics CoE at NASA Ames.

The dataset consists of multiple multivariate time series. Each data set is further divided into training and test subsets. Each time series is from a different engine i.e., the data can be considered to be from a fleet of engines of the same type. Each engine starts with different degrees of initial wear and manufacturing variation which is unknown to the user. This wear and variation is considered normal, i.e., it is not considered a fault condition. There are three operational settings that have a substantial effect on engine performance. These settings are also included in the data. The data is contaminated with sensor noise.

The engine is operating normally at the start of each time series, and develops a fault at some point during the series. In the training set, the fault grows in magnitude until system failure. In the test set, the time series ends some time prior to system failure. The objective of the competition is to predict the number of remaining operational cycles before failure in the test set, i.e., the number of operational cycles after the last cycle that the engine will continue to operate. Also provided a vector of true Remaining Useful Life (RUL) values for the test data.

### Load Data

In [None]:
# download data from URL
!wget https://ti.arc.nasa.gov/c/6/ -O CMAPSSData.zip

In [None]:
!conda install -y -c conda-forge unzip

In [None]:
# unpack zip file
!unzip -o CMAPSSData.zip

In [None]:
role = get_execution_role()
session = sagemaker.session.Session()
bucket_name = session.default_bucket()
bucket = 's3://{}'.format(session.default_bucket())
print('default_s3_bucket: {}'.format(bucket))

The datasets don't contain any headers so we'll have to manually define those

In [None]:
cols=["unit","cycle","op1","op2","op3","sensor1","sensor2","sensor3","sensor4","sensor5","sensor6","sensor7","sensor8","sensor9","sensor10","sensor11","sensor12","sensor13","sensor14","sensor15","sensor16","sensor17","sensor18","sensor19","sensor20","sensor21","sensor22","sensor23"]

In [None]:
train_df = pd.read_csv('train_FD001.txt',sep=' ',header=None,names=cols)
rul_df = pd.read_csv('RUL_FD001.txt',header=None,names=['rul'])

In [None]:
print('Training Data')
display(train_df.head())
print('Remaining Useful Life Data')
display(rul_df.head())

In [None]:
print('Training Data')
print(train_df.shape)
print('Remaining Useful Life Data')
print(rul_df.shape)

In [None]:
train_df['remaining_cycles']=train_df.groupby('unit')['cycle'].transform(max)-train_df['cycle']

In [None]:
train_df.info()

**Take Away:** Data was loaded successfully, we have some null values in the last two sensors that we'll need to address

### Data Exploration and Preparation

**Questions:**

1. How many engines are in the dataset?
2. How long was each engine run?
3. What is the distribution of total cycles on each engine?
4. What are the operational settings for the engines and how do they vary?

In [None]:
# Question 1 - How many engines are in the dataset?
train_df['unit'].unique()

In [None]:
# Question 2 - How long was each engine run for?
max_cycles = train_df.groupby('unit')['cycle'].max()

In [None]:
fig,ax = plt.subplots(figsize=(20,10))
plt.bar(max_cycles.index,max_cycles.values)
plt.xlabel('Engine Unit #')
plt.ylabel('Max Cycles')
plt.title('Total Cycles Per Engine')
plt.xlim((0,101))

In [None]:
# Question 3 - What is the distribution of cycles across engines?
fig,ax = plt.subplots(figsize=(10,10))
plt.hist(max_cycles.values,bins=20)
plt.title('Histogram of Max Cycles')
plt.xlabel('Number of Cycles')
plt.ylabel('Count')

In [None]:
max_cycles.describe()

In [None]:
#Question 4. What are the operational settings for the engines and how do they vary?
fig,ax = plt.subplots(figsize=(10,10))
plt.hist(train_df['op1'].values,bins=30)
plt.title('Histogram of Op Setting #1')
plt.xlabel('Op Setting #1')
plt.ylabel('Count')
train_df['op1'].describe()

In [None]:
fig,ax = plt.subplots(figsize=(10,10))
plt.hist(train_df['op2'].values,bins=30)
plt.title('Histogram of Op Setting #2')
plt.xlabel('Op Setting #2')
plt.ylabel('Count')
train_df['op2'].describe()

In [None]:
fig,ax = plt.subplots(figsize=(10,10))
plt.hist(train_df['op3'].values,bins=30)
plt.title('Histogram of Op Setting #3')
plt.xlabel('Op Setting #3')
plt.ylabel('Count')
train_df['op3'].describe()

**Take Away** The Op3 setting is constant for all data in the dataset, suggest removal

In [None]:
train_df.hist(figsize=(20,20),bins=25)

In [None]:
train_df.describe().T

**Take Away** 
1. Sensor 1, 5, 6, 10, 16, 18, 19, Op3 have constant values across the dataset, suggest removal.  
2. Sensor 22, 23 are missing, suggest removal

Note:  Unit and Cycle histograms are not informative since they contain the engine number and cycle (recall from above that max cycle per engine was useful)

In [None]:
drop_cols = ['unit','cycle','op3','sensor1','sensor5','sensor6','sensor10','sensor16','sensor18','sensor19','sensor22','sensor23']

In [None]:
train_df.drop(drop_cols,axis=1,inplace=True)

#### Prepare Target Variable

For this task we want to predict that the engine is near failure, before it fails.  To do that we will convert the cycle count column to remaining cycles.  Next, we will add an additonal column that is a binary variable when the engine has X remaining cycles left.  X can be defined by the user.  In this example we will use 10 remaining cycles as the 

In [None]:
cycle_limit = 10
train_df['failed'] = train_df['remaining_cycles'].apply(lambda x: 1 if x <= cycle_limit else 0)

In [None]:
print('Total number of failed cases in training dataset:  '+str(train_df['failed'].sum()))
print('Fraction of total in training dataset:  '+str(train_df['failed'].sum()/train_df['failed'].count()))

#### Missing Values

In [None]:
train_df.isnull().values.any()

In [None]:
print('Training Dataset Shape:', train_df.shape)

In [None]:
train_df.head()

#### Save off the datasets

In [None]:
train_df.to_csv('cleaned_train.csv',index=False)

#### OPTIONAL - Load the saved datasets

In [None]:
train_df = pd.read_csv('cleaned_train.csv')

#### Shuffle / Randomize

In [None]:
train_df = train_df.sample(frac=1,random_state=RANDOM_SEED).reset_index(drop=True)

#### Select only non-failed rows and drop other columns

In [None]:
train_x = train_df.loc[train_df['failed'] == 0]
train_x = train_x.drop(['failed','remaining_cycles'], axis=1)

In [None]:
train_x.head()

#### Split Dataset

In [None]:
# 80% for the training set and 20% for testing set
TEST_PCT = 0.2 # 20% of the data
train_x, val_x = train_test_split(train_x, test_size=TEST_PCT, random_state=RANDOM_SEED)

#### Scale Data

IMPORTANT! - You must apply the same scaling from the training dataset to the testing / validation dataset

In [None]:
scaler = StandardScaler()

In [None]:
scaler.fit(train_x)

In [None]:
train = scaler.transform(train_x)
val = scaler.transform(val_x)

#### Save and Upload Data

In [None]:
os.makedirs("./data", exist_ok = True)

# save
with open('./data/val.npy', 'wb') as f: np.save(f, val)
with open('./data/train.npy', 'wb') as f: np.save(f, train)

In [None]:
# load
with open('./data/val.npy', 'rb') as f: val = np.load(f)
with open('./data/train.npy', 'rb') as f: train = np.load(f)

In [None]:
prefix = 'turbofan-RUL-AE'

training_input_path   = sess.upload_data('data/train.npy', key_prefix=prefix+'/training')
validation_input_path = sess.upload_data('data/val.npy', key_prefix=prefix+'/validation')

print(training_input_path)
print(validation_input_path)

# Model Training

AutoEncoders are special kind of neural networks, where your input is 'x' and you have your output as 'x' as well. What this really means is that we are trying to learn a function, where the input and output are the same.

Few things to note. 

- We are reducing the number of nodes, which will force network to learn the features from the dataset. Intuition being that this "code" is a set of abstracted features which represents or creates a fingerprint for "failures" or a "non-failure" activitiy.
- Since we are starting with the input 'x', reducing into a abstracted features and then reconstructing back the 'x' means we really don't need a labeled dataset. 
- The "code" is intutively a representation of abstracted features. 

For our engine dataset, we are going to get all the non-failed data and will try to re-create the same. During this process the network should try to learn a unique representation of what's a non-failed activity. Once the model is trained with whats 'normal' that means anything which does not match this normal representation can be declared as abnormal. 

For inference, we are going to give both failed and non-failed data to the model. Model prediction will give us the  reconstruction error. This is where we set the threshold which let's domain expert define what tolerance is ok consider normal and when to declare as abnormal data. 

In [None]:
!pygmentize turbofan_autoencoder_keras_tf.py

## Train with Tensorflow on the notebook instance (aka 'local mode')

In [None]:
tf_estimator = TensorFlow(entry_point='turbofan_autoencoder_keras_tf.py', 
                          role=role,
                          train_instance_count=1, 
                          train_instance_type='local',
                          framework_version='1.12', 
                          py_version='py3',
                          script_mode=True,
                          hyperparameters={'epochs': 2}
                         )

In [None]:
tf_estimator.fit({'training': training_input_path, 'validation': validation_input_path})

## Train with Tensorflow on a GPU instance

In [None]:
tf_estimator = TensorFlow(entry_point='turbofan_autoencoder_keras_tf.py', 
                          role=role,
                          train_instance_count=1, 
                          train_instance_type='ml.p3.2xlarge',
                          framework_version='1.12', 
                          py_version='py3',
                          script_mode=True,
                          hyperparameters={
                              'epochs': 30,
                              'batch-size': 256,
                              'learning-rate': 0.001}
                         )

In [None]:
tf_estimator.fit({'training': training_input_path, 'validation': validation_input_path})

### Model Training Performance

#### Training Evaluation

In [None]:
# plot validation and training progress
client = boto3.client('logs')
BASE_LOG_NAME = '/aws/sagemaker/TrainingJobs'

def plot_log(model):
    logs = client.describe_log_streams(logGroupName=BASE_LOG_NAME, logStreamNamePrefix=model._current_job_name)
    cw_log = client.get_log_events(logGroupName=BASE_LOG_NAME, logStreamName=logs['logStreams'][0]['logStreamName'])

    val = []
    train = []
    iteration = []
    count = 0
    for e in cw_log['events']:
        msg = e['message']
        if '/step' in msg:
            msg = msg.split(' ')
            #print(msg)
            train.append(float(msg[-4]))
            val.append(float(msg[-1]))
            iteration.append(count)
            count+=1

    fig, ax = plt.subplots(figsize=(15,10))
    plt.xlabel('Epoch')
    plt.ylabel('Error')
    train_plot,   = ax.plot(iteration,   train,   label='train')
    val_plot,   = ax.plot(iteration,   val,   label='validation')
    plt.legend(handles=[train_plot,val_plot])
    plt.grid()
    plt.show()

In [None]:
plot_log(tf_estimator)

# Deploy

### Deploy the trained model with Elastic Inference and Data Capture

Pricing for SageMaker by region is found [here](https://aws.amazon.com/sagemaker/pricing/)

In [None]:
%%time

import time
from sagemaker.model_monitor import DataCaptureConfig

tf_endpoint_name = 'turbo-fan-RUL-AE-'+time.strftime("%Y-%m-%d-%H-%M-%S", time.gmtime())
s3_capture_upload_path = 's3://{}/{}/monitoring/datacapture'.format(bucket_name, prefix)
print(s3_capture_upload_path)

#predictor = tf_estimator.deploy(initial_instance_count=1,
#                                   instance_type='ml.p2.xlarge')      # $1.125/hour in us-east-1

predictor = tf_estimator.deploy(initial_instance_count=1,
                         instance_type='ml.c5.large',        # $0.119/hour in us-east-1
                         accelerator_type='ml.eia1.medium',  # + $0.168/hour in us-east-1
                         endpoint_name=tf_endpoint_name,     # = 67% discount!
                         data_capture_config=DataCaptureConfig(
                                enable_capture=True,
                                sampling_percentage=100,
                                destination_s3_uri=s3_capture_upload_path))

### OPTIONAL - Connect to deployed endpoint

In [None]:
# How do you connect to an already deployed endpoint
end_point_name = 'ENDPOINT-NAME'
predictor = sagemaker.tensorflow.model.TensorFlowPredictor(end_point_name,sagemaker_session=sess)

# Predict

We are going to calculate the mean squared error between predicted and the expected values. This will be our reconstruction error

In [None]:
# single predictions
test1 = np.array([ 0.23181664, -1.031415  ,  0.0200345 ,  0.49338209,  1.00451033,
       -1.45819216,  0.0041503 ,  0.03854353,  0.54958493, -0.95076123,
        0.16660914, -0.18772613,  0.64182997,  0.67217714, -0.82575155,
       -0.16161417])

result = predictor.predict(test1)
print(result)

test2 = np.array([ 1.05574627,  1.35571432, -0.57110458,  0.74633819,  0.67483162,
       -0.51925459,  0.77610555, -0.76845284, -0.2522413 , -0.04313786,
        0.31915843, -0.43000929,  0.683185  ,  0.67217714,  0.20974389,
        0.09380091])

result = predictor.predict(test2)
print(result)

Now let's list the data capture files stored in S3. You should expect to see different files from different time periods organized based on the hour in which the invocation occurred.

**Note that the delivery of capture data to Amazon S3 can require a couple of minutes so next cell might error. If this happens, please retry after a minute.**

In [None]:
s3_client = boto3.Session().client('s3')
current_endpoint_capture_prefix = '{}/monitoring/datacapture/{}'.format(prefix, tf_endpoint_name)

result = s3_client.list_objects(Bucket=bucket_name, Prefix=current_endpoint_capture_prefix)
capture_files = ['s3://{0}/{1}'.format(bucket_name, capture_file.get("Key")) for capture_file in result.get('Contents')]

print("Capture Files: ")
print("\n ".join(capture_files))

We can also read the contents of one of these files and see how capture records are organized in JSON lines format.

In [None]:
!aws s3 cp {capture_files[0]} datacapture/captured_data_example.jsonl

import json
with open ("datacapture/captured_data_example.jsonl", "r") as myfile:
    data=myfile.read()

print(json.dumps(json.loads(data.split('\n')[0]), indent=2))

For each inference request, we get input data, output data and some metadata like the inference time captured and saved.


#### Reconstruction error without failure

In [None]:
# run the validation data through the trained model
y_pred_val = predictor.predict(val)['predictions']
mse = np.mean(np.power(val - y_pred_val, 2), axis=1)

error_df_val = pd.DataFrame({'reconstruction_error': mse,'true_class': np.zeros(len(mse))})
error_df_val.describe(percentiles=[.50,.90,.95,.99,.999,.9999])

#### Reconstruction error with failure

In [None]:
# run the failed examples through the trained model
train_fail = train_df.loc[train_df['failed'] == 1]
train_fail = train_fail.drop(['failed','remaining_cycles'], axis=1)
train_fail_scaled = scaler.transform(train_fail)

In [None]:
y_pred = predictor.predict(train_fail_scaled)['predictions']
mse = np.mean(np.power(train_fail_scaled - y_pred, 2), axis=1)

error_df_val_fail = pd.DataFrame({'reconstruction_error': mse,'true_class': np.ones(len(mse))})
error_df_val_fail.describe(percentiles=[.50,.90,.95,.99,.999,.9999])

In [None]:
val_df = pd.concat([error_df_val,error_df_val_fail],ignore_index=True,axis=0)

In [None]:
fig = plt.figure(figsize=(15,10))
ax = fig.add_subplot(111)
_ = ax.hist(val_df[val_df['true_class']==0]['reconstruction_error'].values, bins=20,density=True,color='blue',edgecolor='black',alpha=0.5,label='normal')
_ = ax.hist(val_df[val_df['true_class']==1]['reconstruction_error'], bins=50,density=True,color='red',edgecolor='black',alpha=0.5,label='failed')
plt.legend()
plt.xlabel('reconstruction error, MSE')
plt.ylabel('normalized count')
#plt.ylim((0,50))

### Evaluation

In [None]:
fpr, tpr, thresholds = roc_curve(val_df.true_class, val_df.reconstruction_error)
roc_auc = auc(fpr, tpr)

plt.title('Receiver Operating Characteristic')
plt.plot(fpr, tpr, label='AUC = %0.4f'% roc_auc)
plt.legend(loc='lower right')
plt.plot([0,1],[0,1],'r--')
plt.xlim([-0.001, 1])
plt.ylim([0, 1.001])
plt.ylabel('True Positive Rate')
plt.xlabel('False Positive Rate')
plt.show();

In [None]:
precision, recall, th = precision_recall_curve(val_df.true_class, val_df.reconstruction_error)
plt.plot(recall, precision, 'b', label='Precision-Recall curve')
plt.title('Recall vs Precision')
plt.xlabel('Recall')
plt.ylabel('Precision')
plt.show()

In [None]:
plt.plot(th, precision[1:], 'b', label='Threshold-Precision curve')
plt.title('Precision for different threshold values')
plt.xlabel('Threshold')
plt.ylabel('Precision')
plt.show()

In [None]:
plt.plot(th, recall[1:], 'b', label='Threshold-Recall curve')
plt.title('Recall for different threshold values')
plt.xlabel('Reconstruction error')
plt.ylabel('Recall')
plt.show()

In [None]:
threshold = 1

In [None]:
groups = val_df.groupby('true_class')
fig, ax = plt.subplots(figsize=(15,10))

for name, group in groups:
    ax.plot(group.index, group.reconstruction_error, marker='o', ms=3.5, linestyle='',
            label= "Fail" if name == 1 else "Normal")
ax.hlines(threshold, ax.get_xlim()[0], ax.get_xlim()[1], colors="r", zorder=100, label='Threshold')
ax.legend()
plt.title("Reconstruction error for different classes")
plt.ylabel("Reconstruction error")
plt.xlabel("Data point index")
plt.show();

In [None]:
# apply threshold to positive probabilities to create labels
def to_labels(pos_probs, threshold):
    return (pos_probs >= threshold).astype('int')

In [None]:
# define thresholds
thresholds = np.arange(0, 2, 0.001)
# evaluate each threshold
scores = [f1_score(val_df['true_class'], to_labels(val_df.reconstruction_error, t)) for t in thresholds]
# get best threshold
ix = np.argmax(scores)
print('Threshold=%.3f, F-Score=%.5f' % (thresholds[ix], scores[ix]))

In [None]:
t = 0.965
pd.crosstab(index=val_df['true_class'], columns=to_labels(val_df.reconstruction_error,t), rownames=['actuals'], colnames=['predictions'])

# Model Monitor 

The generic steps for configuring the Model Monitor are:

1.  Deploy a model with data capture enabled
2.  Make predictions on the deployed model
3.  Create a baseline
4.  Create a monitor

## Baseline

From our validation dataset let's ask Amazon SageMaker to suggest a set of baseline constraints and generate descriptive statistics for our features. Note that we are using the validation dataset for this workshop to make sure baselining time is short, and that file extension needs to be changed since the baselining jobs require .CSV file extension as default.

In reality, you might be willing to use a larger dataset as baseline.

In [None]:
s3 = boto3.resource('s3')

df = pd.DataFrame(val)
df.to_csv('val.csv',index=False)
baseline_path = sess.upload_data('val.csv', key_prefix=prefix+'/monitoring/baselining/data')

In [None]:
baseline_path

In [None]:
baseline_data_path = 's3://{0}/{1}/monitoring/baselining/data'.format(bucket_name, prefix)
baseline_results_path = 's3://{0}/{1}/monitoring/baselining/results'.format(bucket_name, prefix)

print(baseline_data_path)
print(baseline_results_path)

Please note that running the baselining job will require 8-10 minutes. In the meantime, you can take a look at the Deequ library, used to execute these analyses with the default Model Monitor container: https://github.com/awslabs/deequ

In [None]:
from sagemaker.model_monitor import DefaultModelMonitor
from sagemaker.model_monitor.dataset_format import DatasetFormat

my_default_monitor = DefaultModelMonitor(
    role=role,
    instance_count=1,
    instance_type='ml.c5.4xlarge',
    volume_size_in_gb=5,
    max_runtime_in_seconds=3600,
)

In [None]:
my_default_monitor.suggest_baseline(
    baseline_dataset=baseline_data_path,
    dataset_format=DatasetFormat.csv(header=True),
    output_s3_uri=baseline_results_path,
    wait=True
)

Let's display the statistics that were generated by the baselining job.

In [None]:
baseline_job = my_default_monitor.latest_baselining_job
schema_df = pd.json_normalize(baseline_job.baseline_statistics().body_dict["features"])
schema_df.head()

Then, we can also visualize the constraints.

In [None]:
constraints_df = pd.json_normalize(baseline_job.suggested_constraints().body_dict["features"])
constraints_df.head()



The baselining job has inspected the validation dataset and generated constraints and statistics, that will be used to monitor our endpoint.

## Monitor

Once we have built the baseline for our data, we can enable endpoint monitoring by creating a monitoring schedule.
When the schedule fires, a monitoring job will be kicked-off and will inspect the data captured at the endpoint with respect to the baseline; then it will generate some report files that can be used to analyze monitoring results.

### Create Monitoring Schedule - DO NOT RUN

In [None]:
from sagemaker.model_monitor import CronExpressionGenerator

endpoint_name = predictor.endpoint

mon_schedule_name = 'turbofan-RUL-AE-monitor' + time.strftime("%Y-%m-%d-%H-%M-%S", time.gmtime())
my_default_monitor.create_monitoring_schedule(
    monitor_schedule_name=mon_schedule_name,
    endpoint_input=endpoint_name,
    output_s3_uri=reports_path,
    statistics=my_default_monitor.baseline_statistics(),
    constraints=my_default_monitor.suggested_constraints(),
    schedule_cron_expression=CronExpressionGenerator.hourly(),
    enable_cloudwatch_metrics=True
)

### Describe Monitoring Schedule

In [None]:
desc_schedule_result = my_default_monitor.describe_schedule()
desc_schedule_result

### Delete Monitoring Schedule

Once the schedule is created, it will kick of jobs at specified intervals. Note that if you are kicking this off after creating the hourly schedule, you might find the executions empty. 
You might have to wait till you cross the hour boundary (in UTC) to see executions kick off. 

In [None]:
# Note: this is just for the purpose of running this example.
my_default_monitor.delete_monitoring_schedule()

### What is produced by the Monitor Job?  
Since we won't have time today to feed the endpoint with predictions and see the monitor execute.  We've included some example files to show what the monitor outputs

In [None]:
!aws s3 cp s3://aws-machine-learning-immersion-day/resources/statistics.json ./data/statistics.json
!aws s3 cp s3://aws-machine-learning-immersion-day/resources/constraints.json ./data/constraints.json
!aws s3 cp s3://aws-machine-learning-immersion-day/resources/constraint_violations.json ./data/constraint_violations.json

In [None]:
import pandas as pd
pd.set_option('display.max_colwidth', -1)

file = open('./data/constraint_violations.json', 'r')
data = file.read()

violations_df = pd.json_normalize(json.loads(data)['violations'])
violations_df.head(10)

You might be asking yourself what are the type of violations that are monitored and how drift from the baseline is computed.

The types of violations monitored are listed here: https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-interpreting-violations.html. Most of them use configurable thresholds, that are specified in the monitoring configuration section of the baseline constraints JSON. Let's take a look at this configuration from the baseline constraints file:

In [None]:
import json
with open ("./data/constraints.json", "r") as myfile:
    data=myfile.read()

print(json.dumps(json.loads(data)['monitoring_config'], indent=2))

This configuration is intepreted when the monitoring job is executed and used to compare captured data to the baseline. If you want to customize this section, you will have to update the **constraints.json** file and upload it back to Amazon S3 before launching the monitoring job.

When data distributions are compared to detect potential drift, you can choose to use either a _Simple_ or _Robust_ comparison method, where the latter has to be preferred when dealing with small datasets. Additional info: https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-byoc-constraints.html.

# Clean Up

If you are done with this notebook, please run the cell below.  This will remove the hosted endpoint you created and avoid any charges from a stray instance being left on.

In [None]:
predictor.delete_endpoint()