## Train, tune, and deploy a custom ML model using QML based RUL Forcast Using IOT Sensors Algorithm from AWS Marketplace 

A hybrid quantum algorithm-based solution that forecasts the remaining useful life of a system using IOT sensors data to predict failures.

This sample notebook shows you how to train a custom ML model using QML based RUL Forcast Using IOT Sensors Algorithm from AWS Marketplace.

> **Note**: This is a reference notebook and it cannot run unless you make changes suggested in the notebook.

#### Pre-requisites:
1. **Note**: This notebook contains elements which render correctly in Jupyter interface. Open this notebook from an Amazon SageMaker Notebook Instance or Amazon SageMaker Studio.
1. Ensure that IAM role used has **AmazonSageMakerFullAccess**
1. Some hands-on experience using [Amazon SageMaker](https://aws.amazon.com/sagemaker/).
1. To use this algorithm successfully, ensure that:
    1. Either your IAM role has these three permissions and you have authority to make AWS Marketplace subscriptions in the AWS account used: 
        1. **aws-marketplace:ViewSubscriptions**
        1. **aws-marketplace:Unsubscribe**
        1. **aws-marketplace:Subscribe**  
    2. or your AWS account has a subscription to For Seller to update: QML based RUL Forcast Using IOT Sensors. 

#### Contents:
1. [Subscribe to the algorithm](#1.-Subscribe-to-the-algorithm)
1. [Prepare dataset](#2.-Prepare-dataset)
	1. [Dataset format expected by the algorithm](#A.-Dataset-format-expected-by-the-algorithm)
	1. [Configure and visualize train and test dataset](#B.-Configure-and-visualize-train-and-test-dataset)
	1. [Upload datasets to Amazon S3](#C.-Upload-datasets-to-Amazon-S3)
1. [Train a machine learning model](#3:-Train-a-machine-learning-model)
	1. [Set up environment](#3.1-Set-up-environment)
	1. [Train a model](#3.2-Train-a-model)
1. [Deploy model and verify results](#4:-Deploy-model-and-verify-results)
    1. [Deploy trained model](#A.-Deploy-trained-model)
    1. [Create input payload](#B.-Create-input-payload)
    1. [Perform real-time inference](#C.-Perform-real-time-inference)
    1. [Visualize output](#D.-Visualize-output)
    1. [Calculate relevant metrics](#E.-Calculate-relevant-metrics)
    1. [Delete the endpoint](#F.-Delete-the-endpoint)
1. [Tune your model! (optional)](#5:-Tune-your-model!-(optional))
	1. [Tuning Guidelines](#A.-Tuning-Guidelines)
	1. [Define Tuning configuration](#B.-Define-Tuning-configuration)
	1. [Run a model tuning job](#C.-Run-a-model-tuning-job)
1. [Perform Batch inference](#6.-Perform-Batch-inference)
1. [Clean-up](#7.-Clean-up)
	1. [Delete the model](#A.-Delete-the-model)
	1. [Unsubscribe to the listing (optional)](#B.-Unsubscribe-to-the-listing-(optional))


#### Usage instructions
You can run this notebook one cell at a time (By using Shift+Enter for running a cell).

### 1. Subscribe to the algorithm

To subscribe to the algorithm:
1. Open the algorithm listing page QML based RUL Forcast Using IOT Sensors.
1. On the AWS Marketplace listing,  click on **Continue to subscribe** button.
1. On the **Subscribe to this software** page, review and click on **"Accept Offer"** if you agree with EULA, pricing, and support terms. 
1. Once you click on **Continue to configuration button** and then choose a **region**, you will see a **Product Arn**. This is the algorithm ARN that you need to specify while training a custom ML model. Copy the ARN corresponding to your region and specify the same in the following cell.

In [1]:
algo_arn='arn:aws:sagemaker:us-east-2:786796469737:algorithm/q-pred-maintenance'

### 2. Prepare dataset

In [2]:
import base64
import json 
import uuid
from sagemaker import ModelPackage
import sagemaker as sage
from sagemaker import get_execution_role
from sagemaker import ModelPackage
from urllib.parse import urlparse
import boto3
from IPython.display import Image
from PIL import Image as ImageEdit
import urllib.request
import numpy as np

#### A. Dataset format expected by the algorithm

Usage Instructions:
- This algorithm takes ZIP file named "train.zip" as input. The ZIP file should have the training file named as “train.csv”.
- The target column should be named as “RUL”.
- The output will be a CSV file with the test file and the RUL column attached.
- A hyperparameters file which contains the user controlled parameters.

#### B. Configure and visualize train and test dataset

In [3]:
training_dataset='training/train.zip'

In [4]:
test_dataset='testing/test.csv'

In [5]:
import pandas as pd
df = pd.read_csv(test_dataset)
df.head()

Unnamed: 0.1,Unnamed: 0,id,cycle,setting1,setting2,setting3,s1,s2,s3,s4,...,s15,s16,s17,s18,s19,s20,s21,cycle_norm,label1,label2
0,0,1,1,0.632184,0.75,0,0,0.545181,0.310661,0.269413,...,0.308965,0,0.333333,0,0,0.55814,0.661834,0.0,0,0
1,1,1,2,0.344828,0.25,0,0,0.150602,0.379551,0.222316,...,0.213159,0,0.416667,0,0,0.682171,0.686827,0.00277,0,0
2,2,1,3,0.517241,0.583333,0,0,0.376506,0.346632,0.322248,...,0.458638,0,0.416667,0,0,0.728682,0.721348,0.00554,0,0
3,3,1,4,0.741379,0.5,0,0,0.370482,0.285154,0.408001,...,0.257022,0,0.25,0,0,0.666667,0.66211,0.00831,0,0
4,4,1,5,0.58046,0.5,0,0,0.391566,0.352082,0.332039,...,0.300885,0,0.166667,0,0,0.658915,0.716377,0.01108,0,0


#### C. Upload datasets to Amazon S3

In [6]:
sagemaker_session = sage.Session()
bucket=sagemaker_session.default_bucket()
bucket

'sagemaker-us-east-2-786796469737'

In [7]:
training_data=sagemaker_session.upload_data(training_dataset, bucket=bucket, key_prefix='QML_based_RUL_Forcast_Using_IOT_Sensors')
test_data=sagemaker_session.upload_data(test_dataset, bucket=bucket, key_prefix='QML_based_RUL_Forcast_Using_IOT_Sensors')

In [8]:
print("Training input uploaded to " + training_data)

Training input uploaded to s3://sagemaker-us-east-2-786796469737/QML_based_RUL_Forcast_Using_IOT_Sensors/train.zip


## 3: Train a machine learning model

Now that dataset is available in an accessible Amazon S3 bucket, we are ready to train a machine learning model. 

### 3.1 Set up environment

In [9]:
role = get_execution_role()


In [10]:
output_location = 's3://{}/QML_based_RUL_Forcast_Using_IOT_Sensors/{}'.format(bucket, 'output')

### 3.2 Train a model

You can also find more information about hyperparametes in **Hyperparameters** section of Smartwatch Health Data Anomaly Detection Algorithm.

In [11]:
#Define hyperparameters
hyperparameters={"batch_size": "1",
                 "epochs": "1",
                 "number_qubits": "2",
                 "reps": "1"}

For information on creating an `Estimator` object, see [documentation](https://sagemaker.readthedocs.io/en/stable/api/training/estimators.html)

In [12]:
#Create an estimator object for running a training job
estimator = sage.algorithm.AlgorithmEstimator(
    algorithm_arn=algo_arn,
    base_job_name="QML-based-RUL-Forcast-Using-IOT-Sensors",
    role=role,
    train_instance_count=1,
    train_instance_type='ml.m5.large',
    input_mode="File",
    output_path=output_location,
    sagemaker_session=sagemaker_session,
    hyperparameters=hyperparameters,
    instance_count=1,
    instance_type='ml.m5.large'
)
#Run the training job.
# estimator.fit({"training": training_data,"training":test_data})
estimator.fit({"training": training_data})

2022-06-01 09:38:21 Starting - Starting the training job...
2022-06-01 09:38:45 Starting - Preparing the instances for trainingProfilerReport-1654076300: InProgress
......
2022-06-01 09:39:45 Downloading - Downloading input data...
2022-06-01 09:40:05 Training - Downloading the training image.....[34mReading file[0m
[34mFile read[0m
[34mStarting the training.[0m
[34mbatch size is 1[0m
  return F.mse_loss(input, target, reduction=self.reduction)[0m
[34mTraining [0]#011Loss: 31501.6504[0m
[34mTime per Epoch is 1.6819391250610352[0m
[34mTraining [0]#011Loss: 20899.3740[0m
[34mTime per Epoch is 1.7915678024291992[0m
[34mTraining [0]#011Loss: 25647.5384[0m
[34mTime per Epoch is 1.8783807754516602[0m
[34mTraining [0]#011Loss: 21758.8374[0m
[34mTime per Epoch is 1.9788920879364014[0m
[34mTraining [0]#011Loss: 24662.1785[0m
[34mTime per Epoch is 2.0566887855529785[0m
[34mTraining [0]#011Loss: 24225.3167[0m
[34mTime per Epoch is 2.136986255645752[0m
[34mTrainin


2022-06-01 09:41:45 Completed - Training job completed
Training seconds: 107
Billable seconds: 107


See this [blog-post](https://aws.amazon.com/blogs/machine-learning/easily-monitor-and-visualize-metrics-while-training-models-on-amazon-sagemaker/) for more information how to visualize metrics during the process. You can also open the training job from [Amazon SageMaker console](https://console.aws.amazon.com/sagemaker/home?#/jobs/) and monitor the metrics/logs in **Monitor** section.

### 4: Deploy model and verify results

Now you can deploy the model for performing real-time inference.

In [13]:
model_name='QML_based_RUL_Forcast_Using_IOT_Sensors_Inference'

content_type='text/csv'

real_time_inference_instance_type='ml.m5.large'
batch_transform_inference_instance_type='ml.m5.large'

#### A. Deploy trained model

In [14]:
from sagemaker.predictor import csv_serializer
predictor = estimator.deploy(1, real_time_inference_instance_type, serializer=csv_serializer)

..........
-----!

Once endpoint is created, you can perform real-time inference.

#### B. Create input payload

In [15]:
df = pd.read_csv("testing/test.csv")

In [16]:
df

Unnamed: 0.1,Unnamed: 0,id,cycle,setting1,setting2,setting3,s1,s2,s3,s4,...,s15,s16,s17,s18,s19,s20,s21,cycle_norm,label1,label2
0,0,1,1,0.632184,0.750000,0,0,0.545181,0.310661,0.269413,...,0.308965,0,0.333333,0,0,0.558140,0.661834,0.000000,0,0
1,1,1,2,0.344828,0.250000,0,0,0.150602,0.379551,0.222316,...,0.213159,0,0.416667,0,0,0.682171,0.686827,0.002770,0,0
2,2,1,3,0.517241,0.583333,0,0,0.376506,0.346632,0.322248,...,0.458638,0,0.416667,0,0,0.728682,0.721348,0.005540,0,0
3,3,1,4,0.741379,0.500000,0,0,0.370482,0.285154,0.408001,...,0.257022,0,0.250000,0,0,0.666667,0.662110,0.008310,0,0
4,4,1,5,0.580460,0.500000,0,0,0.391566,0.352082,0.332039,...,0.300885,0,0.166667,0,0,0.658915,0.716377,0.011080,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
683,683,8,9,0.511494,0.666667,0,0,0.259036,0.427731,0.237002,...,0.301654,0,0.416667,0,0,0.612403,0.579536,0.022161,0,0
684,684,8,10,0.528736,0.500000,0,0,0.433735,0.270329,0.440918,...,0.416314,0,0.333333,0,0,0.565891,0.630351,0.024931,0,0
685,685,8,11,0.488506,0.583333,0,0,0.358434,0.357968,0.311951,...,0.462101,0,0.416667,0,0,0.542636,0.654653,0.027701,0,0
686,686,8,12,0.632184,0.416667,0,0,0.409639,0.264879,0.559757,...,0.394382,0,0.333333,0,0,0.558140,0.623723,0.030471,0,0


#### C. Perform real-time inference

In [17]:
file_name = "testing/test.csv"
output_file_name = "inference_out.csv"

In [18]:
!aws sagemaker-runtime invoke-endpoint \
    --endpoint-name $predictor.endpoint \
    --body fileb://$file_name \
    --content-type $content_type \
    --region $sagemaker_session.boto_region_name \
    $output_file_name

The endpoint attribute has been renamed in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.


{
    "ContentType": "text/csv; charset=utf-8",
    "InvokedProductionVariant": "AllTraffic"
}


#### D. Visualize output

In [19]:
result = pd.read_csv("inference_out.csv", header=None)
result

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,21,22,23,24,25,26,27,28,29,30
0,Unnamed: 0,id,cycle,setting1,setting2,setting3,s1,s2,s3,s4,...,s16,s17,s18,s19,s20,s21,cycle_norm,label1,label2,Result
1,0,1,1,0.632183908,0.75,0,0,0.545180723,0.310660562,0.269412559,...,0,0.333333333,0,0,0.558139535,0.6618337479999999,0.0,0,0,0.6407925
2,1,1,2,0.344827586,0.25,0,0,0.15060241,0.379550905,0.222316003,...,0,0.41666666700000005,0,0,0.682170543,0.686826843,0.002770083,0,0,0.6407925
3,2,1,3,0.517241379,0.583333333,0,0,0.376506024,0.346631785,0.32224848100000003,...,0,0.41666666700000005,0,0,0.728682171,0.7213476940000001,0.005540166,0,0,0.6407925
4,3,1,4,0.74137931,0.5,0,0,0.37048192799999996,0.285153695,0.40800135,...,0,0.25,0,0,0.666666667,0.662109914,0.008310249,0,0,0.6407925
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
684,683,8,9,0.511494253,0.666666667,0,0,0.259036145,0.42773054299999996,0.237002026,...,0,0.41666666700000005,0,0,0.6124031010000001,0.57953604,0.022160665,0,0,0.6407927
685,684,8,10,0.5287356320000001,0.5,0,0,0.43373494,0.270329191,0.440918298,...,0,0.333333333,0,0,0.565891473,0.630350732,0.024930748,0,0,0.6407927
686,685,8,11,0.48850574700000005,0.583333333,0,0,0.358433735,0.357968171,0.311951384,...,0,0.41666666700000005,0,0,0.542635659,0.654653411,0.027700831000000002,0,0,0.6407927
687,686,8,12,0.632183908,0.41666666700000005,0,0,0.409638554,0.26487900600000003,0.559756921,...,0,0.333333333,0,0,0.558139535,0.623722729,0.030470913999999998,0,0,0.6407927


#### F. Delete the endpoint

Now that you have successfully performed a real-time inference, you do not need the endpoint any more. you can terminate the same to avoid being charged.

In [20]:
predictor.delete_endpoint(delete_endpoint_config=True)

Since this is an experiment, you do not need to run a hyperparameter tuning job. However, if you would like to see how to tune a model trained using a third-party algorithm with Amazon SageMaker's hyperparameter tuning functionality, you can run the optional tuning step.

### 5. Perform Batch inference

In this section, you will perform batch inference using multiple input payloads together.

In [21]:
#upload the batch-transform job input files to S3
transform_input_folder = "testing/test.csv"
transform_input = sagemaker_session.upload_data(transform_input_folder, key_prefix=model_name) 
print("Transform input uploaded to " + transform_input)

Transform input uploaded to s3://sagemaker-us-east-2-786796469737/QML_based_RUL_Forcast_Using_IOT_Sensors_Inference/test.csv


In [None]:
#Run the batch-transform job
transformer = estimator.transformer(1, batch_transform_inference_instance_type)
transformer.transform(transform_input, content_type=content_type)
transformer.wait()

..........
........................[34mStarting the inference server with 2 workers.[0m
[34m[2022-06-01 09:50:01 +0000] [9] [INFO] Starting gunicorn 20.1.0[0m
[34m[2022-06-01 09:50:01 +0000] [9] [INFO] Listening at: unix:/tmp/gunicorn.sock (9)[0m
[34m[2022-06-01 09:50:01 +0000] [9] [INFO] Using worker: gevent[0m
[34m[2022-06-01 09:50:01 +0000] [13] [INFO] Booting worker with pid: 13[0m
[34m[2022-06-01 09:50:01 +0000] [14] [INFO] Booting worker with pid: 14[0m


In [None]:
#output is available on following path
transformer.output_path

### 7. Clean-up

#### A. Delete the model

In [None]:
estimator.delete_endpoint()

#### B. Unsubscribe to the listing (optional)

If you would like to unsubscribe to the algorithm, follow these steps. Before you cancel the subscription, ensure that you do not have any [deployable model](https://console.aws.amazon.com/sagemaker/home#/models) created from the model package or using the algorithm. Note - You can find this information by looking at the container name associated with the model. 

**Steps to unsubscribe to product from AWS Marketplace**:
1. Navigate to __Machine Learning__ tab on [__Your Software subscriptions page__](https://aws.amazon.com/marketplace/ai/library?productType=ml&ref_=mlmp_gitdemo_indust)
2. Locate the listing that you want to cancel the subscription for, and then choose __Cancel Subscription__  to cancel the subscription.

