## Execute Robustness Metrics for Tabular data Algorithm from AWS Marketplace 


The solution measures the prominent robustness metrics for a Keras based classifier for tabular data classification.


This sample notebook shows you how to execute the Robustness Metrics for Tabular data Algorithm from AWS Marketplace 

> **Note**: This is a reference notebook and it cannot run unless you make changes suggested in the notebook.

#### Pre-requisites:
1. **Note**: This notebook contains elements which render correctly in Jupyter interface. Open this notebook from an Amazon SageMaker Notebook Instance or Amazon SageMaker Studio.
1. Ensure that IAM role used has **AmazonSageMakerFullAccess**
1. Some hands-on experience using [Amazon SageMaker](https://aws.amazon.com/sagemaker/).
1. To use this algorithm successfully, ensure that:
    1. Either your IAM role has these three permissions and you have authority to make AWS Marketplace subscriptions in the AWS account used: 
        1. **aws-marketplace:ViewSubscriptions**
        1. **aws-marketplace:Unsubscribe**
        1. **aws-marketplace:Subscribe**  
    2. or your AWS account has a subscription to Robustness Metrics for Tabular data. 

#### Contents:
1. [Subscribe to the algorithm](#1.-Subscribe-to-the-algorithm)
1. [Prepare dataset](#2.-Prepare-dataset)
	1. [Dataset format expected by the algorithm](#A.-Dataset-format-expected-by-the-algorithm)
	1. [Configure dataset](#B.-Configure-dataset)
	1. [Upload datasets to Amazon S3](#C.-Upload-datasets-to-Amazon-S3)
1. [Calculate Robustness Measures](#3.-Calculate-Robustness-Measures)
	1. [Set up environment](#3.1-Set-up-environment)
	1. [Calculate Measures](#3.2-Calculate-Measures)
    1. [Visualize Output](#3.3-Inspect-the-Output-in-S3)
1. [Clean-up](#4.-Clean-up)
	1. [Unsubscribe to the listing (optional)](#Unsubscribe-to-the-listing-(optional))


#### Usage instructions
You can run this notebook one cell at a time (By using Shift+Enter for running a cell).

### 1. Subscribe to the algorithm

To subscribe to the algorithm:
1. Open the algorithm listing page Robustness Metrics for Tabular data
1. On the AWS Marketplace listing,  click on **Continue to subscribe** button.
1. On the **Subscribe to this software** page, review and click on **"Accept Offer"** if you agree with EULA, pricing, and support terms. 
1. Once you click on **Continue to configuration button** and then choose a **region**, you will see a **Product Arn**. This is the algorithm ARN that you need to specify while training a custom ML model. Copy the ARN corresponding to your region and specify the same in the following cell.

In [1]:
algo_arn ='arn:aws:sagemaker:us-east-2:786796469737:algorithm/robustness-measures-tabular'

### 2. Prepare dataset

In [2]:
import base64
import json 
import uuid
from sagemaker import ModelPackage
import sagemaker as sage
from sagemaker import get_execution_role
from urllib.parse import urlparse
import boto3
import urllib.request
import numpy as np
import tarfile
from zipfile import ZipFile
import pandas as pd
from pprint import pprint

#### A. Dataset format expected by the algorithm

The algorithm requires data in the format as described for best results:
* Input File name should be data.zip
* Withing the zip file, following 4 files must be provided: train.csv, test.csv, model.h5 and eps.json
* For detailed instructions, please refer sample notebook and algorithm input details

#### B. Configure dataset

In [3]:
training_dataset='Input/data.zip'

#### C. Upload datasets to Amazon S3

In [4]:
sagemaker_session = sage.Session()
bucket=sagemaker_session.default_bucket()

In [5]:
# training input location
common_prefix = "robustness-measures-tabular"
training_input_prefix = common_prefix + "/training-input-data"
TRAINING_WORKDIR = "Input"
training_input = sagemaker_session.upload_data(TRAINING_WORKDIR, key_prefix=training_input_prefix)
print("Training input uploaded to " + training_input)

Training input uploaded to s3://sagemaker-us-east-2-786796469737/robustness-measures-tabular/training-input-data


## 3. Calculate Robustness Measures

Now that dataset is available in an accessible Amazon S3 bucket, we are ready to calculate the robustness measures. 

### 3.1 Set up environment

In [9]:
role = get_execution_role()

In [6]:
output_location = 's3://{}/robustness-measures-tabular/{}'.format(bucket, 'output')

### 3.2 Calculate Measures

For information on creating an `Estimator` object, see [documentation](https://sagemaker.readthedocs.io/en/stable/api/training/estimators.html)

In [7]:
training_instance_type='ml.m5.large'

In [10]:
#Create an estimator object for running a training job
estimator = sage.algorithm.AlgorithmEstimator(
    algorithm_arn=algo_arn,
    base_job_name="robustness-measures-tabular",
    role=role,
    train_instance_count=1,
    train_instance_type=training_instance_type,
    input_mode="File",
    output_path=output_location,
    sagemaker_session=sagemaker_session,
    instance_count=1,
    instance_type=training_instance_type
)
#Run the training job.
estimator.fit({"training": training_input})

2022-07-28 11:07:05 Starting - Starting the training job...
2022-07-28 11:07:28 Starting - Preparing the instances for trainingProfilerReport-1659006425: InProgress
......
2022-07-28 11:08:29 Downloading - Downloading input data...
2022-07-28 11:08:51 Training - Downloading the training image......
2022-07-28 11:10:02 Training - Training image download completed. Training in progress..[34m2022-07-28 11:10:05.632117: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory[0m
[34m2022-07-28 11:10:05.632181: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.[0m
[34mInstructions for updating:[0m
[34mnon-resource variables are not supported in the long term[0m
[34mStarting the train functiom.[0m
[34m2022-07-28 11:10:14.814063: W tensorflow/stream_executor/pla

See this [blog-post](https://aws.amazon.com/blogs/machine-learning/easily-monitor-and-visualize-metrics-while-training-models-on-amazon-sagemaker/) for more information how to visualize metrics during the process. You can also open the training job from [Amazon SageMaker console](https://console.aws.amazon.com/sagemaker/home?#/jobs/) and monitor the metrics/logs in **Monitor** section.

In [11]:
#output is available on following path
estimator.output_path

's3://sagemaker-us-east-2-786796469737/robustness-measures-tabular/output'

## Note: Inferencing is done within training pipeline. Real time inference endpoint/batch transform job is not required.

### 3.3 Inspect the Output in S3

In [12]:
from urllib.parse import urlparse

parsed_url = urlparse(estimator.output_path)
bucket_name = parsed_url.netloc
file_key = parsed_url.path[1:]+'/'+estimator.latest_training_job.job_name+'/output/'+"model.tar.gz"

s3_client = sagemaker_session.boto_session.client('s3')

response = s3_client.get_object(Bucket = sagemaker_session.default_bucket(), Key = file_key)

In [13]:
bucketFolder = estimator.output_path.rsplit('/')[3] +'/output/'+estimator.latest_training_job.job_name+'/output/'+"model.tar.gz"

In [20]:
import boto3
s3_conn = boto3.client("s3")
bucket_name=bucket
with open('Output/output.tar.gz', 'wb') as f:
    s3_conn.download_fileobj(bucket_name, bucketFolder, f)
    print("Output file loaded from bucket")

Output file loaded from bucket


In [22]:
with tarfile.open('Output/output.tar.gz') as file:
    file.extractall('./Output')
with ZipFile('./Output/result.zip', "r") as output_zip:
    res = json.loads(output_zip.open('result.txt').read().decode('utf-8'))

In [23]:
print('Results:')
pprint(res)

Results:
{'Clever Score': '0.28338301539885163',
 'Difference in Accuracy': '14.2%',
 'Loss Sensitivity': '6.8732057',
 'Robustness Score': '0.07968053'}


### 4. Clean-up

#### Unsubscribe to the listing (optional)

If you would like to unsubscribe to the algorithm, follow these steps. Before you cancel the subscription, ensure that you do not have any [deployable model](https://console.aws.amazon.com/sagemaker/home#/models) created from the model package or using the algorithm. Note - You can find this information by looking at the container name associated with the model. 

**Steps to unsubscribe to product from AWS Marketplace**:
1. Navigate to __Machine Learning__ tab on [__Your Software subscriptions page__](https://aws.amazon.com/marketplace/ai/library?productType=ml&ref_=mlmp_gitdemo_indust)
2. Locate the listing that you want to cancel the subscription for, and then choose __Cancel Subscription__  to cancel the subscription.

