# Label Inspector in AWS Marketplace 


cleanlab's [Label Inspector (Tabular)](TODO: marketplace listing url) automatically detects label errors in your tabular classification dataset. All you need is your dataset containing class labels and the feature values of each datapoint, and we will flag examples that potentially have erroneous labels.

This sample notebook will show you how to use the Label Inspector in Amazon SageMaker.

> **Note**: This is a reference notebook and it cannot run unless you make changes suggested in the notebook.

## Pre-requisites:
1. **Note**: This notebook contains elements which render correctly in Jupyter interface. Open this notebook from an Amazon SageMaker Notebook Instance or Amazon SageMaker Studio.
1. Ensure that IAM role used has **AmazonSageMakerFullAccess**
1. To deploy this ML model successfully, ensure that:
    1. Either your IAM role has these three permissions and you have authority to make AWS Marketplace subscriptions in the AWS account used: 
        1. **aws-marketplace:ViewSubscriptions**
        1. **aws-marketplace:Unsubscribe**
        1. **aws-marketplace:Subscribe**  
    2. or your AWS account has a subscription to [Label Inspector](TODO: marketplace listing url). If so, skip step: [Subscribe to the model package](#1.-Subscribe-to-the-model-package)

## Contents:
1. [Subscribe to the model package](#1.-Subscribe-to-the-model-package)
2. [Define the model and input data](#2.-Define-the-model-and-input-data)
   1. [Define model package](#A.-Define-model-package)
   2. [Create input payload](#B.-Create-input-payload)
3. [Perform batch inference](#3.-Perform-batch-inference) 
    1. [Run batch transform job](#A.-Run-batch-transform-job)
    2. [Vizualize Output](#B.-Vizualize-output)
4. [Clean-up](#4.-Clean-up)
    1. [Delete the model](#A.-Delete-the-model)
    2. [Unsubscribe to the listing (optional)](#B.-Unsubscribe-to-the-listing-(optional))
    

## Usage instructions
You can run this notebook one cell at a time (By using Shift+Enter for running a cell).

## 1. Subscribe to the model package

To subscribe to the model package:
1. Open the model package listing page [Label Inspector](TODO: marketplace listing url).
1. On the AWS Marketplace listing, click on the **Continue to subscribe** button.
1. On the **Subscribe to this software** page, review and click on **"Accept Offer"** if you and your organization agrees with EULA, pricing, and support terms. 
1. Once you click on **Continue to configuration button** and then choose a **region**, you will see a **Product Arn** displayed. This is the model package ARN that you need to specify while creating a deployable model using Boto3. Copy the ARN corresponding to your region and specify the same in the following cell.

In [None]:
model_package_arn = "<Specify the Model package ARN for Label Inspector obtained from AWS Marketplace>"

In [None]:
from sagemaker import ModelPackage
import sagemaker as sage
from sagemaker import get_execution_role
import boto3

import pandas as pd

In [None]:
role = get_execution_role()

sagemaker_session = sage.Session()

bucket = sagemaker_session.default_bucket()
runtime = boto3.client("runtime.sagemaker")

## 2. Define the Model and Input Data

If you want to understand how real-time inference with Amazon SageMaker works, see [Documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-hosting.html).

In [None]:
model_name = "label-inspector-tabular"
content_type = "text/csv"
batch_transform_inference_instance_type = "ml.m5.xlarge"

### A. Define model package

In [None]:
# Create a model from the model package.
model = ModelPackage(
    role=role, model_package_arn=model_package_arn, sagemaker_session=sagemaker_session
)

### B. Create input payload

Here is a sample input that is accepted by the Label Inspector pacakge.

In [None]:
sample_payload = pd.read_csv("data/input/payload.csv")

sample_payload.head(5)

## 3. Perform batch inference

In this section, you will perform batch inference using multiple input payloads together. If you are not familiar with batch transform, and want to learn more, see these links:
1. [How it works](https://docs.aws.amazon.com/sagemaker/latest/dg/ex1-batch-transform.html)
2. [How to run a batch transform job](https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-batch.html)

### A. Run batch transform job

In [None]:
# Upload the batch-transform job input files to S3
transform_input_folder = "data/input/"
transform_input = sagemaker_session.upload_data(transform_input_folder, key_prefix=model_name)
print("Transform input uploaded to " + transform_input)

In [None]:
# Run the batch-transform job
transformer = model.transformer(1, batch_transform_inference_instance_type)
transformer.transform(transform_input, content_type=content_type)
transformer.wait()

In [None]:
# Your batch transform output is available on following path
transformer.output_path

### B. Vizualize Output

After the batch transform job is complete, we can get the output data from the S3 bucket specified above. Then we will show a sample of what the output data will look like.

In [None]:
!aws s3 cp --recursive $transformer.output_path ./data/output/

In [None]:
!head payload.csv.out

## 4. Clean-up

### A. Delete the model

In [None]:
model.delete_model()

### B. Unsubscribe to the listing (optional)

If you would like to unsubscribe to the model package, follow these steps. Before you cancel the subscription, ensure that you do not have any batch transform jobs still in progress. 

**Steps to unsubscribe to product from AWS Marketplace**:
1. Navigate to __Machine Learning__ tab on [__Your Software subscriptions page__](https://aws.amazon.com/marketplace/ai/library?productType=ml&ref_=mlmp_gitdemo_indust).
2. Locate the listing that you want to cancel the subscription for, and then choose __Cancel Subscription__  to cancel the subscription.

