## Package a machine learning model for listing on the AWS Marketplace

This sample notebook provides scripts you can use to package and verify your ML model for listing on AWS Marketplace. This sample notebook shows you the end-to-end process by building a sample ML model based on the Iris plant dataset.

The following diagram provides an overview of the ML model packaging process. As you can see, In [Step 1](#step1) you will train a simple model, and you will store model artifacts into a joblib file. In [Step 2](#step2) you will learn how to author scoring logic that loads the ML model, performs inference, and returns the prediction. In [Step 3](#step3) you will learn how to package the ML model into a Docker Image. In [Step 4](#step4) you will push this Docker image into Amazon ECR. In [Step 5](#step5) you will learn how to package the ML model into a Model Package. In [Step 6](#step6) you will validate this ML model by deploying it with Amazon SageMaker. In [Step 7](#step7) you will learn about resources that guide you on how to list the ML model in AWS Marketplace.

<img src="images/ml-model-publishing-workflow.png"/>



**Pre-requisites** 
1. Before you start building an ML model, you are strongly recommended watching this [video](https://www.youtube.com/watch?v=npilyL5xvV4) to understand the overall end-to-end ML model building and listing process.
2. You need to add the managed policy **[AmazonEC2ContainerRegistryFullAccess](https://docs.aws.amazon.com/AmazonECR/latest/userguide/security-iam-awsmanpol.html#security-iam-awsmanpol-AmazonEC2ContainerRegistryFullAccess)** to the role associated with your notebook instance.
3. In order to create listings on AWS Marketplace you will need to register an AWS account to be a seller account by following the [seller registration process](https://docs.aws.amazon.com/marketplace/latest/userguide/seller-registration-process.html). This guide assumes that the notebook is to be run in the registered seller account.

**Note** - This example shows how to package a simple Python example which showcases a decision tree model built with the scikit-learn machine learning package. You are recommended to follow the notebook once and then customize it for your own ML model. 

**Table of contents**
1. [Step 1 - Build ML model](#step1): 
2. [Step 2 - Implement scoring logic](#step2): 
3. [Step 3 - Package model artifacts and scoring logic into a Docker image](#step3)
    1. [Step 3.1: Build Docker image to be included in the ML model](#step31)
    2. [Step 3.2 : Test Docker image](#step32)
4. [Step 4 - Push the Docker image into Amazon ECR](#step4): 
5. [Step 5 - Create an ML Model Package](#step5): 
6. [Step 6 - Validate model in Amazon SageMaker environment](#step6):
    1. [Step 6.1 Validate Real-time inference via Amazon SageMaker Endpoint](#step61)
    2. [Step 6.2 Validate batch inference via batch transform job](#step61)
7. [Step 7 - List ML model on AWS Marketplace](#step7):

Here we import all of the libraries needed throughout the notebook to complete the model packaging process.  We also create the clients necessary to interact with the various services needed (e.g., ECR, SageMaker, and S3).

In [None]:
import base64
import boto3
import docker
import json
import pandas as pd
import requests
import sagemaker as sage
from sagemaker import get_execution_role, ModelPackage
import socket
import time
from urllib.parse import urlparse

# Training specific imports
from joblib import dump, load
from sklearn import tree
import matplotlib.pyplot as plt
import numpy as np
from sklearn import datasets
from src.scoring_logic import IrisLabel

# Common variables
session = sage.Session()
s3_bucket = session.default_bucket()
region = session.boto_region_name
account_id = boto3.client("sts").get_caller_identity().get("Account")
role = get_execution_role()

sagemaker = boto3.client("sagemaker")
s3_client = session.boto_session.client("s3")
ecr = boto3.client("ecr")
sm_runtime = boto3.client("sagemaker-runtime")

The model name will be re-used through various parts of the packaging and publishing process.

In [None]:
# Define parameters
model_name = "my-flower-detection-model"

### <a name="step1"></a> Step 1: Build ML model

For the purpose of this sample, this section builds a simple classification model using the [Iris plants dataset](https://scikit-learn.org/stable/datasets/toy_dataset.html#iris-plants-dataset) and then serializes it using joblib

In [None]:
iris = pd.read_csv("s3://sagemaker-sample-files/datasets/tabular/iris/iris.data", header=None)

features = iris.iloc[:, 0:4]
label = iris.iloc[:, 4].apply(
    lambda x: IrisLabel[x.replace("Iris-", "")].value
)  # Integer encode the labels

classifier = tree.DecisionTreeClassifier(random_state=0)
classifier = classifier.fit(features, label)

# Store the model
dump(classifier, "src/model-artifacts.joblib")

# Show the model
plt.figure(figsize=[15.4, 14.0])
tree.plot_tree(classifier, filled=True)
plt.show()

### <a name="step2"></a>Step 2: Implement scoring logic 

The supported input and output content types are left to the scoring logic. It is recommended to follow the [SageMaker standards](https://docs.aws.amazon.com/sagemaker/latest/dg/cdf-inference.html) for request and response formats where possible to provide a consistent experience to end users. The sample scoring logic provided in this example follows this standard.

In [None]:
!ls src

scoring_logic.py contains all the necessary logic to take the HTTP requests that arrive via the SageMaker endpoint, translate them as needed to perform an inference, and return a properly formatted response

In [None]:
!pygmentize src/scoring_logic.py

Amazon SageMaker uses two URLs in the container:

* `/ping` will receive `GET` requests from the infrastructure. Your program returns 200 if the container is up and accepting requests.
* `/invocations` is the endpoint that receives client inference `POST` requests. The format of the request and the response is up to the algorithm. If the client supplied `ContentType` and `Accept` headers, these will be passed in as well. For advanced usage like request tracing, `CustomAttributes` can be used (more [details](https://aws.amazon.com/blogs/machine-learning/amazon-sagemaker-runtime-now-supports-the-customattribute-header/)).  All other headers will be stripped off by the SageMaker Endpoint.

The container will have the model files in the same place they were written during training:

    /opt/ml
     -- model
        -- <model files>


#### Note on Inference pricing

When the buyer runs your software by hosting an endpoint to perform real-time inference, you can choose to set a price per inference or per hour that the endpoint is active. Batch transform processes always use hourly pricing.

With inference pricing, AWS Marketplace charges your buyer for each invocation of your endpoint with an HTTP response code of 2XX. However, in some cases, your software may process a batch of inferences in a single invocation. For an endpoint deployment, you can indicate a custom number of inferences that AWS Marketplace should charge the buyer for that single invocation. To do this, include a custom metering header in the HTTP response headers of your invocation, as in the following example.

```
X-Amzn-Inference-Metering: {"Dimension": "inference.count", "ConsumedUnits": 3}
```
This example shows an invocation that charges the buyer for three inferences. You can find more information in the [documentation](https://docs.aws.amazon.com/marketplace/latest/userguide/machine-learning-pricing.html).

### <a name="step3"></a>Step 3: Package model artifacts and scoring logic into a Docker Image

##### Docker image

The provided Dockerfile packages the model artifacts and serving logic as well as installing all the dependencies needed at inference time (flask, gunicorn, sklearn).

In [None]:
!pygmentize src/Dockerfile

In this notebook, we are showing a minimal example for how to create an inference image for clarity. However, for models that use common machine learning frameworks such as Sklearn, TensorFlow, TensorFlow 2, PyTorch, and Apache MXNet, AWS provides [Deep Learning Containers](https://docs.aws.amazon.com/deep-learning-containers/latest/devguide/what-is-dlc.html) as well as [Scikit-learn and SparkML Containers](https://docs.aws.amazon.com/sagemaker/latest/dg/pre-built-docker-containers-scikit-learn-spark.html), which are a set of optimized Docker images which greatly simplify the setup necessary for model serving. These images should be used as base images when possible as they are performance optimized for CPU, GPU, and Inferentia. [Detailed instructions](https://docs.aws.amazon.com/sagemaker/latest/dg/pre-built-containers-frameworks-deep-learning.html) are available for using the Deep Learning Containers.

Select the appropriate image (CPU/GPU/Inferentia/framework combination) and replace the ubuntu:18.04 base image when adapting this example notebook for your own model to take advantage of the prebuilt SageMaker containers. 

For additional performance optimization, [SageMaker Neo](https://docs.aws.amazon.com/sagemaker/latest/dg/neo.html) provides the ability to automatically optimize an existing model implemented in any common machine learning framework for deployment on cloud instances (including Inferentia).  To take advantage of SageMaker Neo, follow the instructions for [compilation](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-job-compilation.html) and [serving](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-deployment-hosting-services-prerequisites.html).

##### Serving application

`serve` is a minimal script for starting up an HTTP server to handle requests.  

Here we use [gunicorn](https://gunicorn.org/) as it is appropriate for a production deployment of [Flask](https://flask.palletsprojects.com/) applications. For more complex deployments the prebuilt SageMaker containers include the [SageMaker Inference Toolkit](https://github.com/aws/sagemaker-inference-toolkit).

In [None]:
!pygmentize -l bash src/serve

### How Amazon SageMaker runs your Docker container

Amazon SageMaker runs your container with the argument `serve`. How your container processes this argument depends on the container:

* In the example here, we don't define an `ENTRYPOINT` in the Dockerfile so Docker will run the command `train` at training time and `serve` at serving time. In this example, we define these as executable bash scripts, but they could be any program that we want to start in that environment.
* If you specify a program as an `ENTRYPOINT` in the Dockerfile, that program will be run at startup and the first argument will be `train` or `serve`. The program can then look at that argument and decide what to do.
* If you are building separate containers for training and hosting (or building only for one or the other), you can define a program as an `ENTRYPOINT` in the Dockerfile and ignore (or verify) the first argument passed in. 


#### <a name="step31"></a>Step 3.1: Build Docker Image to be included in the ML model

In [None]:
docker_client = docker.from_env()

In [None]:
image, build_logs = docker_client.images.build(path="./src", tag=model_name)

#### <a name="step32"></a>Step 3.2 : Run Docker container

In [None]:
port = 8080
SECONDS = 1000000000  # One second in nanoseconds

container = docker_client.containers.run(
    image,
    detach=True,
    name=model_name,
    command="serve",
    healthcheck={
        "test": f"curl -f http://localhost:{port}/ping || exit 1",
        "interval": 1 * SECONDS,  # One second
        "timeout": 1 * SECONDS,  # One second
    },
    ports={f"{port}/tcp": port},
)

# Wait until our server is ready
while docker_client.api.inspect_container(container.name)["State"]["Health"]["Status"] != "healthy":
    print("Waiting for server to become ready...")
    time.sleep(1)
    container.reload()
    print(
        f"Container is {docker_client.api.inspect_container(container.name)['State']['Health']['Status']}"
    )

#### Step 3.3: Perform inference on the container

Test that we can send a single record in a request.

In [None]:
container_invocation_url = f"http://127.0.0.1:{port}/invocations"

r = requests.post(
    container_invocation_url,
    headers={"Content-Type": "text/csv"},
    data="5.1, 3.5, 1.4, 0.2",  # setosa labeled record from training set
)

print(r.json())

Next, try sending multiple records in a request.

In [None]:
# Three records from the training set corresponding to setosa, versicolor, and virginica labels respectively
csv_input_data = """
5.1, 3.5, 1.4, 0.2
6.5, 2.8, 4.6, 1.5
6.3, 2.9, 5.6, 1.8
""".strip()

print(csv_input_data)

In [None]:
r = requests.post(
    container_invocation_url, headers={"Content-Type": "text/csv"}, data=csv_input_data
)

print(r.json())

Next, try sending different supported input content types.

##### JSON input Content-Type

In [None]:
json_input_data = json.dumps(
    {
        "instances": [
            {"features": [5.1, 3.5, 1.4, 0.2]},  # setosa labeled record from training set
            {"features": [6.5, 2.8, 4.6, 1.5]},  # versicolor
            {"features": [6.3, 2.9, 5.6, 1.8]},  # virginica
        ]
    }
)

In [None]:
r = requests.post(
    container_invocation_url,
    headers={"Content-Type": "application/json"},
    data=json_input_data,
)

print(r.json())

##### JSON Lines input Content-Type

In [None]:
# Three records from the training set corresponding to setosa, versicolor, and virginica labels respectively
jsonlines_input_data = """
{\"features\": [5.1, 3.5, 1.4, 0.2]}
{\"features\": [6.5, 2.8, 4.6, 1.5]}
{\"features\": [6.3, 2.9, 5.6, 1.8]}
""".strip()

print(jsonlines_input_data)

In [None]:
r = requests.post(
    container_invocation_url,
    headers={"Content-Type": "application/jsonlines"},
    data=jsonlines_input_data,
)

print(r.json())

##### CSV output Content-Type

Test the response types by setting the Accept header to the desired type

In [None]:
r = requests.post(
    container_invocation_url,
    headers={"Content-Type": "application/jsonlines", "Accept": "text/csv"},
    data=jsonlines_input_data,
)

print(r.text)

##### JSON Lines output Content-Type

In [None]:
r = requests.post(
    container_invocation_url,
    headers={
        "Content-Type": "application/jsonlines",
        "Accept": "application/jsonlines",
    },
    data=jsonlines_input_data,
)

print(r.text)

Note - If the container did not return the expected response, run the following command to see the logs.

In [None]:
print(container.logs().decode("utf-8"))

Congratulations, now that you have successfully tested container locally you can remove the container.

In [None]:
container.stop()
container.remove()

### <a name="step4"></a>Step 4: Push the docker image into Amazon ECR

Now that your docker image is ready, you are ready to push the docker image into the Amazon ECR repository. 

**NOTE:** The ECR repository must belong to the AWS account that is registered as a seller on the AWS Marketplace.

In [None]:
docker_image_arn = f"{account_id}.dkr.ecr.{region}.amazonaws.com/{model_name}"
docker_image_arn

The following code shows how to build the container image and push the container image to ECR using the Docker python SDK. 

This code looks for an ECR repository in the account you're using and the current default region (if you're using an Amazon SageMaker notebook instance, this will be the region where the notebook instance was created). If the repository doesn't exist, the script will create it.

In [None]:
repo_exists = model_name in [
    repo["repositoryName"] for repo in ecr.describe_repositories().get("repositories")
]

if not repo_exists:
    ecr.create_repository(repositoryName=model_name)

In [None]:
ecr_auth_data = ecr.get_authorization_token()["authorizationData"][0]
username, password = (
    base64.b64decode(ecr_auth_data["authorizationToken"]).decode("utf-8").split(":")
)

docker_client.api.tag(model_name, docker_image_arn, tag="latest")
status = docker_client.api.push(
    docker_image_arn,
    tag="latest",
    auth_config={"username": username, "password": password},
)

### <a name="step5"></a>Step 5: Create an ML Model Package

In this section, we will see how you can package your artifacts (ECR image and the trained artifact from your previous training job) into a ModelPackage. Once you complete this, you can list your product as a pretrained model in the AWS Marketplace.

**NOTE:** If your model can be deployed on multiple hardware types (CPU/GPU/Inferentia) then a ModelPackage must be created for each and added to the MP listing as different versions as, in general, the container image used will be different for each.  

#### Model Package Definition
A Model Package is a reusable abstraction for model artifacts that packages all the ingredients necessary for inference. It consists of an inference specification that defines the inference image to use along with an optional model data location.

The ModelPackage must be created in the AWS account that is registered to be a seller on the AWS Marketplace.

#### Step 5.1 Define parameters 

In [None]:
model_description = "This model accepts petal length, petal width, sepal length, sepal width and predicts whether flower is of type setosa, versicolor, or virginica"

supported_content_types = ["text/csv", "application/json", "application/jsonlines"]
supported_response_MIME_types = [
    "application/json",
    "text/csv",
    "application/jsonlines",
]

A Model Package creation process requires you to specify following:
  1. Docker image
  2. Model artifacts
    - You can either package these inside the docker image, as we have done in this example, or provide them as a gzipped tarball.
  3. Validation specification 
        
In order to provide confidence to sellers (and buyers) that the products work in Amazon SageMaker, before listing them on AWS Marketplace SageMaker needs to perform basic validations. The product can be listed in AWS Marketplace only if this validation process succeeds. This validation process uses the validation profile and sample data provided by you to create a transform job in your account using the Model to verify your inference image works with SageMaker.

Next, you need to identify the right instance-sizes for your ML models. You can do so by running performance tests on top of your ML Model.
A [sample notebook](https://github.com/aws-samples/aws-marketplace-machine-learning/blob/master/right_size_your_sagemaker_endpoints/Right-sizing%20your%20Amazon%20SageMaker%20Endpoints.ipynb) is available to identify minimum suggested instance types.

**NOTE:** In addition to tuning, take into account the requirements of your model when identifying instance types.  If your model does not use GPU resources, then do not include GPU instance types.  Similarly, if your model does use GPU resources, but can only make use of a single GPU, do not include instance types that have multiple GPUs as it will lead to increased infrastructure charges for your customers with no performance benefit.

In [None]:
supported_realtime_inference_instance_types = ["ml.m4.xlarge"]
supported_batch_transform_instance_types = ["ml.m4.xlarge"]

In [None]:
validation_file_name = "input.csv"
validation_input_path = f"s3://{s3_bucket}/validation-input-csv/"
validation_output_path = f"s3://{s3_bucket}/validation-output-csv/"

First, we create sample data to be used in the validation stage of the ModelPackage creation and upload it to S3.

In [None]:
csv_line = "5.1, 3.5, 1.4, 0.2"

with open("input.csv", "w") as f:
    f.write(csv_line)

s3_client.put_object(Bucket=s3_bucket, Key="validation-input-csv/input.csv", Body=csv_line)

#### Step 5.2 Create Model Package 

In [None]:
model_package = sagemaker.create_model_package(
    ModelPackageName=model_name,
    ModelPackageDescription=model_description,
    InferenceSpecification={
        "Containers": [
            {
                "Image": f"{docker_image_arn}:latest",
            }
        ],
        "SupportedTransformInstanceTypes": supported_batch_transform_instance_types,
        "SupportedRealtimeInferenceInstanceTypes": supported_realtime_inference_instance_types,
        "SupportedContentTypes": supported_content_types,
        "SupportedResponseMIMETypes": supported_response_MIME_types,
    },
    CertifyForMarketplace=True,  # Make sure to set this to True for Marketplace models!
    ValidationSpecification={
        "ValidationRole": role,
        "ValidationProfiles": [
            {
                "ProfileName": "Validation-test",
                "TransformJobDefinition": {
                    "BatchStrategy": "SingleRecord",
                    "TransformInput": {
                        "DataSource": {
                            "S3DataSource": {
                                "S3DataType": "S3Prefix",
                                "S3Uri": validation_input_path,
                            }
                        },
                        "ContentType": supported_content_types[0],
                    },
                    "TransformOutput": {
                        "S3OutputPath": validation_output_path,
                    },
                    "TransformResources": {
                        "InstanceType": supported_batch_transform_instance_types[0],
                        "InstanceCount": 1,
                    },
                },
            },
        ],
    },
)

In [None]:
session.wait_for_model_package(model_package_name=model_name)

Once you have executed the preceding cell, open the [Model Packages console from Amazon SageMaker](https://console.aws.amazon.com/sagemaker/home?region=us-east-1#/model-packages/my-resources) and check if model creation succeeded. 

Choose the Model and then open the **Validation** tab to see the validation results.

### <a name="step6"></a>Step 6: Validate model in Amazon SageMaker environment

##### Create a deployable model from the model package.

In [None]:
model = ModelPackage(
    role=role,
    model_package_arn=model_package["ModelPackageArn"],
    sagemaker_session=session,
)

#### <a name='step61'></a>Step 6.1 Validate Real-time inference via Amazon SageMaker Endpoint

##### Deploy the SageMaker model to an endpoint

In [None]:
model.deploy(
    initial_instance_count=1,
    instance_type=supported_realtime_inference_instance_types[0],
    endpoint_name=model_name,
)
model.endpoint_name

In [None]:
content_type = supported_content_types[0]

##### Example invocation via boto3

In [None]:
response = sm_runtime.invoke_endpoint(
    EndpointName=model.endpoint_name,
    ContentType=content_type,
    Accept="application/json",
    Body=csv_input_data,
)

json.load(response["Body"])

##### Example invocation via the AWS CLI

In [None]:
# Perform inference
!aws sagemaker-runtime invoke-endpoint \
    --endpoint-name $model.endpoint_name \
    --body fileb://$validation_file_name \
    --content-type $content_type \
    --region $session.boto_region_name \
    out.out
    
    
# Print inference
!head out.out

Clean up the endpoint and endpoint configuration created.

In [None]:
model.sagemaker_session.delete_endpoint(model.endpoint_name)
model.sagemaker_session.delete_endpoint_config(model.endpoint_name)

#### <a name='step62'></a>Step 6.2 Validate batch inference via batch transform job 

##### Run a batch-transform job

In [None]:
transformer = model.transformer(
    instance_count=1,
    instance_type=supported_batch_transform_instance_types[0],
    accept="application/jsonlines",
)
transformer.transform(validation_input_path, content_type=content_type)
transformer.wait()

##### Retrieve the results from S3

In [None]:
parsed_url = urlparse(transformer.output_path)
file_key = f"{parsed_url.path[1:]}/{validation_file_name}.out"
response = s3_client.get_object(Bucket=s3_bucket, Key=file_key)

print(response["Body"].read().decode("utf-8"))

Congratulations! You just verified that the batch transform job is working as expected. Since the model is not required, you can delete it. Note that you are deleting the deployable model. Not the model package.

In [None]:
model.delete_model()

To publish the model to the AWS Marketplace, you will need to specify model package ARN. Copy the following Model Package ARN 

In [None]:
model_package["ModelPackageArn"]

### <a name="step7"></a>Step 7: List ML Model on AWS Marketplace

In the [Model Packages](https://console.aws.amazon.com/sagemaker/home?region=us-east-1#/model-packages/my-resources) section of the SageMaker console you'll find the entity you created in this notebook. If it was successfully created and validated, you should be able to select the entity and choose **Publish new ML Marketplace listing**.

<img src="images/publish-to-marketplace-action.png"/>

You will be redirected to the [AWS Marketplace Management portal](https://aws.amazon.com/marketplace/management/ml-products/) where you will be able to build a listing.

If your model targets multiple hardware types, remember to add each ModelPackage to the listing as separate versions.

#### Creating a High-quality ML Model Listing

Your AWS Marketplace model listing should appeal to both data scientists with deep expertise who are looking for an ML model because they don't have access to data they need to train an ML model from scratch and developers with little to no ML background who are looking to add powerful new features to their applications. 

You need to provide sample notebook and instructions your users can follow to interact with your model. [Sample notebook templates](https://github.com/aws/amazon-sagemaker-examples/tree/master/aws_marketplace/curating_aws_marketplace_listing_and_sample_notebook/ModelPackage/Sample_Notebook_Template) are available to assist in creating an effective sample notebook.

To build an impressive listing, you need to stand out by providing information thoughtfully on your Marketplace listing. [Best practice recommendations](https://github.com/aws/amazon-sagemaker-examples/blob/master/aws_marketplace/curating_aws_marketplace_listing_and_sample_notebook/ModelPackage/curating_good_model_package_listing.md) are provided for curating your listing.


**Resources**
* [Publishing your product in AWS Marketplace](https://docs.aws.amazon.com/marketplace/latest/userguide/ml-publishing-your-product-in-aws-marketplace.html)

Most importantly, once you have listed your listing in AWS Marketplace, do explore how you can spread the word about your cool new ML listing.