## Testing your algorithm on your local machine or on an Amazon SageMaker notebook instance

While you're first packaging an algorithm use with Amazon SageMaker, you probably want to test it yourself to make sure it's working right. In the directory `container/local_test`, there is a framework for doing this. It includes three shell scripts for running and using the container and a directory structure that mimics the one outlined above.

The scripts are:

* `train_local.sh`: Run this with the name of the image and it will run training on the local tree. You'll want to modify the directory `test_dir/input/data/...` to be set up with the correct channels and data for your algorithm. Also, you'll want to modify the file `input/config/hyperparameters.json` to have the hyperparameter settings that you want to test (as strings).
* `serve_local.sh`: Run this with the name of the image once you've trained the model and it should serve the model. It will run and wait for requests. Simply use the keyboard interrupt to stop it.
* `predict.sh`: Run this with the name of a payload file and (optionally) the HTTP content type you want. The content type will default to `image/jpeg`. For example, you can run `$ ./predict.sh payload.jpg image/jpeg`.

The directories as shipped are set up to test the image classification sample algorithm presented here.

# Part 2: Training, Batch Inference and Hosting your Algorithm in Amazon SageMaker

Once you have your container packaged, you can use it to train and serve models. Let's do that with the algorithm we made above.

## Set up the environment

Here we specify a bucket to use and the role that will be used for working with Amazon SageMaker.

In [1]:
# S3 prefix
common_prefix = "DEMO-gluoncv-model-zoo"
training_input_prefix = common_prefix + "/training-input-data"

import os
from sagemaker import get_execution_role

role = get_execution_role()

## Create the session

The session remembers our connection parameters to Amazon SageMaker. We'll use it to perform all of our SageMaker operations.

In [2]:
import sagemaker as sage

sess = sage.Session()

## Create an estimator and fit the model

In order to use Amazon SageMaker to fit our algorithm, we'll create an `Estimator` that defines how to use the container to train. This includes the configuration we need to invoke SageMaker training:

* The __container name__. This is constructed as in the shell commands above.
* The __role__. As defined above.
* The __instance count__ which is the number of machines to use for training.
* The __instance type__ which is the type of machine to use for training.
* The __output path__ determines where the model artifact will be written.
* The __session__ is the SageMaker session object that we defined above.

Then we use fit() on the estimator to train against the data that we uploaded above.

In [3]:
account = sess.boto_session.client('sts').get_caller_identity()['Account']
region = sess.boto_session.region_name
image = '{}.dkr.ecr.{}.amazonaws.com/gluoncv-image-classification:latest'.format(account, region)

In [4]:
TRAINING_WORKDIR = "data/training"
MODEL_NAME = 'resnet18_v1b'

training_input = sess.upload_data(TRAINING_WORKDIR, key_prefix=training_input_prefix)
print ("Training Data Location " + training_input)
classifier = sage.estimator.Estimator(image,
                       role, 1, 'ml.c4.2xlarge',
                       output_path="s3://{}/output".format(sess.default_bucket()),
                       sagemaker_session=sess,
                       hyperparameters={'model_name': MODEL_NAME})
classifier.fit(training_input)

Training Data Location s3://sagemaker-us-west-2-218569190993/DEMO-gluoncv-model-zoo/training-input-data
2020-05-04 21:33:27 Starting - Starting the training job...
2020-05-04 21:33:29 Starting - Launching requested ML instances......
2020-05-04 21:34:28 Starting - Preparing the instances for training...
2020-05-04 21:35:24 Downloading - Downloading input data
2020-05-04 21:35:24 Training - Downloading the training image...
2020-05-04 21:35:57 Uploading - Uploading generated training model.[34mStarting the training.[0m
[34mFilling weights from resnet18_v1b[0m
[34mDownloading /root/.mxnet/models/resnet18_v1b-2d9d980c.zip from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/resnet18_v1b-2d9d980c.zip...[0m
[34m#015  0%|          | 0/42432 [00:00<?, ?KB/s]#015 12%|#2        | 5283/42432 [00:00<00:00, 52822.49KB/s]#015 29%|##8       | 12277/42432 [00:00<00:00, 57007.96KB/s]#015 46%|####5     | 19474/42432 [00:00<00:00, 60798.83KB/s]#015 63%|######2   | 26729/42

## Batch Transform
Here we simply use a demo image for transform input.

In [5]:
TRANSFORM_WORKDIR = "data/transform"
batch_inference_input_prefix = common_prefix + "/batch-inference-input-data"
transform_input = sess.upload_data(TRANSFORM_WORKDIR, key_prefix=batch_inference_input_prefix) + "/cat1.jpg"
print("Transform input uploaded to " + transform_input)

Transform input uploaded to s3://sagemaker-us-west-2-218569190993/DEMO-gluoncv-model-zoo/batch-inference-input-data/cat1.jpg


## Deploy the model

Deploying the model to Amazon SageMaker hosting just requires a `deploy` call on the fitted model. This call takes an instance count, instance type, and optionally serializer and deserializer functions. These are used when the resulting predictor is created on the endpoint.

In [6]:
# from sagemaker.predictor import csv_serializer

model = classifier.create_model()
predictor = classifier.deploy(1, 'ml.m4.xlarge')

-----------!

### Choose some data and use it for a prediction

In order to do some predictions, we'll use a demo jpeg image to test the model.

In [7]:
with open('data/transform/cat1.jpg', 'rb') as f:
    x = f.read()
    print(predictor.predict(x, initial_args={'ContentType':'image/jpeg'}).decode('utf-8'))

[lynx], with probability 0.253.
[Egyptian cat], with probability 0.252.
[tiger cat], with probability 0.106.
[tabby], with probability 0.063.
[soft-coated wheaten terrier], with probability 0.041.



### Cleanup Endpoint

When you're done with the endpoint, you'll want to clean it up.

In [8]:
sess.delete_endpoint(predictor.endpoint)

# Part 4 - Package your resources as an Amazon SageMaker ModelPackage

In this section, we will see how you can package your artifacts (ECR image and the trained artifact from your previous training job) into a ModelPackage. Once you complete this, you can list your product as a pretrained model in the AWS Marketplace.

## Model Package Definition
A Model Package is a reusable model artifacts abstraction that packages all ingredients necessary for inference. It consists of an inference specification that defines the inference image to use along with an optional model weights location.


#### Region Limitation
Seller onboarding is limited to us-east-2 region (CMH) only. The client we are creating below will be hard-coded to talk to our us-east-2 endpoint only. (Note: You may have previous done this step in Part 3. Repeating here to keep Part 4 self contained.)

In [9]:
import boto3
smmp = boto3.client('sagemaker', region_name='us-west-2', endpoint_url="https://sagemaker.us-west-2.amazonaws.com")

#### Inference Specification

You specify details pertinent to your inference code in this section.


In [10]:
from src.inference_specification import InferenceSpecification

import json

modelpackage_inference_specification = InferenceSpecification().get_inference_specification_dict(
    ecr_image=image,
    supports_gpu=True,
    supported_content_types=["image/jpeg", "image/png"],
    supported_mime_types=["text/plain"])

# Specify the model data resulting from the previously completed training job
modelpackage_inference_specification["InferenceSpecification"]["Containers"][0]["ModelDataUrl"]=classifier.model_data
print(json.dumps(modelpackage_inference_specification, indent=4, sort_keys=True))

{
    "InferenceSpecification": {
        "Containers": [
            {
                "Image": "218569190993.dkr.ecr.us-west-2.amazonaws.com/gluoncv-image-classification:latest",
                "ModelDataUrl": "s3://sagemaker-us-west-2-218569190993/output/gluoncv-image-classification-2020-05-04-21-33-26-992/output/model.tar.gz"
            }
        ],
        "SupportedContentTypes": [
            "image/jpeg",
            "image/png"
        ],
        "SupportedRealtimeInferenceInstanceTypes": [
            "ml.m4.xlarge",
            "ml.m4.2xlarge",
            "ml.m4.4xlarge",
            "ml.m4.10xlarge",
            "ml.m4.16xlarge",
            "ml.m5.large",
            "ml.m5.xlarge",
            "ml.m5.2xlarge",
            "ml.m5.4xlarge",
            "ml.m5.12xlarge",
            "ml.m5.24xlarge",
            "ml.c4.xlarge",
            "ml.c4.2xlarge",
            "ml.c4.4xlarge",
            "ml.c4.8xlarge",
            "ml.c5.xlarge",
            "ml.c5.2xlarge",
  

#### Validation Specification

In order to provide confidence to the sellers (and buyers) that the products work in Amazon SageMaker before listing them on AWS Marketplace, SageMaker needs to perform basic validations. The product can be listed in the AWS Marketplace only if this validation process succeeds. This validation process uses the validation profile and sample data provided by you to run the following validations:

* Create a transform job in your account using the above Model to verify your inference image works with SageMaker.


In [11]:
from src.modelpackage_validation_specification import ModelPackageValidationSpecification
import json

modelpackage_validation_specification = ModelPackageValidationSpecification().get_validation_specification_dict(
    validation_role = role,
    batch_transform_input = transform_input,
    content_type = "image/jpeg",
    instance_type = "ml.c4.xlarge",
    output_s3_location = 's3://{}/{}'.format(sess.default_bucket(), common_prefix))

print(json.dumps(modelpackage_validation_specification, indent=4, sort_keys=True))

{
    "ValidationSpecification": {
        "ValidationProfiles": [
            {
                "ProfileName": "ValidationProfile1",
                "TransformJobDefinition": {
                    "MaxConcurrentTransforms": 1,
                    "MaxPayloadInMB": 10,
                    "TransformInput": {
                        "CompressionType": "None",
                        "ContentType": "image/jpeg",
                        "DataSource": {
                            "S3DataSource": {
                                "S3DataType": "S3Prefix",
                                "S3Uri": "s3://sagemaker-us-west-2-218569190993/DEMO-gluoncv-model-zoo/batch-inference-input-data/cat1.jpg"
                            }
                        },
                        "SplitType": "None"
                    },
                    "TransformOutput": {
                        "Accept": "image/jpeg",
                        "AssembleWith": "None",
                        "KmsKeyId": "",
 

## Putting it all together

Now we put all the pieces together in the next cell and create an Amazon SageMaker Model Package.

In [12]:
import json
import time

model_package_name = "gluoncv-image-classification" + str(round(time.time()))
create_model_package_input_dict = {
    "ModelPackageName" : model_package_name,
    "ModelPackageDescription" : "Model to perform image classification or extract image features by deep learning",
    "CertifyForMarketplace" : True
}
create_model_package_input_dict.update(modelpackage_inference_specification)
create_model_package_input_dict.update(modelpackage_validation_specification)
print(json.dumps(create_model_package_input_dict, indent=4, sort_keys=True))

smmp.create_model_package(**create_model_package_input_dict)

{
    "CertifyForMarketplace": true,
    "InferenceSpecification": {
        "Containers": [
            {
                "Image": "218569190993.dkr.ecr.us-west-2.amazonaws.com/gluoncv-image-classification:latest",
                "ModelDataUrl": "s3://sagemaker-us-west-2-218569190993/output/gluoncv-image-classification-2020-05-04-21-33-26-992/output/model.tar.gz"
            }
        ],
        "SupportedContentTypes": [
            "image/jpeg",
            "image/png"
        ],
        "SupportedRealtimeInferenceInstanceTypes": [
            "ml.m4.xlarge",
            "ml.m4.2xlarge",
            "ml.m4.4xlarge",
            "ml.m4.10xlarge",
            "ml.m4.16xlarge",
            "ml.m5.large",
            "ml.m5.xlarge",
            "ml.m5.2xlarge",
            "ml.m5.4xlarge",
            "ml.m5.12xlarge",
            "ml.m5.24xlarge",
            "ml.c4.xlarge",
            "ml.c4.2xlarge",
            "ml.c4.4xlarge",
            "ml.c4.8xlarge",
            "ml.c5.xlarg

{'ModelPackageArn': 'arn:aws:sagemaker:us-west-2:218569190993:model-package/gluoncv-image-classification1588628531',
 'ResponseMetadata': {'RequestId': '1b6f5e3c-a51f-4895-88dc-b6a8308af03b',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '1b6f5e3c-a51f-4895-88dc-b6a8308af03b',
   'content-type': 'application/x-amz-json-1.1',
   'content-length': '115',
   'date': 'Mon, 04 May 2020 21:42:11 GMT'},
  'RetryAttempts': 0}}

#### Describe the ModelPackage 

The next cell describes the ModelPackage and waits until it reaches a terminal state (Completed or Failed)

In [13]:
import time
import json

while True:
    response = smmp.describe_model_package(ModelPackageName=model_package_name)
    status = response["ModelPackageStatus"]
    print (status)
    if (status == "Completed" or status == "Failed"):
        print (response["ModelPackageStatusDetails"])
        break
    time.sleep(5)


Pending
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
Completed
{'ValidationStatuses': [{'Name': 'ValidationProfile1', 'Status': 'Completed'}], 'ImageScanStatuses': [{'Name': '218569190993.dkr.ecr.us-west-2.amazonaws.com/gluoncv-image-classification@sha256:6da5cb7c722bee70ea8b25355596acb60e417b5cd6c88fc08f54ea00de55b695', 'Status': 'Completed'}]}


## Debugging Creation Issues

Entity creation typically never fails in the synchronous path. However, the validation process can fail for many reasons. If the above Algorithm creation fails, you can investigate the cause for the failure by looking at the "AlgorithmStatusDetails" field in the Algorithm object or "ModelPackageStatusDetails" field in the ModelPackage object. You can also look for the Training Jobs / Transform Jobs created in your account as part of our validation and inspect their logs for more hints on what went wrong. 

If all else fails, please contact AWS Customer Support for assistance!


## List on AWS Marketplace

Next, please go back to the Amazon SageMaker console, click on "Algorithms" (or "Model Packages") and you'll find the entity you created above. If it was successfully created and validated, you should be able to select the entity and "Publish new ML Marketplace listing" from SageMaker console.
<img src="images/publish-to-marketplace-action.png"/>