## Part I. Prerequisties for SageMaker Train and Deploy

Before divinh into the nitty-gritty of Sagemaker training and deploy, it is crutial to make sure the training and deploy "container" is set up. This container will provide the most up-to-date version of GluonCV, MXNet and other essential programming environments, which enable us to achieve state-of-the-art(SOTA) model training and deployment.
Let's take a look of the process of setting up a container.

# Part 2: Training, Batch Inference and Hosting your Algorithm in Amazon SageMaker

Once you have your container packaged, you can use it to train and serve models. Let's do that with the algorithm we made above.

## Set up the environment

Here we specify a bucket to use and the role that will be used for working with Amazon SageMaker.

In [2]:
import os
from sagemaker import get_execution_role

role = get_execution_role()

## Create the session

The session remembers our connection parameters to Amazon SageMaker. We'll use it to perform all of our SageMaker operations.

In [3]:
import sagemaker as sage

sess = sage.Session()

## Create an estimator and fit the model

In order to use Amazon SageMaker to fit our algorithm, we'll create an `Estimator` that defines how to use the container to train. This includes the configuration we need to invoke SageMaker training:

* The __container name__. This is constructed as in the shell commands above.
* The __role__. As defined above.
* The __instance count__ which is the number of machines to use for training.
* The __instance type__ which is the type of machine to use for training.
* The __output path__ determines where the model artifact will be written.
* The __session__ is the SageMaker session object that we defined above.

Then we use fit() on the estimator to train against the data that we uploaded above.

In [4]:
account = sess.boto_session.client('sts').get_caller_identity()['Account']
region = sess.boto_session.region_name
ecr_name = "mla-cv"
ecr_image = '{}.dkr.ecr.{}.amazonaws.com/{}:latest'.format(account, region, ecr_name)

Upload data to S3 bucket: https://s3.console.aws.amazon.com/s3/buckets/sagemaker-us-east-1-058295922468/sagemaker-deploy-gluoncv/data/?region=us-east-1

In [14]:
# S3 prefix
s3_bucket = "sagemaker-deploy-gluoncv"
# model_path = "s3://{}/{}/model".format(sess.default_bucket(), s3_bucket)
# os.path.join(model_path, "model.tar.gz")
# model_prefix = s3_bucket + "/model"
train_data_local = "./data/minc-2500/train"
train_data_dir_prefix = s3_bucket + "/data/train"


# model_local_path = "model_output"
train_data_upload = sess.upload_data(path=train_data_local, 
#                                 bucket=s3_bucket, 
                                key_prefix=train_data_dir_prefix)
print("Train input uploaded to " + train_data_upload)

In [22]:
from sagemaker.estimator import Estimator

train_dir = "data/minc-2500/train"
hyperparameters = {'epochs': 1, 
                   'model_name': 'resnet18_v1b'}
instance_type = 'ml.c4.2xlarge'  # 'ml.p2.xlarge'
s3_path = "s3://{}/{}/model".format(sess.default_bucket(), s3_bucket)
model_path = os.path.join(s3_path, "model.tar.gz")
print(model_path)




classifier = Estimator(role=role, 
                       sagemaker_session=sess,
                       image_name=ecr_image, 
                       train_instance_count=1,
                       train_instance_type=instance_type,
                       hyperparameters=hyperparameters,
#                        checkpoint_local_path="model_output/", 
                       output_path=s3_path
                       )
# train_data_upload = model_upload
classifier.fit(train_data_upload)


s3://sagemaker-us-east-1-058295922468/sagemaker-deploy-gluoncv/model/model.tar.gz
2020-05-14 03:48:12 Starting - Starting the training job...
2020-05-14 03:48:15 Starting - Launching requested ML instances......
2020-05-14 03:49:20 Starting - Preparing the instances for training...
2020-05-14 03:49:58 Downloading - Downloading input data...
2020-05-14 03:50:33 Training - Downloading the training image...
2020-05-14 03:51:08 Uploading - Uploading generated training model[34mStarting the training.[0m
[34mFilling weights from resnet18_v1b[0m
[34mDownloading /root/.mxnet/models/resnet18_v1b-2d9d980c.zip from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/models/resnet18_v1b-2d9d980c.zip...[0m
[34m#015  0%|          | 0/42432 [00:00<?, ?KB/s]#015  0%|          | 170/42432 [00:00<00:38, 1107.82KB/s]#015  2%|2         | 883/42432 [00:00<00:28, 1434.15KB/s]#015  8%|7         | 3251/42432 [00:00<00:19, 1996.95KB/s]#015 15%|#5        | 6562/42432 [00:00<00:12, 2780.90KB/

## Batch Transform
Here we simply use a demo image for transform input.

In [15]:
demo_dir = "data/demo"
test_image = "cat1.jpg"
sample_inference_input_prefix = s3_bucket + "/data/test"

demo_input = sess.upload_data(os.path.join(demo_dir, test_image), 
                                   key_prefix=sample_inference_input_prefix) 
print("Demo input uploaded to " + demo_input)

Demo input uploaded to s3://sagemaker-us-east-1-058295922468/sagemaker-deploy-gluoncv/data/test/cat1.jpg


## Deploy the model

Deploying the model to Amazon SageMaker hosting just requires a `deploy` call on the fitted model. This call takes an instance count, instance type, and optionally serializer and deserializer functions. These are used when the resulting predictor is created on the endpoint.

In [23]:
# from sagemaker.predictor import csv_serializer

model = classifier.create_model()
predictor = classifier.deploy(1, 'ml.m4.xlarge')

---------------!

### Choose some data and use it for a prediction

In order to do some predictions, we'll use a demo jpeg image to test the model.

In [24]:
with open(os.path.join(demo_dir, test_image), 'rb') as f:
    x = f.read()
    print(predictor.predict(x, initial_args={'ContentType':'image/jpeg'}).decode('utf-8'))

[lynx], with probability 0.253.
[Egyptian cat], with probability 0.252.
[tiger cat], with probability 0.106.
[tabby], with probability 0.063.
[soft-coated wheaten terrier], with probability 0.041.



### Cleanup Endpoint

When you're done with the endpoint, you'll want to clean it up.

In [25]:
sess.delete_endpoint(predictor.endpoint)