## Overview

The **SageMaker Python SDK** helps you deploy your models for training and hosting in optimized, productions ready containers in SageMaker. The SageMaker Python SDK is easy to use, modular, extensible and compatible with TensorFlow, MXNet, PyTorch and Chainer. 

This notebook shows how to deploy a pre-trained model in AWS SageMaker

Based on:

https://github.com/awslabs/amazon-sagemaker-examples/tree/master/sagemaker-python-sdk/pytorch_cnn_cifar10

https://jovian.ml/aakashns/05-cifar10-cnn

## Set up the environment

Let's start by specifying:

- The S3 bucket and prefix that you want to use for training and model data. This should be within the same region as the Notebook Instance, training, and hosting.
- The IAM role arn used to give training and hosting access to your data. See the documentation for how to create these. Note, if more than one role is required for notebook instances, training, and/or hosting, please replace the sagemaker.get_execution_role() with appropriate full IAM role arn string(s).

In [1]:
import sagemaker

sagemaker_session = sagemaker.Session()

bucket = sagemaker_session.default_bucket()
prefix = 'sagemaker/DEMO-pytorch-cnn-cifar10-deploy'

role = sagemaker.get_execution_role()

## Create model artifacts

Now, tar all the model artifacts such as model weights file 'model.pth' and optionally your inference code

In [2]:
# create a tar file from the model file
import tarfile
with tarfile.open('model.tar.gz', mode='w:gz') as archive:
    archive.add('model.pth', recursive=True)

In [3]:
model_path = 'model.tar.gz'

## Upload model artifacts to s3

Upload the model artifacts 'model.tar.gz' file to s3 bucket

In [4]:
# upload model artifacts to S3
model_artifact = sagemaker_session.upload_data(path=model_path, bucket=bucket, key_prefix=prefix)

In [5]:
model_artifact

's3://sagemaker-us-east-1-495274804427/sagemaker/DEMO-pytorch-cnn-cifar10-deploy/model.tar.gz'

## Create SageMaker PyTorch model

In [6]:
from sagemaker.predictor import RealTimePredictor, json_deserializer
from sagemaker.pytorch import PyTorchModel

class ImagePredictor(RealTimePredictor):
    def __init__(self, endpoint_name, sagemaker_session):
        super(ImagePredictor, self).__init__(endpoint_name, sagemaker_session=sagemaker_session, serializer=None, 
                                            deserializer=json_deserializer, content_type='image/jpeg')

Create a PyTorch model object with specifying the model artifacts, role, framework_version, entry point. Optionally you can give a name for your endpoint. 
The entry point contains the inference code. The dependecies and any special API required for your file are provided in your requirements.txt file

In [7]:
# build the sagemaker model
model = PyTorchModel(model_data=model_artifact,
                     role = role,
                     framework_version='1.5.1',
                     entry_point='predict.py',
                     source_dir = 'serve',
                     predictor_cls=ImagePredictor,
                     py_version='py3',
                     name='cifar-deploy')

Parameter image will be renamed to image_uri in SageMaker Python SDK v2.


In [8]:
%%time
# deploy the model as an endpoint
predictor = model.deploy(initial_instance_count=1, instance_type='ml.c5.xlarge')

'create_image_uri' will be deprecated in favor of 'ImageURIProvider' class in SageMaker Python SDK v2.
Using already existing model: cifar-deploy


-----------------!CPU times: user 2.16 s, sys: 175 ms, total: 2.33 s
Wall time: 8min 35s


## Clean-up

In [10]:
predictor.delete_endpoint()
predictor.delete_model()

## Inference without deployment

It is a good practice to do a sanity check before deploying the model as SageMaker endpoints. Load and do inference of your model to ensure all working fine. It will be very difficult to debug with SageMaker. 

In [11]:
import sys
sys.path.append('./serve')

In [12]:
import torch

In [13]:
print(torch.__version__)

1.4.0


In [14]:
import torchvision

In [15]:
print(torchvision.__version__)

0.5.0


In [16]:
!pip install sagemaker_containers -q

You should consider upgrading via the '/home/ec2-user/anaconda3/envs/pytorch_p36/bin/python -m pip install --upgrade pip' command.[0m


In [17]:
from predict import model_fn, input_fn, output_fn, predict_fn

In [18]:
model = model_fn('./')

Loading model.
Done loading model.


In [19]:
import requests
from PIL import Image
import io
url = 'https://images.unsplash.com/photo-1525396524423-64f7b55f5b33?ixlib=rb-1.2.1&ixid=eyJhcHBfaWQiOjEyMDd9&w=1000&q=80'
img_bytes = requests.get(url).content
img = Image.open(io.BytesIO(img_bytes))

In [20]:
input_data = input_fn(img_bytes)

In [21]:
output = predict_fn(input_data, model)

Inferring class of input data.


In [22]:
output

'airplane'

In [23]:
from model import Cifar10CnnModel, predict_image, classes
from utils import get_default_device, to_device, get_transform

In [24]:
model = Cifar10CnnModel()
model.load_state_dict(torch.load('model.pth', map_location='cpu'))

<All keys matched successfully>

In [25]:
tfms = get_transform()
input_data = tfms(img)
output = predict_image(input_data, model)

In [26]:
output

'airplane'