# Deploying pre-trained PyTorch vision models with Amazon SageMaker Neo

Amazon SageMaker Neo is an API to compile machine learning models to optimize them for our choice of hardward targets. Currently, Neo supports pre-trained PyTorch models from [TorchVision](https://pytorch.org/docs/stable/torchvision/models.html). General support for other PyTorch models is forthcoming.

In [2]:
import sys
!{sys.executable} -m pip install torch==1.4.0 torchvision==0.5.0

Collecting torch==1.4.0
  Downloading torch-1.4.0-cp37-cp37m-manylinux1_x86_64.whl (753.4 MB)
[K     |████████████████████████████████| 753.4 MB 3.8 kB/s  eta 0:00:01�█████████▏         | 520.8 MB 78.0 MB/s eta 0:00:03     |██████████████████████████      | 613.1 MB 19.8 MB/s eta 0:00:08     |███████████████████████████████▍| 738.5 MB 91.9 MB/s eta 0:00:01
[?25hCollecting torchvision==0.5.0
  Downloading torchvision-0.5.0-cp37-cp37m-manylinux1_x86_64.whl (4.0 MB)
[K     |████████████████████████████████| 4.0 MB 25.4 MB/s eta 0:00:01
Installing collected packages: torch, torchvision
Successfully installed torch-1.4.0 torchvision-0.5.0


In [3]:
!{sys.executable} -m pip install --upgrade sagemaker

Collecting sagemaker
  Downloading sagemaker-2.16.3.post0.tar.gz (309 kB)
[K     |████████████████████████████████| 309 kB 12.5 MB/s eta 0:00:01
Building wheels for collected packages: sagemaker
  Building wheel for sagemaker (setup.py) ... [?25ldone
[?25h  Created wheel for sagemaker: filename=sagemaker-2.16.3.post0-py2.py3-none-any.whl size=435625 sha256=be232c3f49dce0fb3ccc5646d16aee879f76c6a7d7c18e1d2953799da9d1119d
  Stored in directory: /root/.cache/pip/wheels/fc/84/3c/5c8b33bed13b51a2bf1c1491100bb01085ac4a20e6a1890d46
Successfully built sagemaker
Installing collected packages: sagemaker
  Attempting uninstall: sagemaker
    Found existing installation: sagemaker 2.15.2
    Uninstalling sagemaker-2.15.2:
      Successfully uninstalled sagemaker-2.15.2
Successfully installed sagemaker-2.16.3.post0


In [1]:
# !~/anaconda3/envs/pytorch_p36/bin/pip install torch==1.4.0 torchvision==0.5.0

/bin/sh: 1: /root/anaconda3/envs/pytorch_p36/bin/pip: not found


### SageMaker SDK >= 2.0 is required for this notebook

In [None]:
# !~/anaconda3/envs/pytorch_p36/bin/pip install --upgrade sagemaker

In [4]:
import sagemaker
current_version = sagemaker.__version__
if current_version.split('.')[0] == '1':
    raise Exception("Please upgrade SageMaker SDK by running the above code cell and restart the kernel")

## Import ResNet18 from TorchVision

We'll import [ResNet18](https://arxiv.org/abs/1512.03385) model from TorchVision and create a model artifact `model.tar.gz`.

In [5]:
import torch
import torchvision.models as models
import tarfile

resnet18 = models.resnet18(pretrained=True)
input_shape = [1,3,224,224]
trace = torch.jit.trace(resnet18.float().eval(), torch.zeros(input_shape).float())
trace.save('model.pth')

with tarfile.open('model.tar.gz', 'w:gz') as f:
    f.add('model.pth')

Downloading: "https://download.pytorch.org/models/resnet18-5c106cde.pth" to /root/.cache/torch/checkpoints/resnet18-5c106cde.pth


HBox(children=(FloatProgress(value=0.0, max=46827520.0), HTML(value='')))




### Upload the model archive to S3

In [6]:
import boto3
import sagemaker
import time
from sagemaker.utils import name_from_base

role = sagemaker.get_execution_role()
sess = sagemaker.Session()
region = sess.boto_region_name
bucket = sess.default_bucket()

compilation_job_name = name_from_base('TorchVision-ResNet18-Neo')
prefix = compilation_job_name+'/model'

model_path = sess.upload_data(path='model.tar.gz', key_prefix=prefix)

data_shape = '{"input0":[1,3,224,224]}'
target_device = 'ml_c5'
framework = 'PYTORCH'
framework_version = '1.4.0'
compiled_model_path = 's3://{}/{}/output'.format(bucket, compilation_job_name)

## Invoke Neo Compilation API

### Create a PyTorch SageMaker model

In [8]:
from sagemaker.pytorch.model import PyTorchModel
from sagemaker.predictor import Predictor

sagemaker_model = PyTorchModel(model_data=model_path,
                               predictor_cls=Predictor,
                               framework_version = framework_version,
                               role=role,
                               sagemaker_session=sess,
                               entry_point='resnet18.py',
                               source_dir='code',
                               py_version='py3',
                               env={'MMS_DEFAULT_RESPONSE_TIMEOUT': '500'}
                              )

### Use Neo compiler to compile the model

In [9]:
compiled_model = sagemaker_model.compile(target_instance_family=target_device, 
                                         input_shape=data_shape,
                                         job_name=compilation_job_name,
                                         role=role,
                                         framework=framework.lower(),
                                         framework_version=framework_version,
                                         output_path=compiled_model_path
                                        )

?..........!

## Deploy the model

In [10]:
predictor = compiled_model.deploy(initial_instance_count = 1,
                                  instance_type = 'ml.c5.9xlarge'
                                 )

---------------!

## Send requests

Let's try to send a cat picture.

![title](cat.jpg)

In [11]:
import numpy as np
import json

with open('cat.jpg', 'rb') as f:
    payload = f.read()
    payload = bytearray(payload) 

response = predictor.predict(payload)
result = json.loads(response.decode())
print('Most likely class: {}'.format(np.argmax(result)))

Most likely class: 282


In [12]:
# Load names for ImageNet classes
object_categories = {}
with open("imagenet1000_clsidx_to_labels.txt", "r") as f:
    for line in f:
        key, val = line.strip().split(':')
        object_categories[key] = val
print("Result: label - " + object_categories[str(np.argmax(result))]+ " probability - " + str(np.amax(result)))

Result: label -  'tiger cat', probability - 0.6455850005149841


## Delete the Endpoint
Having an endpoint running will incur some costs. Therefore as a clean-up job, we should delete the endpoint.

In [None]:
sess.delete_endpoint(predictor.endpoint_name)