# Serve a Pytorch model trained on SageMaker

The model for this example was trained using this sample notebook on sagemaker - https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/pytorch_mnist/pytorch_mnist.ipynb

It is certainly easiler to do estimator.deploy() using the standard Sagemaker SDK if you are following that example, but cinsider this one if you have a pytorch model (or two) on S3 and you are looking for an easy way to test and deploy this model.

In [None]:
!pip install torch

In [None]:
!pip show sagemaker

## Step 1 : Write a model transform script

#### Make sure you have a ...

- "load_model" function
    - input args are model path
    - returns loaded model object
    - model name is the same as what you saved the model file as (see above step)
<br><br>
- "predict" function
    - input args are the loaded model object and a payload
    - returns the result of model.predict
    - make sure you format it as a single (or multiple) string return inside a list for real time (for mini batch)
    - from a client, a list  or string or np.array that is sent for prediction is interpreted as bytes. Do what you have to for converting back to list or string or np.array
    - return the error for debugging


In [None]:
%%writefile modelscript_pytorch.py
import torch
import torch.distributed as dist
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import torch.utils.data
import torch.utils.data.distributed
from joblib import load
import numpy as np
import os
import json
from six import BytesIO

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
        self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
        self.conv2_drop = nn.Dropout2d()
        self.fc1 = nn.Linear(320, 50)
        self.fc2 = nn.Linear(50, 10)

    def forward(self, x):
        x = F.relu(F.max_pool2d(self.conv1(x), 2))
        x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
        x = x.view(-1, 320)
        x = F.relu(self.fc1(x))
        x = F.dropout(x, training=self.training)
        x = self.fc2(x)
        return F.log_softmax(x, dim=1)
    
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

#Return loaded model
def load_model(modelpath):
    model = torch.nn.DataParallel(Net())
    with open(os.path.join(modelpath, 'model.pth'), 'rb') as f:
        model.load_state_dict(torch.load(f))
    print("loaded")
    return model.to(device)

# return prediction based on loaded model (from the step above) and an input payload
def predict(model, payload):
    
    if type(payload) == list:
        data = np.frombuffer(payload[0]['body'],dtype=np.float32).reshape(1,1,28,28)
    elif type(payload) == np.ndarray:
        data = payload  
    try:
        print(type(data))
        input_data = torch.Tensor(data)
        model.eval()
        with torch.no_grad():
            out =  model(input_data.to(device)).argmax(axis=1)[0].tolist()
    except Exception as e:
        out = str(e)
    return [out]

### Download model locally

In [None]:
!aws s3 cp s3://ezsmdeploy/pytorchmnist/input.html ./
!aws s3 cp s3://ezsmdeploy/pytorchmnist/model.tar.gz ./
!tar xvf model.tar.gz

### Input data for prediction

Draw a number from 0 - 9 in the box that appears when you run the next cell

In [None]:
from IPython.display import HTML
import numpy as np
HTML(open("input.html").read())

## Does this work locally? (not "_in a container locally_", but _actually_ in local)

In [None]:
image = np.array([data], dtype=np.float32)

In [None]:
from modelscript_pytorch import *
model = load_model('./') # 

In [None]:
predict(model,image)

### ok great! Now let's install ezsmdeploy

_[To Do]_: currently local; replace with pip version!

In [None]:
!pip install ezsmdeploy

In [None]:
import ezsmdeploy

#### If you have been running other inference containers in local mode, stop existing containers to avoid conflict

In [None]:
!docker container stop $(docker container ls -aq) >/dev/null

## Upload to your S3 bucket

In [None]:
import sagemaker
modelpath = sagemaker.session.Session().upload_data('./model.tar.gz')

## Deploy locally

In [None]:
ez = ezsmdeploy.Deploy(model = [modelpath], #loading pretrained MNIST model
                  script = 'modelscript_pytorch.py',
                  requirements = ['numpy','torch','joblib'], #or pass in the path to requirements.txt
                  instance_type = 'local',
                  wait = True)

## Test containerized version locally

Since you are downloading this model from a hub, the first time you invoke it will be slow, so invoke again to get an inference without all of the container logs

In [None]:
out = ez.predictor.predict(image.tobytes()).decode()
out

## Deploy on SageMaker

In [None]:
ezonsm = ezsmdeploy.Deploy(model = [modelpath],
                  script = 'modelscript_pytorch.py',
                  requirements = ['numpy','torch','joblib'], #or pass in the path to requirements.txt
                  wait = True,
                  ei = 'ml.eia2.medium') # Add a GPU accelerator

In [None]:
out = ezonsm.predictor.predict(image.tobytes(), target_model='model1.tar.gz').decode() 
out

In [None]:
ezonsm.predictor.delete_endpoint()