# fast.ai lesson 1 - training on Notebook Instance and deploy on Amazon SageMaker

## Pre-requisites

This notebook shows how to use the SageMaker Python SDK to train your fast.ai model on a SageMaker notebook instance then deploy it to Amazon SageMaker for production. 

In order to use this feature you'll need to install docker-compose (and nvidia-docker if training with a GPU).

**Note, you can only run a single local notebook at one time.**

In [None]:
!/bin/bash ./setup.sh

## Overview

We are going to train a fast.ai model as per [Lesson 1 of the fast.ai MOOC course](https://course.fast.ai/videos/?lesson=1) locally on the SageMaker Notebook instance. We will then save the model weights and upload them to S3 so we can deploy a model as a SageMaker Endpoint.

### Set up the environment

To setup a new SageMaker notebook instance with fast.ai installed then follow steps outlined [here](https://course.fast.ai/start_sagemaker.html).

This notebook was created and tested on a single ml.p3.2xlarge notebook instance. 

Let's start by specifying:

- The S3 bucket and prefix that you want to use for training and model data. This should be within the same region as the Notebook Instance, training, and hosting.
- The IAM role arn used to give training and hosting access to your data. See the documentation for how to create these. Note, if more than one role is required for notebook instances, training, and/or hosting, please replace the sagemaker.get_execution_role() with appropriate full IAM role arn string(s).

In [None]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline

In [None]:
import os
import io
import tarfile

import PIL

import sagemaker
from sagemaker.pytorch import PyTorchModel
from sagemaker.predictor import RealTimePredictor, json_deserializer
from sagemaker.utils import name_from_base

from fastai.vision import *

In [None]:
path = untar_data(URLs.PETS); path

In [None]:
path_anno = path/'annotations'
path_img = path/'images'
fnames = get_image_files(path_img)
np.random.seed(2)
pat = re.compile(r'/([^/]+)_\d+.jpg$')

In [None]:
bs=64

In [None]:
data = ImageDataBunch.from_name_re(path_img, fnames, pat, ds_tfms=get_transforms(),
                                   size=299, bs=bs//2).normalize(imagenet_stats)

In [None]:
learn = create_cnn(data, models.resnet50, metrics=error_rate)

In [None]:
learn.lr_find()
learn.recorder.plot()

In [None]:
learn.fit_one_cycle(8)

In [None]:
learn.unfreeze()
learn.fit_one_cycle(3, max_lr=slice(1e-6,1e-4))

# Export model and upload to S3

Now that we have trained our model we need to export it, create a tarball of the artefacts and upload to S3.

First we need to get the S3 bucket and prefix where the model will be uploaded to.

In [None]:
sagemaker_session = sagemaker.Session()

bucket = sagemaker_session.default_bucket()
model_name = name_from_base('fastai-pets-model')
prefix = f'sagemaker/{model_name}'

role = sagemaker.get_execution_role()

Now we need to export the data object so that we can do predictions on SageMaker and save the model weights.

In [None]:
data.export()
learn.save('resnet50')

Next step is to create a tarfile of the exported data object and model weights. We also need to create an empty folder for the models.

In [None]:
tar_file=path_img/'models/model.tar.gz'
model_file='resnet50.pth'
data_export_file='export.pkl'

In [None]:
with tarfile.open(tar_file, 'w:gz') as f:
    t = tarfile.TarInfo('models')
    t.type = tarfile.DIRTYPE
    f.addfile(t)
    f.add(path_img/f'models/{model_file}', arcname=model_file)
    f.add(path_img/data_export_file, arcname=data_export_file)

Now we need to upload the model tarball to S3.

In [None]:
model_artefact = sagemaker_session.upload_data(path=str(tar_file), bucket=bucket, key_prefix=prefix)
print('model artefact path on S3: {}'.format(model_artefact))

# Construct a script for inference
Here is the full code that does model inference.

In [None]:
!pygmentize source/pets.py

## Script Functions

SageMaker invokes the main function defined within your training script for training. When deploying your trained model to an endpoint, the model_fn() is called to determine how to load your trained model. The model_fn() along with a few other functions list below are called to enable predictions on SageMaker.

### [Predicting Functions](https://github.com/aws/sagemaker-pytorch-containers/blob/master/src/sagemaker_pytorch_container/serving.py)
* model_fn(model_dir) - loads your model.
* input_fn(serialized_input_data, content_type) - deserializes predictions to predict_fn.
* output_fn(prediction_output, accept) - serializes predictions from predict_fn.
* predict_fn(input_data, model) - calls a model on data deserialized in input_fn.

The model_fn() is the only function that doesn't have a default implementation and is required by the user for using PyTorch on SageMaker. 

# Deploy the trained model to prepare for predictions

First we need to create a `PyTorchModel` object from the model uploaded to S3. The `deploy()` method on the model object creates an endpoint (in this case locally) which serves prediction requests in real-time. If the `instance_type` is set to a SageMaker instance type (e.g. ml.m5.large) then the model will be deployed on SageMaker. If the `instance_type` parameter is set to `local` then it will be deployed locally as a Docker container and ready for testing locally.

First we need to create a `RealTimePredictor` class to accept `jpeg` images as input and output JSON. The default behaviour is to accept a numpy array.

In [None]:
class ImagePredictor(RealTimePredictor):
    def __init__(self, endpoint_name, sagemaker_session):
        super(ImagePredictor, self).__init__(endpoint_name, sagemaker_session=sagemaker_session, serializer=None, 
                                            deserializer=json_deserializer, content_type='image/jpeg')

If you want to deploy your model locally then comment out the ```instance_type``` declaration below. 

In [None]:
# Uncomment out for local deployment
instance_type = 'local'

If you want to deploy your model on SageMaker then uncomment the the ```instance_type``` declaration below. 

In [None]:
# Uncomment out for SageMaker Deployment
instance_type = 'ml.c5.large'

In [None]:
pets_model=PyTorchModel(model_data=model_artefact,
                        name=model_name,
                        role=role,
                        framework_version='1.0.0',
                        entry_point='source/pets.py',
                        predictor_cls=ImagePredictor)

pets_predictor = pets_model.deploy(initial_instance_count=1,
                                   instance_type=instance_type)

# Invoking the endpoint

In [None]:
urls = []
# English Cocker Spaniel
urls.append('https://s3.amazonaws.com/cdn-origin-etr.akc.org/wp-content/uploads/2017/11/16105011/English-Cocker-Spaniel-Slide03.jpg')
# Shiba Inu
urls.append('https://upload.wikimedia.org/wikipedia/commons/thumb/6/6b/Taka_Shiba.jpg/1200px-Taka_Shiba.jpg')
# German Short haired
urls.append('https://vetstreet.brightspotcdn.com/dims4/default/232fcc6/2147483647/crop/0x0%2B0%2B0/resize/645x380/quality/90/?url=https%3A%2F%2Fvetstreet-brightspot.s3.amazonaws.com%2Fda%2Fa44590a0d211e0a2380050568d634f%2Ffile%2FGerman-Shorthair-Pointer-2-645mk062111.jpg')

In [None]:
# get a random selection
img_bytes = requests.get(random.choice(urls)).content
img = PIL.Image.open(io.BytesIO(img_bytes))
img

We will call either the local or SageMaker endpoint for inference.

In [None]:
response = pets_predictor.predict(img_bytes)
response

# Clean-up

Deleting the local endpoint when you're finished is important since you can only run one local endpoint at a time.

In [None]:
pets_predictor.delete_endpoint()