# fast.ai lesson 1 - training on Notebook Instance and export to torch.jit model

## Pre-requisites

This notebook shows how to use the SageMaker Python SDK to train your fast.ai model on a SageMaker notebook instance then export it as a torch.jit model to be used for inference on AWS Lambda.

]## Overview

We are going to train a fast.ai model as per [Lesson 1 of the fast.ai MOOC course](https://course.fast.ai/videos/?lesson=1) locally on the SageMaker Notebook instance. We will then save the model weights and upload them to S3.

### Set up the environment

To setup a new SageMaker notebook instance with fast.ai installed then follow steps outlined [here](https://course.fast.ai/start_sagemaker.html).

This notebook was created and tested on a single ml.p3.2xlarge notebook instance. 

Let's start by specifying:

- The S3 bucket and prefix that you want to use for training and model data. This should be within the same region as the Notebook Instance, training, and hosting.
- The IAM role arn used to give training and hosting access to your data. See the documentation for how to create these. Note, if more than one role is required for notebook instances, training, and/or hosting, please replace the sagemaker.get_execution_role() with appropriate full IAM role arn string(s).

In [None]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline

In [None]:
import os
import io
import tarfile

import PIL

import sagemaker

from fastai.vision import *

In [None]:
path = untar_data(URLs.PETS); path

In [None]:
path_anno = path/'annotations'
path_img = path/'images'
fnames = get_image_files(path_img)
np.random.seed(2)
pat = re.compile(r'/([^/]+)_\d+.jpg$')

In [None]:
bs=64
img_size=299

In [None]:
data = ImageDataBunch.from_name_re(path_img, fnames, pat, ds_tfms=get_transforms(),
                                   size=img_size, bs=bs//2).normalize(imagenet_stats)

In [None]:
learn = create_cnn(data, models.resnet50, metrics=error_rate)

In [None]:
learn.lr_find()
learn.recorder.plot()

In [None]:
learn.fit_one_cycle(8)

In [None]:
learn.unfreeze()
learn.fit_one_cycle(3, max_lr=slice(1e-6,1e-4))

# Export model and upload to S3

Now that we have trained our model we need to export it, create a tarball of the artefacts and upload to S3.

First we need to get the S3 bucket and prefix where the model will be uploaded to.

In [None]:
from sagemaker.utils import name_from_base

sagemaker_session = sagemaker.Session()

bucket = sagemaker_session.default_bucket()
model_name = name_from_base('fastai-pets-jit-model')
prefix = f'sagemaker/{model_name}'

Now we need to export the model as a PyTorch JIT so we can load into an AWS Lambda function.

In [None]:
save_texts(path_img/'models/classes.txt', data.classes)

In [None]:
trace_input = torch.ones(1,3,img_size,img_size).cuda()
jit_model = torch.jit.trace(learn.model.float(), trace_input)
model_file='resnet50_jit.pth'
output_path = str(path_img/f'models/{model_file}')
torch.jit.save(jit_model, output_path)

Next step is to create a tarfile of the exported classes file and model weights. We also need to create an empty folder for the models.

In [None]:
tar_file=path_img/'models/model.tar.gz'
classes_file='classes.txt'

In [None]:
with tarfile.open(tar_file, 'w:gz') as f:
    t = tarfile.TarInfo('models')
    t.type = tarfile.DIRTYPE
    f.addfile(t)
    f.add(path_img/f'models/{model_file}', arcname=model_file)
    f.add(path_img/f'models/{classes_file}', arcname=classes_file)

Now we need to upload the model tarball to S3.

In [None]:
model_artefact = sagemaker_session.upload_data(path=str(tar_file), bucket=bucket, key_prefix=prefix)
print('model artefact path on S3: {}'.format(model_artefact))