# Sagemaker

## Contents

 1. [Introduction](#Introduction)
    1. [The High Level Library](#The-High-Level-Library)
 2. [Goal for this tutorial](#Goal-For-This-Tutorial)
    1. [Imports](#Imports)
    2. [Training the Model](#Training-the-Model)
    3. [Deploying the Model](#Deploying-the-Model)
    4. [Summary](#Summary)
 3. [Defining the Python Script](#Defining-the-Python-Script)

## Introduction

Sagemaker allows us to do all of our analysis through a Jupyter Notebook interface
and all that is required is the code and an AWS account (Free Tier is included).

### The High Level Library

When you are a first time user of Sagemaker, the experience needs to be
as seamless as possible, this is why amazon has implemented a high level
python library that abstracts most of the complexities involved in training
and deploying machine learning models.

## Goal For This Tutorial

The Goal is to train a  Convolutional Neural Network on CIFAR-10 and deploy it in the cloud.
To use the Sagemakers API we need to have  a training script. This script can be written
in any of the major Deep learning frameworks (Tensorflow, Pytorch, Chainer and MXNet).

For this example we will be using MXNet due to its simplicity.

Let's start with the syntax required, we can start with the library imports that are required:

### Imports

In [None]:
import mxnet as mx
import sagemaker
from sagemaker.mxnet import MXNet as MXNetEstimator

The last line of the imports brings in the MXNet specific class of Sagemaker's Estimators.
The Estimators encapsule everything required to train a model, these consist of:

 - Methods and functions wrapping your training code
 - a docker image used as the environment for traning your model

Using the MXNetEstimator instead of a generic Estimator class means we do not need to explicitly
provide a Docker image as a default image. It is also possible to setup a custom image. In this case
we will be using the default image and move onto loading the data.

### Loading the Data

In [None]:
mx.test_utils.get_cifar10()
sagemaker_session = sagemaker.session()
inputs = sagemaker_session.upload_data(path='data/cifar',key_prefix='data/cifar10')

in this code section the Sagemaker Session is used to upload the dataset to an s3 bucket
which will be accessed by our Estimator.

### Training the Model

We can now start training the model. We do this using the MXNetEstimator and use the provided
fit method.

In [None]:
estimator = MXNetEstimator(entry_point='train.py',role=sagemaker.get_execution_role(),
                           train_instance_count=1,train_instance_type='ml.p3.2xlarge',
                           hyperparameters={'batch_size':1024,'epochs':30})
estimator.fit(inputs)

The code segment is doing the following:

The MXNetEstimator requires an entry point which is our training script. This python file must take
a specific format that will be described later. We will also need to specify our AWS Role. Amongst other
things the AWS Role will determine what kind of EC2 instances we can access. In this snippet we requested one
`ml.p3.2xlarge` instance which contains a GPU for accelerated computing.

The final thing we do is specify the hyperparameters in a dictionary that will be used by the training script in the
python file.

Calling the fit method will launch the requested EC2 instance as a training environment, setup the docker
image on it and start the training loop. Once this is done, we can deploy the model with the deploy method.

### Deploying the Model

When we call the deploy method, Sagemaker uploads the trained model to the S3 bucket and creates the predictive environment.
It also returns a Predictor (This is the interface we will use to predict new inputs).

In [None]:
predictor = estimator.deploy(initial_instance_count=1, instance_type= 'ml.m4.xlarge')

As we can see from the code it is possible to deploy the model on a different type of instance that the estimator,
This can assist as we may not require as much processing power to apply a model versus training a model.

We can now do some prediction with the deployed model.

In [None]:
for i, img in enumerate(image_data):
    response = predictor.predict(img)
    print('image {}:class: {}'.format(i,int(response)))

Once the deployed model is no longer required we can close down the deployed the model with the following:

In [None]:
estimator.delete_endpoint()

### Summary

In the syntax we have seen how to fit the MXNet Estimator, deploy it and then use the predictor on new inputs.
All of these methods rely on the python script. Which we will look at now.

## Defining the Python Script

The training loop and prediction code will be nested inside SageMakers Library as such it should follow the follow a certain format.
The code will require four sections to be present:

 - Train
    - A function to implement the training loop
 - Save
    - A function to save the model
 - model_fn
    - A function to load the model
 - transfom_fn
    - A function to predict the inputs

Only the first two functions will be used during the training phase. We can look at what an example of the code should look like here:


In [None]:
def train(channel_input_dirs, hyperparameters):

    # Retrieve the hyperparameters
    batch_size = hyperparameters.get('batch_size', 55)
    epochs = hyperparameters.get('epochs', 100)

    # Prepare the data
    data_dir = channel_input_dirs['training']
    train_data = get_data(data_dir, batch_size, train=True)

    # Create the model
    net = models.get_model('resnet34_v2', ctx=mx.gpu(), pretrained=False, classes=10)

    for epoch in range(epochs):
    # Your training loop goes here.
    return net

def save(net, model_dir):
    net.save_params('%s/model.params' % model_dir)

The content of the train function is not that much different from a default train function of MXNet, you set the hyperparameters, get the data, define the model and start running the model.

To make it work Sagemaker provides you with the information you need, the location of the data and the hyperparameters dictionary. Importantly the train function should return the trained
network.

The Save Function is even simplier, You receive as input the trained model as well  as the directory to save its parameters.

The last two functions in the script is used to deploy the model, Lets have a look at what should be contained in them:

In [None]:
def model_fn(model_dir):
    net = models.get_model('resnet34_v2', ctx=mx.cpu(), pretrained=False, classes=10)
    net.load_params('%s/model.params' % model_dir, ctx=mx.cpu())
    return net


def transform_fn(net, data, input_content_type, output_content_type):
    inputs = mx.nd.array(json.loads(data))
    outputs = net(inputs)
    predictions = mx.nd.argmax(outputs, axis=1)
    response = json.dumps(predictions.asnumpy().tolist()[0])
    return response, output_content_type

The model_fn function will simply load the model while transform_fn will process the new predictions. Note that the inputs variable
in transform_fn is originally in a json format and we need to convert it to an MX.nd.Array. The reason for this is the prediction
requests in Sagemaker are communicated through HTTPS. With this being the case we will need to translate the network's outputs
back to JSON so it can be properly used by the downstream code.

From these examples we can see that the format the code needs to follow is neither too restrictive nor complicated to implement