# TensorFlow BYOM: Train with Custom Training Script, Compile with Neo, and Deploy on SageMaker

This notebook can be compared to [TensorFlow MNIST distributed training notebook](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/tensorflow_distributed_mnist/tensorflow_distributed_mnist.ipynb) in terms of its functionality. We will do the same classification task, but this time we will compile the trained model using the Neo API backend, to optimize for our choice of hardware. Finally, we setup a real-time hosted endpoint in SageMaker for our compiled model using the Neo Deep Learning Runtime.

### Set up the environment

In [1]:
import os
import sagemaker
from sagemaker import get_execution_role

sagemaker_session = sagemaker.Session()

role = get_execution_role()

In [2]:
# data_sets = mnist.read_data_sets('data', dtype=tf.uint8, reshape=False, validation_size=5000)
# type(data_sets.train.images)
# data_sets.train.num_examples
# data_sets.train.images.shape

### Download the MNIST dataset

In [3]:
# import utils
# from tensorflow.contrib.learn.python.learn.datasets import mnist
# import tensorflow as tf

# data_sets = mnist.read_data_sets('data', dtype=tf.uint8, reshape=False, validation_size=5000)

# utils.convert_to(data_sets.train, 'train', 'data')
# utils.convert_to(data_sets.validation, 'validation', 'data')
# utils.convert_to(data_sets.test, 'test', 'data')

### Upload the data
We use the ```sagemaker.Session.upload_data``` function to upload our datasets to an S3 location. The return value inputs identifies the location -- we will use this later when we start the training job.

In [4]:
inputs = sagemaker_session.upload_data(path='data', key_prefix='data/DEMO-mnist')

# Construct a script for distributed training 
Here is the full code for the network model:

In [5]:
!cat 'mnist.py'

import os
import tensorflow as tf
from tensorflow.python.estimator.model_fn import ModeKeys as Modes

INPUT_TENSOR_NAME = 'inputs'
SIGNATURE_NAME = 'predictions'

LEARNING_RATE = 0.001


def model_fn(features, labels, mode, params):
    # Input Layer
    input_layer = tf.reshape(features[INPUT_TENSOR_NAME], [-1, 28, 28, 1])

    # Convolutional Layer #1
    conv1 = tf.layers.conv2d(
        inputs=input_layer,
        filters=32,
        kernel_size=[5, 5],
        padding='same',
        activation=tf.nn.relu)

    # Pooling Layer #1
    pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], strides=2)

    # Convolutional Layer #2 and Pooling Layer #2
    conv2 = tf.layers.conv2d(
        inputs=pool1,
        filters=64,
        kernel_size=[5, 5],
        padding='same',
        activation=tf.nn.relu)
    pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[2, 2], strides=2)

    # Dense Layer
    pool2_flat = tf.reshape(pool2, [-1, 7 * 7 * 64])
    dense = tf.layers.d

The script here is and adaptation of the [TensorFlow MNIST example](https://github.com/tensorflow/models/tree/master/official/mnist). It provides a ```model_fn(features, labels, mode)```, which is used for training, evaluation and inference. See [TensorFlow MNIST distributed training notebook](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/tensorflow_distributed_mnist/tensorflow_distributed_mnist.ipynb) for more details about the training script.

At the end of the training script, there are two additional functions, to be used with Neo Deep Learning Runtime:
* `neo_preprocess(payload, content_type)`: Function that takes in the payload and Content-Type of each incoming request and returns a NumPy array
* `neo_postprocess(result)`: Function that takes the prediction results produced by Deep Learining Runtime and returns the response body

## Create a training job using the sagemaker.TensorFlow estimator

In [6]:
INPUT_TENSOR_NAME = 'inputs'
SIGNATURE_NAME = 'predictions'

In [7]:
from sagemaker.tensorflow import TensorFlow
mnist_estimator = TensorFlow(entry_point='mnist.py',
                             role=role,
                             framework_version='1.11.0',
                             training_steps=5000, 
                             evaluation_steps=100,
                             train_instance_count=1,
                             train_instance_type='ml.p3.2xlarge')
mnist_estimator.fit(inputs)

tensorflow py2 container will be deprecated soon.


2019-08-29 20:44:24 Starting - Starting the training job...
2019-08-29 20:44:25 Starting - Launching requested ML instances...
2019-08-29 20:45:24 Starting - Preparing the instances for training......
2019-08-29 20:46:19 Downloading - Downloading input data
2019-08-29 20:46:19 Training - Downloading the training image....
[31m2019-08-29 20:46:55,879 INFO - root - running container entrypoint[0m
[31m2019-08-29 20:46:55,880 INFO - root - starting train task[0m
[31m2019-08-29 20:46:55,902 INFO - container_support.training - Training starting[0m
[31mDownloading s3://sagemaker-us-east-2-185004506247/sagemaker-tensorflow-2019-08-29-20-44-23-837/source/sourcedir.tar.gz to /tmp/script.tar.gz[0m
[31m2019-08-29 20:46:58,308 INFO - tf_container - ----------------------TF_CONFIG--------------------------[0m
[31m2019-08-29 20:46:58,309 INFO - tf_container - {"environment": "cloud", "cluster": {"master": ["algo-1:2222"]}, "task": {"index": 0, "type": "master"}}[0m
[31m2019-08-29 20:46:5

The **```fit```** method will create a training job in two **ml.c4.xlarge** instances. The logs above will show the instances doing training, evaluation, and incrementing the number of **training steps**. 

In the end of the training, the training job will generate a saved model for TF serving.

# Deploy the trained model to prepare for predictions (the old way)

The deploy() method creates an endpoint which serves prediction requests in real-time.

In [8]:
mnist_predictor = mnist_estimator.deploy(initial_instance_count=1,
                                         instance_type='ml.m4.xlarge')

The Python 2 tensorflow images will be soon deprecated and may not be supported for newer upcoming versions of the tensorflow images.
Please set the argument "py_version='py3'" to use the Python 3 tensorflow image.


--------------------------------------------------------------------------------------!

## Invoking the endpoint

In [9]:
import numpy as np
import tensorflow as tf
import pickle

with open('./data/test_dataset.pickle', 'rb') as handle:
    test_dataset = pickle.load(handle)

    
# len(test_dataset['images'][0])
for i in range(32):
    data = test_dataset['images'][i]
    tensor_proto = tf.make_tensor_proto(values=np.asarray(data), shape=[len(data), len(data)], dtype=tf.float32)
    tensor_proto
    predict_response = mnist_predictor.predict(tensor_proto)

    print("========================================")
    label = test_dataset['labels'][i]
    print("label is {}".format(label))
    prediction = predict_response['outputs']['classes']['int64_val'][0]
    print("prediction is {}".format(prediction))
    # tensor_proto

label is 27
prediction is 27
label is 28
prediction is 28
label is 8
prediction is 8
label is 5
prediction is 5
label is 12
prediction is 12
label is 22
prediction is 22
label is 1
prediction is 1
label is 30
prediction is 30
label is 21
prediction is 20
label is 9
prediction is 9
label is 25
prediction is 25
label is 15
prediction is 15
label is 5
prediction is 5
label is 19
prediction is 19
label is 13
prediction is 13
label is 7
prediction is 8
label is 30
prediction is 30
label is 30
prediction is 30
label is 12
prediction is 11
label is 8
prediction is 8
label is 11
prediction is 12
label is 22
prediction is 22
label is 4
prediction is 4
label is 10
prediction is 10
label is 23
prediction is 23
label is 4
prediction is 4
label is 25
prediction is 25
label is 8
prediction is 8
label is 21
prediction is 21
label is 23
prediction is 23
label is 27
prediction is 27
label is 29
prediction is 29


## Deleting the endpoint

In [10]:
# sagemaker.Session().delete_endpoint(mnist_predictor.endpoint)

# Deploy the trained model using Neo

Now the model is ready to be compiled by Neo to be optimized for our hardware of choice. We are using the  ``TensorFlowEstimator.compile_model`` method to do this. For this example, our target hardware is ``'ml_c5'``. You can changed these to other supported target hardware if you prefer.

## Compiling the model
The ``input_shape`` is the definition for the model's input tensor and ``output_path`` is where the compiled model will be stored in S3. **Important. If the following command result in a permission error, scroll up and locate the value of execution role returned by `get_execution_role()`. The role must have access to the S3 bucket specified in ``output_path``.**

In [11]:
output_path = '/'.join(mnist_estimator.output_path.split('/')[:-1])
optimized_estimator = mnist_estimator.compile_model(target_instance_family='ml_c5', 
                              input_shape={'data':[1, 784]},  # Batch size 1, 3 channels, 224x224 Images.
                              output_path=output_path,
                              framework='tensorflow', framework_version='1.11.0')

The Python 2 tensorflow images will be soon deprecated and may not be supported for newer upcoming versions of the tensorflow images.
Please set the argument "py_version='py3'" to use the Python 3 tensorflow image.


?.....!

In [12]:
optimized_estimator

<sagemaker.tensorflow.model.TensorFlowModel at 0x7f31bbf3eba8>

## Deploying the compiled model

In [13]:
optimized_predictor = optimized_estimator.deploy(initial_instance_count = 1,
                                                 instance_type = 'ml.c5.4xlarge')

--------------------------------------------------------------------------------------!

In [14]:
def numpy_bytes_serializer(data):
    f = io.BytesIO()
    np.save(f, data)
    f.seek(0)
    return f.read()

optimized_predictor.content_type = 'application/vnd+python.numpy+binary'
optimized_predictor.serializer = numpy_bytes_serializer

In [23]:
# data.reshape((28,28))

## Invoking the endpoint

In [None]:
# from tensorflow.examples.tutorials.mnist import input_data
from IPython import display
import PIL.Image
import io

# mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)

for i in range(10):
    data = test_dataset['images'][i]
    
    # Display image
    im = PIL.Image.fromarray(data.reshape((28,28))).convert('L')
    display.display(im)
    # Invoke endpoint with image
    predict_response = optimized_predictor.predict(data)
    
    print("========================================")
    label = test_dataset['labels'][i]
    print("label is {}".format(label))
    prediction = predict_response
    print("prediction is {}".format(prediction))

## Deleting endpoint

In [None]:
# sagemaker.Session().delete_endpoint(optimized_predictor.endpoint)