# TensorFlow script mode training and serving

Script Mode is a training script format for TensorFlow that lets you execute any TensorFlow training script in SageMaker with minimal modification. The SageMaker Python SDK handles transferring your script to a SageMaker training instance; on the training instance, SageMaker's native TensorFlow support simply sets up some training-related environment variables and executes your training script. In this tutorial [SageMaker Python SDK](https://github.com/aws/sagemaker-python-sdk) is used to launch training job and deploy the trained model.

Script Mode supports training with Python script, Python module and shell script. In this example we will use a Python script to train the [MNIST dataset](http://yann.lecun.com/exdb/mnist/). In addition this notebook also demonstrates how to perform real time inference with [SageMaker TensorFlow Serving container](https://github.com/aws/sagemaker-tensorflow-serving-container). TensorFlow Serving container is the only supported inference method for Script Mode. For full documention on TensorFlow Serving please visit [here](https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/tensorflow/deploying_tensorflow_serving.rst).

Let's start by setting up the environment:


# Set up the environment

In [None]:
import os
import sagemaker
from sagemaker import get_execution_role

sagemaker_session = sagemaker.Session()

role = get_execution_role()

## Training Data

The MNIST dataset has been loaded to public S3 buckets ``sagemaker-sample-data-<REGION>`` under prefix ``tensorflow/mnist``. There are four ``.npy`` file under this prefix:
* ``train_data.npy``
* ``eval_data.npy``
* ``train_labels.npy``
* ``eval_labels.npy``

In [None]:
region = sagemaker_session.boto_session.region_name
training_data_uri = 's3://sagemaker-sample-data-{}/tensorflow/mnist'.format(region)

# Construct a script for distributed training

This tutorial's training script was adapted from TensorFlow's official [CNN MNIST example](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/tutorials/layers/cnn_mnist.py). We have modified it to handle the ``model_dir`` parameter passed in by SageMaker. This is an S3 path which can be used for data sharing during distributed training and checkpointing and/or model persistence. In addtion we have also added a argument parsing function to handle processing training-related variables.

At the end of the training job we have also added a step to export the trained model to the path stored in the environment variable ``SM_MODEL_DIR`` which always points to ``/opt/ml/model``. This is critical because SageMaker uploads all the model artifacts in this folder to S3 at end of training.

Here is the entire script:

In [None]:
!pygmentize 'mnist.py'

# Create a training job using the sagemaker.TensorFlow estimator

The ``sagemaker.tensorflow.TensorFlow`` estimator handles locating the script mode container, uploading your script to a S3 location and creating a SageMaker training job. Let's call out couple important parameters here:

* ``py_version`` is set to ``'py3'`` to indicate that we are using script mode since legacy mode supports only Python 2. Though Python 2 will be deprecated soon, you can use script mode with Python 2 by setting ``py_version`` to ``'py2'`` and ``script_mode`` to ``True``.

* ``distributions`` is used to configure the distributed training setup. It's required only if you are doing distributed training either across a cluster of instances or across multiple GPUs. Here we are using parameter servers as the distributed training schema. SageMaker training jobs run on homogeneous clusters. To make parameter server more performant in the SageMaker setup we run a parameter server on every instance in the cluster, so there is no need to specify the number of parameter servers to launch. Script mode also supports distributed training with [Horovod](https://github.com/horovod/horovod). You can find the full documentation on how to configure ``distributions`` [here](https://github.com/aws/sagemaker-python-sdk/tree/master/src/sagemaker/tensorflow#distributed-training). 



In [None]:
from sagemaker.tensorflow import TensorFlow


mnist_estimator = TensorFlow(entry_point='mnist.py',
                             role=role,
                             train_instance_count=2,
                             train_instance_type='ml.p2.xlarge',
                             framework_version='1.12',
                             py_version = 'py3',
                             distributions = {'parameter_server': {'enabled': True}})

## Calling ``fit``

To start a training job, we call `estimator.fit(training_data_uri)`.

You can pass an ``s3_input``, a dictionary or an S3 location to the ``fit`` method. If a dictionary is used the keys, named **channels**, 
have values pointing to the data location. `estimator.fit(training_data_uri)` downloads the TensorFlow container with TensorFlow Python 3, GPU version to the SageMaker training job instances. If a S3 location is used the estimator will create a default channel named ``'training'`` which points to this S3 location.

When training starts, the TensorFlow container executes mnist.py, passing `hyperparameters` which we don't have any in our case and `model_dir` as script arguments, executing the example as follows:
```bash
python mnist.py --model_dir s3://<DEFAULT_BUCKET>/<TRAINING_JOB>
```
In the end of the training, the training job will upload the saved model for TensorFlow serving.

In [None]:
mnist_estimator.fit(training_data_uri)

# Deploy the trained model to an endpoint

The ``deploy()`` method creates an SageMaker model which is then deployed to an endpoint to serve prediction requests in real time. We will use the TensorFlow Serving container for the endpoint because we trained with script mode. This serving container runs an implementation of a web server which is compatible with SageMaker hosting protocol. The [Using your own inference code]() document explains how inference images are run in SageMaker.

In [None]:
predictor = mnist_estimator.deploy(initial_instance_count=1, instance_type='ml.p2.xlarge')

# Invoke the endpoint

Let's download the training data and use that as input for inference.

In [None]:
import numpy as np

!aws s3 cp s3://sagemaker-sample-data-{region}/tensorflow/mnist/train_data.npy train_data.npy
!aws s3 cp s3://sagemaker-sample-data-{region}/tensorflow/mnist/train_labels.npy train_labels.npy

train_data = np.load('train_data.npy')
train_labels = np.load('train_labels.npy')


The formats of the input and the output data correspond directly to the request and response formats of the Predict method in the [TensorFlow Serving REST API](https://www.tensorflow.org/serving/api_rest). SageMaker's TensforFlow Serving endpoints can also accept some additional input formats that are not part of the TensorFlow REST API, including a simplified JSON format, line-delimited JSON objects ("jsons" or "jsonlines"), and CSV data.

In this example we are using ``numpy`` array as input which will be serialize into the simplified JSON format. In addtion TensorFlow serving can also process multiple items at once as you can see in the following code. You can find the complete documentation on how to make predictions against a TensorFlow serving SageMaker endpoint [here](https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/tensorflow/deploying_tensorflow_serving.rst#making-predictions-against-a-sagemaker-endpoint).

In [None]:
predictions = predictor.predict(train_data[:50])
for i in range(0, 50):
    prediction = predictions['predictions'][i]['classes']
    label = train_labels[i]
    print('prediction is {}, label is {}, matched: {}'.format(prediction, label, prediction == label))

# Delete the endpoint

Deleting the endpoint we just created to prevent extra cost.

In [None]:
sagemaker.Session().delete_endpoint(predictor.endpoint)