# Train & Serve TF Model with TensorFlow Serving

### Combining Tutorials

https://www.tensorflow.org/tfx/tutorials/serving/rest_simple  
https://aws.amazon.com/blogs/machine-learning/deploy-trained-keras-or-tensorflow-models-using-amazon-sagemaker/

The Tensorflow tutorial builds a simple MNIST model and shows you how to set up a server (using Docker).  
The second tutorial, SageMaker, shows you how to deploy a model you trained outside of SageMaker.  

This hybrid will deploy the MNIST model to a SageMaker endpoint (not use a Docker to serve the model.)


In [None]:

# Helper libraries
import numpy as np
import matplotlib.pyplot as plt
import os
import subprocess
import tarfile

# TensorFlow and tf.keras
import tensorflow as tf
from tensorflow import keras

tf.logging.set_verbosity(tf.logging.ERROR)
print(tf.__version__)

# AWS SageMaker
import sagemaker
from sagemaker.tensorflow.model import TensorFlowModel
from sagemaker import get_execution_role

## Globals


In [None]:
! pwd
PROJECT_DIR = os.getcwd()
BUCKET = 'cfa-eadatasciencesb-sagemaker'

TRAINED_MODEL_PATH = os.path.join(PROJECT_DIR, "trained_model")
EXPORT_MODEL_PATH = os.path.join(TRAINED_MODEL_PATH, "export/Servo")

S3_MNIST_MODEL_PATH = 's3://{}/trained-models/mnist_fashion/20190814_100/output/'.format(BUCKET)

### Fashion MNIST 
this bypasses a lot of model stuff - that's not really the point

#### Note:
- the images are normalized
- note the shape

In [None]:
fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

# scale the values to 0.0 to 1.0
train_images = train_images / 255.0
test_images = test_images / 255.0

# reshape for feeding into the model
train_images = train_images.reshape(train_images.shape[0], 28, 28, 1)
test_images = test_images.reshape(test_images.shape[0], 28, 28, 1)

class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

print('\ntrain_images.shape: {}, of {}'.format(train_images.shape, train_images.dtype))
print('test_images.shape: {}, of {}'.format(test_images.shape, test_images.dtype))

### Train & Evaluate your model

In [None]:
model = keras.Sequential([
  keras.layers.Conv2D(input_shape=(28,28,1), filters=8, kernel_size=3, 
                      strides=2, activation='relu', name='Conv1'),
  keras.layers.Flatten(),
  keras.layers.Dense(10, activation=tf.nn.softmax, name='Softmax')
])
model.summary()

testing = False
epochs = 5

model.compile(optimizer=tf.train.AdamOptimizer(), 
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=epochs)

test_loss, test_acc = model.evaluate(test_images, test_labels)
print('\nTest accuracy: {}'.format(test_acc))

### You have a trained model - now create a saved_graph.pb

In [None]:
# Fetch the Keras session and save the model
# The signature definition is defined by the input and output tensors,
# and stored with the default serving key


print ("export model path:", EXPORT_MODEL_PATH)
version = 100       # can be any integer, just changing the default here
export_path = os.path.join(EXPORT_MODEL_PATH, str(version))
print('export model path = {}\n'.format(export_path))
if os.path.isdir(export_path):
  print('\nAlready saved a model, cleaning up\n')
  !rm -r {export_path}

tf.saved_model.simple_save(
    keras.backend.get_session(),
    export_path,
    inputs={'input_image': model.input},
    outputs={t.name:t for t in model.outputs})

print('\nSaved model:')
!ls -l {export_path}

### Examine the Signatures of the model

In [None]:
!saved_model_cli show --dir {export_path} --all

### Serving Your Model

https://aws.amazon.com/blogs/machine-learning/deploy-trained-keras-or-tensorflow-models-using-amazon-sagemaker/

This blog shows you that you can easily push this to S3 in the expected format (as if you trained it on SageMaker - which is really just straight TensorFlow) - then it will deploy easily to SageMaker.    Other tutorials suggest you need to make a Docker image.   This appears easier.

#### This demonstrates you have the right artifacts

In [None]:
# simple saved_grap.py is where it belongs

os.chdir(TRAINED_MODEL_PATH)
! ls
! ls export/Servo/

In [None]:
with tarfile.open('model.tar.gz', mode='w:gz') as archive:
    archive.add('export', recursive=True)

In [None]:
sagemaker_session = sagemaker.Session()
print (sagemaker_session)

### Save Model to S3
SageMaker tutorial suggests using SageMaker Session as a unique folder name - but that provides no insight into the model.  So, I'm using a different bucket/folder scheme:  

Sorry for the mixed - and _  -- bad habits

In [None]:
# this is slightly different than the tutorial
#   tutorial says /model but SageMaker when it saves a model put the model output in /output
#   so, I went with /output

# you MANUALLY need to make sure this bucket/folder path is there
# verify S3 path is good
! aws s3 ls {S3_MNIST_MODEL_PATH}

# verify your tarball is ready to go
! ls model.tar.gz -l

In [None]:
! aws s3 cp model.tar.gz {S3_MNIST_MODEL_PATH}

In [None]:
# go back to the project directory
os.chdir(PROJECT_DIR)

In [None]:
print (PROJECT_DIR)

### Deploy model to SageMaker Hosted Endpoint
you are doing this with a weird combination of local and S3 artifacts:  
- model is on S3 as a tarball (model.tar.gz)
- train.py - which is not used, is a local asset
- framework is TF 1.12, but you are on TF 1.14
- And, there is a hidden issue of python version - that you probably won't see

This is all telling you - that this will probably change with a near future release!

In [None]:
role = get_execution_role()

sagemaker_model = TensorFlowModel(model_data = S3_MNIST_MODEL_PATH + 'model.tar.gz',
                                  role = role,
                                  framework_version = '1.12',
                                  entry_point = 'code/train.py',
                                  name='model-mnist-v100-20190814')

In [None]:
# this takes 10 min
# - this creates the model on SageMaker
#   i.e. looks like you created it when you instantiated sagemaker_model - but you won't see it on console
#        NOW you'll see it on the console
# - this always fails
#   UnexpectedStatusException: Error hosting endpoint ep-mnist-v100: Failed. Reason:  The primary container for production variant 
#      AllTraffic did not pass the ping health check. Please check CloudWatch logs for this endpoint

# hack - once you've done this once, you'll have the artifacts you need

# predictor = sagemaker_model.deploy(initial_instance_count=1, instance_type='ml.m4.xlarge', endpoint_name='ep-mnist-v100')

### After the .deploy fails
1. go to console
2. create an endpoint configuration - if none exists
3. your model will exist
4. create the endpoint (from console)
   - reference the endpoint config
   - and the model