# Deploy a Keras or Tensorflow model trained anywhere using Amazon SageMaker


Amazon SageMaker makes it easier for any developer or data scientist to build, train, and deploy machine learning (ML) models. While it’s designed to alleviate the undifferentiated heavy lifting from the full life cycle of ML models, Amazon SageMaker’s capabilities can also be used independently of one another; that is, models trained in Amazon SageMaker can be optimized and deployed outside of Amazon SageMaker including edge (mobile or IoT devices). Conversely, Amazon SageMaker can deploy and host pre-trained models such as model zoos or models trained locally by your team. 

In this notebook, we’ll demonstrate how to deploy a trained Keras (Tensorflow backend) model using Amazon SageMaker, taking advantage of Amazon SageMaker deployment features, such as selecting the type and number of instances, model compilation to improve inference latency, and autoscaling.

### Step 1. Set up

In the AWS Management Console, go to the Amazon SageMaker console. Choose Notebook Instances, and create a new notebook instance. Upload the current notebook and set the kernel to ``conda_tensorflow_p36``.

The get_execution_role function retrieves the AWS Identity and Access Management (IAM) role you created at the time of creating your notebook instance.

In [1]:
from sagemaker import get_execution_role
from sagemaker import Session

role = get_execution_role()
sess = Session()
region = sess.boto_region_name
bucket = sess.default_bucket()

If you are running this locally, check your version of Tensorflow to prevent downstream framework errors.

In [2]:
import tensorflow as tf
print(tf.__version__)  # This notebook runs on TensorFlow 1.15.x or earlier

1.15.3


In [3]:
tf_framework_version = tf.__version__

Import necessary Python packages and install the version of h5py for compatibility with your Keras model.

In [4]:
# ref: https://github.com/keras-team/keras/issues/14265
!pip install "h5py==2.10.0"
import h5py
import numpy as np



### Step 2. Load the Keras model using the json and weights file

If you saved your model in the TensorFlow ProtoBuf format, skip to "Step 4. Convert the TensorFlow model to an Amazon SageMaker-readable format.

Create a directory called ``keras_model``, download [hosted keras model](https://s3.amazonaws.com/aws-ml-blog/artifacts/keras-tensorflow-model-deployment/model.zip), and unzip the model.json and model-weights.h5 files to ``keras_model``.

In [5]:
!mkdir keras_model

In [6]:
!wget https://s3.amazonaws.com/aws-ml-blog/artifacts/keras-tensorflow-model-deployment/model.zip

--2021-03-06 05:08:58--  https://s3.amazonaws.com/aws-ml-blog/artifacts/keras-tensorflow-model-deployment/model.zip
Resolving s3.amazonaws.com (s3.amazonaws.com)... 52.217.108.14
Connecting to s3.amazonaws.com (s3.amazonaws.com)|52.217.108.14|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 30266 (30K) [application/zip]
Saving to: ‘model.zip’


2021-03-06 05:08:59 (28.6 MB/s) - ‘model.zip’ saved [30266/30266]



In [7]:
!unzip model.zip -d keras_model

Archive:  model.zip
  inflating: keras_model/model-weights.h5  
  inflating: keras_model/model.json  


In [None]:
import os
import tensorflow as tf
import tensorflow.keras as keras
from keras.models import model_from_json

with open(os.path.join('keras_model', 'model.json'), 'r') as fp:
    loaded_model_json = fp.read()
loaded_model = model_from_json(loaded_model_json)

In [None]:
loaded_model.load_weights('keras_model/model-weights.h5')

### Step 3. Export the Keras model to the TensorFlow ProtoBuf format

In [10]:
from tensorflow.python.saved_model import builder
from tensorflow.python.saved_model.signature_def_utils import predict_signature_def
from tensorflow.python.saved_model import tag_constants

In [14]:
# Note: This directory structure will need to be followed - see notes for the next section
model_version = '1'
export_dir = 'export/Servo/' + model_version

In [15]:
# Build the Protocol Buffer SavedModel at 'export_dir'
builder = builder.SavedModelBuilder(export_dir)

In [16]:
# Create prediction signature to be used by TensorFlow Serving Predict API
signature = predict_signature_def(
    inputs={"inputs": loaded_model.input}, outputs={"score": loaded_model.output})

Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.


In [17]:
session = tf.compat.v1.Session()
init_op = tf.compat.v1.global_variables_initializer()
session.run(init_op)
# Save the meta graph and variables
builder.add_meta_graph_and_variables(
    sess=session, tags=[tag_constants.SERVING], signature_def_map={"serving_default": signature})
builder.save()

INFO:tensorflow:No assets to save.
INFO:tensorflow:No assets to write.
INFO:tensorflow:SavedModel written to: export/Servo/1/saved_model.pb


b'export/Servo/1/saved_model.pb'

### Step 4. Convert TensorFlow model to a SageMaker readable format

Move the TensorFlow exported model into a directory export\Servo\. SageMaker will recognize this as a loadable TensorFlow model. Your directory and file structure should look like:

In [18]:
model_path = 'export/Servo/1/'

In [19]:
!saved_model_cli show --all --dir {model_path}


MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['inputs'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 50)
        name: dense_1_input:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['score'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 1)
        name: dense_7/Sigmoid:0
  Method name is: tensorflow/serving/predict


####  Tar the entire directory and upload to S3

In [24]:
import tarfile
model_archive = 'model.tar.gz'
with tarfile.open(model_archive, mode='w:gz') as archive:
    archive.add('export', recursive=True)

In [25]:
model_data = sess.upload_data(path=model_archive, key_prefix='model')

In [26]:
model_data

's3://sagemaker-us-east-1-932174941916/model/model.tar.gz'

### Step 5. Deploy the trained model

In [27]:
from sagemaker.tensorflow.serving import Model
instance_type = 'ml.c5.xlarge'

In [28]:
sm_model = Model(model_data=model_data, framework_version=tf_framework_version,role=role)
uncompiled_predictor = sm_model.deploy(initial_instance_count=1, instance_type=instance_type)

The class sagemaker.tensorflow.serving.Model has been renamed in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.
update_endpoint is a no-op in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.


-----------!

### Step 6. Invoke the endpoint

#### Invoke the SageMaker endpoint from the notebook

In [31]:
# The sample model expects an input of shape [1,50]
data = np.random.randn(1, 50)
data.shape

(1, 50)

In [None]:
uncompiled_predictor.predict(data)

#### Compile model using SageMaker Neo

[SageMaker Neo](https://aws.amazon.com/sagemaker/neo/) makes it easy to compile pre-trained TensorFlow models and build an inference optimized container without the need for any custom model serving or inference code.

In [41]:
instance_family = 'ml_c5'
framework = 'tensorflow'
compilation_job_name = 'keras-compile'
# output path for compiled model artifact
compiled_model_path = 's3://{}/{}/output'.format(bucket, compilation_job_name)
data_shape = {'inputs':[1, data.shape[0], data.shape[1]]}

In [42]:
optimized_estimator = sm_model.compile(target_instance_family=instance_family,
                                         input_shape=data_shape,
                                         job_name=compilation_job_name,
                                         role=role,
                                         framework=framework,
                                         framework_version=tf_framework_version,
                                         output_path=compiled_model_path
                                        )

?................................................!

In [45]:
optimized_predictor = optimized_estimator.deploy(initial_instance_count = 1, instance_type = instance_type)

update_endpoint is a no-op in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.


-----------!

#### Invoke optimized SageMaker endpoint

In [None]:
optimized_predictor.predict(data)

### Step 7. Clean up

To avoid incurring charges to your AWS account for the resources used in this tutorial, you need to delete the SageMaker Endpoint.

In [None]:
uncompiled_predictor.delete_endpoint()

In [None]:
optimized_predictor.delete_endpoint()

### Conclusion

In this blog post, we demonstrated converting a Keras model to TensorFlow SavedModel format, deploying a trained model to a SageMaker Endpoint, and compiling the same trained model using SageMaker Neo to get better performance. Using Amazon SageMaker, you can take a trained model and in a few lines of code have a scalable, managed inference deployment. This gives you the flexibility to use your existing model training workflows, while easily deploying trained models to production with all the benefits and optimizations offered by a managed platform.