# Attempting to Serve with Standard SageMaker Tensorflow Support and Script Mode

This notebook illustrates the simplest way to deploy a pre-built model from [TFHub](https://www.tensorflow.org/hub) to SageMaker. All that is required is the model itself on S3.

In order to ensure we got the right model, we'll first load it locally to test that inference works. For that, we'll need to update or install a few packages.

In [1]:
!pip install -U tensorflow-gpu>=2.2.0 tensorflow-hub>=0.8.0 tensorflow-text==2.2.0

You should consider upgrading via the 'pip install --upgrade pip' command.[0m


In [1]:
import tensorflow as tf
import tensorflow_hub as hub
import tensorflow_text
import numpy as np
from sagemaker.tensorflow.serving import Model as TFSModel
import sagemaker as sm
import json

In [2]:
print(tf.__version__)

2.2.0


In [3]:
tf.config.list_physical_devices('GPU')

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

## Retrieving the Model

In [4]:
bucket = sm.session.Session().default_bucket()
print(f"Default bucket: {bucket}")

Default bucket: sagemaker-eu-west-1-113147044314


In [5]:
MUSE_VERSION = 2
MUSE_BASE_URL = f"https://tfhub.dev/google/universal-sentence-encoder-multilingual-large/{MUSE_VERSION}"
muse_url = f"{MUSE_BASE_URL}\?tf-hub-format=compressed"
model_s3_path = f's3://{bucket}/MUSE/large/{MUSE_VERSION:0>6d}/model.tar.gz'
local_model_path = f"../../models/MUSE/large/{MUSE_VERSION:0>6d}"

In [13]:
!rm -rf {local_model_path}
!mkdir -p {local_model_path}
!curl -L {muse_url} | tar -zxvC {local_model_path}
!tar -czf /tmp/model.tar.gz -C {"/".join(local_model_path.split("/")[:-1])} .
!ls -la /tmp/*.tar.gz
!aws s3 cp /tmp/model.tar.gz {model_s3_path}

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0  303M    0   371    0     0    591      0   6d 05h --:--:--   6d 05h   591./
./assets/
./variables/
./variables/variables.index
./variables/variables.data-00000-of-00001
 92  303M   92  281M    0     0  37.0M      0  0:00:08  0:00:07  0:00:01 50.4M./saved_model.pb
100  303M  100  303M    0     0  38.4M      0  0:00:07  0:00:07 --:--:-- 51.5M
-rw-rw-r-- 1 ec2-user ec2-user 635790625 Jul  6 11:07 /tmp/model.tar.gz
upload: ../../../../../../tmp/model.tar.gz to s3://sagemaker-eu-west-1-113147044314/MUSE/large/000002/model.tar.gz


In [10]:
%%writefile modelscript_tensorflow.py
import tensorflow as tf
import numpy as np
import tensorflow_hub as hub
import tensorflow_text
import json

#Return loaded model
def load_model(modelpath):
    model = hub.load(modelpath)
    return model

# return prediction based on loaded model (from the step above) and an input payload
def predict(model, payload):
    if not isinstance(payload, str):
        payload = payload.decode()
    try:
        try:
            if isinstance(json.loads(payload), dict):
                data = json.loads(payload).get('instances', [payload])  # If it has no instances field, assume the payload is a string
            elif isinstance(json.loads(payload), list):
                data = payload
        except json.JSONDecodeError:  # If it can't be decoded, assume it's a string
            data = [payload]
        result = model(data)['outputs'].numpy()
        out = result.tolist()
    except Exception as e:
        out = str(e)
    return json.dumps({'output': out})

Overwriting modelscript_tensorflow.py


In [11]:
inputs = ['The quick brown fox jumped over the lazy dog.', 'This is a test']
inputs_json = json.dumps({'instances': inputs})
inputs_json_list = json.dumps(inputs)

# Testing local inference

In [12]:
%load_ext autoreload
%autoreload 2
from modelscript_tensorflow import *
local_model = load_model(local_model_path)

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [13]:
local_model(['The quick brown fox jumped over the lazy dog.', 'This is a test'])['outputs'].numpy().shape

(2, 512)

In [14]:
predict(local_model, inputs_json)

'{"output": [[-0.011378916911780834, 0.004917477257549763, 0.0777159109711647, 0.012036170810461044, -0.08073007315397263, -0.04827744886279106, -0.020259153097867966, -0.04201997071504593, 0.06365488469600677, -0.03135908395051956, 0.025256173685193062, 0.06291830539703369, 0.009275468997657299, 0.07565078884363174, -0.01695311814546585, -0.03825325146317482, -0.036574117839336395, -0.0279514379799366, -0.10248785465955734, 0.00045519130071625113, 0.034608956426382065, -0.07623744755983353, 0.03754916787147522, 0.001743260771036148, 0.05025285854935646, 0.07515142858028412, 0.0037855070549994707, -0.036492969840765, 0.01126859337091446, -0.006898592226207256, 0.06939531862735748, -0.0020057924557477236, 0.0697748139500618, 0.03602251037955284, -0.07868615537881851, 0.04386170208454132, 0.06253548711538315, -0.09464975446462631, 0.0235211830586195, -0.01700165495276451, -0.011433626525104046, -0.038941990584135056, 0.07634219527244568, -0.02611914835870266, -0.02709721028804779, -0.017

In [35]:
messages = json.dumps({'instances':[
    "Brian Cosgrove's classic introduction to the world of microlight flying has endeared itself to several generations of pilots.",
    "BECAUSE NOT ALL KRAV MAGA IS THE SAME(R) This book is designed for krav maga trainees, security-conscious civilians, law enforcement officers, security professionals, and military personnel alike who wish to refine their essential krav maga combatives, improve their chances of surviving a hostile attack and prevail without serious injury. Combatives are the foundation of krav maga counter-attacks. These are the combatives of the original Israeli Krav Maga Association (Grandmaster Gidon). It is irrefutable that you need only learn a few core combatives to be an effective fighter. Simple is easy. Easy is effective. Effective is what is required to end a violent encounter quickly, decisively, and on your terms. This book stresses doing the right things and doing them in the right way. Right technique + Correct execution = Maximum Effect. Contents include Key strategies for achieving maximum combative effects Krav maga's 12 most effective combatives Developing power and balance Combatives for the upper and lower body Combative combinations and retzev (continuous combat motion) Combatives for takedowns and throws Combatives for armbars, leglocks, and chokes Whatever your martial arts or defensive tactics background or if you have no self-defense background at all, this book can add defensive combatives and combinations to your defensive repertoire. Our aim is to build a strong self-defense foundation through the ability to optimally counter-attack.",
    """-AWESOME FACTS ABOUT THE RUGBY WORLD CUP: I have intentionally selected a specific range of "Rugby World Cup" facts that I feel will not only help children to learn new information but more importantly, remember it. -FUN LEARNING TOOL FOR ALL AGES: This book is designed to capture the imagination of everyone through the use of "WoW" trivia, cool photos and memory recall quiz. -COOL & COLORFUL PICTURES: Each page contains a quality image relating to the subject in question. This helps the reader to match and recall the content. -SHORT QUIZ GAME - POSITIVE REINFORCEMENT: No matter what the score is, everyone's a WINNER! The purpose of the short quiz at the end is to help check understanding, to cement the information and to provide a positive conclusion, regardless of the outcome. Your search for the best "Rugby Union" book is finally over. When you purchase from me today, here are just some of the things you can look forward to..... Amazing and extraordinary "Rugby World Cup" facts. This kind of trivia seems to be one of the few things my memory can actually recall. I'm not sure if it's to do with the shock or the "WoW" factor but for some reason my brain seems to store at least some of it for a later date. A fun way of learning. I've always been a great believer in that whatever the subject, if a good teacher can inspire you and hold your attention, then you'll learn! Now I'm not a teacher but the system I've used in previous publications on Kindle seems to work well, particularly with children. A specific selection of those "WoW" facts combined with some pretty awesome pictures, if I say so myself! Words and images combined to stimulate the brain and absorb the reader using an interactive formula. At the end there is a short "True or False" quiz to check memory recall. Don't worry though, it's a bit of fun but at the same time, it helps to check understanding. Remember, "Everyone's a Winner!" Enjoy ......... Matt."""
]})

# Generating a SageMaker Model

In [15]:
exec_role = sm.get_execution_role()

In [17]:
model = TFSModel(
    model_data=model_s3_path, role=exec_role, 
    entry_point='modelscript_tensorflow.py',
    source_dir='server-src',
    name=f'muse-large-{MUSE_VERSION:0>6d}',
    framework_version='2.1.0'
)

In [18]:
predictor = model.deploy(
    initial_instance_count=1, instance_type='ml.p3.2xlarge',
    accelerator_type=None, endpoint_name=f'muse-large-{MUSE_VERSION:0>6d}-v1',
    update_endpoint=False, tags=None, kms_key=None, wait=True, data_capture_config=None)

-------------!

According to the [Tensorflow Serving API](https://www.tensorflow.org/tfx/serving/api_rest), we should pass a JSON object containing an `instances` field, which in our case is a list of text strings we want to embed.

In [19]:
predictor.predict(inputs_json)

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (404) from model with message "{ "error": "[_Derived_]{{function_node __inference_signature_wrapper_227768}} {{function_node __inference_signature_wrapper_227768}} Op type not registered \'SentencepieceOp\' in binary running on model.aws.local. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.\n\t [[{{node StatefulPartitionedCall}}]]\n\t [[StatefulPartitionedCall]]\n\t [[StatefulPartitionedCall/_3461]]" }". See https://eu-west-1.console.aws.amazon.com/cloudwatch/home?region=eu-west-1#logEventViewer:group=/aws/sagemaker/Endpoints/muse-large-000002-v1 in account 113147044314 for more information.

The inference fails, because MUSE requires `tensorflow-text`, which is not packaged with TF Serving 2.1. The newer version 2.2 is still not supported by SageMaker, so we can't test if this problem was fixed.

We'll proceed with alternative approaches for deploying the model. But first, let's delete the endpoint.

In [None]:
predictor.delete_endpoint()