# Amazon SageMaker Multi-Model Endpoints using TensorFlow

## Prerequisites (Please Read Cell Below)

With Amazon SageMaker [multi-model endpoints](https://docs.aws.amazon.com/sagemaker/latest/dg/multi-model-endpoints.html), customers can create an endpoint that seamlessly hosts up to thousands of models. These endpoints are well suited to use cases where any one of a large number of models, which can be served from a common inference container, needs to be invokable on-demand and where it is acceptable for infrequently invoked models to incur some additional latency. For applications which require consistently low inference latency, a traditional endpoint is still the best choice.

To demonstrate how multi-model endpoints can be created and used, this notebook provides an example using models trained with the [SageMaker TensorFlow framework container](https://github.com/aws/sagemaker-tensorflow-serving-container/tree/dc1ccd1cb19114a0b357862aa2177e9d2a67fdf5). 

We'll train and deploy two different TensorFlow ANN Models for Boston Housing and [Petrol Consumption](https://www.kaggle.com/harinir/petrol-consumption) datasets. The first portion will cover Boston Housing Steps and then repeat same procedure for the Petrol dataset with its own training script/model.

For other MME use cases, you can also refer to:

Segmented home value modelling examples with the [PyTorch framework](https://github.com/aws/amazon-sagemaker-examples/blob/master/advanced_functionality/multi_model_pytorch/pytorch_multi_model_endpoint.ipynb), [Scikit-Learn framework](https://github.com/aws/amazon-sagemaker-examples/tree/master/advanced_functionality/multi_model_sklearn_home_value), the [XGBoost pre-built algorithm](https://github.com/aws/amazon-sagemaker-examples/tree/master/advanced_functionality/multi_model_xgboost_home_value), and the [Linear Learner algorithm](https://github.com/aws/amazon-sagemaker-examples/tree/master/advanced_functionality/multi_model_linear_learner_home_value).
An example with [MXNet](https://github.com/aws/amazon-sagemaker-examples/tree/master/advanced_functionality/multi_model_bring_your_own) and corresponding [documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/build-multi-model-build-container.html) on how to use MME with your own custom containers.

**Kernel**: conda_tensorflow2_p36

## Imports

In [3]:
import tensorflow as tf
import pandas as pd
import numpy as np
import boto3
import sagemaker
import os
from sagemaker.tensorflow import TensorFlow
from sagemaker.inputs import TrainingInput
from sagemaker import get_execution_role
from sagemaker.tensorflow.serving import TensorFlowModel
from sagemaker.multidatamodel import MultiDataModel

## Role and S3 Buckets

In [4]:
role = get_execution_role()
session = boto3.Session()
sagemaker_session = sagemaker.Session()

s3 = session.resource('s3')
TF_FRAMEWORK_VERSION = '2.3.0'
BUCKET = sagemaker.Session().default_bucket()
PREFIX = 'regression-models'

# Boston Housing Model Training

## Boston Dataset Creation

In [5]:
from sklearn import datasets #Boston Housing
boston = datasets.load_boston()
X = pd.DataFrame(boston.data, columns=boston.feature_names)
y = pd.DataFrame(boston.target)
y.columns=['TARGET']
df = pd.concat([X,y], axis=1)

#split into train and test to push to local
bostonTrain = df.iloc[:450,:]
bostonTest = df.iloc[451:,:]

In [6]:
DATASET_PATH = './Data/Boston'
os.makedirs(DATASET_PATH, exist_ok=True)

In [7]:
bostonTrain.to_csv('Data/Boston/train.csv', index=False)
bostonTest.to_csv('Data/Boston/test.csv', index=False)

In [8]:
!aws s3 cp ./{DATASET_PATH}/train.csv s3://{BUCKET}/{PREFIX}/BostonHousing/train/

upload: Data/Boston/train.csv to s3://sagemaker-us-east-1-474422712127/regression-models/BostonHousing/train/train.csv


## Create Training Inputs Boston Model

In [9]:
train_input = TrainingInput(s3_data=f's3://{BUCKET}/{PREFIX}/BostonHousing/train',content_type='csv')

In [10]:
inputs = {'train': train_input}

## Boston Model Training

In [11]:
model_name = 'bostonhousing-ann'
hyperparameters = {'epochs': 50}
estimator_parameters = {'source_dir':"Scripts",
                        'entry_point':'boston.py',
                        'instance_type': 'ml.m5.2xlarge',
                        'instance_count': 1,
                        'model_dir': f'/opt/ml/model',
                        'role': role,
                        'hyperparameters': hyperparameters,
                        'output_path': f's3://{BUCKET}/{PREFIX}/BostonHousing/out',
                        'base_job_name': f'mme-cv-{model_name}',
                        'framework_version': TF_FRAMEWORK_VERSION,
                        'py_version': 'py37',
                        'script_mode': True}
estimator_1 = TensorFlow(**estimator_parameters)
estimator_1.fit(inputs)

2021-08-23 01:53:23 Starting - Starting the training job...
2021-08-23 01:53:25 Starting - Launching requested ML instancesProfilerReport-1629683602: InProgress
...
2021-08-23 01:54:18 Starting - Preparing the instances for training......
2021-08-23 01:55:23 Downloading - Downloading input data...
2021-08-23 01:55:47 Training - Downloading the training image.....[34m2021-08-23 01:56:30,242 sagemaker-training-toolkit INFO     Imported framework sagemaker_tensorflow_container.training[0m
[34m2021-08-23 01:56:30,249 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)[0m
[34m2021-08-23 01:56:30,590 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)[0m
[34m2021-08-23 01:56:30,605 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)[0m
[34m2021-08-23 01:56:30,619 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)[0m
[34m2021-08-23 01:56:30,629 sagemaker-trai

## Create Boston Model

In [12]:
model_1 = estimator_1.create_model(role=role, source_dir="Scripts", entry_point="inference.py")

## Create Boston Endpoint

In [13]:
predictorA = model_1.deploy(
    initial_instance_count=1,
    instance_type="ml.c5.xlarge",
)

update_endpoint is a no-op in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.


----!

## Test Boston Endpoint Individually

Test both endpoints individually just to make sure both are working.

In [14]:
predictorA.serializer = sagemaker.serializers.JSONSerializer()
predictorA.deserializer = sagemaker.deserializers.JSONDeserializer()

In [15]:
#Grab sample test data
test = pd.read_csv('/home/ec2-user/SageMaker/TensorFlow-MME/Data/Boston/train.csv')
test[:1]
testX = test.drop("TARGET", axis=1)
testX = testX[:1].values.tolist()
sampInput = {"inputs": testX}
sampInput

{'inputs': [[0.00632,
   18.0,
   2.31,
   0.0,
   0.5379999999999999,
   6.575,
   65.2,
   4.09,
   1.0,
   296.0,
   15.3,
   396.9,
   4.98]]}

In [16]:
predicted_value = predictorA.predict(sampInput)
predicted_value

{'outputs': [[27.7003841]]}

## Boto3 Invocation

In [269]:
import boto3
import json
from sagemaker.serializers import JSONSerializer

runtime_sm_client = boto3.client(service_name='sagemaker-runtime')
endpoint_name = predictorA.endpoint_name
jsons = JSONSerializer()
payload = jsons.serialize(sampInput)
response = runtime_sm_client.invoke_endpoint(
        EndpointName=endpoint_name,
        Body=payload)
result = json.loads(response['Body'].read().decode())['outputs']
result

[[26.1025639], [26.1895218]]

## Delete Endpoint

In [None]:
predictorA.delete_endpoint(delete_endpoint_config=True)

# Petrol Housing Dataset Training

Repeating same process as Boston Housing Model

In [18]:
petrolDF = pd.read_csv("petrol_consumption.csv")
petrolTrain = petrolDF.iloc[:35,:]
petrolTest = petrolDF.iloc[36:,:]
DATASET_PATH = './Data/Petrol'
os.makedirs(DATASET_PATH, exist_ok=True)
petrolTrain.to_csv('Data/Petrol/train.csv', index=False)
petrolTest.to_csv('Data/Petrol/test.csv', index=False)

In [19]:
!aws s3 cp ./{DATASET_PATH}/train.csv s3://{BUCKET}/{PREFIX}/Petrol/train/

upload: Data/Petrol/train.csv to s3://sagemaker-us-east-1-474422712127/regression-models/Petrol/train/train.csv


In [20]:
train_input = TrainingInput(s3_data=f's3://{BUCKET}/{PREFIX}/Petrol/train',content_type='csv')

In [21]:
inputs = {'train': train_input}

## Petrol Model Training

In [22]:
model_name = 'petrol-ann'
hyperparameters = {'epochs': 50}
estimator_parameters = {'source_dir':"Scripts",
                        'entry_point':'petrol.py',
                        'instance_type': 'ml.m5.2xlarge',
                        'instance_count': 1,
                        'model_dir': f'/opt/ml/model',
                        'role': role,
                        'hyperparameters': hyperparameters,
                        'output_path': f's3://{BUCKET}/{PREFIX}/Petrol/out',
                        'base_job_name': f'mme-cv-{model_name}',
                        'framework_version': TF_FRAMEWORK_VERSION,
                        'py_version': 'py37',
                        'script_mode': True}
estimator_2 = TensorFlow(**estimator_parameters)
estimator_2.fit(inputs)

2021-08-23 02:04:00 Starting - Starting the training job...
2021-08-23 02:04:23 Starting - Launching requested ML instancesProfilerReport-1629684240: InProgress
...
2021-08-23 02:04:56 Starting - Preparing the instances for training.........
2021-08-23 02:06:23 Downloading - Downloading input data...
2021-08-23 02:06:54 Training - Downloading the training image...
2021-08-23 02:07:25 Uploading - Uploading generated training model[34m2021-08-23 02:07:16,313 sagemaker-training-toolkit INFO     Imported framework sagemaker_tensorflow_container.training[0m
[34m2021-08-23 02:07:16,321 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)[0m
[34m2021-08-23 02:07:16,903 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)[0m
[34m2021-08-23 02:07:16,919 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)[0m
[34m2021-08-23 02:07:16,934 sagemaker-training-toolkit INFO     No GPUs detected (normal i

## Create Petrol Model

In [25]:
model_2 = estimator_2.create_model(role=role, source_dir="Scripts", entry_point="inference.py")

## Petrol Endpoint Creation

In [26]:
predictorB = model_2.deploy(
    initial_instance_count=1,
    instance_type="ml.c5.xlarge",
)

update_endpoint is a no-op in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.


-----------!

## Test Petrol Endpoint

In [27]:
predictorB.serializer = sagemaker.serializers.JSONSerializer()
predictorB.deserializer = sagemaker.deserializers.JSONDeserializer()

In [28]:
test = pd.read_csv('/home/ec2-user/SageMaker/TensorFlow-MME/Data/Petrol/train.csv')
test[:1]

Unnamed: 0,Petrol_tax,Average_income,Paved_Highways,Population_Driver_licence(%),Petrol_Consumption
0,9.0,3571,1976,0.525,541


In [29]:
testX = test.drop("Petrol_Consumption", axis=1)
testX = testX[:1].values.tolist()
sampInput = {"inputs": testX}
sampInput

{'inputs': [[9.0, 3571.0, 1976.0, 0.525]]}

In [30]:
predicted_value = predictorB.predict(sampInput)
predicted_value

{'outputs': [[417.224182]]}

## Boto3 Petrol Endpoint Test

In [31]:
import boto3
import json
from sagemaker.serializers import JSONSerializer

runtime_sm_client = boto3.client(service_name='sagemaker-runtime')
endpoint_name = predictorB.endpoint_name
jsons = JSONSerializer()
payload = jsons.serialize(sampInput)
response = runtime_sm_client.invoke_endpoint(
        EndpointName=endpoint_name,
        Body=payload)
result = json.loads(response['Body'].read().decode())['outputs']
result

[[417.224182]]

## Delete Petrol Endpoint

In [32]:
predictorB.delete_endpoint(delete_endpoint_config=True)

# Multi Model Endpoint Creation

## MME S3 Model Path

In [33]:
tf_model_1 = estimator_1.model_data
output_1 = f's3://{BUCKET}/{PREFIX}/mme/boston.tar.gz'

tf_model_2 = estimator_2.model_data
output_2 = f's3://{BUCKET}/{PREFIX}/mme/petrol.tar.gz'

In [34]:
!aws s3 cp {tf_model_1} {output_1}
!aws s3 cp {tf_model_2} {output_2}

copy: s3://sagemaker-us-east-1-474422712127/regression-models/BostonHousing/out/mme-cv-bostonhousing-ann-2021-08-23-01-53-22-813/output/model.tar.gz to s3://sagemaker-us-east-1-474422712127/regression-models/mme/boston.tar.gz
copy: s3://sagemaker-us-east-1-474422712127/regression-models/Petrol/out/mme-cv-petrol-ann-2021-08-23-02-04-00-101/output/model.tar.gz to s3://sagemaker-us-east-1-474422712127/regression-models/mme/petrol.tar.gz


In [37]:
from datetime import datetime
import time
current_time = datetime.fromtimestamp(time.time()).strftime('%Y-%m-%d-%H-%M-%S')
current_time
model_data_prefix = f's3://{BUCKET}/{PREFIX}/mme/'

## Create Multi Data Model

Can use model 1 or any model from estimators (in this case only 2) because MME operates in a shared container.

In [38]:
mme = MultiDataModel(name=f'mme-tensorflow-{current_time}',
                     model_data_prefix=model_data_prefix,
                     model=model_1,
                     sagemaker_session=sagemaker_session)

## Check for Model Artifacts in MME Model Location

In [39]:
list(mme.list_models())

['boston.tar.gz', 'petrol.tar.gz']

## Create MME

In [40]:
predictor = mme.deploy(initial_instance_count=1,
                       instance_type='ml.m5.2xlarge',
                       endpoint_name=f'mme-tensorflow-{current_time}')

-----------!

## Test MME Boston Model

In [41]:
test = pd.read_csv('/home/ec2-user/SageMaker/TensorFlow-MME/Data/Boston/train.csv')
test[:1]
testX = test.drop("TARGET", axis=1)
testX = testX[:2].values.tolist()
sampInput = {"inputs": testX}
sampInput

{'inputs': [[0.00632,
   18.0,
   2.31,
   0.0,
   0.5379999999999999,
   6.575,
   65.2,
   4.09,
   1.0,
   296.0,
   15.3,
   396.9,
   4.98],
  [0.02731,
   0.0,
   7.07,
   0.0,
   0.469,
   6.421,
   78.9,
   4.9671,
   2.0,
   242.0,
   17.8,
   396.9,
   9.14]]}

In [42]:
y_pred = predictor.predict(data=sampInput, initial_args={'TargetModel': 'boston.tar.gz'})
y_pred

{'outputs': [[27.7003841], [25.4000874]]}

## Test MME Petrol Model

In [43]:
test = pd.read_csv('/home/ec2-user/SageMaker/TensorFlow-MME/Data/Petrol/train.csv')
testX = test.drop("Petrol_Consumption", axis=1)
testX = testX[:2].values.tolist()
sampInput = {"inputs": testX}
sampInput

{'inputs': [[9.0, 3571.0, 1976.0, 0.525],
  [9.0, 4092.0, 1250.0, 0.5720000000000001]]}

In [44]:
y_pred = predictor.predict(data=sampInput, initial_args={'TargetModel': 'petrol.tar.gz'})
y_pred

{'outputs': [[417.224182], [375.122925]]}

## Boto3 MME Test

In [45]:
import boto3
import json
from sagemaker.serializers import JSONSerializer
endpoint_name = predictor.endpoint_name

##########
#Petrol Model
##########
target_model = "petrol.tar.gz"
jsons = JSONSerializer()
payload = jsons.serialize(sampInput)
response = runtime_sm_client.invoke_endpoint(
        EndpointName=endpoint_name,
        TargetModel=target_model,
        Body=payload)
result = json.loads(response['Body'].read().decode())['outputs']
result

[[417.224182], [375.122925]]

## Delete Endpoint

In [46]:
predictor.delete_endpoint(delete_endpoint_config=True)