# Amazon SageMaker Multi-Model Endpoints using TensorFlow

This is a cut-down version of a notebook created by my colleague Ram Vegiraju (rvegira@amazon.com). To see his full notebook you can checkout his [github page](https://github.com/RamVegiraju/SageMaker-Deployment/tree/master/RealTime/Multi-Model-Endpoint/TensorFlow).

With Amazon SageMaker [multi-model endpoints](https://docs.aws.amazon.com/sagemaker/latest/dg/multi-model-endpoints.html), customers can create an endpoint that seamlessly hosts up to thousands of models. These endpoints are well suited to use cases where any one of a large number of models, which can be served from a common inference container, needs to be invokable on-demand and where it is acceptable for infrequently invoked models to incur some additional latency. For applications which require consistently low inference latency, a traditional endpoint is still the best choice.

To demonstrate how multi-model endpoints can be created and used, this notebook provides an example using models trained with the [SageMaker TensorFlow framework container](https://github.com/aws/sagemaker-tensorflow-serving-container/tree/dc1ccd1cb19114a0b357862aa2177e9d2a67fdf5). 

We'll train and deploy two different TensorFlow ANN Models for Boston Housing and [Petrol Consumption](https://www.kaggle.com/harinir/petrol-consumption) datasets. The first portion will cover Boston Housing Steps and then repeat same procedure for the Petrol dataset with its own training script/model.

For other MME use cases, you can also refer to:

Segmented home value modelling examples with the [PyTorch framework](https://github.com/aws/amazon-sagemaker-examples/blob/master/advanced_functionality/multi_model_pytorch/pytorch_multi_model_endpoint.ipynb), [Scikit-Learn framework](https://github.com/aws/amazon-sagemaker-examples/tree/master/advanced_functionality/multi_model_sklearn_home_value), the [XGBoost pre-built algorithm](https://github.com/aws/amazon-sagemaker-examples/tree/master/advanced_functionality/multi_model_xgboost_home_value), and the [Linear Learner algorithm](https://github.com/aws/amazon-sagemaker-examples/tree/master/advanced_functionality/multi_model_linear_learner_home_value).
An example with [MXNet](https://github.com/aws/amazon-sagemaker-examples/tree/master/advanced_functionality/multi_model_bring_your_own) and corresponding [documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/build-multi-model-build-container.html) on how to use MME with your own custom containers.

**Kernel**: conda_tensorflow2_p36

## Pre-requisites
1. Ensure you take the scripts from the Scripts folder in my repo and move them to a Scripts folder in the base directory of your notebook
2. Take the data file petrol_consumption.csv from my repo and add to the base directory of your notebook

## Imports

In [None]:
import tensorflow as tf
import pandas as pd
import numpy as np
import boto3
import sagemaker
import os
from sagemaker.tensorflow import TensorFlow
from sagemaker.inputs import TrainingInput
from sagemaker import get_execution_role
from sagemaker.tensorflow.serving import TensorFlowModel
from sagemaker.multidatamodel import MultiDataModel

## Role and S3 Buckets

In [None]:
role = get_execution_role()
session = boto3.Session()
sagemaker_session = sagemaker.Session()

s3 = session.resource('s3')
TF_FRAMEWORK_VERSION = '2.3.0'
BUCKET = sagemaker.Session().default_bucket()
PREFIX = 'regression-models'

# Boston Housing Model Training

## Boston Dataset Creation

In [None]:
from sklearn import datasets #Boston Housing
boston = datasets.load_boston()
X = pd.DataFrame(boston.data, columns=boston.feature_names)
y = pd.DataFrame(boston.target)
y.columns=['TARGET']
df = pd.concat([X,y], axis=1)

#split into train and test to push to local
bostonTrain = df.iloc[:450,:]
bostonTest = df.iloc[451:,:]

In [None]:
DATASET_PATH = './Data/Boston'
os.makedirs(DATASET_PATH, exist_ok=True)

In [None]:
bostonTrain.to_csv('Data/Boston/train.csv', index=False)
bostonTest.to_csv('Data/Boston/test.csv', index=False)

In [None]:
!aws s3 cp ./{DATASET_PATH}/train.csv s3://{BUCKET}/{PREFIX}/BostonHousing/train/

## Create Training Inputs Boston Model

In [None]:
train_input = TrainingInput(s3_data=f's3://{BUCKET}/{PREFIX}/BostonHousing/train',content_type='csv')

In [None]:
inputs = {'train': train_input}

## Boston Model Training

In [None]:
model_name = 'bostonhousing-ann'
hyperparameters = {'epochs': 50}
estimator_parameters = {'source_dir':"Scripts",
                        'entry_point':'boston.py',
                        'instance_type': 'ml.m5.2xlarge',
                        'instance_count': 1,
                        'model_dir': f'/opt/ml/model',
                        'role': role,
                        'hyperparameters': hyperparameters,
                        'output_path': f's3://{BUCKET}/{PREFIX}/BostonHousing/out',
                        'base_job_name': f'mme-cv-{model_name}',
                        'framework_version': TF_FRAMEWORK_VERSION,
                        'py_version': 'py37',
                        'script_mode': True}
estimator_boston = TensorFlow(**estimator_parameters)
estimator_boston.fit(inputs)

## Create Boston Model in Sagemaker

In [None]:
boston_model = estimator_boston.create_model(role=role, source_dir="Scripts", entry_point="inference.py")

# Petrol Housing Dataset Training

Repeating same process as Boston Housing Model

In [None]:
!aws s3 cp 

In [None]:
petrolDF = pd.read_csv("petrol_consumption.csv")
petrolTrain = petrolDF.iloc[:35,:]
petrolTest = petrolDF.iloc[36:,:]
DATASET_PATH = './Data/Petrol'
os.makedirs(DATASET_PATH, exist_ok=True)
petrolTrain.to_csv('Data/Petrol/train.csv', index=False)
petrolTest.to_csv('Data/Petrol/test.csv', index=False)

In [None]:
!aws s3 cp ./{DATASET_PATH}/train.csv s3://{BUCKET}/{PREFIX}/Petrol/train/

In [None]:
train_input = TrainingInput(s3_data=f's3://{BUCKET}/{PREFIX}/Petrol/train',content_type='csv')

In [None]:
inputs = {'train': train_input}

## Petrol Model Training

In [None]:
model_name = 'petrol-ann'
hyperparameters = {'epochs': 50}
estimator_parameters = {'source_dir':"Scripts",
                        'entry_point':'petrol.py',
                        'instance_type': 'ml.m5.2xlarge',
                        'instance_count': 1,
                        'model_dir': f'/opt/ml/model',
                        'role': role,
                        'hyperparameters': hyperparameters,
                        'output_path': f's3://{BUCKET}/{PREFIX}/Petrol/out',
                        'base_job_name': f'mme-cv-{model_name}',
                        'framework_version': TF_FRAMEWORK_VERSION,
                        'py_version': 'py37',
                        'script_mode': True}
estimator_petrol = TensorFlow(**estimator_parameters)
estimator_petrol.fit(inputs)

## Create Petrol Model

In [None]:
petrol_model = estimator_petrol.create_model(role=role, source_dir="Scripts", entry_point="inference.py")

# Multi Model Endpoint Creation

### Upload boston model artifact to MME S3 model path

In [None]:
from datetime import datetime
import time
current_time = datetime.fromtimestamp(time.time()).strftime('%Y-%m-%d-%H-%M-%S')
current_time
mme_model_artifacts = f's3://{BUCKET}/{PREFIX}/mme/'

In [None]:
boston_model_artifact = estimator_boston.model_data
output_boston = f's3://{BUCKET}/{PREFIX}/mme/boston.tar.gz'
!aws s3 cp {boston_model_artifact} {output_boston}

## Create Multi Data Model

Can use boston_model or any model from estimators (in this case only 2) because MME operates in a shared container.

In [None]:
mme = MultiDataModel(name=f'mme-tensorflow-{current_time}',
                     model_data_prefix=mme_model_artifacts,
                     model=boston_model,
                     sagemaker_session=sagemaker_session)

## List which models artifacts are in MME Model Location

In [None]:
list(mme.list_models())

## Deploy MME Endpoint

In [None]:
predictor = mme.deploy(initial_instance_count=1,
                       instance_type='ml.m5.2xlarge',
                       endpoint_name=f'mme-tensorflow-{current_time}')

## Test MME Boston Model

In [None]:
test = pd.read_csv('Data/Boston/train.csv')
test[:1]
testX = test.drop("TARGET", axis=1)
testX = testX[:2].values.tolist()
sampInput = {"inputs": testX}
sampInput

In [None]:
y_pred = predictor.predict(data=sampInput, initial_args={'TargetModel': 'boston.tar.gz'})
y_pred

### Upload petrol model artifact to MME S3 model path

In [None]:
petrol_model_artifact = estimator_petrol.model_data
output_petrol = f's3://{BUCKET}/{PREFIX}/mme/petrol.tar.gz'
!aws s3 cp {petrol_model_artifact} {output_petrol}

#### Notice how new models can be dynamically added to the MME endpoint by adding them to the S3 location

In [None]:
list(mme.list_models())

## Test MME Petrol Model

In [None]:
test = pd.read_csv('Data/Petrol/train.csv')
testX = test.drop("Petrol_Consumption", axis=1)
testX = testX[:2].values.tolist()
sampInput = {"inputs": testX}
sampInput

In [None]:
y_pred = predictor.predict(data=sampInput, initial_args={'TargetModel': 'petrol.tar.gz'})
y_pred

## Cleanup - Delete Endpoint

In [None]:
predictor.delete_endpoint(delete_endpoint_config=True)