# Amazon SageMaker Multi-Model Endpoints using Linear Learner
With [Amazon SageMaker multi-model endpoints](https://docs.aws.amazon.com/sagemaker/latest/dg/multi-model-endpoints.html), customers can create an endpoint that seamlessly hosts up to thousands of models. These endpoints are well suited to use cases where any one of a large number of models, which can be served from a common inference container, needs to be invokable on-demand and where it is acceptable for infrequently invoked models to incur some additional latency. For applications which require consistently low inference latency, a traditional endpoint is still the best choice.

At a high level, Amazon SageMaker manages the loading and unloading of models for a multi-model endpoint, as they are needed. When an invocation request is made for a particular model, Amazon SageMaker routes the request to an instance assigned to that model, downloads the model artifacts from S3 onto that instance, and initiates loading of the model into the memory of the container. As soon as the loading is complete, Amazon SageMaker performs the requested invocation and returns the result. If the model is already loaded in memory on the selected instance, the downloading and loading steps are skipped and the invocation is performed immediately.

Amazon SageMaker inference pipeline model consists of a sequence of containers that serve inference requests by combining preprocessing, predictions and post-processing data science tasks.  An inference pipeline allows you to apply the same preprocessing code used during model training, to process the inference request data used for predictions.

To demonstrate how multi-model endpoints are created and used with inference pipeline, this notebook provides an example using a set of Linear Learner models that each predict housing prices for a single location. This domain is used as a simple example to easily experiment with multi-model endpoints.  

This notebook showcases three MME capabilities: 
* Native MME support with Amazon SageMaker Linear Learner algorithm.  Because of the native support there is no need for you to create a custom container.  
* Native MME support with Amazon SageMaker Inference Pipelines.
* Granual InvokeModel access to multiple models hosted on the MME using IAM condition key.

To demonstrate these capabilities, the notebook discusses the use case of predicting house prices in multiple cities using linear regression.  House prices are predicted based on features like number of bedrooms, number of garages, square footage etc.  Depending on the city, the features effect the house price differently.  For example, small changes in the square footage cause a drastic change in house prices in NewYork when compared to price changes in Houston.  For accurate house price predictions, we will train multiple linear regression models, a unique location specific model per city.  


### Contents

1. [Generate synthetic data for housing models](#Generate-synthetic-data-for-housing-models)
1. [Preprocess the raw housing data using Scikit Learn model](#Preprocess-synthetic-housing-data-using-scikit-learn)
1. [Train multiple house value prediction models for multiple cities](#Train-multiple-house-value-prediction-models)
1. [Create model entity with multi model support](#Create-sagemaker-multi-model-support)
1. [Create an inference pipeline with sklearn model and MME linear learner model](#Create-inference-pipeline)
1. [Exercise the inference pipeline - Get predictions from the different  linear learner models](#Exercise-inference-pipeline)
1. [Update Multi Model Endpoint with new models](#update-models)
1. [Fine grained control for model invocation using IAM conditional keys.](#Finegrain-control-invoke-models)
1. [Latency analysis of Linear Learner MME](#Latency-analysis)
1. [Clean up](#CleanUp)


## Section 1 - Generate synthetic data for housing models <a id='Generate-synthetic-data-for-housing-models'></a>

In this section, you will generate synthetic data that will be used to train the linear learner models.  The data generated consists of 6 numerical features - the year the house was built in, house size in square feet, number of bedrooms, number of bathroom, the lot size and number of garages and two categorial features - deck and front_porch.  

In [531]:
import numpy as np
import pandas as pd
import json
import datetime
import time
import boto3
import sagemaker
import os

from time import gmtime, strftime
from random import choice

from sagemaker import get_execution_role
from sagemaker.predictor import csv_serializer

from sagemaker.multidatamodel import MULTI_MODEL_CONTAINER_MODE
from sagemaker.multidatamodel import MultiDataModel

from sklearn.model_selection import train_test_split

In [532]:
NUM_HOUSES_PER_LOCATION = 1000
LOCATIONS  = ['NewYork_NY',    'LosAngeles_CA',   'Chicago_IL',    'Houston_TX',   'Dallas_TX',
              'Phoenix_AZ',    'Philadelphia_PA', 'SanAntonio_TX', 'SanDiego_CA',  'SanFrancisco_CA']
MAX_YEAR = 2019

In [533]:
def gen_price(house):
    """Generate price based on features of the house"""
    
    base_price = int(house['SQUARE_FEET'] * 150)
    if house['FRONT_PORCH'] == 'y':
        garage = 1
    else:
        garage = 0
        
    if house['FRONT_PORCH'] == 'y':
        front_porch = 1
    else:
        front_porch = 0
        
    price = int(base_price + 10000 * house['NUM_BEDROOMS'] + \
                15000 * house['NUM_BATHROOMS'] + \
                15000 * house['LOT_ACRES'] + \
                10000 * garage + \
                10000 * front_porch + \
                15000 * house['GARAGE_SPACES'] - \
                5000 * (MAX_YEAR - house['YEAR_BUILT']))
    return price

In [534]:
def gen_yes_no():
    """Generate values (y/n) for categorical features"""
    answer = choice(['y', 'n'])
    return answer

In [535]:
def gen_random_house():
    """Generate a row of data (single house information)"""
    house = {'SQUARE_FEET':    np.random.normal(3000, 750),
              'NUM_BEDROOMS':  np.random.randint(2, 7),
              'NUM_BATHROOMS': np.random.randint(2, 7) / 2,
              'LOT_ACRES':     round(np.random.normal(1.0, 0.25), 2),
              'GARAGE_SPACES': np.random.randint(0, 4),
              'YEAR_BUILT':    min(MAX_YEAR, int(np.random.normal(1995, 10))),
              'FRONT_PORCH':   gen_yes_no(),
              'DECK':          gen_yes_no()
             }
    
    price = gen_price(house)
    
    return [house['YEAR_BUILT'],   
            house['SQUARE_FEET'], 
            house['NUM_BEDROOMS'], 
            house['NUM_BATHROOMS'], 
            house['LOT_ACRES'],    
            house['GARAGE_SPACES'],
            house['FRONT_PORCH'],    
            house['DECK'], 
            price]

In [536]:
def gen_houses(num_houses):
    """Generate housing dataset"""
    house_list = []
    
    for i in range(num_houses):
        house_list.append(gen_random_house())
        
    df = pd.DataFrame(
        house_list, 
        columns=[
            'YEAR_BUILT',    
            'SQUARE_FEET',  
            'NUM_BEDROOMS',            
            'NUM_BATHROOMS',
            'LOT_ACRES',
            'GARAGE_SPACES',
            'FRONT_PORCH',
            'DECK', 
            'PRICE']
    )
    return df
    

In [537]:
def save_data_locally(location, train, test): 
    """Save the housing data locally"""
    os.makedirs('data/{0}/train'.format(location),exist_ok=True)
    train.to_csv('data/{0}/train/train.csv'.format(location), sep=',', header=False, index=False)
       
    os.makedirs('data/{0}/test'.format(location),exist_ok=True)
    test.to_csv('data/{0}/test/test.csv'.format(location), sep=',', header=False, index=False) 

In [538]:
#Generate housing data for multiple locations.
#Change "PARALLEL_TRAINING_JOBS " to a lower number to limit the number of training jobs and models. Or to a higher value to experiment with more models.

#PARALLEL_TRAINING_JOBS = 4

for loc in LOCATIONS[:PARALLEL_TRAINING_JOBS]:
    houses = gen_houses(NUM_HOUSES_PER_LOCATION)
    
    #Spliting data into train and test in 90:10 ratio
    #Not splitting the train data into train and val because its not preprocessed yet
    train, test = train_test_split(houses, test_size=0.1)
    save_data_locally(loc, train, test)


In [539]:
#Shows the first few lines of data.
houses.head()

Unnamed: 0,YEAR_BUILT,SQUARE_FEET,NUM_BEDROOMS,NUM_BATHROOMS,LOT_ACRES,GARAGE_SPACES,FRONT_PORCH,DECK,PRICE
0,1999,3578.956826,6,2.0,1.2,0,n,n,544843
1,1993,2011.989097,6,2.5,1.08,1,n,y,300498
2,2002,3245.068849,3,2.5,1.1,2,n,n,515760
3,1992,4000.608981,3,2.5,0.48,2,y,n,589791
4,2009,5991.931901,4,3.0,1.18,3,n,n,996489


## Section 2 - Preprocess the raw housing data using Scikit Learn <a id='Preprocess-synthetic-housing-data-using-scikit-learn'></a>

In this section, the categorical features of the data (deck and porch) are pre-processed using sklearn to convert them to one hot encoding representation.  

In [540]:
sm_client = boto3.client(service_name='sagemaker')
runtime_sm_client = boto3.client(service_name='sagemaker-runtime')
sagemaker_session = sagemaker.Session()

s3 = boto3.resource('s3')
s3_client = boto3.client('s3')

BUCKET  = sagemaker_session.default_bucket()
print("BUCKET : ", BUCKET)

role = get_execution_role()
print("ROLE : ", role)

ACCOUNT_ID = boto3.client('sts').get_caller_identity()['Account']
REGION = boto3.Session().region_name

DATA_PREFIX = 'DEMO_MME_LINEAR_LEARNER'
HOUSING_MODEL_NAME = 'housing'
MULTI_MODEL_ARTIFACTS = 'multi_model_artifacts'

BUCKET :  sagemaker-us-east-1-555360056434
ROLE :  arn:aws:iam::555360056434:role/service-role/AmazonSageMaker-ExecutionRole-20200630T162532


In [541]:
#Create the SKLearn estimator with the sklearn_preprocessor.py as the script
from sagemaker.sklearn.estimator import SKLearn

script_path = 'sklearn_preprocessor.py'

sklearn_preprocessor = SKLearn(
    entry_point=script_path,
    role=role,
    train_instance_type="ml.c4.xlarge",
    sagemaker_session=sagemaker_session)

In [542]:
#Upload the raw training data to S3 bucket, to be accessed by SKLearn
#prefix = 'housing-data'
train_inputs = []

for loc in LOCATIONS[:PARALLEL_TRAINING_JOBS]:
    #WORK_DIRECTORY = "data/" + loc 

    train_input = sagemaker_session.upload_data(
        #path='{}/{}'.format(WORK_DIRECTORY + "/train/", 'train.csv'), 
        path='data/{}/train/train.csv'.format(loc),
        bucket=BUCKET,
        #key_prefix='{}/{}/{}'.format(prefix, loc, 'train')
        key_prefix='housing-data/{}/train'.format(loc)
    )
    
    train_inputs.append(train_input)
    print("Raw training data uploaded to : ", train_input)

Raw training data uploaded to :  s3://sagemaker-us-east-1-555360056434/housing-data/NewYork_NY/train/train.csv
Raw training data uploaded to :  s3://sagemaker-us-east-1-555360056434/housing-data/LosAngeles_CA/train/train.csv
Raw training data uploaded to :  s3://sagemaker-us-east-1-555360056434/housing-data/Chicago_IL/train/train.csv
Raw training data uploaded to :  s3://sagemaker-us-east-1-555360056434/housing-data/Houston_TX/train/train.csv


In [543]:
##Launch multiple scikit learn training and batch transform jobs to process the raw synthetic data generated for multiple locations.
##Before executing this, take the training instance limits in your account and cost into consideration.

#sklearn_preprocessors = []
#preprocessing_jobs = []

preprocessor_transformers = []

for index, loc in enumerate(LOCATIONS[:PARALLEL_TRAINING_JOBS]):
    #for loc in LOCATIONS[:PARALLEL_TRAINING_JOBS]:
    #index = LOCATIONS.index(loc)
    print("preprocessing fit input data at ", index , " for loc ", loc)
    job_name='scikit-learn-preprocessor-'+strftime('%Y-%m-%d-%H-%M-%S', gmtime())
    
    #sklearn_preprocessor = SKLearn(
     #   entry_point=script_path,
      #  role=role,
       # train_instance_type="ml.c4.xlarge",
        #sagemaker_session=sagemaker_session)
    
    #sklearn_preprocessor.fit({'train': train_inputs[index]}, job_name=job_name, wait=False)
    sklearn_preprocessor.fit({'train': train_inputs[index]}, job_name=job_name, wait=True)
    
    ##Once the preprocessor is fit, use tranformer to preprocess the raw training data and store the transformed data
    ##right back into s3.
    
    transformer = sklearn_preprocessor.transformer(
        instance_count=1, 
        instance_type='ml.m4.xlarge',
        assemble_with = 'Line',
        accept = 'text/csv'
    )
    
    preprocessor_transformers.append(transformer)
    
    #sklearn_preprocessors.append(sklearn_preprocessor)
    #preprocessing_jobs.append(job_name)
    ##Wait a second to avoid throttling errors
    #time.sleep(1)

preprocessing fit input data at  0  for loc  NewYork_NY
2020-07-06 00:08:58 Starting - Starting the training job...
2020-07-06 00:09:01 Starting - Launching requested ML instances......
2020-07-06 00:10:16 Starting - Preparing the instances for training......
2020-07-06 00:11:08 Downloading - Downloading input data...
2020-07-06 00:11:57 Training - Training image download completed. Training in progress..[34m2020-07-06 00:11:57,857 sagemaker-containers INFO     Imported framework sagemaker_sklearn_container.training[0m
[34m2020-07-06 00:11:57,859 sagemaker-containers INFO     No GPUs detected (normal if no gpus installed)[0m
[34m2020-07-06 00:11:57,869 sagemaker_sklearn_container.training INFO     Invoking user training script.[0m
[34m2020-07-06 00:12:58,516 sagemaker-containers INFO     Module sklearn_preprocessor does not provide a setup.py. [0m
[34mGenerating setup.py[0m
[34m2020-07-06 00:12:58,517 sagemaker-containers INFO     Generating setup.cfg[0m
[34m2020-07-06 00:

2020-07-06 00:13:41 Starting - Starting the training job...
2020-07-06 00:13:44 Starting - Launching requested ML instances.........
2020-07-06 00:15:27 Starting - Preparing the instances for training......
2020-07-06 00:16:26 Downloading - Downloading input data...
2020-07-06 00:16:52 Training - Downloading the training image.[34m2020-07-06 00:17:14,906 sagemaker-containers INFO     Imported framework sagemaker_sklearn_container.training[0m
[34m2020-07-06 00:17:14,909 sagemaker-containers INFO     No GPUs detected (normal if no gpus installed)[0m
[34m2020-07-06 00:17:14,919 sagemaker_sklearn_container.training INFO     Invoking user training script.[0m
[34m2020-07-06 00:17:15,484 sagemaker-containers INFO     Module sklearn_preprocessor does not provide a setup.py. [0m
[34mGenerating setup.py[0m
[34m2020-07-06 00:17:15,485 sagemaker-containers INFO     Generating setup.cfg[0m
[34m2020-07-06 00:17:15,485 sagemaker-containers INFO     Generating MANIFEST.in[0m
[34m2020-07

2020-07-06 00:17:56 Starting - Launching requested ML instances......
2020-07-06 00:19:13 Starting - Preparing the instances for training......
2020-07-06 00:20:14 Downloading - Downloading input data...
2020-07-06 00:20:45 Training - Downloading the training image..[34m2020-07-06 00:21:07,308 sagemaker-containers INFO     Imported framework sagemaker_sklearn_container.training[0m
[34m2020-07-06 00:21:07,310 sagemaker-containers INFO     No GPUs detected (normal if no gpus installed)[0m
[34m2020-07-06 00:21:07,320 sagemaker_sklearn_container.training INFO     Invoking user training script.[0m
[34m2020-07-06 00:21:07,828 sagemaker-containers INFO     Module sklearn_preprocessor does not provide a setup.py. [0m
[34mGenerating setup.py[0m
[34m2020-07-06 00:21:07,828 sagemaker-containers INFO     Generating setup.cfg[0m
[34m2020-07-06 00:21:07,828 sagemaker-containers INFO     Generating MANIFEST.in[0m
[34m2020-07-06 00:21:07,828 sagemaker-containers INFO     Installing modu

2020-07-06 00:23:22 Starting - Preparing the instances for training......
2020-07-06 00:24:20 Downloading - Downloading input data...
2020-07-06 00:25:07 Training - Training image download completed. Training in progress...[34m2020-07-06 00:25:09,025 sagemaker-containers INFO     Imported framework sagemaker_sklearn_container.training[0m
[34m2020-07-06 00:25:09,027 sagemaker-containers INFO     No GPUs detected (normal if no gpus installed)[0m
[34m2020-07-06 00:25:09,037 sagemaker_sklearn_container.training INFO     Invoking user training script.[0m
[34m2020-07-06 00:25:09,476 sagemaker-containers INFO     Module sklearn_preprocessor does not provide a setup.py. [0m
[34mGenerating setup.py[0m
[34m2020-07-06 00:25:09,477 sagemaker-containers INFO     Generating setup.cfg[0m
[34m2020-07-06 00:25:09,477 sagemaker-containers INFO     Generating MANIFEST.in[0m
[34m2020-07-06 00:25:09,477 sagemaker-containers INFO     Installing module with the following command:[0m
[34m/min

In [544]:
def wait_for_training_job_to_complete(job_name):
    """ Wait for the training job to complete """
    print('Waiting for job {} to complete...'.format(job_name))
    
    waiter = sm_client.get_waiter('training_job_completed_or_stopped')
    waiter.wait(TrainingJobName=job_name)

In [545]:
#preprocessor_transformers = []

#for index, loc in enumerate(LOCATIONS[:PARALLEL_TRAINING_JOBS]):
 #   print("Transforming input data at ", index , " for loc ", loc)
    
    #ALREADY sklearn_preprocessor = sklearn_preprocessors[index]
    
    #ALREADY print ("Using the preprocessor ", sklearn_preprocessor)
    
    #transformer = sklearn_preprocessor.transformer(
     #   instance_count=1, 
      #  instance_type='ml.m4.xlarge',
       # assemble_with = 'Line',
        #accept = 'text/csv'
    #)
    
    #preprocessor_transformers.append(transformer)

In [546]:
def wait_for_batch_transform_job_to_complete(job_name):
    """Wait for the batch transform job to complete"""
    print('Waiting for job {} to complete...'.format(job_name))
    
    waiter = sm_client.get_waiter('transform_job_completed_or_stopped')
    waiter.wait(TransformJobName=job_name)

In [547]:
# Preprocess training input
preprocessed_train_data_path = []

for transformer in preprocessor_transformers: 
    index = preprocessor_transformers.index(transformer)
    transformer.transform(train_inputs[index], content_type='text/csv')
    print('Launching batch transform job: ' + transformer.latest_transform_job.job_name)
    preprocessed_train_data_path.append(transformer.output_path)

Launching batch transform job: sagemaker-scikit-learn-2020-07-06-00-25-49-730
Launching batch transform job: sagemaker-scikit-learn-2020-07-06-00-25-50-048
Launching batch transform job: sagemaker-scikit-learn-2020-07-06-00-25-52-568
Launching batch transform job: sagemaker-scikit-learn-2020-07-06-00-25-56-165


In [548]:
#Wait for all the batch transform jobs to finish
for transformer in preprocessor_transformers: 
    job_name=transformer.latest_transform_job.job_name
    wait_for_batch_transform_job_to_complete(job_name)

Waiting for job sagemaker-scikit-learn-2020-07-06-00-25-49-730 to complete...
Waiting for job sagemaker-scikit-learn-2020-07-06-00-25-50-048 to complete...
Waiting for job sagemaker-scikit-learn-2020-07-06-00-25-52-568 to complete...
Waiting for job sagemaker-scikit-learn-2020-07-06-00-25-56-165 to complete...


In [549]:
##Download the preprocessed data, split into train and val, upload back to S3 in the same directory as tranformer output path
for transformer in preprocessor_transformers: 
    index = preprocessor_transformers.index(transformer)
    transformer_output_key='{}/{}'.format(transformer.latest_transform_job.job_name, 'train.csv.out') 
    
    preprocessed_data_download_dir = '{}/'.format("preprocessed-data/"+LOCATIONS[index])
    
    sagemaker_session.download_data(
        path=preprocessed_data_download_dir, 
        bucket=BUCKET,
        key_prefix=transformer_output_key
    )
    
    print("transformer_output_key:", transformer_output_key )
    print("Download directory:", preprocessed_data_download_dir )
    
    train_df = pd.read_csv(preprocessed_data_download_dir+"/train.csv.out")
    
    #Spliting data into train and test in 70:30 ratio
    _train, _val = train_test_split(train_df, test_size=0.3)
    
    _train.to_csv(preprocessed_data_download_dir+'train.csv', sep=',', header=False, index=False)
    _val.to_csv(preprocessed_data_download_dir+'val.csv', sep=',', header=False, index=False)
    
    
    train_input = sagemaker_session.upload_data(
        path='{}/{}'.format(preprocessed_data_download_dir, 'train.csv'), 
        bucket=BUCKET,
        key_prefix='{}'.format(transformer.latest_transform_job.job_name, 'train.csv'))
    
    val_input = sagemaker_session.upload_data(
        path='{}/{}'.format(preprocessed_data_download_dir, 'val.csv'), 
        bucket=BUCKET,
        key_prefix='{}'.format(transformer.latest_transform_job.job_name, 'val.csv'))

transformer_output_key: sagemaker-scikit-learn-2020-07-06-00-25-49-730/train.csv.out
Download directory: preprocessed-data/NewYork_NY/
transformer_output_key: sagemaker-scikit-learn-2020-07-06-00-25-50-048/train.csv.out
Download directory: preprocessed-data/LosAngeles_CA/
transformer_output_key: sagemaker-scikit-learn-2020-07-06-00-25-52-568/train.csv.out
Download directory: preprocessed-data/Chicago_IL/
transformer_output_key: sagemaker-scikit-learn-2020-07-06-00-25-56-165/train.csv.out
Download directory: preprocessed-data/Houston_TX/


In [550]:
##S3 location of the preprocessed data
for preprocessed_train_data in preprocessed_train_data_path : 
    print(preprocessed_train_data)

s3://sagemaker-us-east-1-555360056434/sagemaker-scikit-learn-2020-07-06-00-25-49-730
s3://sagemaker-us-east-1-555360056434/sagemaker-scikit-learn-2020-07-06-00-25-50-048
s3://sagemaker-us-east-1-555360056434/sagemaker-scikit-learn-2020-07-06-00-25-52-568
s3://sagemaker-us-east-1-555360056434/sagemaker-scikit-learn-2020-07-06-00-25-56-165


In [551]:
for index, loc in enumerate(LOCATIONS[:PARALLEL_TRAINING_JOBS]):
    preprocessed_data_download_dir = '{}/'.format("preprocessed-data/"+LOCATIONS[index])
    path='{}/{}'.format(preprocessed_data_download_dir, 'train.csv')

## Section 3 : Train house value prediction models for multiple cities <a id='Train-multiple-house-value-prediction-models'></a>

In this section, you will use the preprocessed housing data to train multiple linear learner models.

In [552]:
from sagemaker.amazon.amazon_estimator import get_image_uri
container = get_image_uri(boto3.Session().region_name, 'linear-learner')

### Launch a single training job for a given housing location
There is nothing specific to multi-model endpoints in terms of the models it will host. They are trained in the same way as all other SageMaker models. Here we are using the Linear Learner estimator and not waiting for the job to complete.

In [553]:
def launch_training_job(location, transformer):
    """Launch a linear learner traing job"""
    # clear out old versions of the data
    #s3_bucket = s3.Bucket(BUCKET)
    #full_input_prefix = '{}/model_prep/{}'.format(DATA_PREFIX, location)
    #s3_bucket.objects.filter(Prefix=full_input_prefix + '/').delete()
    
    train_inputs = transformer.output_path+"/train.csv"
    val_inputs = transformer.output_path+"/val.csv"
    
    print("train_inputs:", train_inputs)
    print("val_inputs:", val_inputs)
     
    #_job = 'll-{}'.format(location.replace('_', '-'))  ##CHECK
    full_output_prefix = '{}/model_artifacts/{}'.format(DATA_PREFIX, location)
    s3_output_path = 's3://{}/{}'.format(BUCKET, full_output_prefix)
    
    linear_estimator = sagemaker.estimator.Estimator(
                            container,
                            role, 
                            train_instance_count=1, 
                            train_instance_type='ml.c4.xlarge',
                            output_path=s3_output_path,
                            sagemaker_session=sagemaker_session)
    
    linear_estimator.set_hyperparameters(feature_dim=10,
                           mini_batch_size=100,
                           predictor_type='regressor',
                           epochs=10,
                           num_models=32,
                           loss='absolute_loss')
    
    DISTRIBUTION_MODE = 'FullyReplicated'
    train_input = sagemaker.s3_input(s3_data=train_inputs, 
                                     distribution=DISTRIBUTION_MODE, content_type='text/csv;label_size=1')
    val_input   = sagemaker.s3_input(s3_data=val_inputs,
                                     distribution=DISTRIBUTION_MODE, content_type='text/csv;label_size=1')
    
    remote_inputs = {'train': train_input, 'validation': val_input}
     
    linear_estimator.fit(remote_inputs, wait=False)
   
    #return linear_estimator, linear_estimator.latest_training_job.name
    return linear_estimator.latest_training_job.name

### Kick off a model training job for each housing location

In [555]:
training_jobs = []
#linear_estimators = []

#PARALLEL_TRAINING_JOBS = 4 
    
for transformer,loc in zip(preprocessor_transformers, LOCATIONS[:PARALLEL_TRAINING_JOBS]): 
    job = launch_training_job(loc, transformer)
    training_jobs.append(job)
    #linear_estimators.append(estimator)
    
print('{} training jobs launched: {}'.format(len(training_jobs), training_jobs))

train_inputs: s3://sagemaker-us-east-1-555360056434/sagemaker-scikit-learn-2020-07-06-00-25-49-730/train.csv
val_inputs: s3://sagemaker-us-east-1-555360056434/sagemaker-scikit-learn-2020-07-06-00-25-49-730/val.csv
train_inputs: s3://sagemaker-us-east-1-555360056434/sagemaker-scikit-learn-2020-07-06-00-25-50-048/train.csv
val_inputs: s3://sagemaker-us-east-1-555360056434/sagemaker-scikit-learn-2020-07-06-00-25-50-048/val.csv
train_inputs: s3://sagemaker-us-east-1-555360056434/sagemaker-scikit-learn-2020-07-06-00-25-52-568/train.csv
val_inputs: s3://sagemaker-us-east-1-555360056434/sagemaker-scikit-learn-2020-07-06-00-25-52-568/val.csv
train_inputs: s3://sagemaker-us-east-1-555360056434/sagemaker-scikit-learn-2020-07-06-00-25-56-165/train.csv
val_inputs: s3://sagemaker-us-east-1-555360056434/sagemaker-scikit-learn-2020-07-06-00-25-56-165/val.csv
4 training jobs launched: ['linear-learner-2020-07-06-00-32-26-786', 'linear-learner-2020-07-06-00-32-26-939', 'linear-learner-2020-07-06-00-32-

### Wait for all  training jobs to finish

In [556]:
#Wait for the jobs to finish
for job_name in training_jobs:
    wait_for_training_job_to_complete(job_name)

Waiting for job linear-learner-2020-07-06-00-32-26-786 to complete...
Waiting for job linear-learner-2020-07-06-00-32-26-939 to complete...
Waiting for job linear-learner-2020-07-06-00-32-27-451 to complete...
Waiting for job linear-learner-2020-07-06-00-32-31-352 to complete...


## Section 4 - Create Sagemaker model with multi model support <a id='Create-sagemaker-multi-model-support'></a>

In [557]:
import re
def parse_model_artifacts(model_data_url):
    # extract the s3 key from the full url to the model artifacts
    _s3_key = model_data_url.split('s3://{}/'.format(BUCKET))[1]
    # get the part of the key that identifies the model within the model artifacts folder
    _model_name_plus = _s3_key[_s3_key.find('model_artifacts') + len('model_artifacts') + 1:]
    # finally, get the unique model name (e.g., "NewYork_NY")
    _model_name = re.findall('^(.*?)/', _model_name_plus)[0]
    return _s3_key, _model_name 

In [558]:
# make a copy of the model artifacts from the original output of the training job to the place in
# s3 where the multi model endpoint will dynamically load individual models
#MULTI_MODEL_ARTIFACTS='model_artifacts'
def deploy_artifacts_to_mme(job_name):
    print("job_name :", job_name)
    _resp = sm_client.describe_training_job(TrainingJobName=job_name)
    _source_s3_key, _model_name = parse_model_artifacts(_resp['ModelArtifacts']['S3ModelArtifacts'])
    _copy_source = {'Bucket': BUCKET, 'Key': _source_s3_key}
    _key = '{}/{}/{}/{}.tar.gz'.format(DATA_PREFIX, MULTI_MODEL_ARTIFACTS, _model_name,_model_name)
    
    print('Copying {} model\n   from: {}\n     to: {}...'.format(_model_name, _source_s3_key, _key))
    s3_client.copy_object(Bucket=BUCKET, CopySource=_copy_source, Key=_key)
    #return _key

In [559]:
# First, clear out old versions of the model artifacts from previous runs of this notebook
#s3 = boto3.resource('s3')
#s3_bucket = s3.Bucket(BUCKET)
#print(BUCKET)
full_input_prefix = '{}/multi_model_artifacts'.format(DATA_PREFIX)
print('Removing old model artifacts from {}'.format(full_input_prefix))
s3_bucket.objects.filter(Prefix=full_input_prefix + '/').delete()

Removing old model artifacts from DEMO_MME_LINEAR_LEARNER/multi_model_artifacts


[{'ResponseMetadata': {'RequestId': '3D690B7EDC723B9E',
   'HostId': 'a+yb61YYYCfMML9AFWpjNz6gX71SQ6sdwXiEQNDfprQYGUkTumJNL8pDr/71Y8PIM45pgoFbsNc=',
   'HTTPStatusCode': 200,
   'HTTPHeaders': {'x-amz-id-2': 'a+yb61YYYCfMML9AFWpjNz6gX71SQ6sdwXiEQNDfprQYGUkTumJNL8pDr/71Y8PIM45pgoFbsNc=',
    'x-amz-request-id': '3D690B7EDC723B9E',
    'date': 'Mon, 06 Jul 2020 00:41:41 GMT',
    'connection': 'close',
    'content-type': 'application/xml',
    'transfer-encoding': 'chunked',
    'server': 'AmazonS3'},
   'RetryAttempts': 0},
  'Deleted': [{'Key': 'DEMO_MME_LINEAR_LEARNER/multi_model_artifacts/Chicago_IL/Chicago_IL.tar.gz'},
   {'Key': 'DEMO_MME_LINEAR_LEARNER/multi_model_artifacts/LosAngeles_CA/LosAngeles_CA.tar.gz'},
   {'Key': 'DEMO_MME_LINEAR_LEARNER/multi_model_artifacts/NewYork_NY/NewYork_NY.tar.gz'}]}]

In [560]:
## Deploy all but the last model trained to MME
for job_name in training_jobs[:-1]:
    deploy_artifacts_to_mme(job_name)

job_name : linear-learner-2020-07-06-00-32-26-786
Copying NewYork_NY model
   from: DEMO_MME_LINEAR_LEARNER/model_artifacts/NewYork_NY/linear-learner-2020-07-06-00-32-26-786/output/model.tar.gz
     to: DEMO_MME_LINEAR_LEARNER/multi_model_artifacts/NewYork_NY/NewYork_NY.tar.gz...
job_name : linear-learner-2020-07-06-00-32-26-939
Copying LosAngeles_CA model
   from: DEMO_MME_LINEAR_LEARNER/model_artifacts/LosAngeles_CA/linear-learner-2020-07-06-00-32-26-939/output/model.tar.gz
     to: DEMO_MME_LINEAR_LEARNER/multi_model_artifacts/LosAngeles_CA/LosAngeles_CA.tar.gz...
job_name : linear-learner-2020-07-06-00-32-27-451
Copying Chicago_IL model
   from: DEMO_MME_LINEAR_LEARNER/model_artifacts/Chicago_IL/linear-learner-2020-07-06-00-32-27-451/output/model.tar.gz
     to: DEMO_MME_LINEAR_LEARNER/multi_model_artifacts/Chicago_IL/Chicago_IL.tar.gz...


In [561]:
MODEL_NAME = '{}-{}'.format(HOUSING_MODEL_NAME, strftime('%Y-%m-%d-%H-%M-%S', gmtime()))

_model_url  = 's3://{}/{}/{}/'.format(BUCKET, DATA_PREFIX, MULTI_MODEL_ARTIFACTS)

ll_multi_model = MultiDataModel(
        name=MODEL_NAME,
        model_data_prefix=_model_url,
        image=container,
        role=role,
        sagemaker_session=sagemaker_session
    )

## Section 5 : Create an inference pipeline with sklearn model and MME linear learner model <a id='Create-inference-pipeline'></a>

Set up the inference pipeline using the Pipeline Model API.  This sets up a list of models in a single endpoint; In this example, we configure our pipeline model with the fitted Scikit-learn inference model and the fitted Linear Learner model.

In [562]:
from sagemaker.model import Model
from sagemaker.pipeline import PipelineModel
import boto3
from time import gmtime, strftime

timestamp_prefix = strftime("%Y-%m-%d-%H-%M-%S", gmtime())

scikit_learn_inference_model = sklearn_preprocessor.create_model()

model_name = 'inference-pipeline-' + timestamp_prefix
endpoint_name = 'inference-pipeline-ep-' + timestamp_prefix
sm_model = PipelineModel(
    name=model_name, 
    role=role, 
    sagemaker_session=sagemaker_session,
    models=[
        scikit_learn_inference_model, 
        ll_multi_model])

sm_model.deploy(initial_instance_count=1, instance_type='ml.m4.xlarge', endpoint_name=endpoint_name)

-------------------!

In [None]:
##DELETE AFTER VALIDATING THE FINE GRAINED ACCESS
##endpoint_name = 'inference-pipeline-ep-2020-06-30-20-29-20'

In [None]:
#endpoint_name

## Section 6 :  Exercise the inference pipeline - Get predictions from  different  linear learner models. <a id='Exercise-inference-pipeline'></a>

In [563]:
#Create RealTimePredictor
from sagemaker.predictor import json_serializer, csv_serializer, json_deserializer, RealTimePredictor
from sagemaker.content_types import CONTENT_TYPE_CSV, CONTENT_TYPE_JSON

predictor = RealTimePredictor(
    endpoint=endpoint_name,
    sagemaker_session=sagemaker_session,
    serializer=csv_serializer,
    content_type=CONTENT_TYPE_CSV,
    accept=CONTENT_TYPE_JSON)

In [564]:
def predict_one_house_value(features, model_name, predictor_to_use):
    print('Using model {} to predict price of this house: {}'.format(model_name,
                                                                     features))
    body = ','.join(map(str, features)) + '\n'
    start_time = time.time()
     
    response = predictor_to_use.predict(features, target_model=model_name)
    
    response_json = json.loads(response)
        
    predicted_value = response_json['predictions'][0]['score']    
    
    duration = time.time() - start_time
    
    print('${:,.2f}, took {:,d} ms\n'.format(predicted_value, int(duration * 1000)))

In [565]:
for i in range(10):
    #model_name = LOCATIONS[np.random.randint(1, len(LOCATIONS[:PARALLEL_TRAINING_JOBS - 1]))]
    model_name = LOCATIONS[np.random.randint(1, PARALLEL_TRAINING_JOBS - 1)]
    full_model_name = '{}/{}.tar.gz'.format(model_name,model_name)
    predict_one_house_value(gen_random_house()[:-1], full_model_name,predictor)

Using model LosAngeles_CA/LosAngeles_CA.tar.gz to predict price of this house: [2001, 2965.889936279411, 6, 2.0, 0.94, 1, 'y', 'n']
$491,033.69, took 1,605 ms

Using model Chicago_IL/Chicago_IL.tar.gz to predict price of this house: [1993, 3918.5819638165285, 5, 3.0, 0.83, 3, 'y', 'y']
$630,143.50, took 1,097 ms

Using model Chicago_IL/Chicago_IL.tar.gz to predict price of this house: [2002, 3390.27810269848, 6, 1.5, 0.7, 0, 'n', 'y']
$522,261.44, took 50 ms

Using model LosAngeles_CA/LosAngeles_CA.tar.gz to predict price of this house: [2009, 4729.440394975367, 4, 2.5, 0.61, 0, 'n', 'n']
$761,033.50, took 43 ms

Using model Chicago_IL/Chicago_IL.tar.gz to predict price of this house: [2008, 3617.635669845823, 5, 1.5, 0.6, 2, 'y', 'n']
$618,076.25, took 59 ms

Using model LosAngeles_CA/LosAngeles_CA.tar.gz to predict price of this house: [1987, 3465.345261682198, 5, 2.0, 1.31, 2, 'n', 'n']
$490,928.34, took 46 ms

Using model LosAngeles_CA/LosAngeles_CA.tar.gz to predict price of this 

## Section 7 - Add new model to the endpoint, simply by copying the model artifact to the S3 location
<a id='update-models'></a>

In [566]:
## Copy the last model
last_training_job=training_jobs[PARALLEL_TRAINING_JOBS-1]
deploy_artifacts_to_mme(last_training_job)

job_name : linear-learner-2020-07-06-00-32-31-352
Copying Houston_TX model
   from: DEMO_MME_LINEAR_LEARNER/model_artifacts/Houston_TX/linear-learner-2020-07-06-00-32-31-352/output/model.tar.gz
     to: DEMO_MME_LINEAR_LEARNER/multi_model_artifacts/Houston_TX/Houston_TX.tar.gz...


In [567]:
model_name = LOCATIONS[PARALLEL_TRAINING_JOBS-1]
full_model_name = '{}/{}.tar.gz'.format(model_name,model_name)
predict_one_house_value(gen_random_house()[:-1], full_model_name,predictor)

Using model Houston_TX/Houston_TX.tar.gz to predict price of this house: [1985, 3457.8311086653835, 4, 2.5, 1.1, 2, 'y', 'n']
$499,028.28, took 1,185 ms



## Section 8 - Latency Analysis <a id='Latency-Analysis'></a>

With MME, the models are dynamically loaded into the container’s memory of the instance hosting the endpoint when invoked.  Therefore, the model invocation may longer when it is invoked for the first time. And after the model is already in the instance container’s memory, the subsequent invocations will be faster. If an instance memory utilization is high and a new model needs to be loaded then unused models are unloaded.  The unloaded models will remain in the instance’s storage volume and can be loaded into container’s memory later without being downloaded from the S3 bucket again.  If the instance’s storage volume if full, unused models are deleted from storage volume.    
Managing the loading/unloading of the models is completely handled by Amazon SageMaker behind the scenes without you having to take any specific actions.  However, it is important to understand this behavior because it has implications on the model invocation latency.

Amazon SageMaker provides CloudWatch metrics for multi-model endpoints so you can determine the endpoint usage and the cache hit rate and optimize your endpoint.  To analyze the endpoint and the container behavior, you will invoke multiple models in this sequence :

    a. Create 200 copies of the original model and save with different names.
    b. Starting with no models loaded into the container, Invoke the first 100 models
    c. Invoke the same 100 models again
    d. Invoke all 200 models

We use this sequence to observe the behavior of the CloudWatch metrics - LoadedModelCount,MemoryUtilization and ModelCacheHit.  You are encouraged to experiment with loading varying number of models to use the CloudWatch charts to help make ongoing decisions on the optimal choice of instance type, instance count, and number of models that a given endpoint should host.



In [579]:
# Make a copy of the model artifacts in S3 bucket with new names so we have multiple models to understand the latency behavior.
def copy_additional_artifacts_to_mme(num_copies):
    
    #_model_prefix = "Houston_TX"
    #_model_name = "Houston_TX"
    #source_s3_model_key = '{}/{}/{}/{}.tar.gz'.format(DATA_PREFIX, MULTI_MODEL_ARTIFACTS, _model_prefix,_model_name)
    source_s3_model_key = '{}/{}/{}/{}.tar.gz'.format(DATA_PREFIX, MULTI_MODEL_ARTIFACTS, model_name,model_name)
    _copy_source = {'Bucket': BUCKET, 'Key': source_s3_model_key}
    for i in range(num_copies):
        copy_num = str(i)
        new_model_name="{}_{}".format(i, model_name)
        #_new_model_name=copy_num + "_" + _model_name
        dest_s3_model_key = '{}/{}/{}/{}.tar.gz'.format(DATA_PREFIX, MULTI_MODEL_ARTIFACTS, _model_prefix,new_model_name)
        #print('Copying {} model\n   from: {}\n     to: {}...'.format(_model_name, source_s3_model_key, dest_s3_model_key))
        s3_client.copy_object(Bucket=BUCKET, CopySource=_copy_source, Key=dest_s3_model_key)
    return

In [580]:
##Create 200 copies of the original model and save with different names.
copy_additional_artifacts_to_mme(200)

In [581]:
##Invoke multiple models in a loop
#_model_prefix = "Houston_TX"
#_model_name = "Houston_TX.tar.gz"
def invoke_multiple_models_mme(model_range_low,model_range_high):
    for i in range(model_range_low,model_range_high):
        copy_num = str(i)
        _new_model_name=copy_num + "_" + _model_name
        full_model_name = _model_prefix + "/" + _new_model_name
        predict_one_house_value(gen_random_house()[:-1], full_model_name,predictor)
        time.sleep(5)


In [582]:
##Starting with no models loaded into the container
##Invoke the first 100 models
invoke_multiple_models_mme(0,100)

Using model Houston_TX/0_Houston_TX.tar.gz to predict price of this house: [1997, 2565.0242683195997, 5, 2.0, 0.89, 2, 'y', 'y']
$419,119.69, took 1,185 ms

Using model Houston_TX/1_Houston_TX.tar.gz to predict price of this house: [2001, 2124.638834058075, 2, 1.5, 0.89, 1, 'n', 'n']
$299,461.91, took 1,177 ms

Using model Houston_TX/2_Houston_TX.tar.gz to predict price of this house: [1999, 3846.298691252199, 4, 2.5, 0.89, 3, 'y', 'n']
$636,314.62, took 1,085 ms

Using model Houston_TX/3_Houston_TX.tar.gz to predict price of this house: [2001, 1534.997026116037, 2, 2.5, 0.82, 1, 'n', 'y']
$222,500.02, took 1,123 ms

Using model Houston_TX/4_Houston_TX.tar.gz to predict price of this house: [1979, 3820.2379746770575, 6, 1.0, 0.83, 1, 'n', 'n']
$482,779.06, took 1,059 ms

Using model Houston_TX/5_Houston_TX.tar.gz to predict price of this house: [1997, 3477.045397952993, 2, 1.0, 1.19, 1, 'n', 'n']
$483,016.19, took 1,064 ms

Using model Houston_TX/6_Houston_TX.tar.gz to predict price of

$370,235.91, took 1,054 ms

Using model Houston_TX/53_Houston_TX.tar.gz to predict price of this house: [1976, 3435.711022762174, 4, 2.0, 0.81, 3, 'n', 'n']
$433,348.91, took 1,073 ms

Using model Houston_TX/54_Houston_TX.tar.gz to predict price of this house: [1994, 3057.4187892410355, 3, 2.0, 1.27, 0, 'n', 'y']
$415,964.09, took 1,094 ms

Using model Houston_TX/55_Houston_TX.tar.gz to predict price of this house: [1997, 3619.9813391122443, 6, 3.0, 1.15, 0, 'y', 'y']
$580,307.00, took 1,078 ms

Using model Houston_TX/56_Houston_TX.tar.gz to predict price of this house: [1998, 1373.4094471228366, 6, 1.0, 1.38, 1, 'y', 'y']
$233,503.53, took 1,066 ms

Using model Houston_TX/57_Houston_TX.tar.gz to predict price of this house: [2012, 2374.4588553443946, 4, 1.0, 0.88, 0, 'n', 'n']
$389,029.50, took 1,033 ms

Using model Houston_TX/58_Houston_TX.tar.gz to predict price of this house: [1974, 2638.7830126166587, 2, 2.0, 0.73, 0, 'n', 'y']
$234,718.50, took 1,062 ms

Using model Houston_TX/59

In [583]:
##Invoke the same 100 models again
invoke_multiple_models_mme(0,100)

Using model Houston_TX/0_Houston_TX.tar.gz to predict price of this house: [2004, 2794.484558800993, 6, 1.0, 0.57, 0, 'y', 'n']
$449,006.00, took 44 ms

Using model Houston_TX/1_Houston_TX.tar.gz to predict price of this house: [1976, 2192.398435096421, 5, 2.0, 0.89, 3, 'n', 'y']
$255,495.86, took 36 ms

Using model Houston_TX/2_Houston_TX.tar.gz to predict price of this house: [1993, 3384.7252863609297, 5, 3.0, 0.52, 1, 'n', 'y']
$497,157.50, took 36 ms

Using model Houston_TX/3_Houston_TX.tar.gz to predict price of this house: [1979, 3577.556959770925, 2, 1.5, 1.23, 1, 'n', 'y']
$417,818.12, took 36 ms

Using model Houston_TX/4_Houston_TX.tar.gz to predict price of this house: [1972, 2129.4818467259574, 4, 3.0, 1.2, 0, 'y', 'n']
$215,097.62, took 35 ms

Using model Houston_TX/5_Houston_TX.tar.gz to predict price of this house: [1988, 4667.519348033415, 3, 1.0, 1.13, 3, 'n', 'y']
$656,766.62, took 38 ms

Using model Houston_TX/6_Houston_TX.tar.gz to predict price of this house: [2003,

Using model Houston_TX/54_Houston_TX.tar.gz to predict price of this house: [1999, 4006.755219480705, 4, 1.5, 1.36, 3, 'y', 'y']
$652,765.44, took 35 ms

Using model Houston_TX/55_Houston_TX.tar.gz to predict price of this house: [2006, 3302.0395404577816, 4, 1.5, 0.95, 2, 'y', 'n']
$558,970.75, took 36 ms

Using model Houston_TX/56_Houston_TX.tar.gz to predict price of this house: [2006, 3702.8977963297243, 4, 2.5, 0.73, 2, 'n', 'n']
$610,570.81, took 36 ms

Using model Houston_TX/57_Houston_TX.tar.gz to predict price of this house: [2002, 1715.2535238946057, 5, 1.0, 0.6, 0, 'y', 'y']
$264,349.12, took 35 ms

Using model Houston_TX/58_Houston_TX.tar.gz to predict price of this house: [1999, 2185.608272580093, 5, 1.5, 1.37, 3, 'n', 'y']
$367,545.78, took 35 ms

Using model Houston_TX/59_Houston_TX.tar.gz to predict price of this house: [1988, 3962.879224134585, 4, 2.0, 0.9, 2, 'y', 'y']
$576,875.00, took 37 ms

Using model Houston_TX/60_Houston_TX.tar.gz to predict price of this house:

In [584]:
##This itme invoke all 200 models to observe behavior
invoke_multiple_models_mme(0,200)

Using model Houston_TX/0_Houston_TX.tar.gz to predict price of this house: [2000, 3568.0627454483993, 5, 3.0, 0.96, 2, 'y', 'y']
$602,203.12, took 34 ms

Using model Houston_TX/1_Houston_TX.tar.gz to predict price of this house: [1998, 2827.705072295027, 5, 3.0, 0.9, 0, 'y', 'y']
$449,849.25, took 34 ms

Using model Houston_TX/2_Houston_TX.tar.gz to predict price of this house: [1994, 3995.3203388096554, 2, 2.5, 1.0, 0, 'n', 'n']
$551,548.56, took 35 ms

Using model Houston_TX/3_Houston_TX.tar.gz to predict price of this house: [2015, 2380.6663201950896, 2, 1.0, 1.16, 2, 'y', 'y']
$436,848.09, took 35 ms

Using model Houston_TX/4_Houston_TX.tar.gz to predict price of this house: [1985, 2406.790788985759, 6, 1.0, 0.86, 1, 'n', 'n']
$298,406.50, took 35 ms

Using model Houston_TX/5_Houston_TX.tar.gz to predict price of this house: [1996, 4779.7331536301945, 2, 1.0, 0.86, 2, 'y', 'n']
$703,882.44, took 37 ms

Using model Houston_TX/6_Houston_TX.tar.gz to predict price of this house: [2009

Using model Houston_TX/54_Houston_TX.tar.gz to predict price of this house: [2009, 3119.340980111573, 6, 1.5, 1.03, 3, 'n', 'y']
$561,295.38, took 39 ms

Using model Houston_TX/55_Houston_TX.tar.gz to predict price of this house: [2006, 4098.558783815178, 4, 2.0, 0.95, 0, 'n', 'y']
$636,054.94, took 36 ms

Using model Houston_TX/56_Houston_TX.tar.gz to predict price of this house: [1981, 3588.4696347421045, 4, 2.0, 1.06, 1, 'n', 'y']
$454,515.53, took 36 ms

Using model Houston_TX/57_Houston_TX.tar.gz to predict price of this house: [1987, 2416.3885544447107, 4, 2.5, 1.22, 3, 'n', 'n']
$347,895.06, took 35 ms

Using model Houston_TX/58_Houston_TX.tar.gz to predict price of this house: [1983, 2755.0087500935538, 6, 1.0, 0.81, 2, 'y', 'n']
$375,323.31, took 36 ms

Using model Houston_TX/59_Houston_TX.tar.gz to predict price of this house: [1994, 3097.5411074680205, 4, 2.0, 1.14, 2, 'y', 'y']
$479,563.69, took 35 ms

Using model Houston_TX/60_Houston_TX.tar.gz to predict price of this hou

Using model Houston_TX/107_Houston_TX.tar.gz to predict price of this house: [1994, 3428.296601149494, 5, 3.0, 0.92, 3, 'n', 'n']
$547,376.94, took 1,098 ms

Using model Houston_TX/108_Houston_TX.tar.gz to predict price of this house: [2006, 2278.9185222287347, 5, 2.5, 1.18, 0, 'n', 'y']
$383,129.38, took 1,129 ms

Using model Houston_TX/109_Houston_TX.tar.gz to predict price of this house: [2001, 2370.5332753950756, 3, 3.0, 0.77, 3, 'n', 'y']
$395,596.66, took 1,603 ms

Using model Houston_TX/110_Houston_TX.tar.gz to predict price of this house: [1991, 2740.176776668868, 5, 2.5, 0.77, 0, 'y', 'n']
$394,140.69, took 1,066 ms

Using model Houston_TX/111_Houston_TX.tar.gz to predict price of this house: [2004, 2321.1918414062934, 3, 1.0, 1.37, 1, 'y', 'n']
$376,043.47, took 1,164 ms

Using model Houston_TX/112_Houston_TX.tar.gz to predict price of this house: [1994, 1543.6141118970802, 3, 1.5, 0.93, 0, 'n', 'y']
$172,503.56, took 1,105 ms

Using model Houston_TX/113_Houston_TX.tar.gz to 

$440,087.19, took 1,077 ms

Using model Houston_TX/159_Houston_TX.tar.gz to predict price of this house: [2003, 2092.5380512405945, 3, 1.0, 0.88, 2, 'n', 'y']
$320,000.81, took 1,084 ms

Using model Houston_TX/160_Houston_TX.tar.gz to predict price of this house: [2004, 2976.420693765037, 5, 2.0, 1.11, 2, 'y', 'n']
$521,224.50, took 1,126 ms

Using model Houston_TX/161_Houston_TX.tar.gz to predict price of this house: [2005, 2253.2190916737027, 5, 2.0, 0.59, 2, 'n', 'y']
$384,708.50, took 1,075 ms

Using model Houston_TX/162_Houston_TX.tar.gz to predict price of this house: [1987, 2743.220406909226, 6, 2.5, 1.16, 2, 'n', 'y']
$400,879.88, took 1,075 ms

Using model Houston_TX/163_Houston_TX.tar.gz to predict price of this house: [1983, 4044.6155346332425, 6, 2.0, 1.29, 1, 'n', 'n']
$560,354.50, took 1,099 ms

Using model Houston_TX/164_Houston_TX.tar.gz to predict price of this house: [2007, 3056.150301809477, 6, 1.5, 1.01, 0, 'y', 'y']
$517,704.56, took 1,073 ms

Using model Houston_T

#### CloudWatch charts for LoadedModelCount,MemoryUtilization and ModelCacheHit metrics will be similar to charts below.

![](cw_charts/ModelCountMemUtilization.png)

“LoadedModelCount” continuously increases, as more models are invocated, till it levels off at 121.  “MemoryUtilization” of the container also increased correspondingly to around 79%.  This shows that the instance chosen to host the endpoint, could only maintain 121 models in memory, when 200 model invocations are made.  

![](cw_charts/ModelCountMemUtilizationCacheHit.png)

As the number of models loaded to the container memory increase, the ModelCacheHit improves.  When the same 100 models are invoked the second time, the ModelCacheHit reaches 1.  When new models, not yet loaded are invoked the ModelCacheHit decreases again. 

## Section 9 - Explore granular access to the target models of MME <a id='Finegrain-control-invoke-models'></a>

If the role attached to this notebook instance allows invoking SageMaker endpoints, it is able to invoke all models hosted on the MME.  Using IAM conditional keys, you can restrict this model invocation access to specific models.  To explore this, you will create a new IAM role and IAM policy with conditional key to restrict access to a single model.  Assume this new role and verify that only a single target model can be invoked.

Note that to execute this section, the role attached to the notebook instance should allow the following actions :
    "iam:CreateRole",
    "iam:CreatePolicy",
    "iam:AttachRolePolicy",
    "iam:UpdateAssumeRolePolicy"
    
If this is not the case, please work the Administrator of this AWS account to ensure this.  

In [585]:
iam_client = boto3.client('iam')

In [None]:
#Get the role assumed by this notebook instance.
#role

In [586]:
#Create a new role that can be assumed by this notebook.  The roles should allow access to only a single model.

#job_name='scikit-learn-preprocessor-'+strftime('%Y-%m-%d-%H-%M-%S', gmtime())
path='/'
role_name='allow_invoke_ny_model_role'+strftime('%Y-%m-%d-%H-%M-%S', gmtime())
description='Role that allows invoking a single model'

action_string = "sts:AssumeRole"
    
trust_policy={
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "statement1",
      "Effect": "Allow",
      "Principal": {
        "AWS": role
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

In [587]:
response = iam_client.create_role(
    Path=path,
    RoleName=role_name,
    AssumeRolePolicyDocument=json.dumps(trust_policy),
    Description=description,
    MaxSessionDuration=3600
)

In [588]:
role_arn=response['Role']['Arn']
print("Role arn is :", role_arn)

Role arn is : arn:aws:iam::555360056434:role/allow_invoke_ny_model_role2020-07-06-01-47-13


In [589]:
endpoint_resource_arn="arn:aws:sagemaker:" + REGION + ":" + ACCOUNT_ID + ":endpoint/" + endpoint_name
print("Endpoint arn is :", endpoint_resource_arn)

Endpoint arn is : arn:aws:sagemaker:us-east-1:555360056434:endpoint/inference-pipeline-ep-2020-07-06-00-42-02


In [590]:
##Create the IAM policy with the IAM condition key
policy_name= 'allow_invoke_ny_model_policy'+strftime('%Y-%m-%d-%H-%M-%S', gmtime())
managed_policy = {
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "SageMakerAccess",
            "Action": "sagemaker:InvokeEndpoint",
            "Effect": "Allow",
            "Resource":endpoint_resource_arn,
            "Condition": {
                "StringLike": {
                    "sagemaker:TargetModel": ["NewYork_NY/*"]
                }
            }
        }
    ]
}

response = iam_client.create_policy(
  PolicyName=policy_name,
  PolicyDocument=json.dumps(managed_policy)
)

In [591]:
policy_arn=response['Policy']['Arn']

In [592]:
##Attach policy to role
iam_client.attach_role_policy(
    PolicyArn=policy_arn,
    RoleName=role_name
)

{'ResponseMetadata': {'RequestId': '030e9369-7bc4-49b1-8272-eb237154109f',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '030e9369-7bc4-49b1-8272-eb237154109f',
   'content-type': 'text/xml',
   'content-length': '212',
   'date': 'Mon, 06 Jul 2020 01:47:28 GMT'},
  'RetryAttempts': 0}}

In [593]:
## Invoke with the role that has access to only NY model
sts_connection = boto3.client('sts')
assumed_role_limited_access = sts_connection.assume_role(
    RoleArn=role_arn,
    RoleSessionName="MME_Invoke_NY_Model"
)
assumed_role_limited_access['AssumedRoleUser']['Arn']


'arn:aws:sts::555360056434:assumed-role/allow_invoke_ny_model_role2020-07-06-01-47-13/MME_Invoke_NY_Model'

In [594]:
trust_policy={
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "statement1",
      "Effect": "Allow",
      "Principal": {
        "AWS": role
      },
      "Action": "sts:AssumeRole"
    },
    {
      "Sid": "statement2",
      "Effect": "Allow",
      "Principal": {
          "AWS": assumed_role_limited_access['AssumedRoleUser']['Arn']
      },
      "Action": "sts:AssumeRole"
    }  
  ]
}

In [595]:
iam_client.update_assume_role_policy(
    RoleName=role_name,
    PolicyDocument=json.dumps(trust_policy)
)

{'ResponseMetadata': {'RequestId': '194a73d1-c091-4d9e-9304-c372aa339514',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '194a73d1-c091-4d9e-9304-c372aa339514',
   'content-type': 'text/xml',
   'content-length': '224',
   'date': 'Mon, 06 Jul 2020 01:47:32 GMT'},
  'RetryAttempts': 0}}

In [596]:
ACCESS_KEY = assumed_role_limited_access['Credentials']['AccessKeyId']
SECRET_KEY = assumed_role_limited_access['Credentials']['SecretAccessKey']
SESSION_TOKEN = assumed_role_limited_access['Credentials']['SessionToken']

runtime_sm_client_with_assumed_role = boto3.client(
    service_name='sagemaker-runtime', 
    aws_access_key_id=ACCESS_KEY,
    aws_secret_access_key=SECRET_KEY,
    aws_session_token=SESSION_TOKEN,
)

In [597]:
 sagemakerSessionAssumedRole = sagemaker.Session(sagemaker_runtime_client=runtime_sm_client_with_assumed_role)

In [598]:
predictorAssumedRole = RealTimePredictor(
    endpoint=endpoint_name,
    sagemaker_session=sagemakerSessionAssumedRole,
    serializer=csv_serializer,
    content_type=CONTENT_TYPE_CSV,
    accept=CONTENT_TYPE_JSON)

In [599]:
full_model_name = 'NewYork_NY/NewYork_NY.tar.gz'
predict_one_house_value(gen_random_house()[:-1], full_model_name,predictorAssumedRole)

Using model NewYork_NY/NewYork_NY.tar.gz to predict price of this house: [1983, 4157.329826676647, 2, 2.5, 0.56, 0, 'y', 'y']
$533,722.12, took 1,123 ms



In [600]:
##This should fail with "AccessDeniedException" since the assumed role does not have access to Chicago model
full_model_name = 'Chicago_IL/Chicago_IL.tar.gz'
predict_one_house_value(gen_random_house()[:-1], full_model_name,predictorAssumedRole)

Using model Chicago_IL/Chicago_IL.tar.gz to predict price of this house: [2002, 4111.711154533768, 2, 2.0, 0.72, 2, 'y', 'y']


ClientError: An error occurred (AccessDeniedException) when calling the InvokeEndpoint operation: User: arn:aws:sts::555360056434:assumed-role/allow_invoke_ny_model_role2020-07-06-01-47-13/MME_Invoke_NY_Model is not authorized to perform: sagemaker:InvokeEndpoint on resource: arn:aws:sagemaker:us-east-1:555360056434:endpoint/inference-pipeline-ep-2020-07-06-00-42-02

## Clean up<a id='CleanUp'></a>
Clean up the endpoint to avoid unneccessary costs.



In [None]:
# shut down the endpoint
#sm_client.delete_endpoint(EndpointName=endpoint_name)