# Amazon SageMaker Multi-Model Endpoints using Scikit Learn
This notebook shows an example of using multi-model endpoints for **predicting housing prices** at different locations. The predicitve models are trainined in Scikit Learn and will be deployed behind a Sagemaker multi-model endpoint.

In [1]:
# Assumption: I already have the list of trained model for 4 cities in the US
model_list = ['LosAngeles_CA', 'Chicago_IL', 'Houston_TX', 'NewYork_NY']

In [2]:
# First I install the dependencies
!pip install -qU awscli boto3 sagemaker

[33mYou are using pip version 10.0.1, however version 20.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


### Set names for S3 Prefix, Model uploading artifact, ECR Image name, Endpoint Instance Type

First get the role and session.
The S3 bucket name and the S3 data prefix associated with the multi-model endpoint is declared here. If the bucket does not exist it is also created. Note that a default bucket is created using sagemaker sessions to ensure the right roles are attached.
The script_file has the function declaration for loading the model in the container and the the user_code_artifact is the tar.gz file packages this code and uploads to the S3 bucket later.
Also declared is the name of the ECR image for deploying the model.
The instance type of the endpoint will be an ml.m4.xlarge instance

In [3]:
import sagemaker
from sagemaker import get_execution_role
from sagemaker.predictor import csv_serializer
import boto3

sm_client = boto3.client(service_name='sagemaker')
runtime_sm_client = boto3.client(service_name='sagemaker-runtime')

s3 = boto3.resource('s3')
s3_client = boto3.client('s3')

sagemaker_session = sagemaker.Session()
role = get_execution_role()

# S3 Prefix
BUCKET      = sagemaker_session.default_bucket()
DATA_PREFIX            = 'DEMO_MME_SCIKIT'
MULTI_MODEL_ARTIFACTS  = 'multi_model_artifacts'

# Additional API: Artifact for uploading model from S3 to Container
SCRIPT_FILENAME     = 'inference.py'
USER_CODE_ARTIFACTS = 'user_code.tar.gz'

# ECR Image name
ALGORITHM_NAME = 'multi-model-sklearn'
ACCOUNT_ID  = boto3.client('sts').get_caller_identity()['Account']
REGION      = boto3.Session().region_name
MULTI_MODEL_SKLEARN_IMAGE = '{}.dkr.ecr.{}.amazonaws.com/{}:latest'.format(ACCOUNT_ID, REGION, 
                                                                           ALGORITHM_NAME)
# ECR Image and Model Object Name
HOUSING_MODEL_NAME     = 'ushousing'
ENDPOINT_INSTANCE_TYPE = 'ml.m4.xlarge'

### Build and register a Scikit Learn container that can serve multiple models

This is the script for creating the image that will serve the multi-model endpoint.
This image if it not in ECR will be created and pushed to ECR.
For an inference container to serve multiple models in a multi-model endpoint, we must implement [additional APIs](https://docs.aws.amazon.com/sagemaker/latest/dg/build-multi-model-build-container.html) in order to load, list, get, unload and invoke specific models.

We refer to the 'mme' branch of the [SageMaker Scikit Learn Container repository](https://github.com/aws/sagemaker-scikit-learn-container/tree/mme) which shows an example implementation on how to adapt SageMaker's Scikit Learn framework container to use [Multi Model Server](https://github.com/awslabs/multi-model-server). MMS is a framework that provides an HTTP frontend that implements the additional container APIs required by multi-model endpoints, and also provides a pluggable backend handler for serving models using a custom framework, in this case the Scikit Learn framework.

In [4]:
%%sh -s $ALGORITHM_NAME

algorithm_name=$1

account=$(aws sts get-caller-identity --query Account --output text)

# Get the region defined in the current configuration (default to us-west-2 if none defined)
region=$(aws configure get region)

ecr_image="${account}.dkr.ecr.${region}.amazonaws.com/${algorithm_name}:latest"

# If the repository doesn't exist in ECR, create it.
aws ecr describe-repositories --repository-names "${algorithm_name}" > /dev/null 2>&1

if [ $? -ne 0 ]
then
    aws ecr create-repository --repository-name "${algorithm_name}" > /dev/null
fi

# Get the login command from ECR and execute it directly
$(aws ecr get-login --region ${region} --no-include-email --registry-ids ${account})

# Build the docker image locally with the image name and then push it to ECR
# with the full image name.

# First clear out any prior version of the cloned repo
rm -rf sagemaker-scikit-learn-container/

# Clone the sklearn container repo
git clone --single-branch --branch mme https://github.com/aws/sagemaker-scikit-learn-container.git
cd sagemaker-scikit-learn-container/

# Build the "base" container image that encompasses the installation of the
# scikit-learn framework and all of the dependencies needed.
docker build -q -t sklearn-base:0.20-2-cpu-py3 -f docker/0.20-2/base/Dockerfile.cpu --build-arg py_version=3 .

# Create the SageMaker Scikit-learn Container Python package.
python setup.py bdist_wheel --universal

# Build the "final" container image that encompasses the installation of the
# code that implements the SageMaker multi-model container requirements.
docker build -q -t ${algorithm_name} -f docker/0.20-2/final/Dockerfile.cpu .

docker tag ${algorithm_name} ${ecr_image}

docker push ${ecr_image}

Login Succeeded
sha256:33fac7063dad5e17683ecce178e56dac4c6f09ead19ff4f53453b6e2167a2903
running bdist_wheel
running build
running build_py
creating build
creating build/lib
creating build/lib/sagemaker_sklearn_container
copying src/sagemaker_sklearn_container/__init__.py -> build/lib/sagemaker_sklearn_container
copying src/sagemaker_sklearn_container/training.py -> build/lib/sagemaker_sklearn_container
copying src/sagemaker_sklearn_container/serving.py -> build/lib/sagemaker_sklearn_container
copying src/sagemaker_sklearn_container/handler_service.py -> build/lib/sagemaker_sklearn_container
creating build/lib/sagemaker_sklearn_container/mms_patch
copying src/sagemaker_sklearn_container/mms_patch/mms_transformer.py -> build/lib/sagemaker_sklearn_container/mms_patch
copying src/sagemaker_sklearn_container/mms_patch/__init__.py -> build/lib/sagemaker_sklearn_container/mms_patch
copying src/sagemaker_sklearn_container/mms_patch/model_server.py -> build/lib/sagemaker_sklearn_container/mms_p

https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Cloning into 'sagemaker-scikit-learn-container'...


### The code used to train the model

Here is a script that helps load the model to the container. This script is both for training the model and loading the model.

In [5]:
%%writefile $SCRIPT_FILENAME

import argparse
import os
import glob

import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestRegressor
from sklearn.externals import joblib

# inference functions ---------------
def model_fn(model_dir):
    print('loading model.joblib from: {}'.format(model_dir))
    _loaded_model = joblib.load(os.path.join(model_dir, 'model.joblib'))
    return _loaded_model


if __name__ =='__main__':

    print('extracting arguments')
    parser = argparse.ArgumentParser()

    # hyperparameters sent by the client are passed as command-line arguments to the script.
    # to simplify the demo we don't use all sklearn RandomForest hyperparameters
    parser.add_argument('--n-estimators', type=int, default=10)
    parser.add_argument('--min-samples-leaf', type=int, default=3)

    # Data, model, and output directories
    parser.add_argument('--model-dir', type=str, default=os.environ.get('SM_MODEL_DIR'))
    parser.add_argument('--train', type=str, default=os.environ.get('SM_CHANNEL_TRAIN'))
    parser.add_argument('--validation', type=str, default=os.environ.get('SM_CHANNEL_VALIDATION'))
    parser.add_argument('--model-name', type=str)

    args, _ = parser.parse_known_args()

    print('reading data')
    print('model_name: {}'.format(args.model_name))

    train_file = os.path.join(args.train, args.model_name + '_train.csv')    
    train_df = pd.read_csv(train_file)

    val_file = os.path.join(args.validation, args.model_name + '_val.csv')
    test_df = pd.read_csv(os.path.join(val_file))

    print('building training and testing datasets')
    X_train = train_df[train_df.columns[1:train_df.shape[1]]] 
    X_test = test_df[test_df.columns[1:test_df.shape[1]]]
    y_train = train_df[train_df.columns[0]]
    y_test = test_df[test_df.columns[0]]

    # train
    print('training model')
    model = RandomForestRegressor(
        n_estimators=args.n_estimators,
        min_samples_leaf=args.min_samples_leaf,
        n_jobs=-1)
    
    model.fit(X_train, y_train)

    # print abs error
    print('validating model')
    abs_err = np.abs(model.predict(X_test) - y_test)

    # print couple perf metrics
    for q in [10, 50, 90]:
        print('AE-at-' + str(q) + 'th-percentile: '
              + str(np.percentile(a=abs_err, q=q)))
        
    # persist model
    path = os.path.join(args.model_dir, 'model.joblib')
    joblib.dump(model, path)
    print('model persisted at ' + path)

Overwriting inference.py


### Package Inference entry point code

In [6]:
# When using multi-model endpoints with the Scikit Learn container, we need to provide an entry point for
# inference, a function that will load the saved model in S3 into the container. This function uploads a 
# the user code artifact containing such a script. This tar.gz file will be fed to the SageMaker multi-model 
# creation and pointed to by theSAGEMAKER_SUBMIT_DIRECTORY environment variable.

def upload_inference_code(script_file_name, prefix):
    _tmp_folder = 'inference-code'
    if not os.path.exists(_tmp_folder):
        os.makedirs(_tmp_folder)
    !tar -czvf $_tmp_folder/$USER_CODE_ARTIFACTS $script_file_name > /dev/null
    _loc = sagemaker_session.upload_data(_tmp_folder, 
                                         key_prefix='{}/{}'.format(prefix, _tmp_folder))
    return _loc + '/' + USER_CODE_ARTIFACTS

### Create Model for multi-model endpoint

In [7]:
# Thus function creates the model object with the path to the S3 prefix, the ECR image that was created previously
# and the path to the code that loads the model into the container
def create_multi_model_entity(multi_model_name, role):
    # establish the place in S3 from which the endpoint will pull individual models
    _model_url  = 's3://{}/{}/{}/'.format(BUCKET, DATA_PREFIX, MULTI_MODEL_ARTIFACTS)
    _container = {
        'Image':        MULTI_MODEL_SKLEARN_IMAGE,
        'ModelDataUrl': _model_url,
        'Mode':         'MultiModel',
        'Environment': {
            'SAGEMAKER_PROGRAM' : SCRIPT_FILENAME,
            'SAGEMAKER_SUBMIT_DIRECTORY' : upload_inference_code(SCRIPT_FILENAME, DATA_PREFIX)
        }
    }
    create_model_response = sm_client.create_model(
        ModelName = multi_model_name,
        ExecutionRoleArn = role,
        Containers = [_container])
    
    return _model_url

In [8]:
from time import gmtime, strftime
import os

multi_model_name = '{}-{}'.format(HOUSING_MODEL_NAME, strftime('%Y-%m-%d-%H-%M-%S', gmtime()))
model_url = create_multi_model_entity(multi_model_name, role)
print('Multi model name: {}'.format(multi_model_name))

Multi model name: ushousing-2020-01-23-04-58-23


### Create Endpoint Config

Next we create an endpoint config that defines the instance type, count variant weight and the name of the model that was created in the previous step

In [9]:
endpoint_config_name = multi_model_name
print('Endpoint config name: ' + endpoint_config_name)

create_endpoint_config_response = sm_client.create_endpoint_config(
    EndpointConfigName = endpoint_config_name,
    ProductionVariants=[{
        'InstanceType': ENDPOINT_INSTANCE_TYPE,
        'InitialInstanceCount': 1,
        'InitialVariantWeight': 1,
        'ModelName'   : multi_model_name,
        'VariantName' : 'AllTraffic'}])

endpoint_name = multi_model_name
print('Endpoint name: ' + endpoint_name)

Endpoint config name: ushousing-2020-01-23-04-58-23
Endpoint name: ushousing-2020-01-23-04-58-23


### Create Endpoint

Next the model is deployed by creating the endpoint. The endpoint takes the endpoint config that was created previously.

In [10]:
create_endpoint_response = sm_client.create_endpoint(
    EndpointName=endpoint_name,
    EndpointConfigName=endpoint_config_name)
print('Endpoint Arn: ' + create_endpoint_response['EndpointArn'])

Endpoint Arn: arn:aws:sagemaker:us-east-1:963992372437:endpoint/ushousing-2020-01-23-04-58-23


In [11]:
print('Waiting for {} endpoint to be in service...'.format(endpoint_name))
waiter = sm_client.get_waiter('endpoint_in_service')
waiter.wait(EndpointName=endpoint_name)

Waiting for ushousing-2020-01-23-04-58-23 endpoint to be in service...


### Generate synthetic test data

The models for house price prediction were built using synthetically generated data. We use the same process to generate some test data or inference payload.
Each test sample has six attributes namely area of house, num of bedroom, bathrooms, lot acres, garage space and year built. These are linearly combined and added to a base price to derive the price. Goal to predict the price for a new data sample.


In [12]:
NUM_HOUSES_PER_LOCATION = 1000
LOCATIONS  = ['NewYork_NY',    'LosAngeles_CA',   'Chicago_IL',    'Houston_TX',   'Dallas_TX',
              'Phoenix_AZ',    'Philadelphia_PA', 'SanAntonio_TX', 'SanDiego_CA',  'SanFrancisco_CA']
COLUMNS = ['PRICE', 'YEAR_BUILT', 'SQUARE_FEET', 'NUM_BEDROOMS',
           'NUM_BATHROOMS', 'LOT_ACRES', 'GARAGE_SPACES']
MAX_YEAR = 2019

In [13]:
def gen_price(house):
    _base_price = int(house['SQUARE_FEET'] * 150)
    _price = int(_base_price + (10000 * house['NUM_BEDROOMS']) + \
                               (15000 * house['NUM_BATHROOMS']) + \
                               (15000 * house['LOT_ACRES']) + \
                               (15000 * house['GARAGE_SPACES']) - \
                               (5000 * (MAX_YEAR - house['YEAR_BUILT'])))
    return _price

In [14]:
def gen_random_house():
    _house = {'SQUARE_FEET':   int(np.random.normal(3000, 750)),
              'NUM_BEDROOMS':  np.random.randint(2, 7),
              'NUM_BATHROOMS': np.random.randint(2, 7) / 2,
              'LOT_ACRES':     round(np.random.normal(1.0, 0.25), 2),
              'GARAGE_SPACES': np.random.randint(0, 4),
              'YEAR_BUILT':    min(MAX_YEAR, int(np.random.normal(1995, 10)))}
    _price = gen_price(_house)
    return [_price, _house['YEAR_BUILT'],   _house['SQUARE_FEET'], 
                    _house['NUM_BEDROOMS'], _house['NUM_BATHROOMS'], 
                    _house['LOT_ACRES'],    _house['GARAGE_SPACES']]

### Uploading 2 Models from Local to S3 Bucket

Two of the models from the model list are uploaded to the S3 bucket with the fixed prefix.

In [15]:
!aws s3 cp LosAngeles_CA.tar.gz s3://$BUCKET/$DATA_PREFIX/$MULTI_MODEL_ARTIFACTS/
!aws s3 cp Chicago_IL.tar.gz s3://$BUCKET/$DATA_PREFIX/$MULTI_MODEL_ARTIFACTS/

upload: ./LosAngeles_CA.tar.gz to s3://sagemaker-us-east-1-963992372437/DEMO_MME_SCIKIT/multi_model_artifacts/LosAngeles_CA.tar.gz
upload: ./Chicago_IL.tar.gz to s3://sagemaker-us-east-1-963992372437/DEMO_MME_SCIKIT/multi_model_artifacts/Chicago_IL.tar.gz


### List of models the MME endpoint will have access to

This is the list of model in the S3 bucket

In [16]:
print('Here are the models that the endpoint has at its disposal:')
!aws s3 ls --human-readable --summarize $model_url

Here are the models that the endpoint has at its disposal:
2020-01-23 05:15:03  386.9 KiB Chicago_IL.tar.gz
2020-01-23 05:15:02  384.3 KiB LosAngeles_CA.tar.gz

Total Objects: 2
   Total Size: 771.1 KiB


### Invoke multiple individual models hosted behind a single endpoint

Next we show how to invoke the endpoint using this test data using the InvokeEndpoint function that we have seen before. The predict_one house function is predicting the price for 1 house given the model name and payload.

Some of the model names are randomly picked and prediction made. Note that we get an error when we are trying to call a model that does not existing in the S3 bucket.

In [17]:
import json
import datetime
import time

In [18]:
def predict_one_house_value(features, model_name):
    print('Using model {} to predict price of this house: {}'.format(full_model_name,
                                                                     features))

    _float_features = [float(i) for i in features]
    _body = ','.join(map(str, _float_features)) + '\n'
    
    _start_time = time.time()

    _response = runtime_sm_client.invoke_endpoint(
                        EndpointName=endpoint_name,
                        ContentType='text/csv',
                        TargetModel=full_model_name,
                        Body=_body)
    _predicted_value = json.loads(_response['Body'].read())[0]

    _duration = time.time() - _start_time
    
    print('${:,.2f}, took {:,d} ms\n'.format(_predicted_value, int(_duration * 1000)))

In [20]:
model_list2 = ['Chicago_IL', 'LosAngeles_CA']
for i in range(10):
    model_name = model_list2[np.random.randint(1, len(model_list2))]
    full_model_name = '{}.tar.gz'.format(model_name)
    predict_one_house_value(gen_random_house()[1:], full_model_name)

Using model LosAngeles_CA.tar.gz to predict price of this house: [2008, 4057, 4, 1.0, 0.99, 1]
$642,295.23, took 1,037 ms

Using model LosAngeles_CA.tar.gz to predict price of this house: [2005, 4062, 5, 2.0, 0.63, 1]
$643,161.18, took 123 ms

Using model LosAngeles_CA.tar.gz to predict price of this house: [2004, 4550, 2, 2.5, 0.98, 3]
$723,025.77, took 114 ms

Using model LosAngeles_CA.tar.gz to predict price of this house: [1996, 2470, 4, 1.0, 1.02, 3]
$375,575.37, took 115 ms

Using model LosAngeles_CA.tar.gz to predict price of this house: [1983, 3064, 6, 2.0, 1.04, 2]
$416,198.69, took 114 ms

Using model LosAngeles_CA.tar.gz to predict price of this house: [1991, 3235, 4, 1.0, 0.97, 2]
$470,995.81, took 118 ms

Using model LosAngeles_CA.tar.gz to predict price of this house: [1997, 1924, 6, 1.5, 0.68, 0]
$280,092.33, took 115 ms

Using model LosAngeles_CA.tar.gz to predict price of this house: [1994, 3294, 5, 1.0, 0.98, 3]
$491,854.85, took 116 ms

Using model LosAngeles_CA.tar.

In [19]:
# iterate through invocations with random inputs against a random model showing results and latency
import numpy as np

for i in range(10):
    model_name = model_list[np.random.randint(1, len(model_list))]
    full_model_name = '{}.tar.gz'.format(model_name)
    predict_one_house_value(gen_random_house()[1:], full_model_name)

Using model Houston_TX.tar.gz to predict price of this house: [2000, 3050, 4, 1.5, 1.12, 1]


ValidationError: An error occurred (ValidationError) when calling the InvokeEndpoint operation: "Failed to download model data(bucket: sagemaker-us-east-1-963992372437, key: DEMO_MME_SCIKIT/multi_model_artifacts/Houston_TX.tar.gz). Please ensure that there is an object located at the URL and that the role passed to CreateModel has permissions to download the model.
"

### Dynamically deploy another model

Nex twe upload the other two models. Note there is no code change for the new models. Note that it takes more time to respond as it is uploading the model the first time. After that latency is much less. So Any one of the models can be invoked using the same nul-model endpoint.

In [21]:
!aws s3 cp Houston_TX.tar.gz s3://$BUCKET/$DATA_PREFIX/$MULTI_MODEL_ARTIFACTS/
!aws s3 cp NewYork_NY.tar.gz s3://$BUCKET/$DATA_PREFIX/$MULTI_MODEL_ARTIFACTS/

upload: ./Houston_TX.tar.gz to s3://sagemaker-us-east-1-963992372437/DEMO_MME_SCIKIT/multi_model_artifacts/Houston_TX.tar.gz
upload: ./NewYork_NY.tar.gz to s3://sagemaker-us-east-1-963992372437/DEMO_MME_SCIKIT/multi_model_artifacts/NewYork_NY.tar.gz


In [22]:
print('Here are the models that the endpoint has at its disposal:')
!aws s3 ls $model_url

Here are the models that the endpoint has at its disposal:
2020-01-23 05:15:03     396157 Chicago_IL.tar.gz
2020-01-23 05:17:30     396349 Houston_TX.tar.gz
2020-01-23 05:15:02     393494 LosAngeles_CA.tar.gz
2020-01-23 05:17:31     395990 NewYork_NY.tar.gz


In [23]:
#
model_name = 'NewYork_NY'
full_model_name = '{}.tar.gz'.format(model_name)
for i in range(5):
    features = gen_random_house()
    predict_one_house_value(gen_random_house()[1:], full_model_name)

Using model NewYork_NY.tar.gz to predict price of this house: [1983, 4122, 6, 2.0, 0.88, 3]
$544,632.56, took 877 ms

Using model NewYork_NY.tar.gz to predict price of this house: [1992, 3794, 5, 3.0, 1.32, 2]
$566,414.70, took 120 ms

Using model NewYork_NY.tar.gz to predict price of this house: [1996, 1636, 3, 2.0, 0.92, 0]
$230,567.33, took 115 ms

Using model NewYork_NY.tar.gz to predict price of this house: [2001, 2524, 5, 1.0, 1.2, 1]
$413,050.06, took 119 ms

Using model NewYork_NY.tar.gz to predict price of this house: [1984, 3486, 4, 2.0, 0.75, 3]
$451,767.35, took 121 ms



### Clean Up

In [None]:
# shut down the endpoint
sm_client.delete_endpoint(EndpointName=endpoint_name)
# and the endpoint config
sm_client.delete_endpoint_config(EndpointConfigName=endpoint_config_name)
# delete model too
sm_client.delete_model(ModelName=multi_model_name)