# Amazon SageMaker Multi-Model Endpoints using XGBoost

*이 노트북은 [Amazon SageMaker Multi-Model Endpoints using XGBoost (영문 원본)](https://github.com/awslabs/amazon-sagemaker-examples/blob/master/advanced_functionality/multi_model_xgboost_home_value/xgboost_multi_model_endpoint_home_value.ipynb) 의 한국어 번역입니다.*

고객들은 [Amazon SageMaker 멀티 모델 엔드포인트(multi-model endpoints)](https://docs.aws.amazon.com/sagemaker/latest/dg/multi-model-endpoints.html)를 사용하여 최대 수천 개의 모델을 완벽하게 호스팅하는 엔드포인트를 생성할 수 있습니다. 이러한 엔드포인트는 공통 추론 컨테이너(common inference container)에서 제공할 수 있는 많은 모델 중 하나를 온디맨드(on demand)로 호출할 수 있어야 하고 자주 호출되지 않는 모델이 약간의 추가 대기 시간(latency) 허용이 가능한 사례들에 적합합니다. 지속적으로 낮은 추론 대기 시간이 필요한 애플리케이션의 경우 기존의 엔드포인트가 여전히 최선의 선택입니다.

High level에서 Amazon SageMaker는 필요에 따라 멀티 모델 엔드포인트에 대한 모델 로딩 및 언로딩을 관리합니다. 특정 모델에 대한 호출 요청이 발생하면 Amazon SageMaker는 해당 모델에 할당된 인스턴스로 요청을 라우팅하고 S3에서 모델 아티팩트(model artifacts)를 해당 인스턴스로 다운로드한 다음 컨테이너의 메모리에 모델 로드를 시작합니다. 로딩이 완료되면 Amazon SageMaker는 요청된 호출을 수행하고 결과를 반환합니다. 모델이 선택된 인스턴스의 메모리에 이미 로드되어 있으면 다운로드 및 로딩 단계들을 건너 뛰고 즉시 호출이 수행됩니다.

멀티 모델 엔드포인트 작성 및 사용 방법을 보여주기 위해, 이 노트북은 단일 위치의 주택 가격을 예측하는 XGBoost 모델을 사용하는 예시를 제공합니다. 이 도메인은 멀티 모델 엔드포인트를 쉽게 실험하기 위한 간단한 예제입니다.

Amazon SageMaker 멀티 모델 엔드포인트 기능은 컨테이너를 가져 오는 프레임워크를 포함한 모든 머신 러닝 프레임워크 및 알고리즘에서 작동하도록 설계되었습니다.

### Contents

1. [Build and register an XGBoost container that can serve multiple models](#Build-and-register-an-XGBoost-container-that-can-serve-multiple-models)
1. [Generate synthetic data for housing models](#Generate-synthetic-data-for-housing-models)
1. [Train multiple house value prediction models](#Train-multiple-house-value-prediction-models)
1. [Import models into hosting](#Import-models-into-hosting)
  1. [Deploy model artifacts to be found by the endpoint](#Deploy-model-artifacts-to-be-found-by-the-endpoint)
  1. [Create the Amazon SageMaker model entity](#Create-the-Amazon-SageMaker-model-entity)
  1. [Create the multi-model endpoint](#Create-the-multi-model-endpoint)
1. [Exercise the multi-model endpoint](#Exercise-the-multi-model-endpoint)
  1. [Dynamically deploy another model](#Dynamically-deploy-another-model)
  1. [Invoke the newly deployed model](#Invoke-the-newly-deployed-model)
  1. [Updating a model](#Updating-a-model)
1. [Clean up](#Clean-up)

## Build and register an XGBoost container that can serve multiple models

In [1]:
!pip install -qU awscli boto3 sagemaker

[33mYou are using pip version 10.0.1, however version 19.3.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


추론 컨테이너가 멀티 모델 엔드 포인트에서 여러 모델을 제공하려면 특정 모델의 로드(load), 나열(list), 가져오기(get), 언로드(unload) 및 호출(invoke)을 위한 [추가 API](https://docs.aws.amazon.com/sagemaker/latest/dg/build-multi-model-build-container.html)를 구현해야 합니다.

[SageMaker XGBoost 컨테이너 저장소의 'mme' branch](https://github.com/aws/sagemaker-xgboost-container/tree/mme)는 멀티 모델 엔드포인트에 필요한 추가 컨테이너 API를 구현하는 HTTP 프론트엔드를 제공하는 프레임워크인 [Multi Model Server](https://github.com/awslabs/multi-model-server)를 사용하도록 SageMaker의 XGBoost 프레임워크 컨테이너를 조정하는 방법에 대한 예제 구현입니다. 또한 사용자 정의 프레임워크 (본 예시에서는 XGBoost 프레임워크)를 사용하여 모델을 제공하기 위한 플러그 가능한 백엔드 핸들러(pluggable backend handler)를 제공합니다.

이 branch를 사용하여 모든 멀티 모델 엔드 포인트 컨테이너 요구 사항을 충족하는 XGBoost 컨테이너를 구축한 다음 해당 이미지를 Amazon Elastic Container Registry(ECR)에 업로드합니다. 이미지를 ECR에 업로드하면 새로운 ECR 저장소가 생성될 수 있으므로 이 노트북에는 일반 `SageMakerFullAccess` 권한 외에 권한이 필요합니다. 이러한 권한을 추가하는 가장 쉬운 방법은 관리형 정책 `AmazonEC2ContainerRegistryFullAccess`를 노트북 인스턴스를 시작하는 데 사용한 역할(role)에 추가하는 것입니다. 이 작업을 수행할 때 노트북 인스턴스를 다시 시작할 필요가 없으며 새 권한을 즉시 사용할 수 있습니다.

In [2]:
ALGORITHM_NAME = 'multi-model-xgboost'

In [3]:
%%sh -s $ALGORITHM_NAME

algorithm_name=$1

account=$(aws sts get-caller-identity --query Account --output text)

# Get the region defined in the current configuration
region=$(aws configure get region)

ecr_image="${account}.dkr.ecr.${region}.amazonaws.com/${algorithm_name}:latest"

# If the repository doesn't exist in ECR, create it.
aws ecr describe-repositories --repository-names "${algorithm_name}" > /dev/null 2>&1

if [ $? -ne 0 ]
then
    aws ecr create-repository --repository-name "${algorithm_name}" > /dev/null
fi

# Get the login command from ECR and execute it directly
$(aws ecr get-login --region ${region} --no-include-email --registry-ids ${account})

# Build the docker image locally with the image name and then push it to ECR
# with the full name.

# First clear out any prior version of the cloned repo
rm -rf sagemaker-xgboost-container/

# Clone the xgboost container repo
git clone --single-branch --branch mme https://github.com/aws/sagemaker-xgboost-container.git
cd sagemaker-xgboost-container/

# Build the "base" container image that encompasses the installation of the
# XGBoost framework and all of the dependencies needed.
docker build -q -t xgboost-container-base:0.90-2-cpu-py3 -f docker/0.90-2/base/Dockerfile.cpu .

# Create the SageMaker XGBoost Container Python package.
python setup.py bdist_wheel --universal

# Build the "final" container image that encompasses the installation of the
# code that implements the SageMaker multi-model container requirements.
docker build -q -t ${algorithm_name} -f docker/0.90-2/final/Dockerfile.cpu .

docker tag ${algorithm_name} ${ecr_image}

docker push ${ecr_image}

Login Succeeded
sha256:043284b5094346869c82aedf4248f70279624c8e79e469e06dc8a816bdd71ddf
running bdist_wheel
running build
running build_py
creating build
creating build/lib
creating build/lib/sagemaker_algorithm_toolkit
copying src/sagemaker_algorithm_toolkit/hyperparameter_validation.py -> build/lib/sagemaker_algorithm_toolkit
copying src/sagemaker_algorithm_toolkit/__init__.py -> build/lib/sagemaker_algorithm_toolkit
copying src/sagemaker_algorithm_toolkit/metrics.py -> build/lib/sagemaker_algorithm_toolkit
copying src/sagemaker_algorithm_toolkit/exceptions.py -> build/lib/sagemaker_algorithm_toolkit
copying src/sagemaker_algorithm_toolkit/channel_validation.py -> build/lib/sagemaker_algorithm_toolkit
copying src/sagemaker_algorithm_toolkit/metadata.py -> build/lib/sagemaker_algorithm_toolkit
creating build/lib/sagemaker_xgboost_container
copying src/sagemaker_xgboost_container/__init__.py -> build/lib/sagemaker_xgboost_container
copying src/sagemaker_xgboost_container/training.py ->

https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Cloning into 'sagemaker-xgboost-container'...


## Generate synthetic data for housing models

In [4]:
import numpy as np
import pandas as pd
import json
import datetime
import time
from time import gmtime, strftime
import matplotlib.pyplot as plt

In [5]:
NUM_HOUSES_PER_LOCATION = 1000
LOCATIONS  = ['NewYork_NY',    'LosAngeles_CA',   'Chicago_IL',    'Houston_TX',   'Dallas_TX',
              'Phoenix_AZ',    'Philadelphia_PA', 'SanAntonio_TX', 'SanDiego_CA',  'SanFrancisco_CA']
PARALLEL_TRAINING_JOBS = 4 # len(LOCATIONS) if your account limits can handle it
MAX_YEAR = 2019

In [6]:
def gen_price(house):
    _base_price = int(house['SQUARE_FEET'] * 150)
    _price = int(_base_price + (10000 * house['NUM_BEDROOMS']) + \
                               (15000 * house['NUM_BATHROOMS']) + \
                               (15000 * house['LOT_ACRES']) + \
                               (15000 * house['GARAGE_SPACES']) - \
                               (5000 * (MAX_YEAR - house['YEAR_BUILT'])))
    return _price

In [7]:
def gen_random_house():
    _house = {'SQUARE_FEET':   int(np.random.normal(3000, 750)),
              'NUM_BEDROOMS':  np.random.randint(2, 7),
              'NUM_BATHROOMS': np.random.randint(2, 7) / 2,
              'LOT_ACRES':     round(np.random.normal(1.0, 0.25), 2),
              'GARAGE_SPACES': np.random.randint(0, 4),
              'YEAR_BUILT':    min(MAX_YEAR, int(np.random.normal(1995, 10)))}
    _price = gen_price(_house)
    return [_price, _house['YEAR_BUILT'],   _house['SQUARE_FEET'], 
                    _house['NUM_BEDROOMS'], _house['NUM_BATHROOMS'], 
                    _house['LOT_ACRES'],    _house['GARAGE_SPACES']]

In [8]:
def gen_houses(num_houses):
    _house_list = []
    for i in range(num_houses):
        _house_list.append(gen_random_house())
    _df = pd.DataFrame(_house_list, 
                       columns=['PRICE',        'YEAR_BUILT',    'SQUARE_FEET',  'NUM_BEDROOMS',
                                'NUM_BATHROOMS','LOT_ACRES',     'GARAGE_SPACES'])
    return _df

## Train multiple house value prediction models

In [9]:
import sagemaker
from sagemaker import get_execution_role
from sagemaker.predictor import csv_serializer
import boto3

sm_client = boto3.client(service_name='sagemaker')
runtime_sm_client = boto3.client(service_name='sagemaker-runtime')

s3 = boto3.resource('s3')
s3_client = boto3.client('s3')

sagemaker_session = sagemaker.Session()
role = get_execution_role()

ACCOUNT_ID = boto3.client('sts').get_caller_identity()['Account']
REGION     = boto3.Session().region_name
BUCKET     = sagemaker_session.default_bucket()

MULTI_MODEL_XGBOOST_IMAGE = '{}.dkr.ecr.{}.amazonaws.com/{}:latest'.format(ACCOUNT_ID, REGION, 
                                                                           ALGORITHM_NAME)

DATA_PREFIX            = 'DEMO_MME_XGBOOST'
HOUSING_MODEL_NAME     = 'housing'
MULTI_MODEL_ARTIFACTS  = 'multi_model_artifacts'

TRAIN_INSTANCE_TYPE    = 'ml.m4.xlarge'
ENDPOINT_INSTANCE_TYPE = 'ml.m4.xlarge'

### Split a given dataset into train, validation, and test

In [10]:
from sklearn.model_selection import train_test_split
SEED = 7
SPLIT_RATIOS = [0.6, 0.3, 0.1]

def split_data(df):
    # split data into train and test sets
    seed      = SEED
    val_size  = SPLIT_RATIOS[1]
    test_size = SPLIT_RATIOS[2]
    
    num_samples = df.shape[0]
    X1 = df.values[:num_samples, 1:] # keep only the features, skip the target, all rows
    Y1 = df.values[:num_samples, :1] # keep only the target, all rows

    # Use split ratios to divide up into train/val/test
    X_train, X_val, y_train, y_val = \
        train_test_split(X1, Y1, test_size=(test_size + val_size), random_state=seed)
    # Of the remaining non-training samples, give proper ratio to validation and to test
    X_test, X_test, y_test, y_test = \
        train_test_split(X_val, y_val, test_size=(test_size / (test_size + val_size)), 
                         random_state=seed)
    # reassemble the datasets with target in first column and features after that
    _train = np.concatenate([y_train, X_train], axis=1)
    _val   = np.concatenate([y_val,   X_val],   axis=1)
    _test  = np.concatenate([y_test,  X_test],  axis=1)

    return _train, _val, _test

### Launch a single training job for a given housing location

모델 학습 시, 기존 SageMaker 모델과 동일한 방식으로 학습하기 때문에 멀티 모델 엔트 포인트에 특화된 기능을 따로 구현하실 필요가 없습니다.

In [11]:
def launch_training_job(location):
    # clear out old versions of the data
    s3_bucket = s3.Bucket(BUCKET)
    full_input_prefix = '{}/model_prep/{}'.format(DATA_PREFIX, location)
    s3_bucket.objects.filter(Prefix=full_input_prefix + '/').delete()

    # upload the entire set of data for all three channels
    local_folder = 'data/{}'.format(location)
    inputs = sagemaker_session.upload_data(path=local_folder, key_prefix=full_input_prefix)
    print('Training data uploaded: {}'.format(inputs))
    
    _job = 'xgb-{}'.format(location.replace('_', '-'))
    full_output_prefix = '{}/model_artifacts/{}'.format(DATA_PREFIX, location)
    s3_output_path = 's3://{}/{}'.format(BUCKET, full_output_prefix)

    xgb = sagemaker.estimator.Estimator(MULTI_MODEL_XGBOOST_IMAGE, role, 
                                        train_instance_count=1, train_instance_type=TRAIN_INSTANCE_TYPE,
                                        output_path=s3_output_path, base_job_name=_job,
                                        sagemaker_session=sagemaker_session)
    xgb.set_hyperparameters(max_depth=5, eta=0.2, gamma=4, min_child_weight=6, subsample=0.8, silent=0, 
                            early_stopping_rounds=5, objective='reg:linear', num_round=25) 
    
    DISTRIBUTION_MODE = 'FullyReplicated'
    train_input = sagemaker.s3_input(s3_data=inputs+'/train', 
                                     distribution=DISTRIBUTION_MODE, content_type='csv')
    val_input   = sagemaker.s3_input(s3_data=inputs+'/val', 
                                     distribution=DISTRIBUTION_MODE, content_type='csv')
    remote_inputs = {'train': train_input, 'validation': val_input}

    xgb.fit(remote_inputs, wait=False)
    
    return xgb.latest_training_job.name

### Kick off a model training job for each housing location

In [12]:
def save_data_locally(location, train, val, test):
    os.makedirs('data/{}/train'.format(location))
    np.savetxt( 'data/{0}/train/{0}_train.csv'.format(location), train, delimiter=',', fmt='%.2f')
    
    os.makedirs('data/{}/val'.format(location))
    np.savetxt( 'data/{0}/val/{0}_val.csv'.format(location),     val, delimiter=',', fmt='%.2f')
    
    os.makedirs('data/{}/test'.format(location))
    np.savetxt( 'data/{0}/test/{0}_test.csv'.format(location),   test, delimiter=',', fmt='%.2f')

In [None]:
def launch_training_job(location):
    # clear out old versions of the data
    s3_bucket = s3.Bucket(BUCKET)
    full_input_prefix = '{}/model_prep/{}'.format(DATA_PREFIX, location)
    s3_bucket.objects.filter(Prefix=full_input_prefix + '/').delete()

    # upload the entire set of data for all three channels
    local_folder = 'data/{}'.format(location)
    inputs = sagemaker_session.upload_data(path=local_folder, key_prefix=full_input_prefix)
    print('Training data uploaded: {}'.format(inputs))
    
    _job = 'xgb-{}'.format(location.replace('_', '-'))
    full_output_prefix = '{}/model_artifacts/{}'.format(DATA_PREFIX, location)
    s3_output_path = 's3://{}/{}'.format(BUCKET, full_output_prefix)

    xgb = sagemaker.estimator.Estimator(MULTI_MODEL_XGBOOST_IMAGE, role, 
                                        train_instance_count=1, train_instance_type=TRAIN_INSTANCE_TYPE,
                                        output_path=s3_output_path, base_job_name=_job,
                                        sagemaker_session=sagemaker_session)
    xgb.set_hyperparameters(max_depth=5, eta=0.2, gamma=4, min_child_weight=6, subsample=0.8, silent=0, 
                            early_stopping_rounds=5, objective='reg:linear', num_round=25) 
    
    DISTRIBUTION_MODE = 'FullyReplicated'
    train_input = sagemaker.s3_input(s3_data=inputs+'/train', 
                                     distribution=DISTRIBUTION_MODE, content_type='csv')
    val_input   = sagemaker.s3_input(s3_data=inputs+'/val', 
                                     distribution=DISTRIBUTION_MODE, content_type='csv')
    remote_inputs = {'train': train_input, 'validation': val_input}

    xgb.fit(remote_inputs, wait=False)
    
    return xgb.latest_training_job.name

In [13]:
import shutil
import os

training_jobs = []

shutil.rmtree('data', ignore_errors=True)

for loc in LOCATIONS[:PARALLEL_TRAINING_JOBS]:
    _houses = gen_houses(NUM_HOUSES_PER_LOCATION)
    _train, _val, _test = split_data(_houses)
    save_data_locally(loc, _train, _val, _test)
    _job = launch_training_job(loc)
    training_jobs.append(_job)
print('{} training jobs launched: {}'.format(len(training_jobs), training_jobs))

Training data uploaded: s3://sagemaker-ap-northeast-2-143656149352/DEMO_MME_XGBOOST/model_prep/NewYork_NY
Training data uploaded: s3://sagemaker-ap-northeast-2-143656149352/DEMO_MME_XGBOOST/model_prep/LosAngeles_CA
Training data uploaded: s3://sagemaker-ap-northeast-2-143656149352/DEMO_MME_XGBOOST/model_prep/Chicago_IL
Training data uploaded: s3://sagemaker-ap-northeast-2-143656149352/DEMO_MME_XGBOOST/model_prep/Houston_TX
4 training jobs launched: ['xgb-NewYork-NY-2019-11-27-06-15-54-555', 'xgb-LosAngeles-CA-2019-11-27-06-15-55-036', 'xgb-Chicago-IL-2019-11-27-06-15-56-286', 'xgb-Houston-TX-2019-11-27-06-15-57-402']


### Wait for all model training to finish

In [23]:
def wait_for_training_job_to_complete(job_name):
    print('Waiting for job {} to complete...'.format(job_name))
    resp = sm_client.describe_training_job(TrainingJobName=job_name)
    status = resp['TrainingJobStatus']
    while status=='InProgress':
        time.sleep(60)
        resp = sm_client.describe_training_job(TrainingJobName=job_name)
        status = resp['TrainingJobStatus']
        if status == 'InProgress':
            print('{} job status: {}'.format(job_name, status))
    print('DONE. Status for {} is {}\n'.format(job_name, status))

In [24]:
# wait for the jobs to finish
for j in training_jobs:
    wait_for_training_job_to_complete(j)

Waiting for job xgb-NewYork-NY-2019-11-27-06-15-54-555 to complete...
DONE. Status for xgb-NewYork-NY-2019-11-27-06-15-54-555 is Completed

Waiting for job xgb-LosAngeles-CA-2019-11-27-06-15-55-036 to complete...
DONE. Status for xgb-LosAngeles-CA-2019-11-27-06-15-55-036 is Completed

Waiting for job xgb-Chicago-IL-2019-11-27-06-15-56-286 to complete...
DONE. Status for xgb-Chicago-IL-2019-11-27-06-15-56-286 is Completed

Waiting for job xgb-Houston-TX-2019-11-27-06-15-57-402 to complete...
DONE. Status for xgb-Houston-TX-2019-11-27-06-15-57-402 is Completed



## Import models into hosting
멀티 모델 엔드포인트의 가장 큰 차이점은 모델 엔티티(Model entity)를 작성할 때 컨테이너의 `MultiModel`은 엔드포인트에서 호출할 수 있는 모델 아티팩트가 있는 S3 접두부(prefix)입니다. 나머지 S3 경로는 실제로 모델을 호출할 때 지정됩니다. 슬래시로 위치를 닫아야 하는 점을 기억해 주세요.

컨테이너의 `Mode`는 컨테이너가 여러 모델을 호스팅함을 나타내기 위해 `MultiModel`로 지정됩니다.

### Deploy model artifacts to be found by the endpoint
상술한 바와 같이, 멀티 모델 엔드 포인트는 S3의 특정 위치에서 모델 아티팩트를 찾도록 구성됩니다. 학습된 각 모델에 대해 모델 아티팩트를 해당 위치에 복사합니다.

이 예에서는 모든 모델들을 단일 폴더에 저장합니다. 멀티 모델 엔드 포인트의 구현은 임의의 폴더 구조를 허용할 만큼 유연합니다. 예를 들어 일련의 하우징 모델의 경우 각 지역마다 최상위 폴더가 있을 수 있으며 모델 아티팩트는 해당 지역 폴더로 복사됩니다. 이러한 모델을 호출할 때 참조되는 대상 모델에는 폴더 경로가 포함됩니다. 예를 들어 `northeast/Boston_MA.tar.gz`입니다.

In [26]:
import re
def parse_model_artifacts(model_data_url):
    # extract the s3 key from the full url to the model artifacts
    _s3_key = model_data_url.split('s3://{}/'.format(BUCKET))[1]
    # get the part of the key that identifies the model within the model artifacts folder
    _model_name_plus = _s3_key[_s3_key.find('model_artifacts') + len('model_artifacts') + 1:]
    # finally, get the unique model name (e.g., "NewYork_NY")
    _model_name = re.findall('^(.*?)/', _model_name_plus)[0]
    return _s3_key, _model_name 

In [27]:
# make a copy of the model artifacts from the original output of the training job to the place in
# s3 where the multi model endpoint will dynamically load individual models
def deploy_artifacts_to_mme(job_name):
    _resp = sm_client.describe_training_job(TrainingJobName=job_name)
    _source_s3_key, _model_name = parse_model_artifacts(_resp['ModelArtifacts']['S3ModelArtifacts'])
    _copy_source = {'Bucket': BUCKET, 'Key': _source_s3_key}
    _key = '{}/{}/{}.tar.gz'.format(DATA_PREFIX, MULTI_MODEL_ARTIFACTS, _model_name)
    
    print('Copying {} model\n   from: {}\n     to: {}...'.format(_model_name, _source_s3_key, _key))
    s3_client.copy_object(Bucket=BUCKET, CopySource=_copy_source, Key=_key)
    return _key

*의도적으로 첫 번째 모델을 복사하지 않는다는 점을 유의해 주세요.*. 첫 번째 모델은 향후 실습 과정에서 복사하여 이미 실행 중인 엔드포인트에 새 모델을 동적으로 추가하는 방법을 보여주기 위함입니다.

In [28]:
# First, clear out old versions of the model artifacts from previous runs of this notebook
s3 = boto3.resource('s3')
s3_bucket = s3.Bucket(BUCKET)
full_input_prefix = '{}/multi_model_artifacts'.format(DATA_PREFIX)
print('Removing old model artifacts from {}'.format(full_input_prefix))
s3_bucket.objects.filter(Prefix=full_input_prefix + '/').delete()

Removing old model artifacts from DEMO_MME_XGBOOST/multi_model_artifacts


[]

In [31]:
job_name = 'xgb-LosAngeles-CA-2019-11-27-06-15-55-036'

In [43]:
import re
from IPython.core.debugger import set_trace

# make a copy of the model artifacts from the original output of the training job to the place in
# s3 where the multi model endpoint will dynamically load individual models
def deploy_artifacts_to_mme(job_name):
    _resp = sm_client.describe_training_job(TrainingJobName=job_name)
    #set_trace() #this one triggers the debugger    
    _source_s3_key, _model_name = parse_model_artifacts(_resp['ModelArtifacts']['S3ModelArtifacts'])
    
    #'DEMO_MME_XGBOOST/model_artifacts/LosAngeles_CA/xgb-LosAngeles-CA-2019-11-27-06-15-55-036/output/model.tar.gz'
    # model_name = 'LosAngeles_CA'
    _copy_source = {'Bucket': BUCKET, 'Key': _source_s3_key}
    _key = '{}/{}/{}.tar.gz'.format(DATA_PREFIX, MULTI_MODEL_ARTIFACTS, _model_name)
    
    print('Copying {} model\n   from: {}\n     to: {}...'.format(_model_name, _source_s3_key, _key))
    s3_client.copy_object(Bucket=BUCKET, CopySource=_copy_source, Key=_key)
    return _key


def parse_model_artifacts(model_data_url):
    # extract the s3 key from the full url to the model artifacts
    _s3_key = model_data_url.split('s3://{}/'.format(BUCKET))[1]
    # get the part of the key that identifies the model within the model artifacts folder
    _model_name_plus = _s3_key[_s3_key.find('model_artifacts') + len('model_artifacts') + 1:]
    # finally, get the unique model name (e.g., "NewYork_NY")
    _model_name = re.findall('^(.*?)/', _model_name_plus)[0]
    return _s3_key, _model_name 

In [44]:
# copy every model except the first one
for job in training_jobs[1:]:
    deploy_artifacts_to_mme(job)

Copying LosAngeles_CA model
   from: DEMO_MME_XGBOOST/model_artifacts/LosAngeles_CA/xgb-LosAngeles-CA-2019-11-27-06-15-55-036/output/model.tar.gz
     to: DEMO_MME_XGBOOST/multi_model_artifacts/LosAngeles_CA.tar.gz...
Copying Chicago_IL model
   from: DEMO_MME_XGBOOST/model_artifacts/Chicago_IL/xgb-Chicago-IL-2019-11-27-06-15-56-286/output/model.tar.gz
     to: DEMO_MME_XGBOOST/multi_model_artifacts/Chicago_IL.tar.gz...
Copying Houston_TX model
   from: DEMO_MME_XGBOOST/model_artifacts/Houston_TX/xgb-Houston-TX-2019-11-27-06-15-57-402/output/model.tar.gz
     to: DEMO_MME_XGBOOST/multi_model_artifacts/Houston_TX.tar.gz...


### Create the Amazon SageMaker model entity
`boto3`을 사용하여 모델 엔터티를 만듭니다. 단일 모델을 설명하는 대신 멀티 모델 시맨틱(semantics)의 사용을 나타내며 모든 특정 모델 아티팩트의 소스 위치를 식별합니다.

In [45]:
def create_multi_model_entity(multi_model_name, role):
    # establish the place in S3 from which the endpoint will pull individual models
    _model_url  = 's3://{}/{}/{}/'.format(BUCKET, DATA_PREFIX, MULTI_MODEL_ARTIFACTS)
    print(_model_url)
    _container = {
        'Image':        MULTI_MODEL_XGBOOST_IMAGE,
        'ModelDataUrl': _model_url,
        'Mode':         'MultiModel'
    }
    create_model_response = sm_client.create_model(
        ModelName = multi_model_name,
        ExecutionRoleArn = role,
        Containers = [_container])
    
    return _model_url

In [46]:
multi_model_name = '{}-{}'.format(HOUSING_MODEL_NAME, strftime('%Y-%m-%d-%H-%M-%S', gmtime()))
model_url = create_multi_model_entity(multi_model_name, role)
print('Multi model name: {}'.format(multi_model_name))

s3://sagemaker-ap-northeast-2-143656149352/DEMO_MME_XGBOOST/multi_model_artifacts/
Multi model name: housing-2019-11-27-06-47-21


### Create the multi-model endpoint

멀티 모델 엔드포인트에 대한 SageMaker 엔드포인트 설정(config)에는 특별한 것이 없습니다. 예상 예측 워크로드에 적합한 인스턴스 유형과 인스턴스 수를 고려해야 합니다. 개별 모델의 수와 크기에 따라 메모리 요구 사항이 변동합니다.

엔드포인트 설정이 완료되면 엔드포인트 생성(creation)은 간단합니다.

In [47]:
endpoint_config_name = multi_model_name
print('Endpoint config name: ' + endpoint_config_name)

create_endpoint_config_response = sm_client.create_endpoint_config(
    EndpointConfigName = endpoint_config_name,
    ProductionVariants=[{
        'InstanceType': ENDPOINT_INSTANCE_TYPE,
        'InitialInstanceCount': 1,
        'InitialVariantWeight': 1,
        'ModelName': multi_model_name,
        'VariantName': 'AllTraffic'}])

endpoint_name = multi_model_name
print('Endpoint name: ' + endpoint_name)

Endpoint config name: housing-2019-11-27-06-47-21
Endpoint name: housing-2019-11-27-06-47-21


In [48]:
create_endpoint_response = sm_client.create_endpoint(
    EndpointName=endpoint_name,
    EndpointConfigName=endpoint_config_name)
print('Endpoint Arn: ' + create_endpoint_response['EndpointArn'])

Endpoint Arn: arn:aws:sagemaker:ap-northeast-2:143656149352:endpoint/housing-2019-11-27-06-47-21


In [49]:
print('Waiting for {} endpoint to be in service...'.format(endpoint_name))
waiter = sm_client.get_waiter('endpoint_in_service')
waiter.wait(EndpointName=endpoint_name)

Waiting for housing-2019-11-27-06-47-21 endpoint to be in service...


## Exercise the multi-model endpoint

### Invoke multiple individual models hosted behind a single endpoint

여기서 여러분은 특정 위치 기반 주택 모델을 무작위로 선택하는 것을 반복합니다. 주어진 모델의 첫번째 호출에 대해 지불된 콜드 스타트(cold start) 비용이 과금된다는 점을 알아 두세요. 동일한 모델의 후속 호출은 이미 메모리에 로드된 모델을 활용합니다.

In [50]:
def predict_one_house_value(features, model_name):
    print('Using model {} to predict price of this house: {}'.format(full_model_name,
                                                                     features))
    body = ','.join(map(str, features)) + '\n'
    start_time = time.time()

    response = runtime_sm_client.invoke_endpoint(
                        EndpointName=endpoint_name,
                        ContentType='text/csv',
                        TargetModel=full_model_name,
                        Body=body)
    predicted_value = json.loads(response['Body'].read())[0]

    duration = time.time() - start_time
    
    print('${:,.2f}, took {:,d} ms\n'.format(predicted_value, int(duration * 1000)))

In [56]:
print('Here are the models that the endpoint has at its disposal:')
!aws s3 ls --human-readable --summarize $model_url

Here are the models that the endpoint has at its disposal:
2019-11-27 06:46:19   11.0 KiB Chicago_IL.tar.gz
2019-11-27 06:46:19   10.9 KiB Houston_TX.tar.gz
2019-11-27 06:46:19   10.5 KiB LosAngeles_CA.tar.gz

Total Objects: 3
   Total Size: 32.4 KiB


주어진 모델에 대한 첫번째 요청을 완료하는 데 걸리는 시간은 S3에서 모델을 다운로드하여 메모리에 로드하기 위한 추가 대기 시간(콜드 스타트)이 필요합니다. 후속 호출은 모델이 이미 로드되었으므로 추가 오버헤드없이 완료됩니다.

In [58]:
# iterate through invocations with random inputs against a random model showing results and latency
for i in range(10):
    model_name = LOCATIONS[np.random.randint(1, len(LOCATIONS[:PARALLEL_TRAINING_JOBS]))]
    print(model_name)
    full_model_name = '{}.tar.gz'.format(model_name)
    predict_one_house_value(gen_random_house()[1:], full_model_name)

Houston_TX
Using model Houston_TX.tar.gz to predict price of this house: [1982, 2682, 6, 2.5, 0.89, 2]
$347,410.53, took 1,450 ms

Chicago_IL
Using model Chicago_IL.tar.gz to predict price of this house: [1983, 2635, 4, 1.0, 0.95, 0]
$312,624.59, took 909 ms

Houston_TX
Using model Houston_TX.tar.gz to predict price of this house: [1964, 2775, 2, 3.0, 1.29, 0]
$290,677.59, took 17 ms

Chicago_IL
Using model Chicago_IL.tar.gz to predict price of this house: [1995, 2551, 5, 2.5, 1.53, 3]
$412,950.59, took 16 ms

Houston_TX
Using model Houston_TX.tar.gz to predict price of this house: [1988, 2117, 2, 2.5, 1.07, 1]
$275,964.59, took 18 ms

LosAngeles_CA
Using model LosAngeles_CA.tar.gz to predict price of this house: [2012, 4253, 3, 2.0, 0.86, 1]
$659,443.25, took 894 ms

Houston_TX
Using model Houston_TX.tar.gz to predict price of this house: [1988, 2759, 2, 2.5, 0.42, 2]
$346,578.16, took 21 ms

LosAngeles_CA
Using model LosAngeles_CA.tar.gz to predict price of this house: [1980, 2940, 2

### Dynamically deploy another model

여기서 신규 모델의 동적 로딩의 힘을 볼 수 있습니다. 이전에 모델을 배포할 때 의도적으로 첫 번째 모델을 복사하지 않았습니다. 이제 추가 모델을 배포하고 다중 모델 엔드 포인트를 통해 즉시 모델을 호출할 수 있습니다. 이전 모델과 마찬가지로 엔드포인트가 모델을 다운로드하고 메모리에 로드하는 데 시간이 걸리므로 새 모델을 처음 호출하는 데 시간이 약간 더 걸린다는 점을 명심해 주세요.

In [60]:
# add another model to the endpoint and exercise it
deploy_artifacts_to_mme(training_jobs[0])

Copying NewYork_NY model
   from: DEMO_MME_XGBOOST/model_artifacts/NewYork_NY/xgb-NewYork-NY-2019-11-27-06-15-54-555/output/model.tar.gz
     to: DEMO_MME_XGBOOST/multi_model_artifacts/NewYork_NY.tar.gz...


'DEMO_MME_XGBOOST/multi_model_artifacts/NewYork_NY.tar.gz'

### Invoke the newly deployed model

엔드포인트 업데이트 또는 재시작 없이 새로 배포된 모델들로 호출을 수행해 보세요.

In [61]:
print('Here are the models that the endpoint has at its disposal:')
!aws s3 ls $model_url

Here are the models that the endpoint has at its disposal:
2019-11-27 06:46:19      11279 Chicago_IL.tar.gz
2019-11-27 06:46:19      11184 Houston_TX.tar.gz
2019-11-27 06:46:19      10751 LosAngeles_CA.tar.gz
2019-11-27 07:00:39      10631 NewYork_NY.tar.gz


In [62]:
model_name = LOCATIONS[0]
full_model_name = '{}.tar.gz'.format(model_name)
for i in range(5):
    features = gen_random_house()
    predict_one_house_value(gen_random_house()[1:], full_model_name)

Using model NewYork_NY.tar.gz to predict price of this house: [1996, 2258, 2, 2.0, 0.72, 0]
$304,361.25, took 1,042 ms

Using model NewYork_NY.tar.gz to predict price of this house: [2002, 2545, 2, 2.0, 1.13, 0]
$376,476.25, took 21 ms

Using model NewYork_NY.tar.gz to predict price of this house: [1989, 3741, 3, 1.5, 1.27, 1]
$502,773.16, took 15 ms

Using model NewYork_NY.tar.gz to predict price of this house: [1982, 2364, 2, 1.5, 1.36, 0]
$265,714.44, took 16 ms

Using model NewYork_NY.tar.gz to predict price of this house: [2003, 3056, 3, 2.5, 1.35, 0]
$466,382.69, took 14 ms



### Updating a model

모델을 업데이트하려면 위와 동일한 방법으로 새 모델로 추가하세요. 예를 들어,`NewYork_NY.tar.gz` 모델을 재학습하고 호출을 시작하려는 경우 업데이트된 모델 아티팩트를 S3 접두어(prefix) 뒤에 `NewYork_NY_v2.tar.gz`와 같은 새로운 이름으로 업로드한 다음 `NewYork_NY.tar.gz` 대신`NewYork_NY_v2.tar.gz`를 호출하도록 `TargetModel` 필드를 변경하세요. 모델의 이전 버전이 여전히 컨테이너 또는 엔드포인트 인스턴스의 스토리지 볼륨에 로드될 수 있으므로 Amazon S3에서 모델 아티팩트를 덮어 쓰지 않으려고 합니다. 그러면 새 모델 호출 시 이전 버전의 모델을 호출할 수 있습니다.

또는, 엔드포인트를 중지하고 새로운 모델 셋을 재배포할 수 있습니다.

## Clean up

더 이상 사용하지 않는 엔드포인트에 대한 요금이 청구되지 않도록 리소스를 정리합니다.

In [63]:
# shut down the endpoint
sm_client.delete_endpoint(EndpointName=endpoint_name)

{'ResponseMetadata': {'RequestId': 'da8bc294-61f9-443d-b2d4-fd29d0173509',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': 'da8bc294-61f9-443d-b2d4-fd29d0173509',
   'content-type': 'application/x-amz-json-1.1',
   'content-length': '0',
   'date': 'Wed, 27 Nov 2019 07:52:03 GMT'},
  'RetryAttempts': 0}}

In [64]:
# and the endpoint config
sm_client.delete_endpoint_config(EndpointConfigName=endpoint_config_name)

{'ResponseMetadata': {'RequestId': '79e06a1b-25cf-4651-8230-f9937ddcd64e',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '79e06a1b-25cf-4651-8230-f9937ddcd64e',
   'content-type': 'application/x-amz-json-1.1',
   'content-length': '0',
   'date': 'Wed, 27 Nov 2019 07:52:03 GMT'},
  'RetryAttempts': 0}}

In [65]:
# delete model too
sm_client.delete_model(ModelName=multi_model_name)

{'ResponseMetadata': {'RequestId': 'f2953525-b07f-4508-ae1c-018c02b10ee5',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': 'f2953525-b07f-4508-ae1c-018c02b10ee5',
   'content-type': 'application/x-amz-json-1.1',
   'content-length': '0',
   'date': 'Wed, 27 Nov 2019 07:52:04 GMT'},
  'RetryAttempts': 0}}