# Amazon SageMaker XGBoost Bring Your Own Model
_**Hosting a Pre-Trained scikit-learn Model in Amazon SageMaker XGBoost Algorithm Container**_

---

---

## Contents

1. [Background](#Background)
1. [Setup](#Setup)
1. [Optionally, train a scikit learn XGBoost model](#Optionally,-train-a-scikit-learn-XGBoost-model)
1. [Upload the pre-trained model to S3](#Upload-the-pre-trained-model-to-S3)
1. [Set up hosting for the model](#Set-up-hosting-for-the-model)
1. [Validate the model for use](#Validate-the-model-for-use)




---
## Background

Amazon SageMaker includes functionality to support a hosted notebook environment, distributed, serverless training, and real-time hosting. We think it works best when all three of these services are used together, but they can also be used independently.  Some use cases may only require hosting.  Maybe the model was trained prior to Amazon SageMaker existing, in a different service.

This notebook shows how to use a pre-existing scikit-learn trained XGBoost model with the Amazon SageMaker XGBoost Algorithm container to quickly create a hosted endpoint for that model. Please note that scikit-learn XGBoost model is compatible with SageMaker XGBoost container, whereas other gradient boosted tree models (such as one trained in SparkML) are not.

---
## Setup

Let's start by specifying:

* AWS region.
* The IAM role arn used to give learning and hosting access to your data. See the documentation for how to specify these.
* The S3 bucket that you want to use for training and model data.

runtime_client = boto3.client('runtime.sagemaker', endpoint_url= 'https://vpce-0803dc74776014d46-iava3t25-us-west-2a.vpce-svc-00a9fd01107f2b64d.us-west-2.vpce.amazonaws.com')

#runtime_client = boto3.client('runtime.sagemaker')


In [1]:
%%time

import os
import boto3
import re
import json
from sagemaker import get_execution_role
import sagemaker
import time

sess = sagemaker.Session()
region = boto3.Session().region_name
role = get_execution_role()
bucket=sess.default_bucket()
#bucket='<s3 bucket>' # put your s3 bucket name here, and create s3 bucket

CPU times: user 1.01 s, sys: 129 ms, total: 1.13 s
Wall time: 8.92 s


In [2]:
prefix = 'sagemaker/DEMO-xgboost-byo'
bucket_path = 'https://s3-{}.amazonaws.com/{}'.format(region,bucket)
# customize to your bucket where you have stored the data

## Optionally, train a scikit learn XGBoost model

These steps are optional and are needed to generate the scikit-learn model that will eventually be hosted using the SageMaker Algorithm contained. 

### Install XGboost
Note that for conda based installation, you'll need to change the Notebook kernel to the environment with conda and Python3. 

In [3]:
!pip install xgboost

Collecting xgboost
[?25l  Downloading https://files.pythonhosted.org/packages/70/91/551d37ba472bcbd70a25e667acc65a18a9d053657b13afcf0f87aa24d7bb/xgboost-1.0.2-py3-none-manylinux1_x86_64.whl (109.7MB)
[K    100% |████████████████████████████████| 109.8MB 429kB/s eta 0:00:01
Installing collected packages: xgboost
Successfully installed xgboost-1.0.2
[33mYou are using pip version 10.0.1, however version 20.0.2 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


### Fetch the dataset

In [4]:
%%time
import pickle, gzip, numpy, urllib.request, json

# Load the dataset
urllib.request.urlretrieve("http://deeplearning.net/data/mnist/mnist.pkl.gz", "mnist.pkl.gz")
f = gzip.open('mnist.pkl.gz', 'rb')
train_set, valid_set, test_set = pickle.load(f, encoding='latin1')
f.close()

CPU times: user 900 ms, sys: 258 ms, total: 1.16 s
Wall time: 2.48 s


### Prepare the dataset for training

In [5]:
%%time

import struct
import io
import boto3

def get_dataset():
  import pickle
  import gzip
  with gzip.open('mnist.pkl.gz', 'rb') as f:
      u = pickle._Unpickler(f)
      u.encoding = 'latin1'
      return u.load()

CPU times: user 8 µs, sys: 0 ns, total: 8 µs
Wall time: 12.4 µs


In [6]:
train_set, valid_set, test_set = get_dataset()

train_X = train_set[0]
train_y = train_set[1]

valid_X = valid_set[0]
valid_y = valid_set[1]

test_X = test_set[0]
test_y = test_set[1]

### Train the XGBClassifier

In [7]:
import xgboost as xgb
import sklearn as sk 

bt = xgb.XGBClassifier(max_depth=5,
                       learning_rate=0.2,
                       n_estimators=10,
                       objective='multi:softmax')   # Setup xgboost model
#900597767885.dkr.ecr.us-west-2.amazonaws.com/sagemaker-xgboost:beta_low_lat_inf


bt.fit(train_X, train_y, # Train it to our data
       eval_set=[(valid_X, valid_y)], 
       verbose=False)

  config.update(yaml.load(text) or {})


XGBClassifier(base_score=0.5, booster=None, colsample_bylevel=1,
       colsample_bynode=1, colsample_bytree=1, gamma=0, gpu_id=-1,
       importance_type='gain', interaction_constraints=None,
       learning_rate=0.2, max_delta_step=0, max_depth=5,
       min_child_weight=1, missing=nan, monotone_constraints=None,
       n_estimators=10, n_jobs=0, num_parallel_tree=1,
       objective='multi:softprob', random_state=0, reg_alpha=0,
       reg_lambda=1, scale_pos_weight=None, subsample=1, tree_method=None,
       validate_parameters=False, verbosity=None)

In [8]:
bt.get_xgb_params()

{'objective': 'multi:softprob',
 'base_score': 0.5,
 'booster': None,
 'colsample_bylevel': 1,
 'colsample_bynode': 1,
 'colsample_bytree': 1,
 'gamma': 0,
 'gpu_id': -1,
 'importance_type': 'gain',
 'interaction_constraints': None,
 'learning_rate': 0.2,
 'max_delta_step': 0,
 'max_depth': 5,
 'min_child_weight': 1,
 'missing': nan,
 'monotone_constraints': None,
 'n_estimators': 10,
 'n_jobs': 0,
 'num_parallel_tree': 1,
 'random_state': 0,
 'reg_alpha': 0,
 'reg_lambda': 1,
 'scale_pos_weight': None,
 'subsample': 1,
 'tree_method': None,
 'validate_parameters': False,
 'verbosity': None}

### Save the trained model file
Note that the model file name must satisfy the regular expression pattern: `^[a-zA-Z0-9](-*[a-zA-Z0-9])*;`. The model file also need to tar-zipped. 

In [9]:
model_file_name = "xgboost-model-"
bt._Booster.save_model(model_file_name)

In [10]:
!tar czvf model.tar.gz $model_file_name

xgboost-model-


## Upload the pre-trained model to S3

In [11]:
fObj = open("model.tar.gz", 'rb')
key= os.path.join(prefix, model_file_name, 'model.tar.gz')
boto3.Session().resource('s3').Bucket(bucket).Object(key).upload_fileobj(fObj)

## Set up hosting for the model in your VPC/Subnet with Security Groups

### Import model into hosting
This involves creating a SageMaker model from the model file previously uploaded to S3.

In [12]:
from sagemaker.amazon.amazon_estimator import get_image_uri
#container = get_image_uri(boto3.Session().region_name, 'xgboost')
container = get_image_uri(boto3.Session().region_name, 'xgboost', '0.90-1')
#container = '900597767885.dkr.ecr.us-west-2.amazonaws.com/sagemaker-xgboost:beta_low_lat_inf'
print('container image: {}'.format(container))

container image: 257758044811.dkr.ecr.us-east-2.amazonaws.com/sagemaker-xgboost:0.90-1-cpu-py3


In [13]:
%%time
from time import gmtime, strftime

model_name = model_file_name + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
model_url = 'https://s3-{}.amazonaws.com/{}/{}'.format(region,bucket,key)
sm_client = boto3.client('sagemaker')

print ('model_url: {}'.format(model_url))

primary_container = {
    'Image': container,
    'ModelDataUrl': model_url
}

#vpc = 'vpc-0e3dc7a6ecb4b94c0'
#subnet_a = 'subnet-018d48710058a32fb'
#subnet_b = 'subnet-06d5f220180ba3862'
#security_group = 'sg-01ac05ba3308465ac'

create_model_response = sm_client.create_model(
    ModelName = model_name,
    ExecutionRoleArn = role,
    PrimaryContainer = primary_container
#    ,
#    VpcConfig={
#        'SecurityGroupIds': [security_group],
#        'Subnets': [subnet_a, subnet_b],
#    },
#    EnableNetworkIsolation=True
    )

print(create_model_response['ModelArn'])
print(model_name)

model_url: https://s3-us-east-2.amazonaws.com/sagemaker-us-east-2-328296961357/sagemaker/DEMO-xgboost-byo/xgboost-model-/model.tar.gz
arn:aws:sagemaker:us-east-2:328296961357:model/xgboost-model-2020-03-19-01-50-07
xgboost-model-2020-03-19-01-50-07
CPU times: user 52.8 ms, sys: 7.89 ms, total: 60.7 ms
Wall time: 434 ms


### Create endpoint configuration for a single model

SageMaker supports configuring REST endpoints in hosting with multiple models, e.g. for A/B testing purposes. In order to support this, you can create an endpoint configuration, that describes the distribution of traffic across the models, whether split, shadowed, or sampled in some way. In addition, the endpoint configuration describes the instance type required for model deployment.

In [14]:
from time import gmtime, strftime

endpoint_config_name_prod = 'XGBoostEP-Prod-' + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
print(endpoint_config_name_prod)
create_endpoint_config_response = sm_client.create_endpoint_config(
    EndpointConfigName = endpoint_config_name_prod,
    ProductionVariants=[{
        'InstanceType':'ml.m4.xlarge',
        'InitialInstanceCount':1,
        'InitialVariantWeight':1,
        'ModelName':model_name,
        'VariantName':'model-Blue'}])

print("Endpoint Config Arn: " + create_endpoint_config_response['EndpointConfigArn'])

XGBoostEP-Prod-2020-03-19-01-50-07
Endpoint Config Arn: arn:aws:sagemaker:us-east-2:328296961357:endpoint-config/xgboostep-prod-2020-03-19-01-50-07


### Create endpoint
You create the endpoint that serves up the model, through specifying the name and configuration defined above. The end result is an endpoint that can be validated and incorporated into production applications. This takes 8-10 minutes to complete. 

#### First, let us create a helper funciton to poll for endoint status

In [15]:
def poll_endpoint(endpoint_name, endpoint_ARN, poll_period=60):
    resp = sm_client.describe_endpoint(EndpointName=endpoint_name)
    status = resp['EndpointStatus']
    print("Status: " + status)

    while (status=='Creating' or status=='Updating'):
        time.sleep(poll_period)
        resp = sm_client.describe_endpoint(EndpointName=endpoint_name)
        status = resp['EndpointStatus']
        print("Status: " + status)

    print("Arn: " + resp['EndpointArn'])
    print("Status: " + status)  

#### Now, let us create the endpoint and poll until it's status changes to "InService"

In [16]:
%%time
import time

endpoint_name = 'XGBoostEP-' + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
print(endpoint_name)
create_endpoint_response = sm_client.create_endpoint(
    EndpointName=endpoint_name,
    EndpointConfigName=endpoint_config_name_prod)
endpoint_ARN = create_endpoint_response['EndpointArn']
print(endpoint_ARN)

poll_endpoint(endpoint_name=endpoint_name, endpoint_ARN=endpoint_ARN)

XGBoostEP-2020-03-19-01-50-07
arn:aws:sagemaker:us-east-2:328296961357:endpoint/xgboostep-2020-03-19-01-50-07
Status: Creating
Status: Creating
Status: Creating
Status: Creating
Status: Creating
Status: Creating
Status: InService
Arn: arn:aws:sagemaker:us-east-2:328296961357:endpoint/xgboostep-2020-03-19-01-50-07
Status: InService
CPU times: user 97 ms, sys: 4.19 ms, total: 101 ms
Wall time: 6min


### Now let us seamlessly deploy a new model to the same endpoint without taking off-line the production 'model-Blue'. 
First we create a new endpoint configuraiton with both 'Blue' and 'Green' models. To save time, we are going to use the same model 'model_name' for both, but in real Blue/Green deployment these would be two different models. 

In [17]:
from time import gmtime, strftime

endpoint_config_name_blue_green = 'XGBoostEP-BlueGreen2-' + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
print(endpoint_config_name_blue_green)
create_endpoint_config_response = sm_client.create_endpoint_config(
    EndpointConfigName = endpoint_config_name_blue_green,
    ProductionVariants=[{
        'InstanceType':'ml.m4.xlarge',
        'InitialInstanceCount':1,
        'InitialVariantWeight':0.1,
        'ModelName':model_name,
        'VariantName':'model-Blue'},
        {'InstanceType':'ml.m4.xlarge',
        'InitialInstanceCount':1,
        'InitialVariantWeight':0.2,
        'ModelName':model_name,
        'VariantName':'model-Green'}])

print("Endpoint Config Arn: " + create_endpoint_config_response['EndpointConfigArn'])
#note that the weights do not need to add up to 1. In the above configuration, 
#Green model is going to be invoked twice as often as Blue one. 

XGBoostEP-BlueGreen2-2020-03-19-01-56-08
Endpoint Config Arn: arn:aws:sagemaker:us-east-2:328296961357:endpoint-config/xgboostep-bluegreen2-2020-03-19-01-56-08


### Update endpoint with Blue and Green models
While this operation is in progress, the endpoint can still receive and respond to inference requests using previously deployed model. 

In [18]:
%%time
import time

#endpoint_name = 'XGBoostEP-BlueGreen-' + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
print(endpoint_name)
update_endpoint_response = sm_client.update_endpoint(
    EndpointConfigName = endpoint_config_name_blue_green,
    EndpointName = endpoint_name)
endpoint_ARN = update_endpoint_response['EndpointArn']
print(endpoint_ARN)

poll_endpoint(endpoint_name=endpoint_name, endpoint_ARN=endpoint_ARN)

XGBoostEP-2020-03-19-01-50-07
arn:aws:sagemaker:us-east-2:328296961357:endpoint/xgboostep-2020-03-19-01-50-07
Status: Updating
Status: Updating
Status: Updating
Status: Updating
Status: Updating
Status: Updating
Status: Updating
Status: InService
Arn: arn:aws:sagemaker:us-east-2:328296961357:endpoint/xgboostep-2020-03-19-01-50-07
Status: InService
CPU times: user 214 ms, sys: 4.75 ms, total: 219 ms
Wall time: 7min 1s


### Update Endpoint's ProductionVariants Weights 
Assuming we are happy with "Green" model performance, let us switch all the traffic to it. 

In [19]:
%%time
print(endpoint_name)
update_weights_response = sm_client.update_endpoint_weights_and_capacities(    
    EndpointName=endpoint_name,
    DesiredWeightsAndCapacities=[
        {
            'VariantName': 'model-Blue',
            'DesiredWeight': 0
        },
        {
            'VariantName': 'model-Green',
            'DesiredWeight': 1
        }        
    ]
)

endpoint_ARN = update_weights_response['EndpointArn']
print(endpoint_ARN)

poll_endpoint(endpoint_name=endpoint_name, endpoint_ARN=endpoint_ARN)

XGBoostEP-2020-03-19-01-50-07
arn:aws:sagemaker:us-east-2:328296961357:endpoint/xgboostep-2020-03-19-01-50-07
Status: Updating
Status: Updating
Status: InService
Arn: arn:aws:sagemaker:us-east-2:328296961357:endpoint/xgboostep-2020-03-19-01-50-07
Status: InService
CPU times: user 67.6 ms, sys: 436 µs, total: 68.1 ms
Wall time: 2min


## Validate the model for use
Now you can obtain the endpoint from the client library using the result from previous operations and generate classifications from the model using that endpoint. You can take the below code and use it as a basis for endpoint invocation from a utility EC2 machine or a different SageMaker notebook.

In [20]:
runtime_client = boto3.client('runtime.sagemaker')

Lets generate the prediction for a single datapoint. We'll pick one from the test data generated earlier.

In [21]:
import numpy as np
point_X = test_X[0]
point_X = np.expand_dims(point_X, axis=0)
point_y = test_y[0]
np.savetxt("test_point.csv", point_X, delimiter=",")

In [22]:
%%time
import json


file_name = 'test_point.csv' #customize to your test file, will be 'mnist.single.test' if use data above

with open(file_name, 'r') as f:
    payload = f.read().strip()

response = runtime_client.invoke_endpoint(EndpointName=endpoint_name, 
                                   ContentType='text/csv', 
                                   Body=payload)
print (response)
result = response['Body'].read().decode('ascii')
print('Predicted Class Probabilities: {}.'.format(result))

{'ResponseMetadata': {'RequestId': '39460dc9-b903-4832-bf01-31586c296727', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': '39460dc9-b903-4832-bf01-31586c296727', 'x-amzn-invoked-production-variant': 'model-Green', 'date': 'Thu, 19 Mar 2020 02:05:11 GMT', 'content-type': 'text/csv; charset=utf-8', 'content-length': '2'}, 'RetryAttempts': 0}, 'ContentType': 'text/csv; charset=utf-8', 'InvokedProductionVariant': 'model-Green', 'Body': <botocore.response.StreamingBody object at 0x7fbe9020be48>}
Predicted Class Probabilities: [].
CPU times: user 13.1 ms, sys: 7 µs, total: 13.1 ms
Wall time: 27.6 ms


### (Optional) Delete the Endpoint

If you're ready to be done with this notebook, please run the delete_endpoint line in the cell below.  This will remove the hosted endpoint you created and avoid any charges from a stray instance being left on.

In [23]:
#sm_client.delete_endpoint(EndpointName=endpoint_name)