# Hosting your model



## Contents

1. [Setup](#Setup)
1. [Optionally, train a scikit learn XGBoost model](#Optionally,-train-a-scikit-learn-XGBoost-model)
1. [Upload the pre-trained model to S3](#Upload-the-pre-trained-model-to-S3)
1. [Set up hosting for the model](#Set-up-hosting-for-the-model)
1. [Validate the model for use](#Validate-the-model-for-use)


## Setup

Let's start by specifying:

* AWS region.
* The IAM role arn used to give learning and hosting access to your data. See the documentation for how to specify these.
* The S3 bucket that you want to use for training and model data.

In [2]:
%%time

import os
import boto3
import re
import json
import sagemaker
from sagemaker import get_execution_role

region = boto3.Session().region_name

role = get_execution_role()

bucket = sagemaker.Session().default_bucket()

sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /root/.config/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /root/.config/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /root/.config/sagemaker/config.yaml
CPU times: user 1.6 s, sys: 236 ms, total: 1.83 s
Wall time: 2.57 s


In [3]:
key = "sagemaker/multiregressor-xgboost-byo/multi-regressor-xgboost-model/model.tar.gz"
bucket_path = "https://s3-{}.amazonaws.com/{}".format(region, bucket)

print(bucket_path)

https://s3-us-west-2.amazonaws.com/sagemaker-us-west-2-442342299380


## Set up hosting for the model

### Import model into hosting
This involves creating a SageMaker model from the model file previously uploaded to S3.

In [4]:
from sagemaker.amazon.amazon_estimator import get_image_uri

container = get_image_uri(boto3.Session().region_name, "xgboost", "0.90-2")
print(container)

The method get_image_uri has been renamed in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.


246618743249.dkr.ecr.us-west-2.amazonaws.com/sagemaker-xgboost:0.90-2-cpu-py3


In [5]:
%%time
from time import gmtime, strftime

model_file_name = "multi-regressor-xgboost-model"
model_name = model_file_name + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
model_url = "https://s3-{}.amazonaws.com/{}/{}".format(region, bucket, key)
sm_client = boto3.client("sagemaker")

print(model_url)

primary_container = {
    "Image": container,
    "ModelDataUrl": model_url,
}

create_model_response2 = sm_client.create_model(
    ModelName=model_name, ExecutionRoleArn=role, PrimaryContainer=primary_container
)

print(create_model_response2["ModelArn"])

https://s3-us-west-2.amazonaws.com/sagemaker-us-west-2-442342299380/sagemaker/multiregressor-xgboost-byo/multi-regressor-xgboost-model/model.tar.gz
arn:aws:sagemaker:us-west-2:442342299380:model/multi-regressor-xgboost-model2023-09-28-09-31-27
CPU times: user 87.3 ms, sys: 9.82 ms, total: 97.1 ms
Wall time: 809 ms


### Create endpoint configuration

SageMaker supports configuring REST endpoints in hosting with multiple models, e.g. for A/B testing purposes. In order to support this, you can create an endpoint configuration, that describes the distribution of traffic across the models, whether split, shadowed, or sampled in some way. In addition, the endpoint configuration describes the instance type required for model deployment.

In [6]:
from time import gmtime, strftime

endpoint_config_name = "MultiRegressor-XGBoostEndpointConfig-" + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
print(endpoint_config_name)
create_endpoint_config_response = sm_client.create_endpoint_config(
    EndpointConfigName=endpoint_config_name,
    ProductionVariants=[
        {
            "InstanceType": "ml.m4.xlarge",
            "InitialInstanceCount": 1,
            "InitialVariantWeight": 1,
            "ModelName": model_name,
            "VariantName": "AllTraffic",
        }
    ],
)

print("Endpoint Config Arn: " + create_endpoint_config_response["EndpointConfigArn"])

MultiRegressor-XGBoostEndpointConfig-2023-09-28-09-31-28
Endpoint Config Arn: arn:aws:sagemaker:us-west-2:442342299380:endpoint-config/multiregressor-xgboostendpointconfig-2023-09-28-09-31-28


### Create endpoint
Lastly, you create the endpoint that serves up the model, through specifying the name and configuration defined above. The end result is an endpoint that can be validated and incorporated into production applications. This takes 9-11 minutes to complete.

In [7]:
%%time
import time

endpoint_name = "MultiRegressor-XGBoostEndpoint-" + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
print(endpoint_name)
create_endpoint_response = sm_client.create_endpoint(
    EndpointName=endpoint_name, EndpointConfigName=endpoint_config_name
)
print(create_endpoint_response["EndpointArn"])

resp = sm_client.describe_endpoint(EndpointName=endpoint_name)
status = resp["EndpointStatus"]
print("Status: " + status)

while status == "Creating":
    time.sleep(60)
    resp = sm_client.describe_endpoint(EndpointName=endpoint_name)
    status = resp["EndpointStatus"]
    print("Status: " + status)

print("Arn: " + resp["EndpointArn"])
print("Status: " + status)

MultiRegressor-XGBoostEndpoint-2023-09-28-09-31-28
arn:aws:sagemaker:us-west-2:442342299380:endpoint/multiregressor-xgboostendpoint-2023-09-28-09-31-28
Status: Creating
Status: Creating
Status: Creating
Status: Creating
Status: InService
Arn: arn:aws:sagemaker:us-west-2:442342299380:endpoint/multiregressor-xgboostendpoint-2023-09-28-09-31-28
Status: InService
CPU times: user 64.6 ms, sys: 12.7 ms, total: 77.3 ms
Wall time: 4min 1s


## Validate the model for use
Now you can obtain the endpoint from the client library using the result from previous operations and generate classifications from the model using that endpoint.

In [8]:
import boto3
runtime_client = boto3.client("runtime.sagemaker")

Lets generate the prediction for a single datapoint. We'll pick one from the test data generated earlier.

In [9]:
runtime = boto3.Session().client('sagemaker-runtime')
 
csv_text = '312,103,4,4,5,8.43,0'
# Send CSV text via InvokeEndpoint API
response = runtime.invoke_endpoint(EndpointName=endpoint_name, ContentType='text/csv', Body=csv_text)
print(response)
# Unpack response
result = json.loads(response['Body'].read().decode('ascii'))
print(result)

{'ResponseMetadata': {'RequestId': '2a66c6da-d068-49ab-9bd4-2c8527cb7bf6', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': '2a66c6da-d068-49ab-9bd4-2c8527cb7bf6', 'x-amzn-invoked-production-variant': 'AllTraffic', 'date': 'Thu, 28 Sep 2023 09:35:29 GMT', 'content-type': 'text/csv; charset=utf-8', 'content-length': '18', 'connection': 'keep-alive'}, 'RetryAttempts': 0}, 'ContentType': 'text/csv; charset=utf-8', 'InvokedProductionVariant': 'AllTraffic', 'Body': <botocore.response.StreamingBody object at 0x7f2537eeb340>}
0.7298557162284851


### (Optional) Delete the Endpoint

If you're ready to be done with this notebook, please run the delete_endpoint line in the cell below.  This will remove the hosted endpoint you created and avoid any charges from a stray instance being left on.

In [11]:
sm_client.delete_endpoint(EndpointName=endpoint_name)

{'ResponseMetadata': {'RequestId': 'bfc369f3-a78d-4547-8b64-4169a78eea0f',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': 'bfc369f3-a78d-4547-8b64-4169a78eea0f',
   'content-type': 'application/x-amz-json-1.1',
   'content-length': '0',
   'date': 'Thu, 28 Sep 2023 09:50:05 GMT'},
  'RetryAttempts': 0}}