## XGBoost Containerized Model deployment (BYOC) in Amazon Sagemaker. 

#### This notebook should be run in an Amazon Sagemaker notebook instance. 


#### Before running this notebook, 

1. Train the Model locally on your laptop, and save it.
2. Containerize the model and make a docker image ( Refer Dockerfile and requirements.txt ).
3. Login to AWS, go to ECR service, create a repository and push the container image to the repository.
   Make note of the image URL.

#### Please remember not to run the last "Delete the Endpoint" cell if you want to test the deployed model from a client. 


After the exercise is over, 
##### you should cleanup the Sagemaker resources as described in 
https://docs.aws.amazon.com/sagemaker/latest/dg/ex1-cleanup.html to avoid charges incurred because of resources left behind.




### Import libraries

In [1]:
%%time

import os
import boto3
import sagemaker

from sagemaker import get_execution_role

region = boto3.Session().region_name

role = get_execution_role()


account_id = boto3.client('sts').get_caller_identity()['Account']

sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/sagemaker-user/.config/sagemaker/config.yaml
CPU times: user 1.39 s, sys: 244 ms, total: 1.63 s
Wall time: 2.53 s


In [2]:
print(role)
print(region)
print(account_id)

arn:aws:iam::180494263813:role/service-role/AmazonSageMaker-ExecutionRole-20240703T092754
us-east-1
180494263813


#### Create a Sagemaker model 

In [3]:
model_file_name = "DEMO-local-xgboost-model"

In [4]:
# Make sure you have uploaded the container image to AWS ECR , the image URL will be like below.
# <aws_account_id>.dkr.ecr.us-east-1.amazonaws.com/iris_xgb_fastapi:latest

In [5]:
%%time
from time import gmtime, strftime

# Set the model name
model_name = model_file_name + strftime("%Y-%m-%d-%H-%M-%S", gmtime())

# Set the container URL
container = '{}.dkr.ecr.{}.amazonaws.com/iris_xgb_fastapi:latest'.format(account_id, region)

# Create a sagemaker client
sm_client = boto3.client("sagemaker")

print('Model name: ' + model_name)
print('Container image: ' + container)

# Set the container
container = {
    "Image": container
}

# Create the Sagemaker model object.
create_model_response = sm_client.create_model(
    ModelName = model_name,
    ExecutionRoleArn = role,
    Containers = [container])

print(create_model_response["ModelArn"])

Model name: DEMO-local-xgboost-model2024-09-04-09-38-17
Container image: 180494263813.dkr.ecr.us-east-1.amazonaws.com/iris_xgb_fastapi:latest
arn:aws:sagemaker:us-east-1:180494263813:model/DEMO-local-xgboost-model2024-09-04-09-38-17
CPU times: user 42.6 ms, sys: 7.72 ms, total: 50.3 ms
Wall time: 672 ms


In [3]:
#boto3.client("sts").get_caller_identity()

### Create endpoint configuration

Create an endpoint configuration, that describes the distribution of traffic across the models, whether split, shadowed, or sampled in some way. In addition, the endpoint configuration describes the instance type required for model deployment.

In [7]:
# Set endpoint config name
endpoint_config_name = "DEMO-XGBoostEndpointConfig-" + strftime("%Y-%m-%d-%H-%M-%S", gmtime())

print(endpoint_config_name)

# Create endpoint config
create_endpoint_config_response = sm_client.create_endpoint_config(
    EndpointConfigName=endpoint_config_name,
    ProductionVariants=[
        {
            "InstanceType": "ml.m4.xlarge",
            "InitialInstanceCount": 1,
            "InitialVariantWeight": 1,
            "ModelName": model_name,
            "VariantName": "AllTraffic",
        }
    ],
)

print("Endpoint Config Arn: " + create_endpoint_config_response["EndpointConfigArn"])

DEMO-XGBoostEndpointConfig-2024-09-04-09-38-32
Endpoint Config Arn: arn:aws:sagemaker:us-east-1:180494263813:endpoint-config/DEMO-XGBoostEndpointConfig-2024-09-04-09-38-32


### Create endpoint
Lastly, you create the endpoint that serves up the model, through specifying the name and configuration defined above. The end result is an endpoint that can be validated and incorporated into production applications. This takes 9-11 minutes to complete.

In [8]:
%%time
import time

# Set endpoint name
endpoint_name = "DEMO-XGBoostEndpoint-" + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
print(endpoint_name)

# Create endpoint
create_endpoint_response = sm_client.create_endpoint(
    EndpointName=endpoint_name, EndpointConfigName=endpoint_config_name
)
print(create_endpoint_response["EndpointArn"])

resp = sm_client.describe_endpoint(EndpointName=endpoint_name)
status = resp["EndpointStatus"]
print("Status: " + status)

print('Waiting for {} endpoint to be in service...'.format(endpoint_name))
waiter = sm_client.get_waiter('endpoint_in_service')
waiter.wait(EndpointName=endpoint_name)


DEMO-XGBoostEndpoint-2024-09-04-09-38-41
arn:aws:sagemaker:us-east-1:180494263813:endpoint/DEMO-XGBoostEndpoint-2024-09-04-09-38-41
Status: Creating
Waiting for DEMO-XGBoostEndpoint-2024-09-04-09-38-41 endpoint to be in service...
CPU times: user 42 ms, sys: 10.6 ms, total: 52.6 ms
Wall time: 3min 1s


### Validate the model for use
Now you can obtain the endpoint from the client library using the result from previous operations and generate classifications from the model using that endpoint.

In [11]:
# Create a sagemaker runtime client for inference.
runtime_client = boto3.client("runtime.sagemaker")

Lets generate the prediction. We will use a sample Iris flower dimension data.

In [27]:
%%time
import json

content_type = "application/json"

# Set the Sepal length, width etc of the flower
request_body = {"data": [5.4, 3, 4.5, 1.5]}

#Serialize data for endpoint
payload = json.dumps(request_body)

#Endpoint invocation
response = runtime_client.invoke_endpoint(
EndpointName=endpoint_name,
ContentType=content_type,
Body=payload)

result = response["Body"].read().decode("ascii")
print("\nPredicted Class : {}.".format(result))


Predicted Class : "Versicolor".
CPU times: user 4.32 ms, sys: 489 µs, total: 4.81 ms
Wall time: 14.2 ms


### (Optional) Delete the Endpoint

If you're ready to be done with this notebook, please run the delete_endpoint line in the cell below.  This will remove the hosted endpoint you created and avoid any charges from a stray instance being left on.

In [28]:
sm_client.delete_endpoint(EndpointName=endpoint_name)

{'ResponseMetadata': {'RequestId': '72625f68-c442-4331-a558-ab99e0d02edd',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '72625f68-c442-4331-a558-ab99e0d02edd',
   'content-type': 'application/x-amz-json-1.1',
   'date': 'Wed, 04 Sep 2024 10:12:32 GMT',
   'content-length': '0'},
  'RetryAttempts': 0}}