## XGBoost Model deployment in Amazon Sagemaker. 

#### This notebook should be run in an Amazon Sagemaker notebook instance. 


#### Before running this notebook, 
you should have uploaded the pre-trained model and test_point.csv from your laptop to the 
same folder where you have this notebook file. test_point.csv contains few sample test data in csv format.


This loads the pre-trained XGBoost model and saves in a S3 bucket in .tar.gz format as required by Sagemaker.
Then it creates a sagemaker model from the model file stored in S3. 
Then configures and creates an Endpoint to deploy the model and also tests invoking the endpoint to get prediction.

#### Please remember not to run the last "Delete the Endpoint" cell if you want to test the deployed model from a client. 


After the exercise is over, 
##### you should cleanup the Sagemaker resources as described in 
https://docs.aws.amazon.com/sagemaker/latest/dg/ex1-cleanup.html to avoid charges incurred because of resources left behind.




### Import libraries

In [2]:
%%time

import os
import boto3
import sagemaker

from sagemaker import get_execution_role

region = boto3.Session().region_name

role = get_execution_role()

CPU times: user 259 ms, sys: 20.2 ms, total: 279 ms
Wall time: 355 ms


### Create S3 bucket

In [3]:
# This creates a default S3 bucket where we will upload our model.
bucket = sagemaker.Session().default_bucket()

In [4]:
bucket_path = "https://s3-{}.amazonaws.com/{}".format(region, bucket)

In [5]:
print(role)
print(region)
print(bucket)
print(bucket_path)

arn:aws:iam::180494263813:role/service-role/AmazonSageMaker-ExecutionRole-20240703T092754
us-east-1
sagemaker-us-east-1-180494263813
https://s3-us-east-1.amazonaws.com/sagemaker-us-east-1-180494263813


#### Install xgboost as it is needed for loading the model from joblib dump file and test it before deployment.
#### Please note that the XGBoost version should be same as the version with which the model was trained locally in laptop.

In [6]:
#!conda install -y -c conda-forge xgboost==0.90

In [7]:
model_file_name = "DEMO-local-xgboost-model"

### Load the pre-trained model and test it before deployment

In [8]:
import joblib
import xgboost

mymodel = joblib.load(model_file_name)


In [34]:
xgboost.__version__

'1.7.6'

In [9]:
#import json
import numpy as np


file_name = (
    "test_point.csv"  # customize to your test file, will be 'mnist.single.test' if use data above
)

with open(file_name, "r") as f:
    mypayload = np.loadtxt(f, delimiter=",")
    
print(mypayload)    

[[5.4 3.  4.5 1.5]
 [5.6 3.  4.1 1.3]
 [6.3 2.8 5.1 1.5]
 [6.  3.  4.8 1.8]
 [5.1 3.3 1.7 0.5]]


In [10]:
mymodel.predict(mypayload)

array([1, 1, 1, 2, 0], dtype=int32)

#### Create a tar.gz model file as this is the format required by Sagemaker for deployment.

In [22]:
#### This step Booster.save_model was needed before creating a tar.gz . 
# Otherwise I faced issues with prediction on deployment.

mymodel._Booster.save_model(model_file_name)

In [23]:
!tar czvf model.tar.gz $model_file_name

DEMO-local-xgboost-model


### Upload the pre-trained model to S3

In [24]:
#### prefix in S3
prefix = "sagemaker/DEMO-xgboost-byo"

fObj = open("model.tar.gz", "rb")
key = os.path.join(prefix, model_file_name, "model.tar.gz")
print(key)
boto3.Session().resource("s3").Bucket(bucket).Object(key).upload_fileobj(fObj)

sagemaker/DEMO-xgboost-byo/DEMO-local-xgboost-model/model.tar.gz


### Set up hosting for the model¶
#### Import model into hosting
This involves creating a SageMaker model from the model file previously uploaded to S3.

#### Create a Sagemaker model 

In [42]:
from sagemaker.amazon.amazon_estimator import get_image_uri

#### Get the built-in xgboost container image in Sagemaker to host our model
#container = get_image_uri(boto3.Session().region_name, "xgboost", "0.90-1")
container = get_image_uri(boto3.Session().region_name, "xgboost", "1.7-1")

The method get_image_uri has been renamed in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.


In [43]:
%%time
from time import gmtime, strftime

model_name = model_file_name + strftime("%Y-%m-%d-%H-%M-%S", gmtime())

model_url = "https://s3-{}.amazonaws.com/{}/{}".format(region, bucket, key)

sm_client = boto3.client("sagemaker")

print(model_url)

primary_container = {
    "Image": container,
    "ModelDataUrl": model_url,
}

create_model_response2 = sm_client.create_model(
    ModelName=model_name, ExecutionRoleArn=role, PrimaryContainer=primary_container
)

print(create_model_response2["ModelArn"])

https://s3-us-east-1.amazonaws.com/sagemaker-us-east-1-180494263813/sagemaker/DEMO-xgboost-byo/DEMO-local-xgboost-model/model.tar.gz
arn:aws:sagemaker:us-east-1:180494263813:model/DEMO-local-xgboost-model2024-07-04-05-10-10
CPU times: user 27.7 ms, sys: 1.34 ms, total: 29 ms
Wall time: 1.15 s


### Create endpoint configuration

Create an endpoint configuration, that describes the distribution of traffic across the models, whether split, shadowed, or sampled in some way. In addition, the endpoint configuration describes the instance type required for model deployment.

In [44]:
from time import gmtime, strftime

endpoint_config_name = "DEMO-XGBoostEndpointConfig-" + strftime("%Y-%m-%d-%H-%M-%S", gmtime())

print(endpoint_config_name)

create_endpoint_config_response = sm_client.create_endpoint_config(
    EndpointConfigName=endpoint_config_name,
    ProductionVariants=[
        {
            "InstanceType": "ml.m4.xlarge",
            "InitialInstanceCount": 1,
            "InitialVariantWeight": 1,
            "ModelName": model_name,
            "VariantName": "AllTraffic",
        }
    ],
)

print("Endpoint Config Arn: " + create_endpoint_config_response["EndpointConfigArn"])

DEMO-XGBoostEndpointConfig-2024-07-04-05-10-11
Endpoint Config Arn: arn:aws:sagemaker:us-east-1:180494263813:endpoint-config/DEMO-XGBoostEndpointConfig-2024-07-04-05-10-11


### Create endpoint
Lastly, you create the endpoint that serves up the model, through specifying the name and configuration defined above. The end result is an endpoint that can be validated and incorporated into production applications. This takes 9-11 minutes to complete.

In [45]:
%%time
import time

endpoint_name = "DEMO-XGBoostEndpoint-" + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
print(endpoint_name)
create_endpoint_response = sm_client.create_endpoint(
    EndpointName=endpoint_name, EndpointConfigName=endpoint_config_name
)
print(create_endpoint_response["EndpointArn"])

resp = sm_client.describe_endpoint(EndpointName=endpoint_name)
status = resp["EndpointStatus"]
print("Status: " + status)

while status == "Creating":
    time.sleep(60)
    resp = sm_client.describe_endpoint(EndpointName=endpoint_name)
    status = resp["EndpointStatus"]
    print("Status: " + status)

print("Arn: " + resp["EndpointArn"])
print("Status: " + status)

DEMO-XGBoostEndpoint-2024-07-04-05-10-12
arn:aws:sagemaker:us-east-1:180494263813:endpoint/DEMO-XGBoostEndpoint-2024-07-04-05-10-12
Status: Creating
Status: Creating
Status: Creating
Status: Creating
Status: InService
Arn: arn:aws:sagemaker:us-east-1:180494263813:endpoint/DEMO-XGBoostEndpoint-2024-07-04-05-10-12
Status: InService
CPU times: user 91.1 ms, sys: 5.15 ms, total: 96.3 ms
Wall time: 4min 1s


### Validate the model for use
Now you can obtain the endpoint from the client library using the result from previous operations and generate classifications from the model using that endpoint.

In [46]:
runtime_client = boto3.client("runtime.sagemaker")

Lets generate the prediction. We'll pick csv data from the test data file

In [51]:
%%time
import json


file_name = (
    "test_point.csv"  # customize to your test file, will be 'mnist.single.test' if use data above
)

with open(file_name, "r") as f:
    payload = f.read().strip()
    
    
print("Payload :\n")

print(payload)
print()

#payload = json.loads(payload)

response = runtime_client.invoke_endpoint(
    EndpointName=endpoint_name, ContentType="text/csv", Body=payload
)

#print(response)

print("Results :\n")
print()
#print(response["Body"].read())
result = response["Body"].read().decode("ascii")
#result = response["Body"].read().decode()
# Unpack response
print("\nPredicted Class Probabilities: {}.".format(result))

Payload :

5.400000000000000355e+00,3.000000000000000000e+00,4.500000000000000000e+00,1.500000000000000000e+00
5.599999999999999645e+00,3.000000000000000000e+00,4.099999999999999645e+00,1.300000000000000044e+00
6.299999999999999822e+00,2.799999999999999822e+00,5.099999999999999645e+00,1.500000000000000000e+00
6.000000000000000000e+00,3.000000000000000000e+00,4.799999999999999822e+00,1.800000000000000044e+00
5.099999999999999645e+00,3.299999999999999822e+00,1.699999999999999956e+00,5.000000000000000000e-01

Results :



Predicted Class Probabilities: 1.0
1.0
1.0
2.0
0.0
.
CPU times: user 23.3 ms, sys: 0 ns, total: 23.3 ms
Wall time: 99 ms


### (Optional) Delete the Endpoint

If you're ready to be done with this notebook, please run the delete_endpoint line in the cell below.  This will remove the hosted endpoint you created and avoid any charges from a stray instance being left on.

In [52]:
sm_client.delete_endpoint(EndpointName=endpoint_name)

{'ResponseMetadata': {'RequestId': 'aaf9574a-a3c0-4f91-acb1-1cbb22ee839d',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': 'aaf9574a-a3c0-4f91-acb1-1cbb22ee839d',
   'content-type': 'application/x-amz-json-1.1',
   'date': 'Thu, 04 Jul 2024 05:19:15 GMT',
   'content-length': '0'},
  'RetryAttempts': 0}}