## XGBoost Model deployment in Amazon Sagemaker. 

#### This notebook should be run in an Amazon Sagemaker notebook instance. 


#### Before running this notebook, 
you should have uploaded the pre-trained model and test_point.csv from your laptop to the 
same folder where you have this notebook file. test_point.csv contains few sample test data in csv format.


This loads the pre-trained XGBoost model and saves in a S3 bucket in .tar.gz format as required by Sagemaker.
Then it creates a sagemaker model from the model file stored in S3. 
Then configures and creates an Endpoint to deploy the model and also tests invoking the endpoint to get prediction.

#### Please remember not to run the last "Delete the Endpoint" cell if you want to test the deployed model from a client. 


After the exercise is over, 
##### you should cleanup the Sagemaker resources as described in 
https://docs.aws.amazon.com/sagemaker/latest/dg/ex1-cleanup.html to avoid charges incurred because of resources left behind.




# Set up Env

In [1]:
%%time

import sys
import os

import sagemaker
from sagemaker import get_execution_role

# Add the parent directory to the sys.path
sys.path.insert(0, os.path.abspath(".."))

# Define IAM role
role = get_execution_role()

# Establish S3 bucket connection
import boto3

s3 = boto3.client("s3")
bucket = "capstone-bucket-4-friends"
region = boto3.Session().region_name
bucket_path = "https://s3-{}.amazonaws.com/{}".format(region, bucket)

# Take a look at current dir
print(os.getcwd())

from file_utilities import s3_download

/home/sagemaker-user/capstone-2024-summer/src/rachel/try_deploy
CPU times: user 164 ms, sys: 23.8 ms, total: 188 ms
Wall time: 187 ms


In [2]:
print(role)
print(region)
print(bucket)
print(bucket_path)

arn:aws:iam::975050273734:role/service-role/SageMaker-ExecutionRole-20240612T130383
us-east-1
capstone-bucket-4-friends
https://s3-us-east-1.amazonaws.com/capstone-bucket-4-friends


#### Install xgboost as it is needed for loading the model from joblib dump file and test it before deployment.
#### Please note that the XGBoost version should be same as the version with which the model was trained locally in laptop.

In [8]:
!conda install -y -c conda-forge xgboost==1.7.6 -q

Channels:
 - conda-forge
 - defaults
Platform: linux-64
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... done

# All requested packages already installed.



### Load the pre-trained model and test it before deployment

In [9]:
model_file_name = "DEMO-local-xgboost-model"

In [10]:
import joblib
import xgboost

mymodel = joblib.load(model_file_name)

In [11]:
#import json
import numpy as np


file_name = (
    "test_point.csv" 
)

with open(file_name, "r") as f:
    mypayload = np.loadtxt(f, delimiter=",")
    
print(mypayload)    

[[5.4 3.  4.5 1.5]
 [5.6 3.  4.1 1.3]
 [6.3 2.8 5.1 1.5]
 [6.  3.  4.8 1.8]
 [5.1 3.3 1.7 0.5]]


In [12]:
mymodel.predict(mypayload)

array([1, 1, 1, 2, 0], dtype=int32)

#### Create a tar.gz model file as this is the format required by Sagemaker for deployment.

In [13]:
mymodel._Booster.save_model(model_file_name)

In [14]:
!tar czvf model.tar.gz $model_file_name

DEMO-local-xgboost-model


### Upload the pre-trained model to S3

In [17]:
s3_path = f"try_deploy/model.tar.gz"

In [19]:
s3.upload_file(
    f"/home/sagemaker-user/capstone-2024-summer/src/rachel/{s3_path}",
    bucket,
    s3_path,
)

### Set up hosting for the model¶
#### Import model into hosting
This involves creating a SageMaker model from the model file previously uploaded to S3.

#### Create a Sagemaker model 

In [24]:
from sagemaker.image_uris import retrieve

#https://docs.aws.amazon.com/sagemaker/latest/dg-ecr-paths/ecr-us-east-1.html#xgboost-us-east-1
container = retrieve(framework='xgboost', region=boto3.Session().region_name, version='1.7-1')

In [26]:
%%time
from time import gmtime, strftime

model_name = model_file_name + strftime("%Y-%m-%d-%H-%M-%S", gmtime())

model_url = "https://s3-{}.amazonaws.com/{}/{}".format(region, bucket, s3_path)

sm_client = boto3.client("sagemaker")

print(model_url)

primary_container = {
    "Image": container,
    "ModelDataUrl": model_url,
}

create_model_response2 = sm_client.create_model(
    ModelName=model_name, ExecutionRoleArn=role, PrimaryContainer=primary_container
)

print(create_model_response2["ModelArn"])

https://s3-us-east-1.amazonaws.com/capstone-bucket-4-friends/try_deploy/model.tar.gz
arn:aws:sagemaker:us-east-1:975050273734:model/DEMO-local-xgboost-model2024-07-01-16-24-33
CPU times: user 39.4 ms, sys: 4.51 ms, total: 44 ms
Wall time: 734 ms


### Create endpoint configuration

Create an endpoint configuration, that describes the distribution of traffic across the models, whether split, shadowed, or sampled in some way. In addition, the endpoint configuration describes the instance type required for model deployment.

In [27]:
from time import gmtime, strftime

endpoint_config_name = "DEMO-XGBoostEndpointConfig-" + strftime("%Y-%m-%d-%H-%M-%S", gmtime())

print(endpoint_config_name)

create_endpoint_config_response = sm_client.create_endpoint_config(
    EndpointConfigName=endpoint_config_name,
    ProductionVariants=[
        {
            "InstanceType": "ml.m4.xlarge",
            "InitialInstanceCount": 1,
            "InitialVariantWeight": 1,
            "ModelName": model_name,
            "VariantName": "AllTraffic",
        }
    ],
)

print("Endpoint Config Arn: " + create_endpoint_config_response["EndpointConfigArn"])

DEMO-XGBoostEndpointConfig-2024-07-01-16-24-56
Endpoint Config Arn: arn:aws:sagemaker:us-east-1:975050273734:endpoint-config/DEMO-XGBoostEndpointConfig-2024-07-01-16-24-56


### Create endpoint
Lastly, you create the endpoint that serves up the model, through specifying the name and configuration defined above. The end result is an endpoint that can be validated and incorporated into production applications. This takes 9-11 minutes to complete.

In [28]:
%%time
import time

endpoint_name = "DEMO-XGBoostEndpoint-" + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
print(endpoint_name)
create_endpoint_response = sm_client.create_endpoint(
    EndpointName=endpoint_name, EndpointConfigName=endpoint_config_name
)
print(create_endpoint_response["EndpointArn"])

resp = sm_client.describe_endpoint(EndpointName=endpoint_name)
status = resp["EndpointStatus"]
print("Status: " + status)

while status == "Creating":
    time.sleep(60)
    resp = sm_client.describe_endpoint(EndpointName=endpoint_name)
    status = resp["EndpointStatus"]
    print("Status: " + status)

print("Arn: " + resp["EndpointArn"])
print("Status: " + status)

DEMO-XGBoostEndpoint-2024-07-01-16-25-05
arn:aws:sagemaker:us-east-1:975050273734:endpoint/DEMO-XGBoostEndpoint-2024-07-01-16-25-05
Status: Creating
Status: Creating
Status: Creating
Status: Creating
Status: InService
Arn: arn:aws:sagemaker:us-east-1:975050273734:endpoint/DEMO-XGBoostEndpoint-2024-07-01-16-25-05
Status: InService
CPU times: user 69.7 ms, sys: 6.37 ms, total: 76 ms
Wall time: 4min 1s


### Validate the model for use
Now you can obtain the endpoint from the client library using the result from previous operations and generate classifications from the model using that endpoint.

In [29]:
runtime_client = boto3.client("runtime.sagemaker")

Lets generate the prediction. We'll pick csv data from the test data file

In [30]:
%%time
import json


file_name = (
    "test_point.csv"
)

with open(file_name, "r") as f:
    payload = f.read().strip()
    
    
print("Payload :\n")

print(payload)
print()

response = runtime_client.invoke_endpoint(
    EndpointName=endpoint_name, ContentType="text/csv", Body=payload
)

##print(response)

print("Results :\n")
print()

result = response["Body"].read().decode("ascii")

# Unpack response
print("\nPredicted Class Probabilities: {}.".format(result))

Payload :

5.400000000000000355e+00,3.000000000000000000e+00,4.500000000000000000e+00,1.500000000000000000e+00
5.599999999999999645e+00,3.000000000000000000e+00,4.099999999999999645e+00,1.300000000000000044e+00
6.299999999999999822e+00,2.799999999999999822e+00,5.099999999999999645e+00,1.500000000000000000e+00
6.000000000000000000e+00,3.000000000000000000e+00,4.799999999999999822e+00,1.800000000000000044e+00
5.099999999999999645e+00,3.299999999999999822e+00,1.699999999999999956e+00,5.000000000000000000e-01

Results :



Predicted Class Probabilities: 1.0
1.0
1.0
2.0
0.0
.
CPU times: user 14.7 ms, sys: 0 ns, total: 14.7 ms
Wall time: 170 ms


### (Optional) Delete the Endpoint

If you're ready to be done with this notebook, please run the delete_endpoint line in the cell below.  This will remove the hosted endpoint you created and avoid any charges from a stray instance being left on.

In [31]:
sm_client.delete_endpoint(EndpointName=endpoint_name)

{'ResponseMetadata': {'RequestId': 'f36f93f9-4fae-4296-8332-b4ef2a484d53',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': 'f36f93f9-4fae-4296-8332-b4ef2a484d53',
   'content-type': 'application/x-amz-json-1.1',
   'date': 'Mon, 01 Jul 2024 16:35:40 GMT',
   'content-length': '0'},
  'RetryAttempts': 0}}