## Step 1: Setup a SageMaker Studio Notebook

In [None]:
%pip install --upgrade -q aiobotocore

In [None]:
%pip install --upgrade botocore

In [None]:
%pip install --upgrade -q awscli

In [2]:
import boto3
import sagemaker
import time

from sagemaker.image_uris import retrieve
from time import gmtime, strftime
from sagemaker.amazon.amazon_estimator import image_uris

s3 = boto3.resource('s3')

sagemaker_session = sagemaker.Session()
default_bucket = sagemaker_session.default_bucket()
write_prefix = "housing-prices-prediction-mme-demo"

region = sagemaker_session.boto_region_name
s3_client = boto3.client("s3", region_name=region)
sm_client = boto3.client("sagemaker", region_name=region)
sm_runtime_client = boto3.client("sagemaker-runtime")
role = sagemaker.get_execution_role()

# S3 locations used for parameterizing the notebook run
read_bucket = "sagemaker-sample-files"
read_prefix = "models/house_price_prediction"
model_prefix = "models/xgb-hpp"

# S3 location of trained model artifact
model_artifacts = f"s3://{default_bucket}/{model_prefix}/"

# Location
location = ['Chicago_IL', 'Houston_TX', 'NewYork_NY', 'LosAngeles_CA']

test_data = [1997, 2527, 6, 2.5, 0.57, 1]


In [3]:
# let's make a copy of the model artifacts from the public S3 bucket to the session created S3 bucket

for i in range (0,4):
    copy_source = {'Bucket': read_bucket, 'Key': f"{read_prefix}/{location[i]}.tar.gz"}
    bucket = s3.Bucket(default_bucket)
    bucket.copy(copy_source, f"{model_prefix}/{location[i]}.tar.gz")

## Step 2: Create a Real-Time Inference endpoint

To deploy a model, we need to implement three steps:

* Create a SageMaker model from the model artifact
* Create an endpoint configuration to specify properties, including instance type and count
* Create the endpoint using the endpoint configuration

In [4]:
# let's create a SageMaker model using the trained model artifacts stored in Amazon S3.

# The create_model method takes the Docker container containing the training image 
# (for this model, the XGBoost container), the Amazon S3 location of the model artifacts, 
# and the execution role as parameters. The primary_container takes a parameter “Mode” as “MultiModel”.

# Retrieve the SageMaker managed XGBoost image
training_image = retrieve(framework="xgboost", region=region, version="1.3-1")

# Specify an unique model name that does not exist
model_name = "housing-prices-prediction-mme-xgb"
primary_container = {
                     "Image": training_image,
                     "ModelDataUrl": model_artifacts,
                     "Mode": "MultiModel"
                    }

model_matches = sm_client.list_models(NameContains=model_name)["Models"]
if not model_matches:
    model = sm_client.create_model(ModelName=model_name,
                                   PrimaryContainer=primary_container,
                                   ExecutionRoleArn=role)
else:
    print(f"Model with name {model_name} already exists! Change model name to create new")


In [5]:
# Let's configure the endpoint, by using the Boto3 create_endpoint_config method. 
# The main inputs to the create_endpoint_config method are the endpoint configuration name and 
# variant information, such as inference instance type and count, the name of the model to be deployed,
# and the traffic share the endpoint should handle.

# Endpoint Config name
endpoint_config_name = f"{model_name}-endpoint-config"

# Create endpoint if one with the same name does not exist
endpoint_config_matches = sm_client.list_endpoint_configs(NameContains=endpoint_config_name)["EndpointConfigs"]
if not endpoint_config_matches:
    endpoint_config_response = sm_client.create_endpoint_config(
                                                                EndpointConfigName=endpoint_config_name,
                                                                ProductionVariants=[
                                                                    {
                                                                        "InstanceType": "ml.m5.xlarge",
                                                                        "InitialInstanceCount": 1,
                                                                        "InitialVariantWeight": 1,
                                                                        "ModelName": model_name,
                                                                        "VariantName": "AllTraffic",
                                                                    }
                                                                ],
                                                                )
else:
    print(f"Endpoint config with name {endpoint_config_name} already exists! Change endpoint config name to create new")


In [6]:
# Let's create the endpoint. 
# The create_endpoint method takes the endpoint configuration as a parameter, 
# and deploys the model specified in the endpoint configuration to a compute instance. 

# Endpoint name
endpoint_name = f"{model_name}-endpoint"

endpoint_matches = sm_client.list_endpoints(NameContains=endpoint_name)["Endpoints"]
if not endpoint_matches:
    endpoint_response = sm_client.create_endpoint(
                                                  EndpointName=endpoint_name,
                                                  EndpointConfigName=endpoint_config_name
                                                 )
else:
    print(f"Endpoint with name {endpoint_name} already exists! Change endpoint name to create new")

resp = sm_client.describe_endpoint(EndpointName=endpoint_name)
status = resp["EndpointStatus"]
while status == "Creating":
    print(f"Endpoint Status: {status}...")
    time.sleep(60)
    resp = sm_client.describe_endpoint(EndpointName=endpoint_name)
    status = resp["EndpointStatus"]
print(f"Endpoint Status: {status}")

Endpoint Status: Creating...
Endpoint Status: Creating...
Endpoint Status: Creating...
Endpoint Status: InService


## Step 3: Invoke The Inference Endpoint

In [7]:
# converting the elements in test data to string
payload = ' '.join([str(elem) for elem in test_data])

for i in range (0,4):
    predicted_value = sm_runtime_client.invoke_endpoint(EndpointName=endpoint_name, TargetModel=f"{location[i]}.tar.gz", ContentType="text/csv", Body=payload)
    print(f"Predicted Value for {location[i]} target model:\n ${predicted_value['Body'].read().decode('utf-8')}")


Predicted Value for Chicago_IL target model:
 $[392504.75]
Predicted Value for Houston_TX target model:
 $[387296.5625]
Predicted Value for NewYork_NY target model:
 $[390451.53125]
Predicted Value for LosAngeles_CA target model:
 $[379517.5]


## Step 4: Clean Up Resources

In [None]:
# Delete model
sm_client.delete_model(ModelName=model_name)

# Delete endpoint configuration
sm_client.delete_endpoint_config(EndpointConfigName=endpoint_config_name)

# Delete endpoint
sm_client.delete_endpoint(EndpointName=endpoint_name)


#### To delete the S3 bucket:

* Open the Amazon S3 console. On the navigation bar, choose Buckets, sagemaker /your-Region/your-account-id/, and then select models/xgb-hpp the checkbox next to housing-prices-prediction-demo. Then, choose Delete.
* On the Delete objects dialog box, verify that you have selected the proper object to delete and enter permanently delete into the Permanently delete objects confirmation box.
* Once this is complete and the bucket is empty, you can delete the sagemaker /your-Region/ your-account-id/ bucket by following the same procedure again.

#### To delete apps

* To delete the SageMaker Studio apps, do the following: On the SageMaker Studio console, choose studio-user, and then delete all the apps listed under Apps by choosing Delete app. Wait until the Status changes to Deleted.

* If you ran the CloudFormation template at the beginning of this tutorial to create a new SageMaker Studio domain, continue with the following steps to delete the domain, user, and the resources created by the CloudFormation template:

* Open the CloudFromation console. In the CloudFormation pane, choose Stacks. From the status dropdown list, select Active. Under Stack name, choose CFN-SM-IM-Lambda-catalog to open the stack details page. On CFN-SM-IM-Lambda-catalog stack details page, choose Delete to delete the stack along with the resources it created in Step 1.

* On CFN-SM-IM-Lambda-catalog stack details page, choose Delete to delete the stack along with the resources it created.