# Deploying a Model Using SageMaker Endpoints

This notebook demonstrates how to register a model in the SageMaker Model Registry and then deploy it as a SageMaker Endpoint.

We'll walk through the following steps:
1. Set up necessary imports and configurations.
2. Register a model in the SageMaker Model Registry.
3. Retrieve the model from the SageMaker Model Registry.
4. Create an endpoint configuration.
5. Deploy the model as a SageMaker Endpoint.
6. Make predictions using the endpoint.

This example assumes that the model is a linear regression model trained using the Linear Learner algorithm and is for house price prediction based on input data in CSV format.

# Step 1: Import Libraries and Get Role
Firstly we need to import the SageMaker SDK for Python packages we are going to need as well as specify what S3 bucket we will be using.

In [None]:
# Step 1: Import necessary libraries
import boto3
import time
import sagemaker
from sagemaker import get_execution_role
from sagemaker.session import Session
from sagemaker import Model
from sagemaker import ModelPackage

# Define your SageMaker session and role
sagemaker_session = Session()
role = get_execution_role()

# Define S3 bucket and prefix
s3_bucket = sagemaker_session.default_bucket()
s3_prefix = '/model'

print(role)

# Step 2: Register Model Package
We have the model.tgz artifact created by the training process in lab4 present here in the file system of our JupyterLab host. But we want it in an accessible location for registration into the model registry and for use by SageMaker Endpoints.

In this lab, we are now in a new environment and SageMaker doesn't know about our model artifact. So we need to wrap that artifact up and save as what is called a model package. This will make SageMaker aware of the model package and we can start to reference it.

Now, SageMaker Model Registry is now our go-to location for storing all metadata about models and we will be using it in this lab. However, its important to understand why artifacts show up in specific locations in the UI. Remember we had SageMaker Notebooks, Training Jobs and Inference long before we had SageMaker Studio? Well, we are going to create a model package so that SageMaker can reference our model and use it for creating entities like endpoints for inference.

Once we have done that, we will then use the newer model registry feature which we find within Sagemaker Studio.


In [None]:
# Upload the model artifact to S3
local_file_path = 'model.tar.gz'
s3_uri = sagemaker_session.upload_data(local_file_path, bucket=s3_bucket, key_prefix = s3_prefix)
print(f"File uploaded to: {s3_uri}")
model_artifact = f's3://{s3_bucket}/{s3_prefix}/model.tar.gz'

# Determine what AWS region we are in
region = sagemaker_session.boto_region_name

# Retrieve the Linear Learner image URI
container_image_uri = sagemaker.image_uris.retrieve(framework="linear-learner", region=region)

# Print path to container image
print(f"SageMaker Linear Learner Image URI: {container_image_uri}")

# Assign a meaningful name to the model
model_name = 'house-prices'  # Give it a name for identification

# Create the SageMaker model
model = Model(
    image_uri=container_image_uri,  # Use the appropriate image URI for your model's container
    model_data=model_artifact,      # Path to the model artifact (model.tgz)
    role=get_execution_role(),
    name=model_name
)

# Now, the model will be registered in SageMaker and show up in the management console

model.create()


Now that might not have seemed exciting, but if you now navigate to the browser tab that shows the AWS management console which should still be open at SageMaker AI we should be able to see our model. If you look to the left side navigation bar, under inference heading, there is a menu option called models. Click on that and you can find our model as a SageMaker entity! Click into the details and you will see the container image and S3 location of the model artifact.

If you want to see the same thing but in the SageMaker Studio (newer) interface, click on your browser tab that has the Studio open. Below the applications panel in the left side navigation bar, you will see an option called Models. Click into there and you find an empty panel because we are looking at registered models - i.e. a model package that is registered in the SageMaker Model Registry. 

But the model registry is not a mandatory component. It's optional. So if we want to see our model package that we created (which points at our S3 model artifact), then click on the deployable models tab and there is our model we created!


# Step 2: Create a Model Package Group in SageMaker Model Registry
Now, we will start with the Model Registry. Remember, with model registry we get to track our models lineage and have approval associated with them. We could have a governance officer who approves models subject to metrics inspection who has permission to approve. That approve action could then be a CI/CD trigger to deploy a model into a production AWS account!

But for now, we are just going to see how we register a model into the model registry with a pending approval status and we will manually approve later on. Before we can register our model however, we need too create a model package group in the model registry first.

To work with the model registry, we are going to use the AWS SDK for Python (boto3) rather than the SageMaker SDK for Python. It is common that we have to switch between SDKs to be able to perform specific steps. The SageMaker SDK is great for high level ML actions, while the boto3 SDK is ideal for low level configuration changes.


In [None]:
# Create a model package group
model_package_group_name = 'house-price-prediction-group'
model_package_group_description = 'KodeKloud Demystifying SageMaker model package group for house price prediction'
sagemaker_client = boto3.client('sagemaker')

try:
    sagemaker_client.create_model_package_group(
        ModelPackageGroupName=model_package_group_name,
        ModelPackageGroupDescription=model_package_group_description
    )
    print(f'Created Model Package Group: {model_package_group_name}')
    
except sagemaker_client.exceptions.ResourceInUse:
    print(f'Model Package Group {model_package_group_name} already exists.')


# Step 3: Register Model into SageMaker Model Registry
Ok, now we have a model registry model group, but its empty. You can switch browser tabs to SageMaker Studio and check under Models item in the left navigation menu and see the house price prediction group.

Next we will register our model into this newly created model group. We will register the model with a status of pending approval, thereby requiring us to do a manual approval before we use it.

In [None]:
# Register the model

model_approval_status = "PendingManualApproval"
customer_metadata_properties = {"ModelType": "HousePricePrediction"}

model_package = model.register(
    content_types=["text/csv"],
    response_types=["text/csv"],
    inference_instances=["ml.m5.large", "ml.m5.large"],
    transform_instances=["ml.m5.large"],
    model_package_group_name=model_package_group_name,
    approval_status=model_approval_status,
    customer_metadata_properties=customer_metadata_properties,
)

print(f"Model package version ARN: {model_package.model_package_arn}")


## Step 4: Retrieve the Model from the SageMaker Model Registry
In the previous step, we registered our model to the model registry. This will help us track versions of the model over time and help us with approval for use. The model was registered to the registry in a pending approval state. This means it can't be used yet.

So before we run the next code cell, go to the SageMaker Studio console tab of your browser and locate the model in your model registry.
This will be under Models -> Model Groups -> house-price-prediction-group -> Version 1

On the right side of the interface you will see a box marked Deploy and the approval status of the model showing as Pending Approval. Click on this to see the drop down list of options. Change the status from Pending approval to Approved and accept the pop up warning.


We will retrieve the registered model package from the model package group.

In [None]:
# Fetch the latest approved model package from the model registry
response = sagemaker_client.list_model_packages(
    ModelPackageGroupName=model_package_group_name,
    ModelApprovalStatus='Approved',
    SortBy='CreationTime',
    SortOrder='Descending'
)

latest_model_package_arn = response['ModelPackageSummaryList'][0]['ModelPackageArn']
print(f'Latest Approved Model Package ARN: {latest_model_package_arn}')

## Step 5: Create an Endpoint Configuration
At the end of lab 4 when we were testing our model inference capability after training, we made use of SageMaker SDK for Python a high level class called the Predictor class. This class abstracts the detail of creating a SageMaker Endpoint config and then a SageMaker Endpoint using that config. That approach using the SageMaker SDK offers several advantages:

Simplicity: It abstracts away many of the lower-level details that you'd need to manage with boto3.
Consistency: It provides a consistent interface for various SageMaker operations.
Integration: It integrates well with other SageMaker features and workflows.

However, we have seen that method already. So in this lab we are going to create the endpoint configuration and the endpoint separately so you can see both methods. This does mean we are going to be using boto3 for this more explicit deployment.


In [None]:
# Create Endpoint Config

endpoint_config_name = "house-price-endpoint-config"

create_endpoint_config_response = sagemaker_client.create_endpoint_config(
    EndpointConfigName=endpoint_config_name,
    ProductionVariants=[{
        'InstanceType': 'ml.m5.large',
        'InitialVariantWeight': 1,
        'InitialInstanceCount': 1,
        'ModelName': model_name,
        'VariantName': 'AllTraffic'
    }]
)

print(f"Endpoint configuration created: {create_endpoint_config_response['EndpointConfigArn']}")

To check we successfully created an endpoint configuration, lets switch browser tabs to our AWS Management Console which should be open at SageMaker AI. In the navigation bar on the left, click on Inference and then endpoint configurations and you will see your newly created endpoint config.

At this point, we have only defined the configuration of an endpoint, not an endpoint itself. We still need to create an endpoint that references this endpoint configuration.

In the future you may wish to update an endpoint to swap it over to a new alternative endpoint configuration.


## Step 6: Create the Endpoint
The endpoint is the actual service that serves predictions using your deployed model.
This can take up to 10minutes to create and be reported as InService.

In [None]:
# Define endpoint name
endpoint_name = 'house-price-prediction-endpoint'

# Create the endpoint
sagemaker_client.create_endpoint(
    EndpointName=endpoint_name,
    EndpointConfigName=endpoint_config_name
)
print(f'Creating endpoint: {endpoint_name}')

## Step 6: Test the Endpoint
Now that the endpoint is deployed, we can send input data for inference and get predictions.
Note, we have some ready made data which will generate a price prediction. A CSV schema is in the code cell, so adjust values and get different results from our model!


In [None]:
import json
import numpy as np

# Prepare sample input data
# lattitude, longtitude, bathrooms, bedrooms, floorsq, livingrooms, valuechange, percentagechange, historyprice, historypercent, historynumeric, postcode_encoded, country,fue,free,lease,shared, 19 types, 7 energy, 3 confidence 

input_data = '51.6215527,-0.2466031,2.0,6.0,517.0,3.0,289000.0,27.39336492890996,1055000.0,37.01298701298701,285000.0,1354000.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0'

# Create runtime client
runtime_client = boto3.client('sagemaker-runtime')

# Make the prediction
response = runtime_client.invoke_endpoint(
    EndpointName=endpoint_name,
    ContentType='text/csv',
    Body=input_data
)

# Decode the response
prediction = response['Body'].read().decode('utf-8')
print(f'Predicted house price: {prediction}')

# Clean Up
Let's now clean up our resources.
We need to delete our endpoint so we stop incurring any charges. But we will also delete the endpoint config.

In [None]:
# Step 1: Delete the endpoint
print(f"Deleting endpoint: {endpoint_name}")
sagemaker_client.delete_endpoint(EndpointName=endpoint_name)

# Wait for the endpoint to be deleted
while True:
    try:
        sagemaker_client.describe_endpoint(EndpointName=endpoint_name)
        print("Endpoint is still being deleted. Waiting...")
        time.sleep(30)
    except sagemaker_client.exceptions.ClientError as e:
        if e.response['Error']['Code'] == 'ValidationException':
            print("Endpoint deleted successfully.")
            break

# Step 2: Delete the endpoint configuration
try:
    print(f"Deleting endpoint configuration: {endpoint_config_name}")
    sagemaker_client.delete_endpoint_config(EndpointConfigName=endpoint_config_name)
    print("Endpoint configuration deleted successfully.")
except sagemaker_client.exceptions.ClientError as e:
    if e.response['Error']['Code'] == 'ValidationException':
        print("Endpoint configuration not found or already deleted.")

print("Cleanup complete.")