# SageMaker Inference Recommender for Instant Recommendations

This notebook demonstrates how to get instant deployment recommendations from the DescribeModel API

The DescribeModel API can be found here: https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeModel.html

**This sample is provided for demonstration purposes, make sure to conduct appropriate testing if deriving this code for your own use-cases!**

## Set Up

In [1]:
!pip install boto3 --upgrade

Looking in indexes: https://pypi.org/simple, https://pip.repos.neuron.amazonaws.com
Collecting boto3
  Downloading boto3-1.26.146-py3-none-any.whl (135 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m135.6/135.6 kB[0m [31m4.5 MB/s[0m eta [36m0:00:00[0m
Collecting botocore<1.30.0,>=1.29.146
  Downloading botocore-1.29.146-py3-none-any.whl (10.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.8/10.8 MB[0m [31m101.1 MB/s[0m eta [36m0:00:00[0m00:01[0m0:01[0m
Installing collected packages: botocore, boto3
  Attempting uninstall: botocore
    Found existing installation: botocore 1.29.114
    Uninstalling botocore-1.29.114:
      Successfully uninstalled botocore-1.29.114
  Attempting uninstall: boto3
    Found existing installation: boto3 1.26.114
    Uninstalling boto3-1.26.114:
      Successfully uninstalled boto3-1.26.114
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. Thi

In [2]:
import time
import boto3
import sagemaker
import pandas as pd
from uuid import uuid4
from datetime import datetime, timedelta

In [3]:
sm_client = boto3.client(service_name="sagemaker")

## Step 1: Create the Model

For this sample notebook, we will use a simple model specifying only a Tensorflow DLC

In [4]:
model_name = "instant-rec-demo-nb-model-" + str(uuid4())
print(model_name)

instant-rec-demo-nb-model-75b12f9c-e894-4971-8c35-cd5d80cfbb7c


In [5]:
region = boto3.Session().region_name
image = sagemaker.image_uris.retrieve(
    framework="tensorflow",
    region=region,
    version="1.12.0",
    image_scope="inference",
    instance_type="ml.c5.4xlarge",
)
print(image)

520713654638.dkr.ecr.us-west-2.amazonaws.com/sagemaker-tensorflow-serving:1.12.0-cpu


In [6]:
execution_role = sagemaker.get_execution_role()

create_model_response = sm_client.create_model(
    ModelName=model_name, ExecutionRoleArn=execution_role, Containers=[{"Image": image}]
)

print(create_model_response)

{'ModelArn': 'arn:aws:sagemaker:us-west-2:337323248444:model/instant-rec-demo-nb-model-75b12f9c-e894-4971-8c35-cd5d80cfbb7c', 'ResponseMetadata': {'RequestId': 'b82d0e6f-d815-49da-b416-83685bfbd5f9', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': 'b82d0e6f-d815-49da-b416-83685bfbd5f9', 'content-type': 'application/x-amz-json-1.1', 'content-length': '124', 'date': 'Mon, 05 Jun 2023 16:27:50 GMT'}, 'RetryAttempts': 0}}


## Step 2: Describe the Model

Describe the Model until `DeploymentRecommendations` goes into a `COMPLETED` state. On average, you should expect to have DeploymentRecommendations available within a second

In [7]:
completed = timed_out = False

start = datetime.today()
max_duration = start + timedelta(minutes=1)

while not completed and not timed_out:
    describe_model_response = sm_client.describe_model(ModelName=model_name)
    deployment_recommendation = describe_model_response.get("DeploymentRecommendation")

    completed = (
        deployment_recommendation is not None
        and "COMPLETED" == deployment_recommendation.get("RecommendationStatus")
    )

    time.sleep(0.5)
    timed_out = datetime.today() > max_duration

    print(datetime.today(), "DeploymentRecommendation:", deployment_recommendation)

2023-06-05 16:27:51.726610 DeploymentRecommendation: None
2023-06-05 16:27:52.332732 DeploymentRecommendation: {'RecommendationStatus': 'COMPLETED', 'RealTimeInferenceRecommendations': [{'RecommendationId': 'instant-rec-demo-nb-model-75b12f9c-e894-4971-8c35-cd5d80cfbb7c/mCDMqOzw', 'InstanceType': 'ml.g4dn.xlarge', 'Environment': {}}, {'RecommendationId': 'instant-rec-demo-nb-model-75b12f9c-e894-4971-8c35-cd5d80cfbb7c/DHSDuziQ', 'InstanceType': 'ml.g4dn.2xlarge', 'Environment': {}}, {'RecommendationId': 'instant-rec-demo-nb-model-75b12f9c-e894-4971-8c35-cd5d80cfbb7c/AAtnB24V', 'InstanceType': 'ml.c6i.2xlarge', 'Environment': {}}]}


## Step 3: Visualize the DeploymentRecommendations

In [8]:
pd.set_option("display.max_colwidth", None)
df = pd.DataFrame(
    deployment_recommendation.get("RealTimeInferenceRecommendations"),
    columns=["RecommendationId", "InstanceType", "Environment"],
)
display(df)

Unnamed: 0,RecommendationId,InstanceType,Environment
0,instant-rec-demo-nb-model-75b12f9c-e894-4971-8c35-cd5d80cfbb7c/mCDMqOzw,ml.g4dn.xlarge,{}
1,instant-rec-demo-nb-model-75b12f9c-e894-4971-8c35-cd5d80cfbb7c/DHSDuziQ,ml.g4dn.2xlarge,{}
2,instant-rec-demo-nb-model-75b12f9c-e894-4971-8c35-cd5d80cfbb7c/AAtnB24V,ml.c6i.2xlarge,{}


## Step 4: Cleanup

In [9]:
sm_client.delete_model(ModelName=model_name)

{'ResponseMetadata': {'RequestId': '7516bf77-576d-4008-9650-007629e1dc1d',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '7516bf77-576d-4008-9650-007629e1dc1d',
   'content-type': 'application/x-amz-json-1.1',
   'content-length': '0',
   'date': 'Mon, 05 Jun 2023 16:27:51 GMT'},
  'RetryAttempts': 0}}