# SageMaker Inference Recommender for Instant Recommendations

This notebook demonstrates how to get instant deployment recommendations from the DescribeModel API

The DescribeModel API can be found here: https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeModel.html

**This sample is provided for demonstration purposes, make sure to conduct appropriate testing if deriving this code for your own use-cases!**

## Set Up

In [1]:
!pip install boto3>=1.26.140

In [2]:
import time
import boto3
import sagemaker
import pandas as pd
from uuid import uuid4
from datetime import datetime, timedelta

In [3]:
sm_client = boto3.client(service_name="sagemaker")

## Step 1: Create the Model

For this sample notebook, we will use a simple model specifying only a Tensorflow DLC

In [6]:
model_name = "instant-rec-demo-nb-model-" + str(uuid4())
print(model_name)

instant-rec-demo-nb-model-7c35c093-5224-4e03-b0f7-b1c1adffcf27


In [7]:
region = boto3.Session().region_name
image = sagemaker.image_uris.retrieve(framework='tensorflow',region=region,version='1.12.0',image_scope='inference',instance_type='ml.c5.4xlarge')
print(image)

520713654638.dkr.ecr.us-west-2.amazonaws.com/sagemaker-tensorflow-serving:1.12.0-cpu


In [8]:
execution_role = sagemaker.get_execution_role()

create_model_response = sm_client.create_model(
    ModelName = model_name,
    ExecutionRoleArn = execution_role,
    Containers = [
        {
            'Image': image
        }
    ]
)

print(create_model_response)

{'ModelArn': 'arn:aws:sagemaker:us-west-2:337323248444:model/instant-rec-demo-nb-model-7c35c093-5224-4e03-b0f7-b1c1adffcf27', 'ResponseMetadata': {'RequestId': '9804610c-bc04-4844-abcb-8f0658379ccd', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': '9804610c-bc04-4844-abcb-8f0658379ccd', 'content-type': 'application/x-amz-json-1.1', 'content-length': '124', 'date': 'Wed, 24 May 2023 22:05:00 GMT'}, 'RetryAttempts': 0}}


## Step 2: Describe the Model

Describe the Model until `DeploymentRecommendations` goes into a `COMPLETED` state. In general, you should expect to have DeploymentRecommendations available within a second

In [9]:
completed = timed_out = False

start = datetime.today()
max_duration = start + timedelta(minutes=1)

while not completed and not timed_out:
    describe_model_response = sm_client.describe_model(ModelName = model_name)
    deployment_recommendation = describe_model_response.get('DeploymentRecommendation')
    
    completed = deployment_recommendation is not None and 'COMPLETED' == deployment_recommendation.get('RecommendationStatus')
    
    time.sleep(0.5)
    timed_out = datetime.today() > max_duration
    
    print(datetime.today(), 'DeploymentRecommendation:', deployment_recommendation)


2023-05-24 22:05:03.097585 DeploymentRecommendation: {'RecommendationStatus': 'COMPLETED', 'RealTimeInferenceRecommendations': [{'RecommendationId': 'instant-rec-demo-nb-model-7c35c093-5224-4e03-b0f7-b1c1adffcf27/68L7dDRg', 'InstanceType': 'ml.g4dn.2xlarge', 'Environment': {}}, {'RecommendationId': 'instant-rec-demo-nb-model-7c35c093-5224-4e03-b0f7-b1c1adffcf27/tRuy7iM7', 'InstanceType': 'ml.c6i.large', 'Environment': {}}, {'RecommendationId': 'instant-rec-demo-nb-model-7c35c093-5224-4e03-b0f7-b1c1adffcf27/fyHIVeGx', 'InstanceType': 'ml.c5.2xlarge', 'Environment': {}}]}


## Step 3: Visualize the DeploymentRecommendations

In [10]:
pd.set_option('display.max_colwidth', None)
df = pd.DataFrame(deployment_recommendation.get('RealTimeInferenceRecommendations'), columns = ['RecommendationId', 'InstanceType', 'Environment'])
display(df)

Unnamed: 0,RecommendationId,InstanceType,Environment
0,instant-rec-demo-nb-model-7c35c093-5224-4e03-b0f7-b1c1adffcf27/68L7dDRg,ml.g4dn.2xlarge,{}
1,instant-rec-demo-nb-model-7c35c093-5224-4e03-b0f7-b1c1adffcf27/tRuy7iM7,ml.c6i.large,{}
2,instant-rec-demo-nb-model-7c35c093-5224-4e03-b0f7-b1c1adffcf27/fyHIVeGx,ml.c5.2xlarge,{}


## Step 4: Cleanup

In [11]:
sm_client.delete_model(ModelName = model_name)

{'ResponseMetadata': {'RequestId': '0fab7b02-89e9-4610-ab64-5837d422b3bb',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '0fab7b02-89e9-4610-ab64-5837d422b3bb',
   'content-type': 'application/x-amz-json-1.1',
   'content-length': '0',
   'date': 'Wed, 24 May 2023 22:05:05 GMT'},
  'RetryAttempts': 0}}