
# <span style="color:DarkSeaGreen">SageMaker Lab 1</span>

This lab does the following:

- Provision a HuggingFace model via SageMaker
- Create a SageMaker endpoint
- Interacts with the model

# <span style="color:DarkSeaGreen">Prepare Your Environment</span>
### Requirements for this Jupyter Notebook Lab if running in VSCode or equivalent local IDE
##### Note these are macOS specific
- Credentials
  - You need credentials to your AWS account to execute this Jupyter Lab if running locally from your laptop
    - Locally: Credentials and therefore permissions asscociated with the IAM user (with CLI access enabled) are provided by AWS configure connection to your AWS account
    - Cloud: Permissions provided via logged in user
- Installers:
  - Pip
    - Python libraries
    - Works inside Python envs
  - homebrew (brew) (mac)
    - System software, tools, and dependencies
    - Works at OS level

- Run the commands of the cell below in a terminal window to create a virtual environment if you need one
  - Note check your Python version first, then if ok, copy the rest and run in terminal window
  - Note if you copy and paste the multiple lines and run as one you will get zsh: command not found: # errors because of the comments, but you can ignore
  - Remember to restart the kernel to pick up the new venv
  - The venv can be deleted via the last cell in this notebook iof no longer needed
- If you already have a virtual environment, then just activate it as shown in the second cell below
  - Venv (can be created below) used by this notebook is *venv-stable-diffuser-lab1*

In [None]:
# Check your credentials (AWS identity) to confirm you are using the right credentials, can also run in a terminal window if you dont have ipykernel (remove the !)
!aws sts get-caller-identity

In [None]:
### STOP ###
### IF USING THIS NOTEBOOK IN A SAGEMAKER JUPYTER NOTEBOOK INSTANCE, THEN SKIP TO THE NEXT CELL ###
### OTHERWISE, IF USING VSCODE OR EQUIVALENT LOCAL IDE, THEN CONTINUE BELOW ###
### This script is for setting up your environment for the SageMaker Lab 1 ###
# do you need to upgrade python first? Your available version of Python is used to create the virtual environment
python3 --version

### STOP ###
### DO YOU NEED TO UPGRADE PYTHON ###
# upgrade to the latest version of python if required
brew install python
# restart vscode to pickup new version of python
python3 --version

### STOP ###
### OK IF YOU HAVE THE CORRECT VERSION OF PYTHON, CONTINUE ###
# create a virtual environment
python3 -m venv venv-stable-diffuser-lab1
# activate the virtual environment
source venv-stable-diffuser-lab1/bin/activate
### COPY TO HERE ONLY IF RUNNING AS ONE COPY AND PASTE ###

### STOP ###
### MAKE SURE ABOVE VENV GETS ACTIVATED BEFORE RUNNING THE REST ###
# upgrade pip
pip install --upgrade pip
# jupyter kernel support
pip install ipykernel
# add the virtual environment to jupyter
python  -m ipykernel install --user --name=venv-stable-diffuser-lab1 --display-name "Python (venv-stable-diffuser-lab1)"
# install the required packages - may need to specify the path here if not in the correct folder in terminal window
pip install -r requirements_lab1.txt
# pip install -r Documents/github/labs-sagemaker/jumpstart/etc/requirements_lab1.txt
# verify the installation
pip list

### RESTART VSCODE TO PICKUP THE NEW VENV ###

In [None]:
### STOP ###
### This command is for activating an environment that already exists, its for use in a terminal window if you need it ###
source venv-stable-diffuser-lab1/bin/activate
pip list

# use pip freeze if you prefer for requirements.txt freiendly format
### ALSO MAKE SURE YOU SELECT IT AS YOUR KERNEL FOR THIS JUPYTER NOTEBOOK ###

# Lab 1 Starts Here!

# <span style="color:DarkSeaGreen">Setup</span>

In [1]:
import random

# region
# for the purpose of this lab, us-east-1, us-west-2, eu-west-1 has the broadest coverage of models and instance types
# if you provision in other regions, you may not have access to all the models or instance types, and may need to request increase of quotas for endpoint usage for some instance types
myRegion='us-east-1'

# iam
myRoleSageMakerExecution="venv-stable-diffuser-lab1-execution-role"
myRoleSageMakerExecutionARN='RETRIEVED FROM ROLE BELOW'

# parameter store
myParameterStoreChosenModel='venv-stable-diffuser-lab1-chosen-model'
myParameterStoreEndpointName='venv-stable-diffuser-lab1-endpoint-name'
myParameterStoreIAMARN='venv-stable-diffuser-lab1-iam-arn'

# bucket - MUST BE A UNIQUE NAME
myBucket='doit-sagemaker-model-bucket-' + str(random.randint(0, 1000)) + '-' + str(random.randint(0, 1000))

# AWS Elastic Container Registry (ECR) account that hosts official AWS SageMaker PyTorch containers
# NOTE
# SageMaker spins up a container from this image on your specified instance (NOT a JumpStart image container - can't use a diffuser pipeline with JumpStart)
# The container contains PyTorch + CUDA + Python runtime
# Then it loads your model artifact (model.tar.gz) into that container
# All inference happens inside that container on the GPU/CPU of your instance
# When you deploy a model via Model.deploy(), SageMaker pulls this container and runs your model inside it
# https://docs.aws.amazon.com/sagemaker/latest/dg/neo-deployment-hosting-services-container-images.html?utm_source=chatgpt.com
# 785573368785 - AWS account ID that hosts this container (Amazon's official account)
# dkr.ecr - Docker Registry (ECR service)
# myRegion (typicaly us-east-1, if not us-east-1 the image has to be changed) - AWS region where the container is stored
# amazonaws.com - AWS domain
# sagemaker-inference-pytorch - Container name for PyTorch inference
# 2.0-gpu-py3 - Tag specifying version: PyTorch 2.0, GPU support, Python 3
aws_ecr_sagemaker_pytorch_container=f"785573368785.dkr.ecr.{myRegion}.amazonaws.com/sagemaker-inference-pytorch:2.0-gpu-py3"

# async endpoint
myEndpointConfig='venv-stable-diffuser-lab1-endpoint-config'
myEndpointAsync='venv-stable-diffuser-lab1-endpoint-async'
myEndpoint='venv-stable-diffuser-lab1-endpoint'

print ('Done! Move to the next cell ->')

Done! Move to the next cell ->


In [2]:
# local client path for resources
myLocalPathForResources='/Users/simondavies/Documents/GitHub/labs-sagemaker/jumpstart/image_upscale/'
# jupypter notebook path if notebook is used in AWS for example
#myLocalPathForResources='/home/ec2-user/SageMaker/labs-sagemaker/image_upscale/'

print ('Done! Move to the next cell ->')

Done! Move to the next cell ->


In [3]:
import json
import boto3
import base64
import io
import urllib, time
from datetime import datetime
from certifi import where
from PIL import Image
from botocore.exceptions import ClientError
from sagemaker.session import get_execution_role
from sagemaker.session import Session
from sagemaker.model import Model
from sagemaker.predictor_async import AsyncPredictor
botoSession = boto3.Session(region_name=myRegion)

# Configure boto3 to use certifi's certificates - helps avoid SSL errors if your system’s certificate store is out of date or missing root certs
sts_client = boto3.client('sts', verify=where())
myAccountNumber = sts_client.get_caller_identity()["Account"]
print(myAccountNumber)
print(sts_client.get_caller_identity()["Arn"])

# create clients we can use later
# iam
iam = boto3.client('iam', region_name=myRegion, verify=where())
# ssm
ssm = boto3.client('ssm', region_name=myRegion, verify=where())
# s3
s3 = boto3.client('s3', region_name=myRegion, verify=where())
# sagemaker
sm = boto3.client("sagemaker", region_name=myRegion, verify=where())
# sagemaker runtime
smr = boto3.client("sagemaker-runtime", region_name=myRegion, verify=where())

print ('Done! Move to the next cell ->')

sagemaker.config INFO - Not applying SDK defaults from location: /Library/Application Support/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /Users/simondavies/Library/Application Support/sagemaker/config.yaml
546709318047
arn:aws:iam::546709318047:user/simon-davies-cli
Done! Move to the next cell ->


In [4]:
# define tags added to all services we create
myTags = [
    {"Key": "env", "Value": "non_prod"},
    {"Key": "owner", "Value": "doit-jumpstart"},
    {"Key": "project", "Value": "lab1"},
    {"Key": "author", "Value": "simon"},
]
myTagsDct = {
    "env": "non_prod",
    "owner": "doit-jumpstart",
    "project": "lab1",
    "author": "simon",
}

print ('Done! Move to the next cell ->')

Done! Move to the next cell ->


# <span style="color:DarkSeaGreen">IAM</span>

In [5]:
def getSageMakerExecutionRole():
    """
    Creates a role required for SageMaker to run jobs on your behalf
    Only needed if this is being run in a local IDE, not needed if in SageMaker Studio or SageMaker Notebook Instance

    Args:
        None

    Returns:
        An IAM execution role ARN
    """

    # trust policy for the role
    roleTrust = {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Principal": {
                    "Service": "sagemaker.amazonaws.com"
                },
                "Action": "sts:AssumeRole"
            }
        ]
    }

    # check if the role exists
    try:
        role = iam.get_role(RoleName=myRoleSageMakerExecution)
        print("Role already exists. Using the existing role.")
        return role['Role']['Arn']
    except iam.exceptions.NoSuchEntityException:
        print("Role does not exist. Creating a new role.")
        
    # create execution role for sagemaker - allows SageMaker notebook instances, training jobs, and models to access S3, ECR, and CloudWatch on your behalf
    # this role is only created if we are running this notebook in a local ide, if we are in a jupyterlab in sagemaker studio, we dont need it as already created and available
    role = iam.create_role(
        RoleName=myRoleSageMakerExecution,
        AssumeRolePolicyDocument=json.dumps(roleTrust),
        Description="Service excution role for sagemaker ai use including inside jupyter notebooks",
        Tags=[
            *myTags,
        ],
    )

    # attach managed policy to the role AmazonSageMakerFullAccess
    iam.attach_role_policy(
        RoleName=myRoleSageMakerExecution,
        PolicyArn="arn:aws:iam::aws:policy/AmazonSageMakerFullAccess"
    )

    # store the role arn in parameter store for use in other notebooks
    ssm.put_parameter(
        Name=myParameterStoreIAMARN,
        Description='The ARN of the IAM role used by SageMaker for execution of jobs',
        Value=role['Role']['Arn'],
        Type='String',
        Tags=[
            *myTags,
        ],
    )   

    return role['Role']['Arn']

# <span style="color:DarkSeaGreen">Get Execution Role and Session</span>
- SageMaker requires an execution role to assume on your behalf

In [6]:
try:
    # if this is being run in a SageMaker AI JupyterLab Notebook
    myRoleSageMakerExecutionARN = get_execution_role()
except:
    # if this is being run in a local IDE - we need to create our own role
    myRoleSageMakerExecutionARN = getSageMakerExecutionRole()

# make sure we get a session in the correct region (needed as it can use the aws configure region if running this locally
sageMakerSession = Session(boto_session=botoSession)

print(myRoleSageMakerExecutionARN)
print(sageMakerSession)

print ('Done! Move to the next cell ->')

Couldn't call 'get_role' to get Role ARN from role name simon-davies-cli to get Role path.


Role does not exist. Creating a new role.
arn:aws:iam::546709318047:role/venv-stable-diffuser-lab1-execution-role
<sagemaker.session.Session object at 0x11b975090>
Done! Move to the next cell ->


# <span style="color:DarkSeaGreen">Provision a SageMaker Model</span>
- Provision a model via a customer inference container
  - This container is defined in the inference.py file
  - It allows us to download a HuggingFace model directly, and when used customise the use of the GPU via a diffuser pipeline
  - We create a custom one because JumpStart models have their own containers and do not allow customisation
### Example models to provision
- Stable Diffusion x4 upscaler FP16
  - https://huggingface.co/stabilityai/stable-diffusion-x4-upscaler/blame/fp16/README.md
  - *model_id, model_version = "model-upscaling-stabilityai-stable-diffusion-x4-upscaler-fp16", "*"*
  - upscaling with Stable Diffusion (x4) is computationally expensive
    - FP16 means it uses half-precision floating point, so you want a GPU with good Tensor Core
  - the x4 upscaler model itself is large
    - want ≥ 16 GB VRAM to run comfortably in FP16 for 512×512 → 2048×2048 upscales
    - p4d.24xlarge (enterprise-grade, overkill unless you’re batching lots of requests)
      - **needs an aws quota increase for this instance for endpoint usage**
    - ml.g5.4xlarge
      - good for a poc - widely supported, good memory, reasonably costed
      - anything smaller and you will likely get CUDA out of memory errors
        - you need plenty of GPU memory
      - **needs an aws quota increase for this instance for endpoint usage**
- see https://aws.amazon.com/sagemaker/ai/pricing/ for pricing, **larger instances can be very expensive per hour**
- If you deply the model and you get a quota error, you will need to visit Service Quotas via the console and request an increase
  - go to SageMaker service and search for the instance
  - select the *model* for endpoint usage
  - make sure your quota allows for auto scaling max
- DO NOT LEAVE LARGE INSTANCES RUNNING LONGER THAN YOU NEED TO $$$!


### Instance Size is Important
- We are usinbg a model that upscales
- The larger the original image, the more GPU memory is taken when upscaling
- Sagemaker typically uses one GPU to do this 
  - SageMaker model endpoints don’t automatically spread inference across multiple GPUs unless the container is written for it
- Stability Diffusion provides a diffuser library 
  - Breaks the image into smaller patches, processes sequentially, then stitches
  - Uses much less VRAM at the cost of a bit more time
  - We use that below
- p4d.24xlarge has more GPU memory, but maybe an overkill, expensive and won't scale if source images are still too large to upscale in one GPU

In [7]:
# the instance we want to provision - THIS DISPLAYS AN INPUT BOX FOR YOU TO CHOOSE AN INSTANCE FOR THE MODEL INFERENCE PROVIDED
# https://aws.amazon.com/sagemaker/ai/pricing/
options = [
    "img2img|stabilityai/stable-diffusion-x4-upscaler|ml.p4d.24xlarge $$$$",
    "img2img|stabilityai/stable-diffusion-x4-upscaler|ml.g5.12xlarge $$$",
    "img2img|stabilityai/stable-diffusion-x4-upscaler|ml.p3.2xlarge $$",
]

print("Select an option:")
for i, opt in enumerate(options, 1):
    print(f"{i}. {opt}")

choice = int(input("Enter the number of the spec you want: "))
selected = options[choice - 1]

modelType = selected.split("|")[0]
modelID = selected.split("|")[1]
instanceType = selected.split("|")[2].split(" ")[0]

# store the model in a parameter store for use in other labs
ssm.put_parameter(
    Name=myParameterStoreChosenModel,
    Description='the model chosen in lab1',
    Value=selected,
    Type='String',
    Overwrite=True,
)

print(f"You selected: model type {modelType} {modelID} on {instanceType}")
print("Done! Move to the next cell ->")

Select an option:
1. img2img|stabilityai/stable-diffusion-x4-upscaler|ml.p4d.24xlarge $$$$
2. img2img|stabilityai/stable-diffusion-x4-upscaler|ml.g5.12xlarge $$$
3. img2img|stabilityai/stable-diffusion-x4-upscaler|ml.p3.2xlarge $$
You selected: model type img2img stabilityai/stable-diffusion-x4-upscaler on ml.g5.12xlarge
Done! Move to the next cell ->


# <span style="color:DarkSeaGreen">Upload inference container to S3</span>
- Create an S3 bucket
- Upload the tar model (this is already provided in this lab, see inference.py for the code)

In [8]:
# create bucket
if myRegion=='us-east-1':
    s3.create_bucket(
        Bucket=myBucket
    )
else:
    s3.create_bucket(
        Bucket=myBucket, CreateBucketConfiguration={"LocationConstraint": myRegion}
    )

s3.put_bucket_tagging(Bucket=myBucket, Tagging={"TagSet": myTags})

# create a "folder" - really keys as S3 is flat
s3.put_object(Bucket=myBucket, Key="model/")
print (f'Created the bucket {myBucket}')

# Upload each file to the S3 bucket
files = [
    {
        's3key': 'model/model.tar.gz',
        'localpath': '{}upscaler/model.tar.gz'.format(myLocalPathForResources)
    }
]

for file in files:
    print ('uploading: {}'.format(file['localpath']))
    s3.upload_file(file['localpath'], myBucket, file['s3key'], ExtraArgs={'StorageClass': 'STANDARD'})
    print ('uploaded: {}'.format(file['s3key']))

print ('Done! Move to the next cell ->')

Created the bucket doit-sagemaker-model-bucket-466-342
uploading: /Users/simondavies/Documents/GitHub/labs-sagemaker/jumpstart/image_upscale/upscaler/model.tar.gz
uploaded: model/model.tar.gz
Done! Move to the next cell ->


# <span style="color:DarkSeaGreen">Create Model and Endpoint</span>
- Create a model from the container
- Create a async endpoint config
- Create an endpoint

In [9]:
# this cell will create an endpoint for the model and instance type you selected previously
# this will take a while (few minutes), as it needs to download the model from huggingface, create the endpoint config and then the endpoint
# https://huggingface.co/stabilityai/stable-diffusion-x4-upscaler
model = Model(
    image_uri=aws_ecr_sagemaker_pytorch_container,
    model_data=f"s3://{myBucket}/model/model.tar.gz",
    #entry_point="inference.py",
    role=myRoleSageMakerExecutionARN,
    sagemaker_session=sageMakerSession,
)
variantName = "AllTraffic"

# if we error out, but the model has been registered, delete it so we can try again, otherwise we get lots of redundant model regs
try:
    # deploy the model to an endpoint
    # this will take a while
    model.create()
    print("Done! Move to the next cell ->")
except Exception as e:
    print(e)
    model.delete_model()

Done! Move to the next cell ->


In [10]:
# create a sagemaker configuration we can use when cteating an endpoint
endpointConfig = sm.create_endpoint_config(
    EndpointConfigName=myEndpointConfig,
    ProductionVariants=[
        {
            "VariantName": variantName,
            "ModelName": model.name,
            "InstanceType": instanceType,
            "InitialInstanceCount": 1,
        }
    ],
    AsyncInferenceConfig={
        "OutputConfig": {
            "S3OutputPath": f"s3://{myBucket}/upscaled-results",
            # Optionally specify Amazon SNS topics
            # "NotificationConfig": {
            # "SuccessTopic": "arn:aws:sns:<aws-region>:<account-id>:<topic-name>",
            # "ErrorTopic": "arn:aws:sns:<aws-region>:<account-id>:<topic-name>",
            # }
        },
        "ClientConfig": {"MaxConcurrentInvocationsPerInstance": 4},
    },
)

create_endpoint_response = sm.create_endpoint(
    EndpointName=myEndpoint, 
    EndpointConfigName=myEndpointConfig
)

print("Done! Move to the next cell ->")

Done! Move to the next cell ->


In [11]:
# lets wait for the endpoint to be available
print(f"At {datetime.now()}")
waiter = sm.get_waiter("endpoint_in_service")
print("Waiting for endpoint to create...")
waiter.wait(EndpointName=myEndpoint)
resp = sm.describe_endpoint(EndpointName=myEndpoint)
print(f"Endpoint Status: {resp['EndpointStatus']}")
print(f"At {datetime.now()}")

At 2025-10-07 18:54:28.789524
Waiting for endpoint to create...
Endpoint Status: InService
At 2025-10-07 19:01:36.688233


In [12]:
# store the endpoint name in a parameter store for use in other notebooks
ssm.put_parameter(
    Name=myParameterStoreEndpointName,
    Description='the name of the sagemaker endpoint created in lab1',
    Value=myEndpoint,
    Type='String',
    Overwrite=True,
)

print ('Done! Move to the next cell ->')

Done! Move to the next cell ->


# <span style="color:DarkSeaGreen">Sample Images</span>
- Create a method to display the images
- Upload the sample images to S3, when calling the inference via async, it must use the source from S3

In [13]:
# required if an image model is being used
def decode_and_show(description, model_response) -> None:
    from PIL import Image
    import base64
    import io
    
    print (description)
    # Handle PIL Image objects
    if hasattr(model_response, 'save'):  # Check if it's a PIL Image
        display(model_response)
        return
    
    # Handle bytes (raw image data)
    elif isinstance(model_response, bytes):
        image = Image.open(io.BytesIO(model_response))
        display(image)
        image.close()
    
    # Handle base64 string (encoded image)
    elif isinstance(model_response, str):
        image = Image.open(io.BytesIO(base64.b64decode(model_response)))
        display(image)
        image.close()
    
    # Handle list of base64 strings (model response)
    elif isinstance(model_response, list):
        for i, img_data in enumerate(model_response):
            image = Image.open(io.BytesIO(base64.b64decode(img_data)))
            print(f"Image {i + 1}:")
            display(image)
            image.close()
    
    else:
        print(f"Can't handle the image. Unexpected response type: {type(model_response)}")

In [14]:
# Upload each file to the S3 bucket as a payload for async requests
files = [
    {
        "s3key": "originals/img1_original.json",
        "localpath": "{}/resources/img1_original.jpeg".format(myLocalPathForResources),
        "prompt": "Enhance this image to high-res",
    },
    {
        "s3key": "originals/img2_original.json",
        "localpath": "{}/resources/img2_original.jpeg".format(myLocalPathForResources),
        "prompt": "Enhance this image to high-res",
    },
]

for file in files:
    print(f"Preparing payload for: {file["s3key"]} from {file["localpath"]}")

    # Read and base64 encode the local image
    with open(file["localpath"], "rb") as f:
        image_b64 = base64.b64encode(f.read()).decode("utf-8")
    
    payload = {
        "prompt": file["prompt"],
        "image": image_b64
    }
    
    # Upload JSON payload to S3
    s3.put_object(
        Bucket=myBucket,
        Key=file["s3key"],
        Body=json.dumps(payload).encode("utf-8"),
        ContentType="application/json"
    )
    print("Uploaded payload to s3://{}/{}".format(myBucket, file["s3key"]))

print("Done! Move to the next cell ->")

Preparing payload for: originals/img1_original.json from /Users/simondavies/Documents/GitHub/labs-sagemaker/jumpstart/image_upscale//resources/img1_original.jpeg
Uploaded payload to s3://doit-sagemaker-model-bucket-466-342/originals/img1_original.json
Preparing payload for: originals/img2_original.json from /Users/simondavies/Documents/GitHub/labs-sagemaker/jumpstart/image_upscale//resources/img2_original.jpeg
Uploaded payload to s3://doit-sagemaker-model-bucket-466-342/originals/img2_original.json
Done! Move to the next cell ->


# <span style="color:DarkSeaGreen">Invoke the Endpoint Asynchronously</span>
- Create a method to display the images
- Upload the sample images to S3, when calling the inference via async, it must use the source from S3

In [15]:
# invoke asynchronously, this will return immediately with a job id
smr = boto3.client('sagemaker-runtime', region_name=myRegion, verify=where())

response = smr.invoke_endpoint_async(
    EndpointName=myEndpoint,
    InputLocation=f"s3://{myBucket}/originals/img1_original.json",
    ContentType="application/json",
    Accept="application/json",
)
print(f"OutputLocation: {response["OutputLocation"]}")
print(f"Submitted async job: {response["InferenceId"]}")
print(f"Submitted at {datetime.now()}")

OutputLocation: s3://doit-sagemaker-model-bucket-466-342/upscaled-results/544561d9-197c-41c1-affa-c904955280de.out
Submitted async job: b5aaa436-7c6f-4381-ab1f-52dae7d649a7
Submitted at 2025-10-07 19:03:38.695191


In [16]:
# monitor for a result
obucket = response["OutputLocation"].split("/")[2]
okey = "/".join(response["OutputLocation"].split("/")[3:])

while True:
    try:
        s3.head_object(Bucket=obucket, Key=okey)
        print(f"Result is ready at s3://{bucket}/{key}")
        break
    except s3.exceptions.ClientError as e:
        # 404 = Not ready yet
        if e.response['Error']['Code'] == '404':
            print("Still processing...")
            time.sleep(10)
        else:
            raise


Still processing...
Still processing...
Still processing...
Still processing...
Still processing...
Still processing...
Still processing...
Still processing...
Still processing...
Still processing...
Still processing...
Still processing...
Still processing...
Still processing...
Still processing...
Still processing...
Still processing...
Still processing...
Still processing...
Still processing...
Still processing...


In [None]:
from diffusers import StableDiffusionUpscalePipeline
import torch

model_id = "stabilityai/stable-diffusion-x4-upscaler"
pipe = StableDiffusionUpscalePipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe = pipe.to("cuda")
print("Model loaded and moved to CUDA")


Keyword arguments {'dtype': torch.float16} are not expected by StableDiffusionUpscalePipeline and will be ignored.
Loading pipeline components...:  67%|██████▋   | 4/6 [00:00<00:00, 38.61it/s]


In [None]:
# test the endpoint with some example payloads
async_predictor = AsyncPredictor(predictor=predictor)

# Note that sending or receiving the payload with the raw RGB values may hit default limits for the input payload and the response size
# Therefore, we recommend using the base64 encoded image by setting:
# content_type = “application/json;jpeg” and accept = “application/json;jpeg”
content_type = "application/json;jpeg"
accept = "application/json;jpeg"

images = [
    "resources/img1_original.jpeg",
    "resources/img2_original.jpeg",
]

for img in images:
    imgOrig = Image.open(img)
    # Resize while preserving aspect ratio
    max_dim = 512
    w, h = imgOrig.size
    if w > h:
        new_w = max_dim
        new_h = int(h * max_dim / w)
    else:
        new_h = max_dim
        new_w = int(w * max_dim / h)

    imgSmall = imgOrig.resize((new_w, new_h), Image.LANCZOS)
    print (f'Resized the image from (w{w},h{h}) to (w{new_w},h{new_h})')
        
    # Convert to base64
    buffered = io.BytesIO()
    imgSmall.save(buffered, format="PNG")
    image_b64 = base64.b64encode(buffered.getvalue()).decode("utf-8")

    payload = {
        "prompt": "highly detailed, realistic photo",
        "image": image_b64
    }
    payload_bytes = json.dumps(payload).encode("utf-8")

    # NOTE if you get a 413 error, your original image sizes are probably too large and should be < 6MB
    # HTTP 413 Content Too Large (Payload Too Large)
    # NOTE if you get a 400 CUDA out of memory error, this indicates that your GPU RAM (Random access memory) is full 
    # HTTP 400 InternalServerException
    # If so, bigger instance with a larger GPU per core, use a diffuser such as below to split the image into batches and restitch once done
    print('Starting the inference')
    response = async_predictor.predict(
        payload_bytes,
        {
            "ContentType": content_type,
            "Accept": accept,
        },
    )
    decode_and_show(response["generated_image"])

print("Done! Move to the next cell ->")

# <span style="color:DarkSeaGreen">Move to Lab 2</span>
# <span style="color:DarkSeaGreen">OR...</span>
# <span style="color:DarkSeaGreen">Clean Up Architecture</span>
### <span style="color:Red">Only do this if you have finished with this lab and any labs that depend on it!</span>
##### It will delete all architecture created, make sure you no longer need any of it!!!

In [17]:
# when finished with the endpoint, delete it
# if you get an error it may still be updating after scaling in from lab 2 or lab 3 locust tests
# try again or delete via the console if the config cannot be found
# lets check the endpoint status first to make sure its still not changing due to scaling in
response = ssm.get_parameter(
    Name=myParameterStoreEndpointName
)
endpointName = response['Parameter']['Value']
response = sm.describe_endpoint(EndpointName=endpointName)
print(response["EndpointStatus"])

if response["EndpointStatus"] == "InService":
    print("Endpoint is in service. Proceeding with deletion.")
    sm.delete_endpoint(EndpointName=endpointName)
    print ('Done! Move to the next cell ->')
else:
    print("Endpoint is not in service. Cannot delete. Try again in a couple of minutes.")

InService
Endpoint is in service. Proceeding with deletion.
Done! Move to the next cell ->


In [18]:
# delete the model
response = sm.delete_model(ModelName=model.name)
print ('Done! Move to the next cell ->')

Done! Move to the next cell ->


In [19]:
# delete the endpoint config
response = sm.delete_endpoint_config(EndpointConfigName=myEndpointConfig)
print ('Done! Move to the next cell ->')

Done! Move to the next cell ->


In [20]:
# delete roles and policies
iam.detach_role_policy(
    RoleName=myRoleSageMakerExecution, PolicyArn='arn:aws:iam::aws:policy/AmazonSageMakerFullAccess'
)
iam.delete_role(RoleName=myRoleSageMakerExecution)
print ('Done! Move to the next cell ->')

Done! Move to the next cell ->


In [21]:
# delete the parameter store entry
ssm.delete_parameter(Name=myParameterStoreChosenModel)
ssm.delete_parameter(Name=myParameterStoreEndpointName)
ssm.delete_parameter(Name=myParameterStoreIAMARN)
print ('Done! Move to the next cell ->')

Done! Move to the next cell ->


In [22]:
# delete s3 bucket
# NOTE WARNING - this will delete all objects in the bucket with NO prompt or confirmation
s3r = boto3.resource('s3', region_name=myRegion, verify=where())
bucket = s3r.Bucket(myBucket)
bucket.objects.all().delete()

# delete the bucket
response = s3.delete_bucket(Bucket=myBucket)
print ('Done! Move to the next cell ->')

Done! Move to the next cell ->


# <span style="color:DarkSeaGreen">Clean Up venv</span>
### Clean up if finished with this lab and running in VSCode or equivalent local IDE
#### Note these are macOS specific
- Run the commands of the cell below in a terminal window if you need to clean up a local venv
  - Note if you copy and paste the entire cell and run as one you will get zsh: command not found: # errors because of the comments, but you can ignore
  - Remember to restart the kernel to refresh whats available

In [None]:
# if you have local host in your terminal prompt
unset HOST
# deactivate the venv
deactivate 
# remove it and its contents if not needed
rm -rf venv-stable-diffuser-lab1 