
# <span style="color:DarkSeaGreen">JumpStart Lab 1</span>

This lab does the following:

- Provision a model via Jumpstart
- Create a JumpStart endpoint
- Interacts with the model



# <span style="color:DarkSeaGreen">requirements_lab1.txt</span>
- Most of the requirements just get the latest version
- However, on Nov 19 2025 AWS released SageMaker 3.0.1 SDK, this is compatable up to Python 3.12
  - At time of writing does not yet support Python 3.13+, (latest current version is Python 3.14)
- SageMaker 3.0.1 is forced in the requirements file, otherwise it only gets 3.0.0 which fails due to dependency issues as its sub files are not included, AWS fixed this is 3.0.1 which was released immediately after 3.0.0 :)
- Therefore make sure your Python venv is created using 3.12 only

# <span style="color:DarkSeaGreen">Prepare Your Environment</span>
### Requirements for this Jupyter Notebook Lab if running in VSCode or equivalent local IDE
##### Note these are macOS specific
- Credentials
  - You need credentials to your AWS account to execute this Jupyter Lab if running locally from your laptop
    - Locally: Credentials and therefore permissions asscociated with the IAM user (with CLI access enabled) are provided by AWS configure connection to your AWS account
    - Cloud: Permissions provided via logged in user
- Installers:
  - Pip
    - Python libraries
    - Works inside Python envs
  - homebrew (brew) (mac)
    - System software, tools, and dependencies
    - Works at OS level

- Run the commands of the cell below in a terminal window to create a virtual environment if you need one
  - Note check your Python version first, then if ok, copy the rest and run in terminal window
  - Note if you copy and paste the multiple lines and run as one you will get zsh: command not found: # errors because of the comments, but you can ignore
  - Remember to restart the kernel to pick up the new venv
  - The venv can be deleted via the last cell in this notebook if no longer needed
- If you already have a virtual environment, then just activate it as shown in the second cell below
  - Venv (can be created below) used by this notebook is *venv-jumpstart-stable-lab1*

#### SageMaker can release breaking changes to this code, see link for details if any cell fails with a SageMaker issue
- This Jupyter Notebook was written with SageMaker V3.0.1 released 20 Nov 2025
- At time of creating this notebook (Nov 26), SageMaker 3.0.1 only supports Python 3.12
  - https://github.com/aws/sagemaker-python-sdk/blob/master/CHANGELOG.md
  - https://github.com/aws/sagemaker-python-sdk

In [None]:
# Check your credentials (AWS identity) to confirm you are using the right credentials
# run in a terminal window 
aws sts get-caller-identity

In [None]:
### STOP ###
### IF USING THIS NOTEBOOK IN A SAGEMAKER JUPYTER NOTEBOOK INSTANCE, THEN SKIP TO THE NEXT CELL ###
### OTHERWISE, IF USING VSCODE OR EQUIVALENT LOCAL IDE, THEN CONTINUE BELOW ###
### This script is for setting up your environment for the JumpStart Lab 1 ###
# do you need to upgrade python first? Your available version of Python is used to create the virtual environment
python3 --version

### STOP ###
### DO YOU NEED TO UPGRADE PYTHON ###
# upgrade to the latest version of python if required
brew install python==3.12
# restart vscode to pickup new version of python
python3 --version

### STOP ###
### OK IF YOU HAVE THE CORRECT VERSION OF PYTHON, CONTINUE ###
# create in the folder of this notebook, eg Documents/github/labs-sagemaker/jumpstart/image_upscale
# create a virtual environment
python3.12 -m venv venv-jumpstart-stable-lab1
# activate the virtual environment
source venv-jumpstart-stable-lab1/bin/activate
### COPY TO HERE ONLY IF RUNNING AS ONE COPY AND PASTE ###

### STOP ###
### MAKE SURE ABOVE VENV GETS ACTIVATED BEFORE RUNNING THE REST ###
# upgrade pip
pip install --upgrade pip
# jupyter kernel support
pip install ipykernel
# add the virtual environment to jupyter
python  -m ipykernel install --user --name=venv-jumpstart-stable-lab1 --display-name "Python (venv-jumpstart-stable-lab1)"
# install the required packages - may need to specify the path here if not in the correct folder in terminal window
pip install -r requirements_lab1.txt
# pip install -r Documents/github/labs-sagemaker/jumpstart/etc/requirements_lab1.txt
# verify the installation
pip list

### RESTART VSCODE TO PICKUP THE NEW VENV ###

In [None]:
### STOP ###
### This command is for activating an environment that already exists, its for use in a terminal window if you need it ###
source venv-jumpstart-stable-lab1/bin/activate
pip list

# use pip freeze if you prefer for requirements.txt freiendly format
### ALSO MAKE SURE YOU SELECT IT AS YOUR KERNEL FOR THIS JUPYTER NOTEBOOK ###

# Lab 1 Starts Here!

# <span style="color:DarkSeaGreen">Setup</span>

In [None]:
# region
# for the purpose of this lab, us-east-1, us-west-2, eu-west-1 has the broadest coverage of JumpStart models and instance types
# if you provision in other regions, you may not have access to all the models or instance types, 
# and may need to request increase of quotas for endpoint usage for some instance types
myRegion='us-east-1'

# iam
myRoleSageMakerExecution="doit-jumpstart-sagemaker-execution-role"
myRoleSageMakerExecutionARN='RETRIEVED FROM ROLE BELOW'

# parameter store
myParameterStoreChosenModel='doit-jumpstart-sagemaker-chosen-model'
myParameterStoreEndpointName='doit-jumpstart-sagemaker-endpoint-name'
myParameterStoreIAMARN='doit-jumpstart-sagemaker-iam-arn'

# endpoint
myEndpoint='doit-jumpstart-endpoint'

print ('Done! Move to the next cell ->')

In [None]:
import json
import boto3
from certifi import where

botoSession = boto3.Session(region_name=myRegion)

# Configure boto3 to use certifi's certificates - helps avoid SSL errors if your system’s certificate store is out of date or missing root certs
sts_client = boto3.client('sts', verify=where())
myAccountNumber = sts_client.get_caller_identity()["Account"]
print(myAccountNumber)
print(sts_client.get_caller_identity()["Arn"])

# create clients we can use later
# iam
iam = boto3.client('iam', region_name=myRegion, verify=where())
# ssm
ssm = boto3.client('ssm', region_name=myRegion, verify=where())
# sagemaker
sm = boto3.client("sagemaker", region_name=myRegion, verify=where())
smr = boto3.client("sagemaker-runtime", region_name=myRegion, verify=where())

print ('Done! Move to the next cell ->')

In [None]:
# define tags added to all services we create
myTags = [
    {"Key": "env", "Value": "non_prod"},
    {"Key": "owner", "Value": "doit-jumpstart"},
    {"Key": "project", "Value": "lab1"},
    {"Key": "author", "Value": "simon"},
]
myTagsDct = {
    "env": "non_prod",
    "owner": "doit-jumpstart",
    "project": "lab1",
    "author": "simon",
}

print ('Done! Move to the next cell ->')

# <span style="color:DarkSeaGreen">Get SageMaker Execution Role</span>
- We need to get the execution role SageMaker uses to execute its commands.
- We do this differently if 
  - running in a SageMaker Jupyter notebook
    - OR
  - running in a local IDE

https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning-ex-role.html

# <span style="color:DarkSeaGreen">IAM</span>
- The following method is only called if this is being run in a local IDE
- It will an IAM execution role used by SageMaker
- The following cell will NOT create the role, it will only create the role if the method it defines is called below

In [None]:
def getSageMakerExecutionRole():
    """
    Creates a role required for SageMaker to run jobs on your behalf
    Only needed if this is being run in a local IDE, not needed if in SageMaker Studio or SageMaker Notebook Instance

    Args:
        None

    Returns:
        An IAM execution role ARN
    """

    # trust policy for the role
    roleTrust = {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Principal": {
                    "Service": "sagemaker.amazonaws.com"
                },
                "Action": "sts:AssumeRole"
            }
        ]
    }

    # check if the role exists
    try:
        role = iam.get_role(RoleName=myRoleSageMakerExecution)
        print("Role already exists. Using the existing role.")
        return role['Role']['Arn']
    except iam.exceptions.NoSuchEntityException:
        print("Role does not exist. Creating a new role.")
        
    # create execution role for sagemaker - allows SageMaker notebook instances, training jobs, and models to access S3, ECR, and CloudWatch on your behalf
    # this role is only created if we are running this notebook in a local ide, if we are in a jupyterlab in sagemaker studio, we dont need it as already created and available
    role = iam.create_role(
        RoleName=myRoleSageMakerExecution,
        AssumeRolePolicyDocument=json.dumps(roleTrust),
        Description="Service excution role for sagemaker ai use including inside jupyter notebooks",
        Tags=[
            *myTags,
        ],
    )

    # attach managed policy to the role AmazonSageMakerFullAccess
    iam.attach_role_policy(
        RoleName=myRoleSageMakerExecution,
        PolicyArn="arn:aws:iam::aws:policy/AmazonSageMakerFullAccess"
    )

    # store the role arn in parameter store for use in other notebooks
    ssm.put_parameter(
        Name=myParameterStoreIAMARN,
        Description='The ARN of the IAM role used by SageMaker for execution of jobs',
        Value=role['Role']['Arn'],
        Type='String',
        Tags=[
            *myTags,
        ],
    )   

    return role['Role']['Arn']

# <span style="color:DarkSeaGreen">Get Execution Role and Session</span>
- SageMaker requires an execution role to assume on your behalf

In [None]:
from sagemaker.core.helper.session_helper import Session, get_execution_role
sagemaker_session = Session()

try:
    # if this is being run in a SageMaker AI JupyterLab Notebook
    myRoleSageMakerExecutionARN = get_execution_role()
except:
    # if this is being run in a local IDE - we need to create our own role
    myRoleSageMakerExecutionARN = getSageMakerExecutionRole()

# make sure we get a session in the correct region (needed as it can use the aws configure region if running this locally
sageMakerSession = Session(boto_session=botoSession)

print(myRoleSageMakerExecutionARN)
print(sageMakerSession)

print ('Done! Move to the next cell ->')

# <span style="color:DarkSeaGreen">Provision a JumpStart Model</span>
- Provision a model via Jumpstart
- If you prefer, you can also do this via the JumpStart console, but you will have to bring in the endpoint that you create to continue with this code
### Example models to provision
- Stable Diffusion x4 upscaler FP16
  - https://huggingface.co/stabilityai/stable-diffusion-x4-upscaler/blame/fp16/README.md
  - *model_id, model_version = "model-upscaling-stabilityai-stable-diffusion-x4-upscaler-fp16", "*"*
  - upscaling with Stable Diffusion (x4) is computationally expensive
    - FP16 means it uses half-precision floating point, so you want a GPU with good Tensor Core
  - the x4 upscaler model itself is large
    - want ≥ 16 GB VRAM to run comfortably in FP16 for 512×512 → 2048×2048 upscales
    - p4d.24xlarge (enterprise-grade, overkill unless you’re batching lots of requests)
      - **needs an aws quota increase for this instance for endpoint usage**
    - ml.g5.4xlarge
      - good for a poc - widely supported, good memory, reasonably costed
      - anything smaller and you will likely get CUDA out of memory errors
        - you need plenty of GPU memory
      - **needs an aws quota increase for this instance for endpoint usage**
- see https://aws.amazon.com/sagemaker/ai/pricing/ for pricing, **larger instances can be very expensive per hour**
- If you deply the model and you get a quota error, you will need to visit Service Quotas via the console and request an increase
  - go to SageMaker service and search for the instance
  - select the *model* for endpoint usage
  - make sure your quota allows for auto scaling max
- DO NOT LEAVE LARGE INSTANCES RUNNING LONGER THAN YOU NEED TO $$$!


### Instance Size is Important
- We are usinbg a model that upscales
- The larger the original image, the more GPU memory is taken when upscaling
- Sagemaker typically uses one GPU to do this 
  - SageMaker model endpoints don’t automatically spread inference across multiple GPUs unless the container is written for it
- Stability Diffusion provides a diffuser library 
  - Breaks the image into smaller patches, processes sequentially, then stitches
  - Uses much less VRAM at the cost of a bit more time
  - We don't use that in this lab
- p4d.24xlarge has more GPU memory, but maybe an overkill, expensive and won't scale if source images are still too large to upscale in one GPU

In [None]:
# the model we want to provision - THIS DISPLAYS AN INPUT BOX FOR YOU TO CHOOSE A MODEL
# jump into the console, click on JumpStart, find a model you like and copy the model id from the details page
# https://aws.amazon.com/sagemaker/ai/pricing/
options = [
    "img2img|model-upscaling-stabilityai-stable-diffusion-x4-upscaler-fp16|ml.p4de.24xlarge $$$$$",
    "img2img|model-upscaling-stabilityai-stable-diffusion-x4-upscaler-fp16|ml.p4d.24xlarge $$$$",
    "img2img|model-upscaling-stabilityai-stable-diffusion-x4-upscaler-fp16|ml.g5.48xlarge $$$$$",
    "img2img|model-upscaling-stabilityai-stable-diffusion-x4-upscaler-fp16|ml.g5.24xlarge $$$$",
    "img2img|model-upscaling-stabilityai-stable-diffusion-x4-upscaler-fp16|ml.g5.12xlarge $$$",
    "img2img|model-upscaling-stabilityai-stable-diffusion-x4-upscaler-fp16|ml.g5.2xlarge $$",
]

print("Select an option:")
for i, opt in enumerate(options, 1):
    print(f"{i}. {opt}")

choice = int(input("Enter the number of the spec you want: "))
selected = options[choice - 1]

modelType = selected.split("|")[0]
modelID = selected.split("|")[1]
instanceType = selected.split("|")[2].split(" ")[0]
print(f"You selected: model type {modelType} {modelID} on {instanceType}")

# store the model in a parameter store for use in other labs
ssm.put_parameter(
    Name=myParameterStoreChosenModel,
    Description='the model chosen in lab1',
    Value=selected,
    Type='String',
    Overwrite=True,
)

print("Done! Move to the next cell ->")

### Previous Errors using JumpStart Container
- JumpStart containers load the entire model onto a single GPU
- does not split the model across GPUs
- does not automatically use multiple GPUs

ml.g5.12xlarge:

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) 
from primary with message "{
  "code": 400,
  "type": "InternalServerException",
  "message": "CUDA out of memory. Tried to allocate 3.96 GiB (GPU 0; 22.20 GiB total capacity; 17.79 GiB already 
allocated; 2.57 GiB free; 17.81 GiB reserved in total by PyTorch) If reserved memory is \u003e\u003e allocated 
memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and 
PYTORCH_CUDA_ALLOC_CONF"
}

ml.g5.24xlarge:
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) 
from primary with message "{
  "code": 400,
  "type": "InternalServerException",
  "message": "CUDA out of memory. Tried to allocate 3.96 GiB (GPU 0; 22.20 GiB total capacity; 17.79 GiB already 
allocated; 2.59 GiB free; 17.79 GiB reserved in total by PyTorch) If reserved memory is \u003e\u003e allocated 
memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and 
PYTORCH_CUDA_ALLOC_CONF"
}


In [None]:
# Create ModelBuilder from JumpStart Config using the model and instance you selected in the previous cell
# this will take a while (few seconds), as it needs to download the model from jumpstart

from sagemaker.serve.model_builder import ModelBuilder
from sagemaker.core.jumpstart.configs import JumpStartConfig
from sagemaker.core.resources import EndpointConfig
from sagemaker.train.configs import Compute

# https://github.com/aws/sagemaker-python-sdk/blob/master/v3-examples/inference-examples/jumpstart-example.ipynb
jumpstart_config = JumpStartConfig(model_id=modelID)
compute = Compute(instance_type=instanceType)
model_builder = ModelBuilder.from_jumpstart_config(
    jumpstart_config=jumpstart_config,
    compute=compute,
    role_arn=myRoleSageMakerExecutionARN,
    sagemaker_session=sageMakerSession
)

print("Done! Move to the next cell ->")

In [None]:
# build the model
core_model = model_builder.build(
    model_name=modelID,
    role_arn=myRoleSageMakerExecutionARN,
    sagemaker_session=sageMakerSession,
)

print("Done! Move to the next cell ->")

In [None]:
# deploy the model to an endpoint
core_endpoint = model_builder.deploy(
    endpoint_name=myEndpoint,
    role_arn=myRoleSageMakerExecutionARN,
    sagemaker_session=sageMakerSession,
    container_timeout_in_seconds=600,
)

print("Done! Move to the next cell ->")

In [None]:
# store the predictor name in a parameter store for use in other notebooks
ssm.put_parameter(
    Name=myParameterStoreEndpointName,
    Description='the name of the sagemaker endpoint created in lab1',
    Value=core_endpoint.endpoint_name,
    Type='String',
    Overwrite=True,
)

print ('Done! Move to the next cell ->')

In [None]:
# displays an image from the model response so it can be reviewed live rather thna diving into S3
def decode_and_show(description, model_response) -> None:
    from PIL import Image
    import base64
    import io
    
    print (description)
    # Handle PIL Image objects
    if hasattr(model_response, 'save'):  # Check if it's a PIL Image
        display(model_response)
        return
    
    # Handle bytes (raw image data)
    elif isinstance(model_response, bytes):
        image = Image.open(io.BytesIO(model_response))
        display(image)
        image.close()
    
    # Handle base64 string (encoded image)
    elif isinstance(model_response, str):
        image = Image.open(io.BytesIO(base64.b64decode(model_response)))
        display(image)
        image.close()
    
    # Handle list of base64 strings (model response)
    elif isinstance(model_response, list):
        for i, img_data in enumerate(model_response):
            image = Image.open(io.BytesIO(base64.b64decode(img_data)))
            print(f"Image {i + 1}:")
            display(image)
            image.close()
    
    else:
        print(f"Can't handle the image. Unexpected response type: {type(model_response)}")

In [None]:
# resize the image if required - large images can cause cuda memory errors dues to memory reqd to scale it
def resize_image(imgBytes, max_size=1024):
    from PIL import Image
    import io

    # Resize maintaining aspect ratio
    image = Image.open(io.BytesIO(imgBytes))
    decode_and_show("Original Image", image)
    
    image.thumbnail((max_size, max_size), Image.Resampling.LANCZOS)
    decode_and_show("Downsized Image", image)

    # Convert back to bytes
    img_byte_arr = io.BytesIO()
    image.save(img_byte_arr, format='JPEG', quality=95)  # Use JPEG to reduce size
    return img_byte_arr.getvalue()

In [None]:
# test the endpoint with some example payloads
import base64
from PIL import Image

#Make sure your jpeg images are < 6MB otherwise you may get a HTTP 413 Content Too Large (Payload Too Large) error
# try your own payload
images = [
    "resources/img2_original_1024.jpeg",
    "resources/img1_original_1024.jpeg",
]

# Note that sending or receiving the payload with the raw RGB values may hit default limits for the input payload and the response size
# Therefore, we recommend using the base64 encoded image by setting:
# content_type = “application/json;jpeg” and accept = “application/json;jpeg”
# https://aws.amazon.com/blogs/machine-learning/upscale-images-with-stable-diffusion-in-amazon-sagemaker-jumpstart/
content_type = "application/json;jpeg"
accept = "application/json;jpeg"

for img in images:
    with open(img, "rb") as f:
        bytes = f.read()
    #downsized_bytes = resize_image(bytes, 128)
    encoded_image = base64.b64encode(bytearray(bytes)).decode()

    # NOTE 
    # num_inference_steps
    # 10-25 steps: Fast, decent quality (good for testing/quick iterations)
    # 25-50 steps: Good balance of speed and quality (common for production)
    # 50-100 steps: High quality, slower (for final outputs)
    # 100+ steps: Diminishing returns, much slower
    #
    # guidance_scale explanation
    # Lower values (1-5): More creative, less faithful to prompt
    # Medium values (7-12): Good balance (7.5 is a common default)
    # Higher values (13-20+): Strictly follows prompt, less creative
    payload = {
        "image": encoded_image,
        "prompt": "Improve the photographic quality of the image",
        "num_inference_steps":25,
        "guidance_scale":7.5
    }

    # NOTE if you get a 413 error, your original image sizes are probably too large, try images < 6MB
    # HTTP 413 Content Too Large (Payload Too Large)
    # NOTE if you get a 400 CUDA out of memory error, this indicates that your GPU RAM (Random access memory) is full 
    # HTTP 400 InternalServerException
    # If so, bigger instance with a larger GPU per core, use a diffuser such as below to split the image into batches and restitch once done
    # Or resize your images down to a smaller size
    # NOTE if you get a timeout error, could be your image is too large
    # If so, resize your images down to a smaller size
    # Or reduce the num_inference_steps

    # The following commented code does not work
    # Seems SageMaker V3 does not yet support the core_endpoint.invoke method as documented
    # https://github.com/aws/sagemaker-python-sdk/blob/master/v3-examples/inference-examples/jumpstart-example.ipynb
    # possibly because of a missing __dict__ attribute on the payload object as it has a bytes body
    # TypeError: vars() argument must have __dict__ attribute
    #response = core_endpoint.invoke(
    #    body = json.dumps(payload).encode('utf-8'),
    #    content_type = content_type,
    #    accept = accept,
    #)

    response = smr.invoke_endpoint(
        EndpointName=core_endpoint.endpoint_name,
        Body=json.dumps(payload).encode('utf-8'),
        ContentType=content_type,
        Accept=accept,
    )

    decode_and_show("Scaled 4x Image", response["generated_images"])

print("Done! Move to the next cell ->")

# <span style="color:DarkSeaGreen">Move to Lab 2</span>
# <span style="color:DarkSeaGreen">OR...</span>
# <span style="color:DarkSeaGreen">Clean Up Architecture</span>
### <span style="color:Red">Only do this if you have finished with this lab and any labs that depend on it!</span>
##### It will delete all architecture created, make sure you no longer need any of it!!!

In [None]:
# when finished with the endpoint, delete it
response = sm.describe_endpoint(EndpointName=myEndpoint)
print(response["EndpointStatus"])

if response["EndpointStatus"] == "InService":
    print("Endpoint is in service. Proceeding with deletion.")
    sm.delete_endpoint(EndpointName=myEndpoint)
    print ('Done! Move to the next cell ->')
else:
    print("Endpoint is not in service. Cannot delete. Try again in a couple of minutes.")

In [None]:
# delete the endpoint config
response = sm.delete_endpoint_config(EndpointConfigName=myEndpoint)
print ('Done! Move to the next cell ->')

In [None]:
# delete the model
response = sm.delete_model(ModelName=core_model.model_name)
print ('Done! Move to the next cell ->')

In [None]:
# delete roles and policies
iam.detach_role_policy(
    RoleName=myRoleSageMakerExecution, PolicyArn='arn:aws:iam::aws:policy/AmazonSageMakerFullAccess'
)
iam.delete_role(RoleName=myRoleSageMakerExecution)
print ('Done! Move to the next cell ->')

In [None]:
# delete the parameter store entry
ssm.delete_parameter(Name=myParameterStoreChosenModel)
ssm.delete_parameter(Name=myParameterStoreEndpointName)
ssm.delete_parameter(Name=myParameterStoreIAMARN)
print ('Done! Move to the next cell ->')

# <span style="color:DarkSeaGreen">Clean Up venv</span>
### Clean up if finished with this lab and running in VSCode or equivalent local IDE
#### Note these are macOS specific
- Run the commands of the cell below in a terminal window if you need to clean up a local venv
  - Note if you copy and paste the entire cell and run as one you will get zsh: command not found: # errors because of the comments, but you can ignore
  - Remember to restart the kernel to refresh whats available

In [None]:
# if you have local host in your terminal prompt
unset HOST
# deactivate the venv
deactivate 
# remove it and its contents if not needed
rm -rf venv-jumpstart-stable-lab1 