Support for Amazon SageMaker AI endpoints as Model Provider #176

dgallitelli · 2025-06-04T13:38:48Z

Description

Support for Amazon SageMaker AI endpoints as Model Provider

Related Issues

Issue #16

Documentation PR

[Link to related associated PR in the agent-docs repo]

Type of Change

New feature

Testing

Yes

Checklist

I have read the CONTRIBUTING document
I have added tests that prove my fix is effective or my feature works
I have updated the documentation accordingly
I have added an appropriate example to the documentation to outline the feature
My changes generate no new warnings
Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

dgallitelli · 2025-06-04T14:41:06Z

This is an updated version of the PR #30 . Please review and merge if appropriate 😄

swami87aws · 2025-06-10T23:48:39Z

I wish this gets merged soon. It opens up access to AWS Marketplace models to deployed as Amazon Sagemaker endpoints and accessible via Strands

brunopistone · 2025-06-11T09:23:40Z

Do we have an expected date for this PR? It is needed for customer workshops in the coming weeks.

rvvittal · 2025-06-11T15:07:17Z

Do we have an expected date for this PR? This will help with your SageMaker AI GTM motions.

mehtarac · 2025-06-11T16:05:15Z

Hi all, thanks for the interest and implementation for the model provider! The team will review the pull request this week and start the feedback process (leave comments, questions if any).

swami87aws · 2025-06-11T18:02:35Z

@dgallitelli Please correct me if i m wrong here. This handles only JSON type content. Will it be able to handle a multimedia content type?

dgallitelli · 2025-06-11T20:15:16Z

@dgallitelli Please correct me if i m wrong here. This handles only JSON type content. Will it be able to handle a multimedia content type?

That's correct! It currently does not support multi-modal models.

src/strands/models/sagemaker.py

tests/strands/models/test_sagemaker.py

swami87aws · 2025-07-01T06:45:38Z

Hi all.. Is there an ETA on this PR merge?

mehtarac · 2025-07-02T12:55:11Z

Hi all.. Is there an ETA on this PR merge?

Hi! I left some comments and will reach out to the author for a new commit to address.

src/strands/models/sagemaker.py

swami87aws · 2025-07-07T01:06:35Z

Tried extending this under fargate deployment and facing the below error.

Error during Story narration: 'async for' requires an object with __aiter__ method, got generator

dgallitelli · 2025-07-07T09:58:48Z

Code has been updated and tested successfully. Please review and merge :)

How to use it:

from strands.models.sagemaker import SageMakerAIModel


model = SageMakerAIModel(
    endpoint_config=SageMakerAIModel.SageMakerAIEndpointConfig(
        endpoint_name=endpoint_name, region_name="us-east-1"
    ),
    payload_config=SageMakerAIModel.SageMakerAIPayloadSchema(
        max_tokens=4096, stream=False
    )
)
agent = Agent(
    model=model, tools=[calculator], 
    system_prompt="You are a helpful assistant, master of tool calling. Do not use tools that are not in the provided list of tools."
)
agent("What is 999*123?")

mehtarac · 2025-07-08T03:04:14Z

Hi Davide! Thank you for these changes. Do you happen to have a test script that you may have used to for check the implementation for the sagemaker model provider? I am running into testing issues when trying to use the agent. Two issues:

Looks like the stream method is async which is different than all other model providers currently in strands.
I was able to get around this by adding await. response = await agent(test_query)
There was an issue with the format_request method. The specific error was:
TypeError: content_type=<H> | unsupported type
This error occurred in the format_request_message_content method of the OpenAI model class that the SageMaker model provider extends. The error suggests that the content format being passed to the model provider isn't in the expected structure.
I was able to get around this error by adding a messages array in a specific format:

messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": message
            }
        ]
    }
]

However, it seems that there might still be an issue with how the content is being processed.

dgallitelli · 2025-07-08T08:07:50Z

Hi Davide! Thank you for these changes. Do you happen to have a test script that you may have used to for check the implementation for the sagemaker model provider? I am running into testing issues when trying to use the agent. Two issues:

Looks like the stream method is async which is different than all other model providers currently in strands.
I was able to get around this by adding await. response = await agent(test_query)

There was an issue with the format_request method. The specific error was:
TypeError: content_type=<H> | unsupported type
This error occurred in the format_request_message_content method of the OpenAI model class that the SageMaker model provider extends. The error suggests that the content format being passed to the model provider isn't in the expected structure.
I was able to get around this error by adding a messages array in a specific format:
messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": message
            }
        ]
    }
]
However, it seems that there might still be an issue with how the content is being processed.

Hey there! :D
I've just tested with the freshly released Qwen 3 from SageMaker JumpStart (announcement blog), as well as with my own fine-tuned function calling model, and it works like a charm ✨

Deployed the 4B model from here:

Make sure to have strands-agents >= 0.2.0 when testing, then use this code:

model = SageMakerAIModel(
    endpoint_config=SageMakerAIModel.SageMakerAIEndpointConfig(
        endpoint_name=endpoint_name, region_name="us-east-1"
    ),
    payload_config=SageMakerAIModel.SageMakerAIPayloadSchema(
        max_tokens=4096, stream=True
    )
)
agent = Agent(
    model=model, tools=[calculator], 
    system_prompt="You are a helpful assistant, master of tool calling. Do not use tools that are not in the provided list of tools."
)
agent("What is 999*123?")

mehtarac · 2025-07-08T21:22:51Z

Investigation summary:

Met with Davide and identified two issues.
1. A rogue yield statement not being cleaned up
2. region-name initialized with a value that is not where the model is deployed not being caught properly
- These two issues were fixed in the meeting with Davide
Continued testing the model provider with different Jumpstart models but running into issues with some models. For each model I deployed them to my account.

Mistral 8BMixtral-8x22B-Instruct-v0.1

[INFO ] PyProcess - W-191-model-stdout: [1,0]<stdout>:jinja2.exceptions.TemplateError: Conversation roles must alternate user/assistant/user/assistant/...

Qwen2.5 14B Instruct when streaming is False

yield {"chunk_type": "content_delta", "data_type": "text", "data": message["content"] or ""}
                                                                    ~~~~~~~^^^^^^^^^^^
KeyError: 'content'

Qwen2.5 14B Instruct when streaming is True

if choice.get("usage", None):
       ^^^^^^
UnboundLocalError: cannot access local variable 'choice' where it is not associated with a value

Qwen3

1.typed-dict.text   Field required [type=missing, input_value={'signature': '', 'thinki...\n', 'type': 'thinking'}, input_type=dict]     For further information visit https://errors.pydantic.dev/2.9/v/missing | AllTraffic/i-05c4b15b3813526bc
-- | --

Field required [type=missing, input_value={'signature': '', 'thinki...\n', 'type': 'thinking'}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.9/v/missing
1.typed-dict.text 
Field required [type=missing, input_value={'signature': '', 'thinki...\n', 'type': 'thinking'}, input_type=dict] For further information visit https://errors.pydantic.dev/2.9/v/missing
1.typed-dict.type
  Input should be 'text' [type=literal_error, input_value='thinking', input_type=str]
    For further information visit https://errors.pydantic.dev/2.9/v/literal_error
    ```

mehtarac · 2025-07-09T02:53:36Z

Also it looks like one of the workflows failed: https://github.com/strands-agents/sdk-python/actions/runs/16147728499/job/45607320057?pr=176

dgallitelli requested a review from a team as a code owner June 4, 2025 13:38

dgallitelli mentioned this pull request Jun 4, 2025

Amazon SageMaker AI Model Provider implementation #30

Closed

6 tasks

dbschmigelski requested review from mehtarac and pgrayy June 11, 2025 15:19