Skip to content

Support for Amazon SageMaker AI endpoints as Model Provider #176

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 29 commits into
base: main
Choose a base branch
from

Conversation

dgallitelli
Copy link

@dgallitelli dgallitelli commented Jun 4, 2025

Description

Support for Amazon SageMaker AI endpoints as Model Provider

Related Issues

Issue #16

Documentation PR

[Link to related associated PR in the agent-docs repo]

Type of Change

New feature

Testing

Yes

Checklist

  • I have read the CONTRIBUTING document
  • I have added tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@dgallitelli dgallitelli requested a review from a team as a code owner June 4, 2025 13:38
@dgallitelli
Copy link
Author

This is an updated version of the PR #30 . Please review and merge if appropriate 😄

@swami87aws
Copy link

I wish this gets merged soon. It opens up access to AWS Marketplace models to deployed as Amazon Sagemaker endpoints and accessible via Strands

@brunopistone
Copy link

Do we have an expected date for this PR? It is needed for customer workshops in the coming weeks.

@rvvittal
Copy link

Do we have an expected date for this PR? This will help with your SageMaker AI GTM motions.

@dbschmigelski dbschmigelski requested review from mehtarac and pgrayy June 11, 2025 15:19
@mehtarac
Copy link
Member

Hi all, thanks for the interest and implementation for the model provider! The team will review the pull request this week and start the feedback process (leave comments, questions if any).

@swami87aws
Copy link

@dgallitelli Please correct me if i m wrong here. This handles only JSON type content. Will it be able to handle a multimedia content type?

@dgallitelli
Copy link
Author

@dgallitelli Please correct me if i m wrong here. This handles only JSON type content. Will it be able to handle a multimedia content type?

That's correct! It currently does not support multi-modal models.

@swami87aws
Copy link

Tried extending this under fargate deployment and facing the below error.

Error during Story narration: 'async for' requires an object with __aiter__ method, got generator

@dgallitelli
Copy link
Author

Code has been updated and tested successfully. Please review and merge :)

How to use it:

from strands.models.sagemaker import SageMakerAIModel


model = SageMakerAIModel(
    endpoint_config=SageMakerAIModel.SageMakerAIEndpointConfig(
        endpoint_name=endpoint_name, region_name="us-east-1"
    ),
    payload_config=SageMakerAIModel.SageMakerAIPayloadSchema(
        max_tokens=4096, stream=False
    )
)
agent = Agent(
    model=model, tools=[calculator], 
    system_prompt="You are a helpful assistant, master of tool calling. Do not use tools that are not in the provided list of tools."
)
agent("What is 999*123?")

@mehtarac
Copy link
Member

mehtarac commented Jul 8, 2025

Hi Davide! Thank you for these changes. Do you happen to have a test script that you may have used to for check the implementation for the sagemaker model provider? I am running into testing issues when trying to use the agent. Two issues:

  1. Looks like the stream method is async which is different than all other model providers currently in strands.
    I was able to get around this by adding await. response = await agent(test_query)
  2. There was an issue with the format_request method. The specific error was:
    TypeError: content_type=<H> | unsupported type
    This error occurred in the format_request_message_content method of the OpenAI model class that the SageMaker model provider extends. The error suggests that the content format being passed to the model provider isn't in the expected structure.
    I was able to get around this error by adding a messages array in a specific format:
messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": message
            }
        ]
    }
]

However, it seems that there might still be an issue with how the content is being processed.

@dgallitelli
Copy link
Author

dgallitelli commented Jul 8, 2025

Hi Davide! Thank you for these changes. Do you happen to have a test script that you may have used to for check the implementation for the sagemaker model provider? I am running into testing issues when trying to use the agent. Two issues:

  1. Looks like the stream method is async which is different than all other model providers currently in strands.
    I was able to get around this by adding await. response = await agent(test_query)
  2. There was an issue with the format_request method. The specific error was:
    TypeError: content_type=<H> | unsupported type
    This error occurred in the format_request_message_content method of the OpenAI model class that the SageMaker model provider extends. The error suggests that the content format being passed to the model provider isn't in the expected structure.
    I was able to get around this error by adding a messages array in a specific format:
messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": message
            }
        ]
    }
]

However, it seems that there might still be an issue with how the content is being processed.

Hey there! :D
I've just tested with the freshly released Qwen 3 from SageMaker JumpStart (announcement blog), as well as with my own fine-tuned function calling model, and it works like a charm ✨

Deployed the 4B model from here:

image

Make sure to have strands-agents >= 0.2.0 when testing, then use this code:

model = SageMakerAIModel(
    endpoint_config=SageMakerAIModel.SageMakerAIEndpointConfig(
        endpoint_name=endpoint_name, region_name="us-east-1"
    ),
    payload_config=SageMakerAIModel.SageMakerAIPayloadSchema(
        max_tokens=4096, stream=True
    )
)
agent = Agent(
    model=model, tools=[calculator], 
    system_prompt="You are a helpful assistant, master of tool calling. Do not use tools that are not in the provided list of tools."
)
agent("What is 999*123?")

@mehtarac
Copy link
Member

mehtarac commented Jul 8, 2025

Investigation summary:

  • Met with Davide and identified two issues.

    1. A rogue yield statement not being cleaned up
    2. region-name initialized with a value that is not where the model is deployed not being caught properly
    • These two issues were fixed in the meeting with Davide
  • Continued testing the model provider with different Jumpstart models but running into issues with some models. For each model I deployed them to my account.

  1. Mistral 8BMixtral-8x22B-Instruct-v0.1
  • [INFO ] PyProcess - W-191-model-stdout: [1,0]<stdout>:jinja2.exceptions.TemplateError: Conversation roles must alternate user/assistant/user/assistant/...
  1. Qwen2.5 14B Instruct when streaming is False
yield {"chunk_type": "content_delta", "data_type": "text", "data": message["content"] or ""}
                                                                    ~~~~~~~^^^^^^^^^^^
KeyError: 'content'
  1. Qwen2.5 14B Instruct when streaming is True
if choice.get("usage", None):
       ^^^^^^
UnboundLocalError: cannot access local variable 'choice' where it is not associated with a value
  1. Qwen3
1.typed-dict.text   Field required [type=missing, input_value={'signature': '', 'thinki...\n', 'type': 'thinking'}, input_type=dict]     For further information visit https://errors.pydantic.dev/2.9/v/missing | AllTraffic/i-05c4b15b3813526bc
-- | --

Field required [type=missing, input_value={'signature': '', 'thinki...\n', 'type': 'thinking'}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.9/v/missing
1.typed-dict.text 
Field required [type=missing, input_value={'signature': '', 'thinki...\n', 'type': 'thinking'}, input_type=dict] For further information visit https://errors.pydantic.dev/2.9/v/missing
1.typed-dict.type
  Input should be 'text' [type=literal_error, input_value='thinking', input_type=str]
    For further information visit https://errors.pydantic.dev/2.9/v/literal_error
    ```

@mehtarac
Copy link
Member

mehtarac commented Jul 9, 2025

Also it looks like one of the workflows failed: https://github.com/strands-agents/sdk-python/actions/runs/16147728499/job/45607320057?pr=176

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants