generated from amazon-archives/__template_Apache-2.0
-
Notifications
You must be signed in to change notification settings - Fork 120
feat: Add docs for Sagemaker model provider #186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,104 @@ | ||
| # Amazon SageMaker | ||
|
|
||
| [Amazon SageMaker](https://aws.amazon.com/sagemaker/) is a fully managed machine learning service that provides infrastructure and tools for building, training, and deploying ML models at scale. The Strands Agents SDK implements a SageMaker provider, allowing you to run agents against models deployed on SageMaker inference endpoints, including both pre-trained models from SageMaker JumpStart and custom fine-tuned models. The provider is designed to work with models that support OpenAI-compatible chat completion APIs. | ||
|
|
||
| For example, you can expose models like [Mistral-Small-24B-Instruct-2501](https://aws.amazon.com/blogs/machine-learning/mistral-small-24b-instruct-2501-is-now-available-on-sagemaker-jumpstart-and-amazon-bedrock-marketplace/) on SageMaker, which has demonstrated reliable performance for conversational AI and tool calling scenarios. | ||
|
|
||
| ## Installation | ||
|
|
||
| SageMaker is configured as an optional dependency in Strands Agents. To install, run: | ||
|
|
||
| ```bash | ||
| pip install 'strands-agents[sagemaker]' | ||
| ``` | ||
|
|
||
| ## Usage | ||
|
|
||
| After installing the SageMaker dependencies, you can import and initialize the Strands Agents' SageMaker provider as follows: | ||
|
|
||
| ```python | ||
| from strands import Agent | ||
| from strands.models.sagemaker import SageMakerAIModel | ||
| from strands_tools import calculator | ||
|
|
||
| model = SageMakerAIModel( | ||
| endpoint_config={ | ||
| "endpoint_name": "my-llm-endpoint", | ||
| "region_name": "us-west-2", | ||
| }, | ||
| payload_config={ | ||
| "max_tokens": 1000, | ||
| "temperature": 0.7, | ||
| "stream": True, | ||
| } | ||
| ) | ||
|
|
||
| agent = Agent(model=model, tools=[calculator]) | ||
| response = agent("What is the square root of 64?") | ||
| ``` | ||
|
|
||
| **Note**: Tool calling support varies by model. Models like [Mistral-Small-24B-Instruct-2501](https://aws.amazon.com/blogs/machine-learning/mistral-small-24b-instruct-2501-is-now-available-on-sagemaker-jumpstart-and-amazon-bedrock-marketplace/) have demonstrated reliable tool calling capabilities, but not all models deployed on SageMaker support this feature. Verify your model's capabilities before implementing tool-based workflows. | ||
|
|
||
|
|
||
| ## Configuration | ||
|
|
||
| ### Endpoint Configuration | ||
|
|
||
| The `endpoint_config` configures the SageMaker endpoint connection: | ||
|
|
||
| | Parameter | Description | Required | Example | | ||
| |-----------|-------------|----------|---------| | ||
| | `endpoint_name` | Name of the SageMaker endpoint | Yes | `"my-llm-endpoint"` | | ||
| | `region_name` | AWS region where the endpoint is deployed | Yes | `"us-west-2"` | | ||
| | `inference_component_name` | Name of the inference component | No | `"my-component"` | | ||
| | `target_model` | Specific model to invoke (multi-model endpoints) | No | `"model-a.tar.gz"` | | ||
| | `target_variant` | Production variant to invoke | No | `"variant-1"` | | ||
|
|
||
| ### Payload Configuration | ||
|
|
||
| The `payload_config` configures the model inference parameters: | ||
|
|
||
| | Parameter | Description | Default | Example | | ||
| |-----------|-------------|---------|---------| | ||
| | `max_tokens` | Maximum number of tokens to generate | Required | `1000` | | ||
| | `stream` | Enable streaming responses | `True` | `True` | | ||
| | `temperature` | Sampling temperature (0.0 to 2.0) | Optional | `0.7` | | ||
| | `top_p` | Nucleus sampling parameter (0.0 to 1.0) | Optional | `0.9` | | ||
| | `top_k` | Top-k sampling parameter | Optional | `50` | | ||
| | `stop` | List of stop sequences | Optional | `["Human:", "AI:"]` | | ||
|
|
||
| ## Model Compatibility | ||
|
|
||
| The SageMaker provider is designed to work with models that support OpenAI-compatible chat completion APIs. During development and testing, the provider has been validated with [Mistral-Small-24B-Instruct-2501](https://aws.amazon.com/blogs/machine-learning/mistral-small-24b-instruct-2501-is-now-available-on-sagemaker-jumpstart-and-amazon-bedrock-marketplace/), which demonstrated reliable performance across various conversational AI tasks. | ||
|
|
||
| ### Important Considerations | ||
|
|
||
| - **Model Performance**: Results and capabilities vary significantly depending on the specific model deployed to your SageMaker endpoint | ||
| - **Tool Calling Support**: Not all models deployed on SageMaker support function/tool calling. Verify your model's capabilities before implementing tool-based workflows | ||
| - **API Compatibility**: Ensure your deployed model accepts and returns data in the OpenAI chat completion format | ||
|
|
||
| For optimal results, we recommend testing your specific model deployment with your use case requirements before production deployment. | ||
|
|
||
| ## Troubleshooting | ||
|
|
||
| ### Module Not Found | ||
|
|
||
| If you encounter `ModuleNotFoundError: No module named 'boto3'` or similar, install the SageMaker dependencies: | ||
|
|
||
| ```bash | ||
| pip install 'strands-agents[sagemaker]' | ||
| ``` | ||
|
|
||
| ### Authentication | ||
|
|
||
| The SageMaker provider uses standard AWS authentication methods (credentials file, environment variables, IAM roles, or AWS SSO). Ensure your AWS credentials have the necessary SageMaker invoke permissions. | ||
|
|
||
| ### Model Compatibility | ||
|
|
||
| Ensure your deployed model supports OpenAI-compatible chat completion APIs and verify tool calling capabilities if needed. Refer to the [Model Compatibility](#model-compatibility) section above for detailed requirements and testing recommendations. | ||
mehtarac marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| ## References | ||
|
|
||
| - [API Reference](../../../api-reference/models.md) | ||
| - [Amazon SageMaker Documentation](https://docs.aws.amazon.com/sagemaker/) | ||
| - [SageMaker Runtime API](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_runtime_InvokeEndpoint.html) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.