-
Notifications
You must be signed in to change notification settings - Fork 373
Aws Bedrock provider native support #274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
No proxies needed! refs #86b5n9pdx
Move model pattern matching logic from model_factory to augmented_llm_bedrock for better encapsulation of bedrock logic and separation of concerns. refs #86b5n9pdx
example minimal config file for AWS Bedrock
Fix issue with message history refs #86b5n9pdx
|
hi @jdecker76 @jondecker76 --- i haven't looked at this in detail, and i don't have a bedrock setup. having said that-
thanks very much for this PR -- it looks great, and I'm keen to get it merged and released (ideally without being driven to tears by AWS setup). |
Not all models support structured output. In this first phase of implementation, we will assume no Bedrock models support structured output and emulate it with prompt engineering. In the next phase we will try to figure out which models DO support true structured output, and handle those separately
…t-agent into aws_bedrock_provider
Sounds great! I have a few more minor changes incoming for this yet, i'll wrap those up and do the tests. upcoming addition: adding structured output support. Here is the basic idea: I will be in touch soon - thanks |
|
Cool -- let me know if you need any help/have questions at all. the discord is probably the best place to get held of me. |
- apply structured output similar to Anthropic, using prompt engineering to get the model to output the desired schema - disable the boto3 and botocore logging, as it's enabled by default and flooding the logs
Adds support for the Titan and Cohere conversation models. Some support system messages and others do not. For those that do not support system messages, inject the system message at the beginning of the first user prompt. Update the fast-agent history - deficiency found from running the smoke_test
Finished testing all of the models for tool use as well as streaming mode with tools
goes along with fast-agent pull request 274: evalstate/fast-agent#274
|
Pull request for the docs site is done |
|
Will start running the e2e smoke test today and will report back with findings |
Add cased-versions of the bedrock type
Due to the number of various models available in Bedrock, some of the less capable models struggle with structured output without additional instruction. Updated the instructions with a more straight-forward schema and additional instruction. With this change, 47 out of 51 tested models are able to repeatedly generate structured output. Unfortunately some of the models just aren't cut out for structured output
Add new tool use logic so that different models can use different tool use schemas. This is mainly to add tool use support to LLAMA models, but may work for Titan models as well
- Implements Anthropic tool calling similar to the real Anthropic provider
- Cleaned up tool call responses. Originally, it included the thinking step ("Hey I'm going to call this tool..."), then the response ("Toll responded with: blah blah blah"), then the summary. Now it clears the responses and only returns the summary. The output is much much cleaner now!
Fold the Anthropic changes into the mix so that all providers now use their associated tool schema type. Testing is finally showing all capable models consistently making tool calls
|
@evalstate Notes: Failures:
Models Tested: I skipped testing the Amazon Titan models - they are quite bad at anything beyond simple chat interaction, do not support tool calls and are soon to be depreciated. ============================= warnings summary ============================= tests/e2e/smoke/base/test_e2e_smoke.py: 12 warnings tests/e2e/smoke/base/test_e2e_smoke.py: 12 warnings -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html |
|
Thanks for the lint fixes - i was just starting that! |
|
No worries -- I saw the list of models at https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html -- do you know if they have a version of that list that includes context window size (and tool calling functionality)? there's not much we can do if a model doesn't support tool calls :) think this is good to merge then? can you just run one final test on a big model on your side? |
|
Unfortunately, the AWS documentation is terrible and out of date. I haven't yet found a single bit of documentation that agrees with the others. For example, their pricing page: https://aws.amazon.com/bedrock/pricing/ mentions providers and models not listed anywhere else that I could find (TwelveLabs, Writer, LumaAI, etc). Also, some models that are available are really meant for fine-tuning, so they don't do well in general use. We exclusively use Bedrock at the company I work for due to Bedrock having everything in place to comply with HIPPA regulations. Every Bedrock model that we have used is working great. I think I'll run a test next that just uses one fully capable model from each family, then I'll post the results here. |
yes, just make sure my update to the provider list didn't mess anything up :) |
|
I just did a test with one model from each provider - all is well ! /Users/jdecker/ACUServeProjects/fast-agent/tests/e2e/smoke/base/test_e2e_smoke.py:323: AssertionError -- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html |
|
very nice finally getting bedrock in fast-agent here! 👍 |
* Mcp filtering (#282) * Initial commit: MCP tool/resource/prompt filtering Initial implementation. Still some testing to do * Add filtering example Simple filtering example showing how to only pull in the tools needed for the agent * MCP Filtering additions - improved example - wrote e2e test - updated enhanced_prompt to show loaded MCP tools, prompts and resources at the beginning of a chat * Fix linter errors * Fix missing api_key from agent * update decorators for custom/router agent --------- Co-authored-by: jondecker76 <jondecker76@gmail.com> Co-authored-by: evalstate <1936278+evalstate@users.noreply.github.com> * Aws Bedrock provider native support (#274) * Initial commit of adding AWS Bedrock support No proxies needed! refs #86b5n9pdx * Bedrock improvement Move model pattern matching logic from model_factory to augmented_llm_bedrock for better encapsulation of bedrock logic and separation of concerns. refs #86b5n9pdx * Minimal example config file example minimal config file for AWS Bedrock * Update augmented_llm_bedrock.py Fix issue with message history refs #86b5n9pdx * linter * Bedrock: Add support for structured output Not all models support structured output. In this first phase of implementation, we will assume no Bedrock models support structured output and emulate it with prompt engineering. In the next phase we will try to figure out which models DO support true structured output, and handle those separately * Bedrock improvements - apply structured output similar to Anthropic, using prompt engineering to get the model to output the desired schema - disable the boto3 and botocore logging, as it's enabled by default and flooding the logs * Add support for Titan and Cohere conversation models Adds support for the Titan and Cohere conversation models. Some support system messages and others do not. For those that do not support system messages, inject the system message at the beginning of the first user prompt. Update the fast-agent history - deficiency found from running the smoke_test * Update augmented_llm_bedrock.py Finished testing all of the models for tool use as well as streaming mode with tools * Update provider_types.py Add cased-versions of the bedrock type * Improve structured responses Due to the number of various models available in Bedrock, some of the less capable models struggle with structured output without additional instruction. Updated the instructions with a more straight-forward schema and additional instruction. With this change, 47 out of 51 tested models are able to repeatedly generate structured output. Unfortunately some of the models just aren't cut out for structured output * Update augmented_llm_bedrock.py Add new tool use logic so that different models can use different tool use schemas. This is mainly to add tool use support to LLAMA models, but may work for Titan models as well * Improve Anthropic tool calling. Clean up response - Implements Anthropic tool calling similar to the real Anthropic provider - Cleaned up tool call responses. Originally, it included the thinking step ("Hey I'm going to call this tool..."), then the response ("Toll responded with: blah blah blah"), then the summary. Now it clears the responses and only returns the summary. The output is much much cleaner now! * Update augmented_llm_bedrock.py Fold the Anthropic changes into the mix so that all providers now use their associated tool schema type. Testing is finally showing all capable models consistently making tool calls * new provider types with friendly names * lint --------- Co-authored-by: jondecker76 <jondecker76@gmail.com> Co-authored-by: evalstate <1936278+evalstate@users.noreply.github.com> --------- Co-authored-by: jdecker76 <161130550+jdecker76@users.noreply.github.com> Co-authored-by: jondecker76 <jondecker76@gmail.com>
* include o4 as a reasoning model * hugging face spaces uses X-HF-Authorization header * o4 in tool tests * o4 and hf_token handling (#292) * Mcp filtering (#282) * Initial commit: MCP tool/resource/prompt filtering Initial implementation. Still some testing to do * Add filtering example Simple filtering example showing how to only pull in the tools needed for the agent * MCP Filtering additions - improved example - wrote e2e test - updated enhanced_prompt to show loaded MCP tools, prompts and resources at the beginning of a chat * Fix linter errors * Fix missing api_key from agent * update decorators for custom/router agent --------- Co-authored-by: jondecker76 <jondecker76@gmail.com> Co-authored-by: evalstate <1936278+evalstate@users.noreply.github.com> * Aws Bedrock provider native support (#274) * Initial commit of adding AWS Bedrock support No proxies needed! refs #86b5n9pdx * Bedrock improvement Move model pattern matching logic from model_factory to augmented_llm_bedrock for better encapsulation of bedrock logic and separation of concerns. refs #86b5n9pdx * Minimal example config file example minimal config file for AWS Bedrock * Update augmented_llm_bedrock.py Fix issue with message history refs #86b5n9pdx * linter * Bedrock: Add support for structured output Not all models support structured output. In this first phase of implementation, we will assume no Bedrock models support structured output and emulate it with prompt engineering. In the next phase we will try to figure out which models DO support true structured output, and handle those separately * Bedrock improvements - apply structured output similar to Anthropic, using prompt engineering to get the model to output the desired schema - disable the boto3 and botocore logging, as it's enabled by default and flooding the logs * Add support for Titan and Cohere conversation models Adds support for the Titan and Cohere conversation models. Some support system messages and others do not. For those that do not support system messages, inject the system message at the beginning of the first user prompt. Update the fast-agent history - deficiency found from running the smoke_test * Update augmented_llm_bedrock.py Finished testing all of the models for tool use as well as streaming mode with tools * Update provider_types.py Add cased-versions of the bedrock type * Improve structured responses Due to the number of various models available in Bedrock, some of the less capable models struggle with structured output without additional instruction. Updated the instructions with a more straight-forward schema and additional instruction. With this change, 47 out of 51 tested models are able to repeatedly generate structured output. Unfortunately some of the models just aren't cut out for structured output * Update augmented_llm_bedrock.py Add new tool use logic so that different models can use different tool use schemas. This is mainly to add tool use support to LLAMA models, but may work for Titan models as well * Improve Anthropic tool calling. Clean up response - Implements Anthropic tool calling similar to the real Anthropic provider - Cleaned up tool call responses. Originally, it included the thinking step ("Hey I'm going to call this tool..."), then the response ("Toll responded with: blah blah blah"), then the summary. Now it clears the responses and only returns the summary. The output is much much cleaner now! * Update augmented_llm_bedrock.py Fold the Anthropic changes into the mix so that all providers now use their associated tool schema type. Testing is finally showing all capable models consistently making tool calls * new provider types with friendly names * lint --------- Co-authored-by: jondecker76 <jondecker76@gmail.com> Co-authored-by: evalstate <1936278+evalstate@users.noreply.github.com> --------- Co-authored-by: jdecker76 <161130550+jdecker76@users.noreply.github.com> Co-authored-by: jondecker76 <jondecker76@gmail.com> --------- Co-authored-by: jdecker76 <161130550+jdecker76@users.noreply.github.com> Co-authored-by: jondecker76 <jondecker76@gmail.com>
This is a proper AWS Bedrock provider for full support of all AWS Bedrock models
features:
I tried using Bedrock with fast-agent via:
Bedrock Access Gateway - chat worked, but tool support was broken
LiteLLM proxy: chat worked, but tool support also broken
TensorZero gateway: chat worked, but tool support also broken. Additionally, there was a lot of complexity setting up the TOML file - literally every single model AND every single tool call had to be manually set up in the TOML config.
I spent a lot of time trying to get Bedrock models working in fast-agent via these types of proxies, but the problem was consistent - whether setting up a model as "custom" or "openai" - it would send the tool schema in the openai format always, which the models rejected. Since Bedrock aggregates many different models from different places, each model must receive the tool information in the schema that particular model understands, and the proxy layers (and fast-agent) were unable to provide this tool data in the correct format. These issues would cascade throughout fast-agent; for example, workflow agents (routers, etc) would not work due to the broken tool calls.
This pull request fixes all of the above by implementing a native Bedrock provider using the AWS SDK (boto3 library). This has been thoroughly tested and is perfectly stable.