Skip to content

[FEATURE] cache_messages argument for Bedrock model provider #404

@pitta-bread

Description

@pitta-bread

Problem Statement

Just like the existing cache point arguments for system prompts and tools: cache_prompt and cache_tools. Wouldn't it be fairly easy to add in a similar option cache_messages. Looking at the Bedrock model invocation logs, I see this as a big opportunity for significant cost (and latency!) improvements. Also increases the attractiveness of Bedrock as a model provider.

Proposed Solution

When cache_tools="default" is passed to the constructor for BedrockModel the format request method, instead of currently just passing through messages as they are received, iterates through messages to ensure {"cachePoint": {"type": self.config["cache_tools"]}} snippets are added after each assistant message. Similar to the unpacking done for tools.

Use Case

When using Bedrock model provider, saves both time and money ! Even with current usage of both cache_tools and cache_prompt, still 80% of input token usage is not from the cache when it comes to the last few requests of a long-running agent. And the previous messages are mostly the same, just each request 2 messages are being added, with the rest remaining the same and perfect for caching!

Alternatives Solutions

No response

Additional Context

"messages": messages,

https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions