Skip to content

[Feature]: enable prompt caching by default in model configuration for bedrock claude models #9805

@HaithamMaya

Description

@HaithamMaya

The Feature

Feature Request: Enable Prompt Caching by Default for Specific Models

Description

LiteLLM supports prompt caching for models like Bedrock Claude, but currently requires users to explicitly set parameters in their messages. We need the ability to enable prompt caching by default for specific models, particularly Bedrock Claude models, so users don't have to modify their messages for every request.

Current Behavior

  • Users must manually add cache_control parameters to their messages
  • No way to globally enable prompt caching by default for specific models
  • Caching capability exists but is opt-in for every request

Desired Behavior

  • Allow setting prompt caching as a default for specific models in configuration
  • Automatically inject cache control parameters for Bedrock Claude models when not explicitly specified
  • Provide a simple toggle like enable_prompt_caching=True in model configuration

Proposed Implementation

  1. Add a supports_prompt_caching flag in model configuration
  2. Enhance the message transformation pipeline to automatically add cache control
  3. Add a preprocessing step in BedrockLLM.completion() that modifies messages for models with default caching enabled

Example Configuration

litellm.register_model(
    model_cost={
        "bedrock/anthropic.claude-3-sonnet-20240229-v1:0": {
            "supports_prompt_caching": True,
            "default_prompt_caching": True,  #  enforce prompt cachine always
            "litellm_provider": "bedrock",
            "mode": "chat"
        }
    }
)

Are you a ML Ops Team?

Yes

Twitter / LinkedIn details

http://linkedin.com/in/hmaya

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions