-
-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Closed
Labels
Description
The Feature
Feature Request: Enable Prompt Caching by Default for Specific Models
Description
LiteLLM supports prompt caching for models like Bedrock Claude, but currently requires users to explicitly set parameters in their messages. We need the ability to enable prompt caching by default for specific models, particularly Bedrock Claude models, so users don't have to modify their messages for every request.
Current Behavior
- Users must manually add
cache_control
parameters to their messages - No way to globally enable prompt caching by default for specific models
- Caching capability exists but is opt-in for every request
Desired Behavior
- Allow setting prompt caching as a default for specific models in configuration
- Automatically inject cache control parameters for Bedrock Claude models when not explicitly specified
- Provide a simple toggle like
enable_prompt_caching=True
in model configuration
Proposed Implementation
- Add a
supports_prompt_caching
flag in model configuration - Enhance the message transformation pipeline to automatically add cache control
- Add a preprocessing step in
BedrockLLM.completion()
that modifies messages for models with default caching enabled
Example Configuration
litellm.register_model(
model_cost={
"bedrock/anthropic.claude-3-sonnet-20240229-v1:0": {
"supports_prompt_caching": True,
"default_prompt_caching": True, # enforce prompt cachine always
"litellm_provider": "bedrock",
"mode": "chat"
}
}
)
Are you a ML Ops Team?
Yes
Twitter / LinkedIn details
marty-sullivan, ddkang1, nitinkr0411, upman, denispol and 3 more