Skip to content

Conversation

@mfenderov
Copy link
Contributor

Prompt caching is automatically enabled for models that support it (detected via models.dev) to reduce latency and costs. System prompts, tool definitions, and recent messages are cached with a 5-minute TTL.

To disable:

provider_opts:
  disable_prompt_caching: true

P.S. Benchmarked with examples/pr-reviewer-bedrock.yaml: 92% cache read vs 8% cache write.

Assisted-By: cagent

@mfenderov mfenderov requested a review from a team as a code owner January 13, 2026 11:48
@krissetto
Copy link
Contributor

This generally looks good to me, thanks for the contribution! ❤️

@dgageot since you've been deep diving on anthropic caching bits lately, you might be interested in taking a look here and maybe comparing bedrock to some of the tests you've been doing with anthropic's api

@krissetto
Copy link
Contributor

Seems to work fine, I'm merging this

Comment on lines +136 to +152
func detectCachingSupport(ctx context.Context, model string) bool {
store, err := modelsdev.NewStore()
if err != nil {
slog.Debug("Bedrock models store unavailable, prompt caching disabled", "error", err)
return false
}

modelID := "amazon-bedrock/" + model
m, err := store.GetModel(ctx, modelID)
if err != nil {
slog.Debug("Bedrock prompt caching disabled: model not found in models.dev",
"model_id", modelID, "error", err)
return false
}

return m.Cost != nil && (m.Cost.CacheRead > 0 || m.Cost.CacheWrite > 0)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is something we can put in the modelsdev store as it could be useful for all providers

@krissetto krissetto merged commit fc56a07 into docker:main Jan 19, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants