feat(go/anthropic): add prompt caching support via message metadata#5103
feat(go/anthropic): add prompt caching support via message metadata#5103mikewiacek wants to merge 2 commits into
Conversation
Enable Anthropic's prompt caching feature for system messages. When a
system message has Metadata["cache"] = map[string]any{"type": "ephemeral"},
the corresponding TextBlockParam gets CacheControl set to ephemeral.
This follows the same pattern as the googlegenai plugin's cache support
(which uses Metadata["cache"]["ttlSeconds"]).
Also tracks CacheCreationInputTokens in Usage.Custom for cost tracking.
Changes:
- toAnthropicRequest: check system message metadata for cache config
- toGenkitResponse: add cache creation tokens to Custom metrics
There was a problem hiding this comment.
Code Review
This pull request implements prompt caching for Anthropic system messages and adds tracking for cache creation tokens within the model response usage metrics. The review feedback recommends simplifying the metadata parsing logic to improve readability and support generic TTL fields, as well as ensuring the custom metrics map is updated safely without overwriting existing entries.
|
/gemini review |
|
Thanks for the contrib @mikewiacek - we'll take a peek. |
|
Warning Gemini encountered an error creating the review. You can try again by commenting |
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces prompt caching for Anthropic system messages by detecting cache metadata and tracks cache creation tokens in custom usage metrics. The review feedback recommends adding support for the 'name' field in the cache metadata to improve consistency across different providers and support the 'ai.WithCacheName' helper.
| if cache, ok := message.Metadata["cache"].(map[string]any); ok { | ||
| t, _ := cache["type"].(string) | ||
| _, hasTTL := cache["ttlSeconds"] | ||
| if t == "ephemeral" || hasTTL { | ||
| block.CacheControl = anthropic.CacheControlEphemeralParam{Type: "ephemeral"} | ||
| } | ||
| } |
There was a problem hiding this comment.
To improve cross-provider consistency, consider supporting the name field in the cache metadata. This allows users to enable caching using the ai.WithCacheName helper (commonly used with other providers like Gemini), even though Anthropic will treat it as an ephemeral cache. This aligns with the goal of providing a consistent API for Go developers as mentioned in the PR description.
| if cache, ok := message.Metadata["cache"].(map[string]any); ok { | |
| t, _ := cache["type"].(string) | |
| _, hasTTL := cache["ttlSeconds"] | |
| if t == "ephemeral" || hasTTL { | |
| block.CacheControl = anthropic.CacheControlEphemeralParam{Type: "ephemeral"} | |
| } | |
| } | |
| if cache, ok := message.Metadata["cache"].(map[string]any); ok { | |
| t, _ := cache["type"].(string) | |
| _, hasTTL := cache["ttlSeconds"] | |
| _, hasName := cache["name"] | |
| if t == "ephemeral" || hasTTL || hasName { | |
| block.CacheControl = anthropic.CacheControlEphemeralParam{Type: "ephemeral"} | |
| } | |
| } |
There was a problem hiding this comment.
@MichaelDoyle happy to accept gemini's suggestion, but it does feel misleading to users where we accept a functional option that's ultimately ignored.
There was a problem hiding this comment.
@mikewiacek No worries. Gemini is not always right - if you want, you can hang tight until we get a chance to (human) review the overall PR - we're a bit short staffed on Go, but we'll take a look soon.
Enable Anthropic's prompt caching feature for system messages in the Go plugin. When a system message has
Metadata["cache"] = map[string]any{"type": "ephemeral"}, the correspondingTextBlockParamgetsCacheControlset to ephemeral.This follows the same pattern as the googlegenai plugin's cache support (which uses
Metadata["cache"]["ttlSeconds"]), giving Go developers a consistent API for enabling caching across providers.Also tracks
CacheCreationInputTokensinUsage.Customfor cost tracking (cache reads were already tracked viaCachedContentTokens).Fixes: #817
Changes:
toAnthropicRequest: check system messageMetadatafor cache config, setCacheControlon matchingTextBlockParamtoGenkitResponse: addCacheCreationInputTokenstoUsage.CustommetricsUsage example:
Anthropic's prompt caching reduces input token costs by up to 90% for repeated system prompts (cache reads are $0.30/MTok vs $3/MTok for Sonnet). Unlike Gemini's 32K minimum, Anthropic has no minimum token requirement for caching.
Checklist: