Skip to content

feat(go/anthropic): add prompt caching support via message metadata#5103

Open
mikewiacek wants to merge 2 commits into
genkit-ai:mainfrom
mikewiacek:anthropic-cache-control
Open

feat(go/anthropic): add prompt caching support via message metadata#5103
mikewiacek wants to merge 2 commits into
genkit-ai:mainfrom
mikewiacek:anthropic-cache-control

Conversation

@mikewiacek
Copy link
Copy Markdown

Enable Anthropic's prompt caching feature for system messages in the Go plugin. When a system message has Metadata["cache"] = map[string]any{"type": "ephemeral"}, the corresponding TextBlockParam gets CacheControl set to ephemeral.

This follows the same pattern as the googlegenai plugin's cache support (which uses Metadata["cache"]["ttlSeconds"]), giving Go developers a consistent API for enabling caching across providers.

Also tracks CacheCreationInputTokens in Usage.Custom for cost tracking (cache reads were already tracked via CachedContentTokens).

Fixes: #817

Changes:

  • toAnthropicRequest: check system message Metadata for cache config, set CacheControl on matching TextBlockParam
  • toGenkitResponse: add CacheCreationInputTokens to Usage.Custom metrics

Usage example:

resp, err := genkit.Generate(ctx, g,
    ai.WithMessages(&ai.Message{
        Role:    ai.RoleSystem,
        Content: []*ai.Part{ai.NewTextPart(systemPrompt)},
        Metadata: map[string]any{
            "cache": map[string]any{"type": "ephemeral"},
        },
    }),
    ai.WithPrompt(userPrompt),
    ai.WithModelName("anthropic/claude-sonnet-4-5-20250929"),
)
// resp.Usage.CachedContentTokens > 0 on cache hits
// resp.Usage.Custom["cacheCreationInputTokens"] > 0 on first call (cache write)

Anthropic's prompt caching reduces input token costs by up to 90% for repeated system prompts (cache reads are $0.30/MTok vs $3/MTok for Sonnet). Unlike Gemini's 32K minimum, Anthropic has no minimum token requirement for caching.

Checklist:

Enable Anthropic's prompt caching feature for system messages. When a
system message has Metadata["cache"] = map[string]any{"type": "ephemeral"},
the corresponding TextBlockParam gets CacheControl set to ephemeral.

This follows the same pattern as the googlegenai plugin's cache support
(which uses Metadata["cache"]["ttlSeconds"]).

Also tracks CacheCreationInputTokens in Usage.Custom for cost tracking.

Changes:
- toAnthropicRequest: check system message metadata for cache config
- toGenkitResponse: add cache creation tokens to Custom metrics
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements prompt caching for Anthropic system messages and adds tracking for cache creation tokens within the model response usage metrics. The review feedback recommends simplifying the metadata parsing logic to improve readability and support generic TTL fields, as well as ensuring the custom metrics map is updated safely without overwriting existing entries.

Comment thread go/plugins/internal/anthropic/anthropic.go Outdated
Comment thread go/plugins/internal/anthropic/anthropic.go Outdated
@MichaelDoyle
Copy link
Copy Markdown
Contributor

/gemini review

@MichaelDoyle MichaelDoyle requested a review from apascal07 May 6, 2026 15:45
@MichaelDoyle
Copy link
Copy Markdown
Contributor

Thanks for the contrib @mikewiacek - we'll take a peek.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

Gemini encountered an error creating the review. You can try again by commenting /gemini review.

@MichaelDoyle
Copy link
Copy Markdown
Contributor

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces prompt caching for Anthropic system messages by detecting cache metadata and tracks cache creation tokens in custom usage metrics. The review feedback recommends adding support for the 'name' field in the cache metadata to improve consistency across different providers and support the 'ai.WithCacheName' helper.

Comment on lines +232 to +238
if cache, ok := message.Metadata["cache"].(map[string]any); ok {
t, _ := cache["type"].(string)
_, hasTTL := cache["ttlSeconds"]
if t == "ephemeral" || hasTTL {
block.CacheControl = anthropic.CacheControlEphemeralParam{Type: "ephemeral"}
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve cross-provider consistency, consider supporting the name field in the cache metadata. This allows users to enable caching using the ai.WithCacheName helper (commonly used with other providers like Gemini), even though Anthropic will treat it as an ephemeral cache. This aligns with the goal of providing a consistent API for Go developers as mentioned in the PR description.

Suggested change
if cache, ok := message.Metadata["cache"].(map[string]any); ok {
t, _ := cache["type"].(string)
_, hasTTL := cache["ttlSeconds"]
if t == "ephemeral" || hasTTL {
block.CacheControl = anthropic.CacheControlEphemeralParam{Type: "ephemeral"}
}
}
if cache, ok := message.Metadata["cache"].(map[string]any); ok {
t, _ := cache["type"].(string)
_, hasTTL := cache["ttlSeconds"]
_, hasName := cache["name"]
if t == "ephemeral" || hasTTL || hasName {
block.CacheControl = anthropic.CacheControlEphemeralParam{Type: "ephemeral"}
}
}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MichaelDoyle happy to accept gemini's suggestion, but it does feel misleading to users where we accept a functional option that's ultimately ignored.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikewiacek No worries. Gemini is not always right - if you want, you can hang tight until we get a chance to (human) review the overall PR - we're a bit short staffed on Go, but we'll take a look soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants