🤖 fix: avoid extra Anthropic cache breakpoints with explicit TTL#3112
Merged
🤖 fix: avoid extra Anthropic cache breakpoints with explicit TTL#3112
Conversation
Stop emitting top-level Anthropic cacheControl from buildProviderOptions, add regression coverage for final cache-breakpoint counts, and verify that explicit TTLs still propagate through Mux's manual cache-marker path. --- _Generated with `mux` • Model: `openai:gpt-5.4` • Thinking: `xhigh` • Cost: `$12.62`_ <!-- mux-attribution: model=openai:gpt-5.4 thinking=xhigh costs=12.62 -->
Member
Author
|
@codex review |
|
Codex Review: Didn't find any major issues. You're on a roll. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR fixes a direct Anthropic regression where explicitly setting
anthropic.cacheTtlcaused Mux to emit one extra cache-control breakpoint, pushing tool-enabled requests over Anthropic's four-breakpoint limit.Background
Mux already applies Anthropic prompt caching through manual cache markers on the cached system prompt, conversation tail, and last tool. When
buildProviderOptions()also emitted top-levelanthropic.cacheControl, the Anthropic SDK serialized an additional top-levelcache_controlblock on direct requests. That produced the user-visible failure:A maximum of 4 blocks with cache_control may be provided. Found 5.Implementation
The fix stops emitting top-level Anthropic
cacheControlfrombuildProviderOptions()while preserving the existing manual cache-marker flow. To guard against future regressions, the PR also adds a helper that counts Anthropic cache breakpoints in shaped request payloads and tests that pin the intended breakpoint budget. A targeted StreamManager regression test verifies that explicit1hTTL values still propagate through the manual cache path even without the top-level provider option.Validation
bun test src/common/utils/ai/providerOptions.test.ts src/node/services/providerModelFactory.test.ts src/common/utils/ai/cacheStrategy.test.ts src/node/services/streamManager.test.tsnix shell nixpkgs#hadolint -c make static-checkmake dev-server-sandboxinstance using env-backed direct Anthropic credentials:1 hourFound 5Anthropic errorRisks
The main regression risk is Anthropic request shaping across direct and routed paths. This change is intentionally narrow: it removes the redundant top-level direct-provider cache marker while keeping the existing manual cache markers intact, and adds tests at both the provider-options layer and the final shaped-request layer.
Pains
make static-checkrequireshadolint, which was not installed in the workspace environment. I ran it throughnix shell nixpkgs#hadolint -c make static-checkso the full required local validation still passed.📋 Implementation Plan
Fix plan: direct Anthropic cache-marker duplication when explicit cache TTL is set
Recommendation
Recommended approach: keep Mux's existing 3 manual Anthropic cache breakpoints, and stop emitting the extra top-level Anthropic
cacheControlfield frombuildProviderOptions().anthropic.cacheTtlis explicitly set.src/common/utils/ai/cacheStrategy.ts:messagePipeline.ts,streamManager.ts, andproviderModelFactory.tsunless follow-up cleanup is still desired after the regression is fixed.Evidence supporting the root-cause diagnosis
src/common/utils/ai/cacheStrategy.tscreateCachedSystemMessage()applyCacheControl()applyCacheControlToTools()src/node/services/messagePipeline.tsappliesapplyCacheControl()after message transforms.src/node/services/streamManager.tsprepends the cached system message and marks the last tool.src/common/utils/ai/cacheStrategy.tsexplicitly documents Anthropic's 4-breakpoint limit and says the intended design is to use 3 total.src/common/utils/ai/providerOptions.tsis the one place that adds an extra top-level AnthropiccacheControlfield, and it does so only whenmuxProviderOptions.anthropic.cacheTtlis explicitly set.src/node/services/aiService.tsalready passes the explicit TTL separately into both:prepareMessagesForProvider(...)(anthropicCacheTtlargument)streamManager.startStream(...)(anthropicCacheTtlOverrideargument)providerOptions.anthropic.cacheControl.Alternate approach (not recommended for the first fix)
Centralize all Anthropic cache injection in
src/node/services/providerModelFactory.tsand remove the higher-level cache-marker transforms.Implementation plan
Phase 1 — Remove the redundant top-level Anthropic cache-control path
Files/symbols
src/common/utils/ai/providerOptions.tssrc/common/utils/ai/providerOptions.test.tsChanges
buildProviderOptions()so Anthropic models do not emit top-levelanthropic.cacheControl, even whenmuxProviderOptions.anthropic.cacheTtlis set to"5m"or"1h".thinkingeffortdisableParallelToolUsesendReasoningQuality gate after Phase 1
src/common/utils/ai/providerOptions.test.tsto assert that explicit Anthropic TTL no longer appears in top-level provider options.Phase 2 — Add a narrow regression guard at the wire-shaping layer
Files/symbols
src/node/services/providerModelFactory.tssrc/node/services/providerModelFactory.test.tsChanges
wrapFetchWithAnthropicCacheControl()that can count Anthropic cache breakpoints in the final request body.providerOptions.anthropic.cacheControlmessage markers if presentQuality gate after Phase 2
src/node/services/providerModelFactory.test.tsthat builds a representative Anthropic request shape with:cacheTtl: "1h"Phase 3 — Verify TTL still propagates through the manual cache-marker path
Files/symbols
src/node/services/aiService.tssrc/node/services/messagePipeline.tssrc/node/services/streamManager.tssrc/common/utils/ai/cacheStrategy.tssrc/common/utils/ai/cacheStrategy.test.tssrc/node/services/streamManager.test.tsorsrc/node/services/aiService.test.ts(only if a small targeted regression test is needed)Changes
cacheControlis removed.aiService/streamManagertests ifproviderOptions+providerModelFactorytests are not enough to pin the behavior down.cacheStrategy.ts; do not refactor that layer yet.Quality gate after Phase 3
"1h") are preserved on the manual pathAcceptance criteria
anthropic.cacheTtl: "1h"no longer exceed Anthropic's 4-breakpoint limit.providerOptions.anthropic.cacheControlmust not silently disable 1-hour prompt caching.Validation plan
bun test src/common/utils/ai/providerOptions.test.tsbun test src/node/services/providerModelFactory.test.tsbun test src/common/utils/ai/cacheStrategy.test.tsaiService.test.tsorstreamManager.test.tsmake typecheckmake lintif the touched files introduce new lint exposurecacheTtl: "1h"and at least one tool-enabled request.Dogfooding plan
Goal: reproduce the original failure mode on the app path the user actually hit, then verify the fix with evidence a reviewer can inspect.
Setup
anthropic.cacheTtl: "1h".Repro / verification flow
cacheTtlis"1h".Evidence to capture
1hTTLSuggested tooling for verification
Risks / non-goals
messagePipeline,streamManager, andproviderModelFactory.cacheControlmay not be sufficient by itself.anthropic.cacheControlas the source of truth for TTL propagation.Generated with
mux• Model:openai:gpt-5.4• Thinking:xhigh• Cost:$12.62