Skip to content

v3.6.1 — thinking-display extension for Opus 4.7 non-interactive surfaces

Choose a tag to compare

@vsits-proxy-builder vsits-proxy-builder released this 17 May 19:31
· 70 commits to main since this release
9fe1148

Patch release shipping a new proxy extension that restores Opus 4.7 thinking summaries in non-interactive CC surfaces — VS Code chat panel, Antigravity panel, SDK, claude --print, anything spawned with --input-format stream-json.

What's new

thinking-display extension

On Opus 4.7, Anthropic flipped the thinking.display API default to "omitted". Claude Code's CLI propagates display: "summarized" only when the session is interactive (via !getIsNonInteractiveSession()) — so every non-interactive subprocess sends a thinking-enabled request without display, and the API returns thinking blocks whose thinking field is empty plus a multi-KB signature. The UI shows a static "Thinking" stub but no reasoning content.

This extension injects thinking.display = "summarized" at the proxy boundary when:

  • Model matches /^claude-opus-4-7/
  • thinking.type is "enabled" or "adaptive"
  • display is unset (user opt-out always preserved)

Default-on on Opus 4.7 after the cache-prefix test measured 0% absolute drop in steady-state cache_read ratio with injection active (5 sequential claude -p calls per window, baseline vs injected, both windows at 1.000 ratio from call 2 onward).

Override via env var:

export CACHE_FIX_THINKING_DISPLAY=summarized  # default (restores summaries)
export CACHE_FIX_THINKING_DISPLAY=omitted     # force-suppress thinking blocks
export CACHE_FIX_THINKING_DISPLAY=disabled    # extension no-op

Upstream root-cause analysis and patch proposal: anthropics/claude-code#59844. Credit to @ojura for the CLI-binary decode and the two-stacked-special-cases framing — this extension is the proxy-side complement.

docs/parallel-proxy-test-harness.md

Developer test harness pattern for end-to-end extension testing. Spin up a parallel proxy on a different port from the feature branch, route claude -p traffic through it via ANTHROPIC_BASE_URL (bypassing the local wrapper that hardcodes :9801), capture real request bodies via a diagnostic extension, run baseline-vs-injected comparisons against live Anthropic API. The harness surfaced the spec/reality mismatch on this very feature (CC v2.1.131 ships thinking.type: "adaptive", not "enabled" as the upstream issue described) that no unit test would have caught.

Tests

793 → 824 (+31): full coverage of the new extension's resolveMode, MODEL_REGEX, shouldInject, and onRequest paths, including the pinned regression test that the extension never overwrites a user's explicit display opt-out.

Install / upgrade

npm install -g claude-code-cache-fix@3.6.1
# or
npm update -g claude-code-cache-fix

If you're running cache-fix-proxy as a service, restart it to pick up the new extension:

systemctl --user restart cache-fix-proxy

Full changelog

See CHANGELOG.md.