feat: cost-statusline showing routing + cumulative cost in Claude Code#23
Open
BenSheridanEdwards wants to merge 10 commits into
Open
feat: cost-statusline showing routing + cumulative cost in Claude Code#23BenSheridanEdwards wants to merge 10 commits into
BenSheridanEdwards wants to merge 10 commits into
Conversation
launch_claude now starts proxy/start-proxy.js and points ANTHROPIC_BASE_URL at the local proxy. Previously only --remote did this. Without it, plain deepclaude pointed Claude Code straight at the backend URL, bypassing the proxy entirely (which means /_proxy/cost always reported zero and any proxy-side feature couldn't fire). start_proxy is a shared helper that sets PROXY_PID/PROXY_PORT/PROXY_LOG as script globals; must be called WITHOUT command substitution because the EXIT trap depends on PROXY_PID reaching the parent shell. SCRIPT_DIR is symlink-resolved so deepclaude works when installed via a ~/.local/bin symlink. The exec on \`claude\` is dropped so the EXIT trap fires and the node child is cleaned up. ANTHROPIC_AUTH_TOKEN is left untouched — whatever the user has in their environment flows through. start-proxy.js legacy mode accepts an optional [defaultMode] third arg so state.mode resolves to e.g. \`deepseek\` rather than \`_single\` and MODEL_REMAP[state.mode] fires. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A Claude Code statusLine integration that surfaces the actual backend
routing and accumulated cost in the bottom bar — closes the loop on the
TUI welcome chip lying about the model under --auto, and gives default
mode a live token/cost readout that previously only showed via
\`curl /_proxy/cost\`.
Output looks like:
[claude-opus-4-7 → deepseek-v4-pro on api.deepseek.com] · 12.3K tokens · $0.04
When the env var name and the wire-side name match (default mode, no
--auto), the arrow is dropped:
[deepseek-v4-pro on api.deepseek.com] · 12.3K tokens · $0.04
Components:
- proxy/model-proxy.js tracks state.lastRequest = { client_model,
wire_model, destination, timestamp } after each /v1/messages remap
and exposes it via /_proxy/status alongside backend_host. The status
line script polls this once per render.
- bin/deepclaude-statusline reads Claude Code's status JSON from stdin,
curls the proxy for status + cost, formats the line, prints. Graceful
fallback when the proxy isn't reachable. Requires jq.
- \`deepclaude --install-statusline\` merges the statusLine entry into
~/.claude/settings.json idempotently (uses jq's '. + {}' so existing
keys like permissions or hooks are preserved). Documented in --help.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previous design required \`deepclaude --install-statusline\` before the status line would appear. Forgetting that step (or not knowing about it) left the bottom of the TUI empty — failing the cost-savings premise on its main UX surface. Now \`launch_claude\` and \`launch_remote\` call ensure_statusline_installed synchronously before \`claude\` starts. Behaviour: - If ~/.claude/settings.json has no statusLine: idempotently merge ours in (preserves all other keys), print a one-line install notice. - If statusLine is already configured (ours or someone else's): no-op. - If jq is not on PATH: silent skip — deepclaude still launches, just without the status line. Removes the explicit \`--install-statusline\` flag/action. There's no \"opt-out\" — if the user wants their own statusLine, they configure it themselves and ours respects their choice. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…usline The previous status line collapsed all tokens into one number and showed only the actual cost — burying the headline value of deepclaude (cost saved versus running through Anthropic directly). New format: [claude-opus-4-7 → deepseek-v4-pro on api.deepseek.com] · ↑5.2K ↓1.1K · \$0.04 (saved \$0.13) \`↑\` is input tokens, \`↓\` is output tokens. The savings tail only appears when savings would round to >= \$0.01 (no \"saved \$0.00\" noise on a fresh session or in pure-Anthropic mode). Implementation: - \`getCostSummary()\` now exposes \`total_input_tokens\` and \`total_output_tokens\` at the top level so the script doesn't have to fold across backend buckets with jq. \`savings\` was already top-level (anthropic_equivalent - total_cost), now surfaced. - bin/deepclaude-statusline reads the new fields, formats input/output separately, conditionally appends the savings tail. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
· \$0.04 (saved \$0.13, 76%) Computed in the script as \`savings / anthropic_equivalent * 100\` (both already top-level fields on /_proxy/cost). Skips the percent if anthropic_equivalent is 0 (which would mean no requests yet, in which case the savings tail itself is also suppressed). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A second \\n after the content gives a one-row gap below the status line in Claude Code's bottom bar. Closest a shell statusLine command can get to CSS-style bottom padding — terminals can't render sub-row vertical space. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Claude Code makes background subagent calls (haiku → deepseek-v4-flash) for things like topic detection during startup. Tracking those as 'last_request' caused the status line to flicker to flash even when the user is actively conversing with the main opus/pro model. Two changes: - /_proxy/status now also exposes \`model_remap\` (the MODEL_REMAP table for the current state.mode). Lets the shell look up wire-side mapping for any client model without duplicating the table. - bin/deepclaude-statusline reads Claude Code's \`model.id\` from stdin (which is the *main* conversation model, stable across subagent activity) and looks the wire side up in model_remap. Falls back to last_request.client_model only if stdin doesn't carry a model field. Result: status line displays the model the user is actually talking to, not whatever transient call most recently went through the proxy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
DeepSeek's anthropic-compat endpoint 400s with:
The \`content[].thinking\` in the thinking mode must be passed back to
the API.
…when the request body has \`thinking: { type: \"enabled\", ... }\` at
the top level but the messages don't carry thinking content blocks.
Background: foreign-backend thinking blocks are invalid against
Anthropic's signing key, so the proxy strips them from messages on
isModelCall. But it left the top-level \`thinking\` config in place,
creating the contradictory state DeepSeek rejects.
Fix: drop both \`thinking\` and \`context_management\` for isModelCall
routes (mirrors what the image-fallback path on PR aattaran#21 already does on
forceAnthropicForImage). Backends like DeepSeek don't honor Anthropic's
extended-thinking config anyway, so dropping it costs nothing and
fixes the 400.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…inuity Previous attempt dropped only the top-level \`thinking\` config; the 400 still fires because DeepSeek's check is on \`content[].thinking\` inside messages — it expects its own prior thinking blocks to be passed back verbatim for conversation continuity. The original strip was added to clean up foreign-backend blocks on backend switches (commit 70518b6), but it also removes DeepSeek's own blocks in pure-DeepSeek sessions, breaking continuity. For now: leave thinking blocks in place on isModelCall so DeepSeek can see its own history. We continue to drop the top-level thinking config since non-Anthropic backends don't honor Anthropic's extended-thinking spec consistently. Backend-switch case (DeepSeek session → Anthropic) is still handled by the Anthropic-side strip (\`hadNonAnthropicSession ? stripAllThinkingBlocks : stripUnsignedThinkingBlocks\`), which shouldn't regress. If a future user reports a foreign-block 400 going INTO DeepSeek (e.g. switching mid-session from openrouter to deepseek), we'll need a finer-grained strip that distinguishes block origin. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
deepclaude already exposes
/_proxy/costand/_proxy/statusover HTTP, but the only way to see live token spend or verify which backend a request actually went to is tocurlfrom another terminal. This PR adds a Claude Code statusLine integration so that information lives in the bottom bar of the running session, with breakdown by direction and visible savings against Anthropic.Output:
When the env var model and wire-side model match (default mode, no
--auto), the arrow is dropped:The savings tail (
(saved $X, NN%)) only appears when savings would round to ≥ $0.01 — nosaved $0.00noise on a fresh session.↑is input tokens,↓is output. Percent issavings / anthropic_equivalent * 100.The status line is auto-installed on every
deepclaudelaunch (idempotent — preserves any existingstatusLineconfig in~/.claude/settings.json). No--install-statuslineflag to remember.This is the natural counterpart to a TUI welcome chip that lies about the routing under
--auto(PR #22): the chip advertises the unlock, the status line restores ground truth.Components
proxy/model-proxy.js:state.lastRequest = { client_model, wire_model, destination, timestamp }updated after each/v1/messagesis processed (post-remap)./_proxy/statusreturnsbackend_host,last_request, andmodel_remap(the active backend's MODEL_REMAP table) so the script can look up the wire-side mapping for whatever model Claude Code passes on stdin without duplicating the table./_proxy/costexposestotal_input_tokensandtotal_output_tokensat the top level.bin/deepclaude-statusline:model.idfrom there (the main conversation model — stable across the haiku/flash subagent calls Claude Code makes for things like topic detection during startup).model_remap[stdin_model]. If the model is already a backend name (default mode), the lookup returns empty and we drop the arrow./_proxy/statusand/_proxy/costwith a 1s timeout each so a slow proxy doesn't block Claude Code's UI.[deepclaude proxy not reachable on :3200].jq;DEEPCLAUDE_PROXY_PORTandDEEPCLAUDE_STATUSLINE_TIMEOUToverridable via env.deepclaude.sh:ensure_statusline_installedruns synchronously beforeclaudestarts. Only installs ifstatusLineisn't already configured (respects user customisation). Silent skip whenjqisn't on PATH.Foundation also in this PR
launch_claudenow starts the proxy in path so the defaultdeepclaudeflow benefits — previously only--remotedid this, which meant/_proxy/costalways reported zero in a default session and the status line would have nothing to read.start_proxyis a shared helper,SCRIPT_DIRis symlink-resolved,start-proxy.jsaccepts an optional[defaultMode]argv.ANTHROPIC_AUTH_TOKENis left untouched — whatever the user has carries through.Test plan
~/.claude/settings.jsongets thestatusLineentry without clobbering existing keys.↑X ↓Ygrowing per turn and accumulating cost.deepseek-v4-pro) even while Haiku subagent calls flow through the proxy in the background.savings >= $0.01; percent reads sensibly.Notes
fix: drop top-level thinking/context_management on non-Anthropic routesandfix: don't strip thinking blocks on isModelCall — DeepSeek needs continuity) so users testing this branch get a working session. Those same two commits are also in fix: thinking-block continuity on non-Anthropic backends #24 as a focused fix PR; whichever lands first, the other's rebase will recognise the duplicate patches and drop them.