Repo: amplifier-module-provider-anthropic + amplifier-module-hooks-streaming-ui
Evidence (~/Downloads/image (1).png)
Simple session — amplifier-dev bundle, claude-opus-4-7, NO tools, NO delegation:
📊 Token Usage (anthropic/claude-opus-4-7) [3.2s]
Input: 92,650 (caching...) | Output: 15 | Total: 92,665 | Cost: $0.58
💰 Turn: $1.16 | Session: $1.16
📊 Token Usage (anthropic/claude-opus-4-7) [3.7s]
Input: 92,683 (94% cached) | Output: 110 | Total: 92,793 | Cost: $0.08
💰 Turn: $0.15 | Session: $1.31
| Turn |
Per-call Cost: |
💰 Turn: |
Ratio |
| "hi" |
$0.58 |
$1.16 |
exactly 2.00× |
| "tip of the day" |
$0.08 |
$0.15 |
≈ 2× (actual ~$0.076) |
This is mathematical doubling, not visual confusion
Cost: $0.58 comes from usage.cost_usd in the content_block:end event — stamped once by compute_cost() inside _convert_to_chat_response()
Turn: $1.16 comes from collect_contributions("session.cost") which reads _totals["cost_usd"]
- For
_totals["cost_usd"] to be $1.16 when one API call cost $0.58, _add_cost($0.58) must have been called twice
What we know from code inspection
_add_cost is called exactly once in _convert_to_chat_response() (line 3395 in amplifier_module_provider_anthropic/__init__.py). _convert_to_chat_response is called once in complete() (line 2669). So either:
-
complete() is being called twice by the orchestrator — possible if the _fallback_on_overload path retries after a partial failure, or if the streaming orchestrator makes two calls (stream-to-display then parse-for-tools). This is the most likely cause.
-
A second session.cost contributor is registered with the same value — e.g., if mount() is called twice and both contributors somehow reflect the same _totals. The Rust coordinator APPENDS contributors (confirmed: coordinator.rs: .push(entry)), so two registrations would both be collected. No second session.cost contributor was found in code inspection.
Why Cost: shows half the Turn
Cost: in the per-call line shows chat_response.usage.cost_usd — the cost of the final call only. Turn: shows the accumulated _totals["cost_usd"] across all calls in the turn. If the orchestrator makes 2 calls per user turn, _totals accumulates both while the display only shows the last.
Investigation needed
Add instrumentation to confirm:
def _add_cost(cost) -> None:
import traceback
logger.warning("_add_cost called: cost=%s, stack=\n%s", cost, ''.join(traceback.format_stack()))
if cost is not None:
_totals["cost_usd"] = (_totals["cost_usd"] or Decimal("0")) + cost
_totals["has_data"] = True
This will reveal whether _add_cost is called once or twice per user turn, and from which code path.
Impact
Every Turn and Session cost shown to users is 2× the actual API cost. The Session total accumulates this doubling, so long sessions show dramatically inflated costs.
Note on previous display fixes
PR/fix on feat/m0-cost-management removed Cost: from the per-call token line (issue #291). That hides the symptom but does NOT fix the Turn: doubling — users will still see doubled Turn/Session costs.
Repo:
amplifier-module-provider-anthropic+amplifier-module-hooks-streaming-uiEvidence (~/Downloads/image (1).png)
Simple session —
amplifier-devbundle,claude-opus-4-7, NO tools, NO delegation:Cost:💰 Turn:$0.58$1.16$0.08$0.15This is mathematical doubling, not visual confusion
Cost: $0.58comes fromusage.cost_usdin thecontent_block:endevent — stamped once bycompute_cost()inside_convert_to_chat_response()Turn: $1.16comes fromcollect_contributions("session.cost")which reads_totals["cost_usd"]_totals["cost_usd"]to be$1.16when one API call cost$0.58,_add_cost($0.58)must have been called twiceWhat we know from code inspection
_add_costis called exactly once in_convert_to_chat_response()(line 3395 inamplifier_module_provider_anthropic/__init__.py)._convert_to_chat_responseis called once incomplete()(line 2669). So either:complete()is being called twice by the orchestrator — possible if the_fallback_on_overloadpath retries after a partial failure, or if the streaming orchestrator makes two calls (stream-to-display then parse-for-tools). This is the most likely cause.A second
session.costcontributor is registered with the same value — e.g., ifmount()is called twice and both contributors somehow reflect the same_totals. The Rust coordinator APPENDS contributors (confirmed:coordinator.rs: .push(entry)), so two registrations would both be collected. No secondsession.costcontributor was found in code inspection.Why
Cost:shows half the TurnCost:in the per-call line showschat_response.usage.cost_usd— the cost of the final call only.Turn:shows the accumulated_totals["cost_usd"]across all calls in the turn. If the orchestrator makes 2 calls per user turn,_totalsaccumulates both while the display only shows the last.Investigation needed
Add instrumentation to confirm:
This will reveal whether
_add_costis called once or twice per user turn, and from which code path.Impact
Every Turn and Session cost shown to users is 2× the actual API cost. The Session total accumulates this doubling, so long sessions show dramatically inflated costs.
Note on previous display fixes
PR/fix on
feat/m0-cost-managementremovedCost:from the per-call token line (issue #291). That hides the symptom but does NOT fix theTurn:doubling — users will still see doubled Turn/Session costs.