You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Integrate Anthropic's Usage & Cost API (/v1/organizations/cost_report) into the Token Cost Observatory to replace estimation-based cost tracking with authoritative USD billing data. This directly addresses the estimation-drift weakness identified in Discussion #635's adversarial review and provides the accuracy foundation needed for reliable budget circuit breakers under the June 15 Agent SDK billing split.
Market Signal
Anthropic provides a Usage & Cost API that returns per-model, per-workspace cost breakdowns in USD (documented at platform.claude.com). The June 15, 2026 billing split creates two independent credit pools (Interactive Usage Pool and Agent SDK Credit Pool), making accurate cost tracking critical. Agent SDK credits per plan: $20 (Pro), $100 (Max 5x), $200 (Max 20x), $100/seat (Team). Credits don't roll over and overage charges apply at standard API rates. Anthropic's workspace-level cache isolation (February 2026) further complicates estimation because cache hit rates are no longer predictable across workspaces.
User Signal
Discussion #635 (Agent SDK Credit Budget Circuit Breaker) proposes budget tracking based on estimated token costs via JSONL records, with the adversarial review explicitly noting: "Token usage records are estimates (4 chars per token), not actual billing data from Anthropic's API. Budget tracking based on estimates will drift from real charges." The Token Cost Observatory (#332) tracks costs but relies on the same estimation approach. engine.sh already probes Anthropic's rate-limit headers via check_provider_headroom() (line 163), demonstrating API-level integration precedent. The token-metrics.sh library captures real usage from Claude's --output-format json mode but only for per-call metrics, not cumulative billing data.
Technical Opportunity
engine.sh already calls the Anthropic API for rate-limit headers in check_provider_headroom(). Extending this to query the Usage & Cost API adds one API call but provides authoritative USD figures. The token_report.sh workflow could query /v1/organizations/cost_report daily and reconcile JSONL estimates against actual billing, calibrating the estimation model over time.
For the budget circuit breaker (#635), real cost data eliminates the 10-15% safety margin needed for estimates, allowing tighter thresholds and better resource utilization. A hybrid approach works: query the Cost API at workflow start for cumulative spend through the previous period, then add the current session's estimated cost from JSONL records. This gives ground-truth accuracy at the start of each run plus real-time estimation within the run.
Integration points:
scripts/lib/token-metrics.sh: Add cost_report_query() function
scripts/token_report.sh: Daily reconciliation of estimates vs. actual
scripts/engine.sh check_provider_headroom(): Budget-aware routing using real data
Assessment
Dimension
Score
Rationale
Feasibility
med
Requires API organization access with billing permissions; one additional API call per workflow run
Impact
high
Replaces 10-15% estimation drift with authoritative billing data; enables reliable circuit breakers for the new Agent SDK credit pool
Urgency
high
June 15 billing split creates a finite monthly credit pool where estimation errors have real financial consequences
Adversarial Review
Strongest objection: The Usage & Cost API requires organization-level API access with billing permissions, which may not be available on subscription plans (Claude Pro/Max). If the project uses subscription-based Claude Code rather than API access, this endpoint is inaccessible. Additionally, the API reports costs retroactively, not in real-time — there is inherent lag between a call and its appearance in the cost report.
Rebuttal: The project already uses ANTHROPIC_API_KEY for the headroom probe (engine.sh line 174), indicating API-level access is available. The retroactive reporting lag is mitigable: query the API at the start of each workflow run to get cumulative spend through the previous period, then add the current session's estimated cost for a hybrid ground-truth + estimate approach. This is strictly more accurate than pure estimation. Even with lag, daily cost reconciliation via token_report.sh catches estimation drift before it compounds into significant budget errors. The investment is minimal: one additional API call per workflow run.
Suggested Next Step
Verify that the project's ANTHROPIC_API_KEY has permissions to query /v1/organizations/cost_report. If so, add a cost_report_query() function to scripts/lib/token-metrics.sh that fetches the latest billing data. Wire it into token_report.sh for daily reconciliation against JSONL estimates, and into check_provider_headroom() for budget-aware routing decisions.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Summary
Integrate Anthropic's Usage & Cost API (
/v1/organizations/cost_report) into the Token Cost Observatory to replace estimation-based cost tracking with authoritative USD billing data. This directly addresses the estimation-drift weakness identified in Discussion #635's adversarial review and provides the accuracy foundation needed for reliable budget circuit breakers under the June 15 Agent SDK billing split.Market Signal
Anthropic provides a Usage & Cost API that returns per-model, per-workspace cost breakdowns in USD (documented at platform.claude.com). The June 15, 2026 billing split creates two independent credit pools (Interactive Usage Pool and Agent SDK Credit Pool), making accurate cost tracking critical. Agent SDK credits per plan: $20 (Pro), $100 (Max 5x), $200 (Max 20x), $100/seat (Team). Credits don't roll over and overage charges apply at standard API rates. Anthropic's workspace-level cache isolation (February 2026) further complicates estimation because cache hit rates are no longer predictable across workspaces.
User Signal
Discussion #635 (Agent SDK Credit Budget Circuit Breaker) proposes budget tracking based on estimated token costs via JSONL records, with the adversarial review explicitly noting: "Token usage records are estimates (4 chars per token), not actual billing data from Anthropic's API. Budget tracking based on estimates will drift from real charges." The Token Cost Observatory (#332) tracks costs but relies on the same estimation approach.
engine.shalready probes Anthropic's rate-limit headers viacheck_provider_headroom()(line 163), demonstrating API-level integration precedent. Thetoken-metrics.shlibrary captures real usage from Claude's--output-format jsonmode but only for per-call metrics, not cumulative billing data.Technical Opportunity
engine.shalready calls the Anthropic API for rate-limit headers incheck_provider_headroom(). Extending this to query the Usage & Cost API adds one API call but provides authoritative USD figures. Thetoken_report.shworkflow could query/v1/organizations/cost_reportdaily and reconcile JSONL estimates against actual billing, calibrating the estimation model over time.For the budget circuit breaker (#635), real cost data eliminates the 10-15% safety margin needed for estimates, allowing tighter thresholds and better resource utilization. A hybrid approach works: query the Cost API at workflow start for cumulative spend through the previous period, then add the current session's estimated cost from JSONL records. This gives ground-truth accuracy at the start of each run plus real-time estimation within the run.
Integration points:
scripts/lib/token-metrics.sh: Addcost_report_query()functionscripts/token_report.sh: Daily reconciliation of estimates vs. actualscripts/engine.sh check_provider_headroom(): Budget-aware routing using real dataAssessment
Adversarial Review
Strongest objection: The Usage & Cost API requires organization-level API access with billing permissions, which may not be available on subscription plans (Claude Pro/Max). If the project uses subscription-based Claude Code rather than API access, this endpoint is inaccessible. Additionally, the API reports costs retroactively, not in real-time — there is inherent lag between a call and its appearance in the cost report.
Rebuttal: The project already uses
ANTHROPIC_API_KEYfor the headroom probe (engine.shline 174), indicating API-level access is available. The retroactive reporting lag is mitigable: query the API at the start of each workflow run to get cumulative spend through the previous period, then add the current session's estimated cost for a hybrid ground-truth + estimate approach. This is strictly more accurate than pure estimation. Even with lag, daily cost reconciliation viatoken_report.shcatches estimation drift before it compounds into significant budget errors. The investment is minimal: one additional API call per workflow run.Suggested Next Step
Verify that the project's
ANTHROPIC_API_KEYhas permissions to query/v1/organizations/cost_report. If so, add acost_report_query()function toscripts/lib/token-metrics.shthat fetches the latest billing data. Wire it intotoken_report.shfor daily reconciliation against JSONL estimates, and intocheck_provider_headroom()for budget-aware routing decisions.Beta Was this translation helpful? Give feedback.
All reactions