💡 Agent SDK Credit Budget Circuit Breaker for Billing-Aware Orchestration #635
Replies: 1 comment
-
Weekly UpdateWhat ChangedCritical timing signal — June 15 billing change is 2 days away. Research confirms the exact mechanics of Anthropic's Agent SDK billing split, effective June 15, 2026:
New complementary proposals: Discussion #639 (Batch API Cost Arbitrage) and Discussion #643 (Anthropic Usage & Cost API Integration) were created this week to address demand-side and accuracy-side aspects of budget management, respectively. The Cost API integration directly addresses this discussion's adversarial weakness about estimation drift. Updated Assessment
RecommendationAdvance immediately. The billing change is 2 days away. Even a minimal implementation (budget check function + warn-level logging) would provide visibility into Agent SDK credit consumption from day one. The accuracy gap (estimation vs. actual) can be addressed by Discussion #643's Cost API integration as a fast-follow. Priority: wire budget awareness into engine.sh before the billing split takes effect. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Summary
Implement budget-awareness in engine.sh to track cumulative Anthropic Agent SDK credit consumption per billing period and enforce circuit breakers (warn at 80%, throttle to cheaper models at 90%, emergency-only at 95%) to prevent mid-month exhaustion of the newly separated Agent SDK credit pool. Starting June 15, 2026, Anthropic bills Agent SDK usage (including non-interactive
claude -pruns) against a separate monthly credit, creating a new failure mode the harness must handle.Market Signal
Anthropic announced on June 15, 2026 that Agent SDK usage on subscription plans draws from a separate monthly Agent SDK credit ($20 Pro, $100 Max 5x, $200 Max 20x). This billing change creates a hard monthly ceiling that didn't exist before. Industry research shows LLM cost optimization via dynamic model routing can reduce spend by 40-85% while maintaining 90-95% quality. The LLM router market (Martian, Unify, RouteLLM) is maturing around exactly this budget-aware orchestration pattern.
User Signal
The ET Anomaly Alerting idea (Discussion #565) addresses runaway session detection but not billing-period budget tracking. The engine.sh model chain architecture (
CLAUDE_*_MODEL_CHAIN) already supports falling back to cheaper models on rate limits — the same mechanism can be extended for budget throttling. Issue #206 (proactive headroom guard for daily subscription cap) directly anticipated this need. The Fable 5 pricing ($10/$50 per MTok) makes budget awareness 2x more critical than with Opus 4.8.Technical Opportunity
engine.sh already tracks cumulative token usage via
TOKEN_LOG_FILEand token-metrics.sh. model-pricing.sh can price each call in real-time. The circuit breaker adds a budget check before each LLM invocation: sum JSONL records for the current billing period, compare against the known credit limit, and either proceed, downshift the model chain, or skip non-essential tiers. The existingCLAUDE_*_MODEL_CHAINfallback mechanism is the throttling lever — on budget pressure, shift triage from haiku to the cheapest viable model and skip the rubber-duck cross-engine pass.Assessment
Adversarial Review
Strongest objection: Token usage records are estimates (4 chars per token), not actual billing data from Anthropic's API. Budget tracking based on estimates will drift from real charges, potentially triggering false alarms or missing actual overages.
Rebuttal: Estimates can be calibrated: the model-pricing.tsv rates are authoritative, and the ET formula is designed to be conservative. A 10-15% safety margin on the circuit breaker thresholds (warn at 70% actual to account for estimation error) makes the system fail-safe rather than fail-dangerous. The alternative — no budget awareness at all — means the first signal of exhaustion is silent 429 errors during overnight review runs. Even imperfect budget tracking is vastly better than none.
Suggested Next Step
Add a
budget_checkfunction to engine.sh that sums current-period JSONL records against a configurableAGENT_SDK_BUDGET_USDenv var, returning the budget state (ok/warn/throttle/emergency). Wire it into the model chain selection logic.Beta Was this translation helpful? Give feedback.
All reactions