Skip to content

feat(cli): add --budget-usd run-level cost cap#1118

Merged
christso merged 2 commits intomainfrom
feat/1113-budget-usd-flag
Apr 16, 2026
Merged

feat(cli): add --budget-usd run-level cost cap#1118
christso merged 2 commits intomainfrom
feat/1113-budget-usd-flag

Conversation

@christso
Copy link
Copy Markdown
Collaborator

Summary

Adds a --budget-usd <n> flag to the run subcommand that caps total cost across all eval files in one invocation.

Closes #1113

Changes

  • RunBudgetTracker (packages/core/src/evaluation/run-budget-tracker.ts): New class tracking cumulative cost with add(), isExceeded(), currentCostUsd, budgetCapUsd
  • CLI (run.ts): Added --budget-usd option with validation (rejects ≤ 0)
  • run-eval.ts: Added budgetUsd to NormalizedOptions and RunEvalResult, integrated tracker into sequential file loop — accumulates costs after each file, skips remaining files with budget_exceeded results
  • Tests: 5 unit tests for RunBudgetTracker
  • Exports: RunBudgetTracker exported from @agentv/core

E2E Verification

Red (validation rejects invalid values):

$ bun apps/cli/src/cli.ts eval run --budget-usd 0 a.yaml
Error: --budget-usd must be a positive number.

$ bun apps/cli/src/cli.ts eval run --budget-usd -5 a.yaml
Error: --budget-usd must be a positive number.

Green (flag appears in help):

$ bun apps/cli/src/cli.ts eval run --help
--budget-usd <number>  Maximum total cost in USD across all eval files...

Test Results

472/472 pass, 0 failures

Add a `--budget-usd` flag to `agentv run` that caps total cost across all
eval files in a single invocation. When the cumulative cost exceeds the cap,
remaining eval files are skipped with `budget_exceeded` results.

Implementation:
- New `RunBudgetTracker` class in packages/core for reusable budget tracking
- CLI flag with validation (must be positive number)
- Integrated into sequential file loop: costs accumulated after each file,
  budget checked before dispatching the next file
- Per-suite `execution.budget_usd` still enforced within files by orchestrator
- Exit code 1 when run-level budget is exceeded
- Summary output shows cap and actual spend when exceeded

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages bot commented Apr 16, 2026

Deploying agentv with  Cloudflare Pages  Cloudflare Pages

Latest commit: 78af506
Status:⚡️  Build in progress...

View logs

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@christso christso merged commit 0ff9c32 into main Apr 16, 2026
3 of 4 checks passed
@christso christso deleted the feat/1113-budget-usd-flag branch April 16, 2026 04:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add run-level --budget-usd flag to cap total cost across all eval files in one invocation

1 participant