[FAQ] Add: billing model, spend controls, and loop cost FAQ entries#29023
[FAQ] Add: billing model, spend controls, and loop cost FAQ entries#29023
Conversation
Add three new FAQ entries under 'Costs & Usage' to address common enterprise questions about billing predictability: - Are Actions minutes charged in addition to AI costs? - How do retries and agent loops affect costs? - How do I control spend and set budgets? These topics were raised in github/agentic-workflows#495 and are broadly applicable to enterprise users evaluating gh-aw. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
📰 DEVELOPING STORY: Smoke Copilot ARM64 reports was cancelled. Our correspondents are investigating the incident... |
|
❌ Smoke Temporary ID encountered failures. Check the logs for details. |
|
❌ Smoke Multi PR failed to create multiple PRs. Check the logs. |
Comment MemoryNote This comment is managed by comment memory.It stores persistent context for this thread in the code block at the top of this comment. |
There was a problem hiding this comment.
Pull request overview
Adds new Costs & Usage FAQ entries to clarify billing, loop/retry cost controls, and spend/budget management guidance.
Changes:
- Documented that GitHub Actions billing and AI inference billing are separate cost components.
- Added guidance on controlling loop depth via
max-turnsandmax-continuations. - Added provider-level spend/budget control recommendations and per-repo tracking guidance.
Show a summary per file
| File | Description |
|---|---|
| docs/src/content/docs/reference/faq.md | Adds three new Costs & Usage FAQ Q&As covering Actions minutes vs inference, loop/retry cost levers, and spend/budget controls. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comments suppressed due to low confidence (3)
docs/src/content/docs/reference/faq.md:559
- “gh-aw has no automatic retry mechanism” is inaccurate as written: the Copilot harness includes automatic retry/backoff for certain partial-execution failures, which can increase both inference usage and wall-clock time within a single workflow run. Suggest clarifying the scope (e.g., no automatic workflow re-run on failure) and noting that engine drivers may retry transient errors inside the run.
gh-aw has no automatic retry mechanism — each workflow trigger produces exactly one run. However, you can control reasoning depth and autopilot continuation, which directly affects how many tokens and how much wall-clock time (Actions minutes) a run consumes:
docs/src/content/docs/reference/faq.md:562
max-continuationsis described here as “multiple consecutive triggered runs,” which reads like multiple separate workflow triggers. Elsewhere (e.g., engines reference) it’s documented as multiple consecutive runs/continuations in Copilot autopilot mode. Recommend aligning wording to avoid implying new workflow runs are triggered.
- `max-turns` (Claude only) — limits the number of AI chat iterations per run
- `max-continuations` (Copilot only) — enables autopilot mode with multiple consecutive triggered runs
docs/src/content/docs/reference/faq.md:577
- This bullet groups Gemini together with Anthropic/OpenAI spend controls (“Anthropic Console or OpenAI platform”), but Gemini keys come from Google AI Studio (per auth docs) and budgeting/quotas are managed in Google’s console/AI Studio rather than Anthropic/OpenAI. Suggest splitting the provider bullets (Claude → Anthropic Console, Codex → OpenAI platform, Gemini → Google AI Studio/Google Cloud) so readers aren’t sent to the wrong place.
- **Actions minutes**: Set an org spending limit in GitHub Billing settings.
- **Claude / Codex / Gemini**: Configure spend limits in the Anthropic Console or OpenAI platform. These apply at the API key or project level.
- **Copilot**: Usage is quota-based (premium requests per month) rather than dollar-metered, so the natural cap is the plan's monthly request quota.
- Files reviewed: 1/1 changed files
- Comments generated: 1
| Yes. Every agentic workflow run is a GitHub Actions workflow run, so it consumes Actions minutes alongside AI inference. These are billed separately: | ||
|
|
||
| - **Actions minutes**: Standard GitHub Actions billing applies — free for public repos, metered for private repos based on your plan. Set a [spending limit](https://docs.github.com/en/billing/managing-billing-for-your-products/managing-billing-for-github-actions/managing-your-spending-limit-for-github-actions) at the org level to cap Actions spend. |
There was a problem hiding this comment.
The answer implies every run always consumes billable GitHub Actions minutes. That’s not true when using self-hosted runners (they don’t consume Actions minutes, though they still use compute). Consider rephrasing to clarify this applies to GitHub-hosted runners / GitHub-hosted minutes billing, and optionally mention self-hosted runner cost is out-of-band.
This issue also appears in the following locations of the same file:
- line 558
- line 560
- line 575
Adds three new entries to the Costs & Usage section of the FAQ, addressing enterprise questions raised in github/agentic-workflows#495:
max-turns/max-continuationsas the cost levers for reasoning depth.These are new entries — no existing FAQ entries covered these topics.
Source issue: github/agentic-workflows#495