Skip to content

[q] optimize: reduce token usage in high-consumption workflows #31555

@github-actions

Description

@github-actions

Q Workflow Optimization Report

Issues Found (from live data)

Contribution Check

  • Run analyzed: 25686395186
  • Token usage: 663K tokens / 6.7M effective tokens / 24 turns
  • Issues identified:
    • No timeout-minutes — could run unboundedly if subagent dispatch stalls
    • No max-continuations cap — orchestrator + 3 subagent calls = 24 turns, no hard ceiling

Matt Pocock Skills Reviewer

  • Run analyzed: 25686664431
  • Token usage: 456K tokens / 4.5M effective tokens / 10 turns
  • Issues identified:
    • No max-continuations cap for Copilot engine
    • Agent fetches full gh pr diff inline — large PRs produce multi-thousand-line diffs fed directly into the context window

Test Quality Sentinel

  • Run analyzed: 25686664489
  • Token usage: 285K tokens / 2.8M effective tokens / 7 turns
  • Issues identified:
    • max-continuations: 40 is 5-6× the observed run length (7 turns) — allows significant cost overrun on adversarial or complex PRs
Changes Made

test-quality-sentinel.md

  • Reduced max-continuations from 40 → 15 (2× the observed maximum of 7 turns, still provides comfortable headroom)

contribution-check.md

  • Added max-continuations: 60 (covers 3× subagent runs of ~20 turns each)
  • Added timeout-minutes: 30 (hard wall to prevent runaway on retry loops)

mattpocock-skills-reviewer.md

  • Changed engine: copilotengine: {id: copilot, max-continuations: 15}
  • Added pre-agent step Pre-fetch PR diff (truncated) that caps the diff at 8 000 lines before the agent starts — prevents huge diffs from inflating the context window on every turn
  • Updated Step 1 in the prompt to read the pre-fetched file instead of re-running gh pr diff

Expected Improvements

  • Test Quality Sentinel: worst-case token spend reduced by ~63% (40 → 15 continuation cap)
  • Matt Pocock Reviewer: large-PR scenarios bounded; diff no longer re-fetched live inside the agent loop
  • Contribution Check: runaway protection added; cost capped even if subagent retry loops occur

Validation

All modified workflows compiled successfully:

  • test-quality-sentinel
  • contribution-check
  • mattpocock-skills-reviewer

Note: .lock.yml files will be regenerated automatically after merge.

References

🎩 Equipped by Q · ● 26.1M ·

  • expires on May 13, 2026, 6:18 PM UTC

Note

This was originally intended as a pull request, but the git push operation failed.

Workflow Run: View run details and download bundle artifact

The bundle file is available in the agent artifact in the workflow run linked above.

To create a pull request with the changes:

# Download the artifact from the workflow run
gh run download 25688146365 -n agent -D /tmp/agent-25688146365

# Fetch the bundle into a local branch
git fetch /tmp/agent-25688146365/aw-q-optimize-token-usage.bundle refs/heads/q/optimize-token-usage:refs/heads/q/optimize-token-usage-20fef7476b3af908
git checkout q/optimize-token-usage-20fef7476b3af908

# Push the branch to origin
git push origin q/optimize-token-usage-20fef7476b3af908

# Create the pull request
gh pr create --title '[q] optimize: reduce token usage in high-consumption workflows' --base main --head q/optimize-token-usage-20fef7476b3af908 --repo github/gh-aw

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions