Target Workflow
File: .github/workflows/mattpocock-skills-reviewer.md
Engine: copilot (claude-sonnet-4.6)
Recent token usage: ~1.97M tokens across 1 run (~1.97M avg/run, last 3 days)
Why This Workflow
The reviewer runs on every PR ready_for_review plus /matt slash command, so usage grows with PR volume. Each run currently consumes ~1.97M tokens, and the prompt body contains a clear deterministic triage phase (Step 3: classify change type → pick 1–2 skills from a 7-row table) that a smaller model can handle independently of the judgment-heavy review work. The workflow has no existing inline sub-agents and no shared tool prefixes across phases, so a single bundled triage extraction is the highest-value remaining lever.
Optimization — Inline Sub-Agent (PR Triage)
LLM Expert Reasoning
- Step 3 (Identify Change Type and Select Skills) is a textbook small-model task: read PR meta + diff, match against a 7-row table, emit a small structured result — extractive classification with bounded output.
- Steps 4–6 are judgment-heavy and tightly coupled to each other (review verdict depends on issues found from skill-application). They must stay with the main model.
- Steps 1 and 2 are trivial file reads (single
cat/find) — extracting them on their own would just add round-trip latency without saving tokens.
- The strongest extraction is to bundle Step 3 with diff-impact summarization: a single triage call that classifies the PR and pre-ranks files by impact. The main agent then enters Step 4 with a structured triage object instead of having to derive it from the raw 3000-line diff.
- Bundling raises the candidate's score from a borderline 5/10 (Step 3 alone) to a solid 9/10 (Step 3 + diff impact summary) because the combined scope absorbs meaningful prompt content and produces a richer JSON output downstream.
Proposed Sub-Agents
1. pr-triage (small)
Extracted task: Classify the PR's change type, recommend 1–2 Matt Pocock skills, and pre-rank changed files by impact.
Why small: Extractive classification against a fixed 7-row table plus file-level summarization — both heuristics on the "small-handles-well" list (classification into a predefined set; extracting fields from semi-structured text).
Score: 9/10 (independence: 3, haiku-adequacy: 3, parallelism: 1, size: 2)
Estimated savings: ~250K–500K tokens/run (the main agent no longer needs to scan the full 3000-line diff for triage purposes; it consumes a compact JSON triage object instead).
Agent definition (copy-paste ready)
## agent: `pr-triage`
---
description: Classify PR change type, recommend Matt Pocock skills, and rank changed files by impact
model: small
---
Read `/tmp/gh-aw/agent/pr-meta.json` and `/tmp/gh-aw/agent/pr-diff.patch`.
Classify the PR's primary change type. Choose exactly one of:
`bug_fix`, `new_feature`, `refactor_cleanup`, `architecture_change`, `tests_only`, `documentation`, `mixed_unclear`.
Recommend 1–2 skills based on this mapping:
- `bug_fix` → `/diagnose` + `/tdd`
- `new_feature` → `/tdd` + `/grill-with-docs`
- `refactor_cleanup` → `/zoom-out` + `/improve-codebase-architecture`
- `architecture_change` → `/improve-codebase-architecture` + `/zoom-out`
- `tests_only` → `/tdd`
- `documentation` → `/grill-with-docs`
- `mixed_unclear` → `/zoom-out` + `/tdd`
Rank changed files by review impact (high/medium/low). Skip lock files, generated code, dist/build paths.
Output strict JSON only:
`{"change_type": "...", "recommended_skills": ["..."], "high_impact_files": ["path1", "path2"], "key_signals": "one-sentence summary"}`
Invocation change in main prompt:
Before (Step 3 — ### Step 3: Identify Change Type and Select Skills, lines 146–160 of the current file):
Based on the PR diff, classify the changes:
| Change Type | Recommended Skill(s) |
|-------------|---------------------|
| **Bug fix** | `/diagnose` + `/tdd` |
...
Select 1–2 skills most relevant to this PR and apply their guidance to your review.
After:
Use the `pr-triage` agent to classify the PR. It reads the pre-fetched data and returns
JSON with `change_type`, `recommended_skills`, `high_impact_files`, and `key_signals`.
Apply the recommended skills in Step 4, prioritising the listed `high_impact_files`.
Estimated Impact
| Metric |
Before |
After (estimated) |
| Avg tokens/run |
~1.97M |
~1.5M–1.7M (~15–25% reduction) |
| Main-model context saved |
— |
~250K–500K tokens/run (diff triage offloaded) |
| Parallelism opportunity |
None |
Triage runs as a single small-model step before review |
Implementation Steps
- Add the
pr-triage agent block at the bottom of .github/workflows/mattpocock-skills-reviewer.md, after all workflow content.
- Replace the Step 3 table and selection paragraph (lines 146–160) with the 3-line invocation shown above.
- In Step 4, prepend one sentence: "Focus your skill application on files listed in
pr-triage's high_impact_files." This makes the triage output actually steer the main review.
- Compile:
gh aw compile mattpocock-skills-reviewer
- Test:
gh workflow run mattpocock-skills-reviewer.lock.yml on a representative PR, then compare token usage against a baseline run.
Notes on What Was Considered and Rejected
- Common tool prefix extraction: rejected — Phase 4 found no shared opening tool calls. PR-data fetching is already consolidated in
pre-agent-steps.
- Extracting Step 1 (Load PR Data) as its own sub-agent: rejected — it's a single
cat invocation that produces data the main agent must consume anyway.
- Extracting Step 2 (Read Available Skills) as its own sub-agent: rejected — the inline guidance covers most cases; only when guidance is insufficient does the main agent read a SKILL.md, and that is rare.
- Extracting Steps 5/6 (post comments / submit review): rejected — both are tightly coupled to Step 4's judgment work and require the full review context.
References
Generated by ⚡ Daily Sub-Agent Optimizer · ● 13.8M · ◷
Target Workflow
File:
.github/workflows/mattpocock-skills-reviewer.mdEngine:
copilot(claude-sonnet-4.6)Recent token usage: ~1.97M tokens across 1 run (~1.97M avg/run, last 3 days)
Why This Workflow
The reviewer runs on every PR
ready_for_reviewplus/mattslash command, so usage grows with PR volume. Each run currently consumes ~1.97M tokens, and the prompt body contains a clear deterministic triage phase (Step 3: classify change type → pick 1–2 skills from a 7-row table) that a smaller model can handle independently of the judgment-heavy review work. The workflow has no existing inline sub-agents and no shared tool prefixes across phases, so a single bundled triage extraction is the highest-value remaining lever.Optimization — Inline Sub-Agent (PR Triage)
LLM Expert Reasoning
cat/find) — extracting them on their own would just add round-trip latency without saving tokens.Proposed Sub-Agents
1.
pr-triage(small)Extracted task: Classify the PR's change type, recommend 1–2 Matt Pocock skills, and pre-rank changed files by impact.
Why small: Extractive classification against a fixed 7-row table plus file-level summarization — both heuristics on the "small-handles-well" list (classification into a predefined set; extracting fields from semi-structured text).
Score: 9/10 (independence: 3, haiku-adequacy: 3, parallelism: 1, size: 2)
Estimated savings: ~250K–500K tokens/run (the main agent no longer needs to scan the full 3000-line diff for triage purposes; it consumes a compact JSON triage object instead).
Agent definition (copy-paste ready)
Invocation change in main prompt:
Before (Step 3 —
### Step 3: Identify Change Type and Select Skills, lines 146–160 of the current file):After:
Estimated Impact
Implementation Steps
pr-triageagent block at the bottom of.github/workflows/mattpocock-skills-reviewer.md, after all workflow content.pr-triage'shigh_impact_files." This makes the triage output actually steer the main review.gh aw compile mattpocock-skills-reviewergh workflow run mattpocock-skills-reviewer.lock.ymlon a representative PR, then compare token usage against a baseline run.Notes on What Was Considered and Rejected
pre-agent-steps.catinvocation that produces data the main agent must consume anyway.References