Target Workflow
File: .github/workflows/ab-testing-advisor.md
Engine: copilot (bare mode)
7-day token usage: ~1.49M tokens across 2 runs (~746K avg/run, ~7.7m avg duration)
Why This Workflow
Daily A/B Testing Advisor has the highest token usage among non-recently-optimized workflows that don't yet use inline sub-agents (Developer Documentation Consolidator, Package Specification Librarian, Q, and Test Quality Sentinel already use sub-agents; Contribution Check and Design Decision Gate were optimized in the last 14 days). The workflow has 5 primary-quest steps plus a 3-area side quest, and two of those phases are clean file-reading + extraction tasks — exactly the kind of work where the main model can be relieved of large file context by delegating to a small model.
Optimization 1 — Common Tool Prefix
Repeated Tool Calls Found
The following tool invocations appear at the start of 2 sections of the prompt body:
cat pkg/workflow/compiler_experiments.go
cat actions/setup/js/pick_experiment.cjs
Sections affected:
- Side Quest intro (line 313) — "Assess the current implementation by reading: ..."
- Side Quest Area 1 (line 326) — "Before proposing additions, verify what is already implemented by reading the source files: ..."
These 1,083 lines of source are pulled into the main model's context twice. Score: Moderate (2 sections share 2 common opening tool calls).
Proposed Resolution
This prefix is fully subsumed by the field-presence-checker sub-agent in Optimization 2: when the agent reads those two files and returns a presence-triple, the main model never needs to cat them at all. If Optimization 2 is implemented, both sets of cat calls can be removed from the prompt body.
Estimated savings (prefix-only, if applied without sub-agent): ~10K tokens/run (one less load of ~20K combined file content)
Optimization 2 — Inline Sub-Agents
LLM Expert Reasoning
- Two phases (Step 2 "Analyze the Selected Workflow" and Side Quest Area 1 "Verify Genuine Gaps") are pure file-reading + structured extraction. They match the heuristic "Summarizing a single file or code section" and "Checking whether something meets a stated criterion (yes/no + reason)".
- Independence is high for both — Step 2 only needs the selected workflow path; Area 1 only needs two file paths plus three field names.
- The field-presence checker has clear parallelism potential: it can run alongside Steps 3 and 4 of the primary quest because Area 1's verdict only affects the side-quest issue-creation decision.
- The creative/synthesis work (Step 3 experiment design, Step 4 issue authoring, Side Quest Areas 2 & 3 proposals) stays with the main model.
- Together the two sub-agents move ~30K tokens of file-reading off the main model per run while preserving the main model's role as the authoritative author of issue bodies.
Proposed Sub-Agents
1. workflow-characterizer (small)
Extracted task: Read a selected workflow .md file and return a structured 7-field characterization for Step 3 to consume.
Why small: Extractive summarization against a fixed schema — no cross-source reasoning required.
Score: 8/10 (independence: 3, model-adequacy: 3, parallelism: 0, size: 2)
Estimated savings: ~10K tokens/run (main model receives a ~500-token summary instead of the full workflow file, which can run 200–500 lines)
Agent definition (copy-paste ready)
## agent: `workflow-characterizer`
---
description: Characterize an agentic workflow file by extracting purpose, triggers, engine, prompt style, tools, outputs, and quality signals
model: small
---
Read the workflow markdown file at the path provided by the caller. Return a compact JSON object summarizing:
- `purpose`: One sentence — what problem the workflow solves
- `triggers`: List of events that start it (e.g., `schedule`, `pull_request`, `workflow_dispatch`)
- `engine`: AI engine ID and any explicit model setting
- `prompt_density`: One of `terse`, `moderate`, `verbose` based on body length and instruction style
- `tools`: Top-level tool categories enabled (e.g., `bash`, `github`, MCP servers)
- `outputs`: What `safe-outputs` it produces (e.g., `create-issue`, `add-comment`)
- `quality_signals`: Any visible TODOs, brittle patterns, or unclear sections worth flagging
Be concise. Cite line numbers when quoting specifics. If a field is unclear, mark it `unknown` rather than guessing.
Invocation change in main prompt (Step 2):
Before:
Read the selected workflow file in full. Study:
1. **Purpose & trigger** — What problem does it solve? What events trigger it?
2. **Engine & model** — Which AI engine is used? Is there a specific model set?
3. **Prompt design** — What instructions does the agent receive? How verbose/prescriptive are they?
4. **Tool configuration** — Which tools and MCP servers are enabled?
5. **Output structure** — What safe-outputs are configured? What does it produce?
6. **Current performance characteristics** — Look at recent workflow run history ...
7. **Existing quality signals** — Are there any reported issues, quality labels, or patterns in runs?
After:
Use the `workflow-characterizer` agent with the selected workflow file path. Use the returned characterization (`purpose`, `triggers`, `engine`, `prompt_density`, `tools`, `outputs`, `quality_signals`) as the basis for Step 3.
Then check recent run performance with `gh run list --workflow="$(basename "$SELECTED" .md).lock.yml" --limit 10 --json conclusion,createdAt,displayTitle,durationMS`.
2. field-presence-checker (small)
Extracted task: Read the experiments compiler and the pick_experiment.cjs setup script and decide whether each of three named fields is genuinely implemented.
Why small: Grep/lookup-style classification (yes/partial/no) against a fixed list of field names — no design judgment.
Score: 10/10 (independence: 3, model-adequacy: 3, parallelism: 2, size: 2)
Estimated savings: ~20K tokens/run (1,083 lines of source code never enter the main model's context; can run in parallel with Steps 3–4)
Agent definition (copy-paste ready)
## agent: `field-presence-checker`
---
description: Check whether named configuration fields are implemented end-to-end across two source files
model: small
---
Read the two source files at the paths provided by the caller. For each named field provided, determine whether it is implemented and surfaced through the configuration pipeline.
For each field, return a JSON entry with:
- `name`: The field name
- `present`: One of `yes`, `partial`, `no`
- `evidence`: A 1-2 line snippet with `file:line` citation, or `not found`
Be strict — a field counts as `present` only if it is both parsed in the compiler AND emitted in the runtime artifact. `partial` means parsed but not surfaced (or vice versa).
Return the result as a compact JSON array.
Invocation change in main prompt (Side Quest Area 1):
Before:
**Important**: Before proposing additions, verify what is already implemented by reading the source files:
```bash
cat pkg/workflow/compiler_experiments.go
cat actions/setup/js/pick_experiment.cjs
Use the field-presence-checker agent with file paths pkg/workflow/compiler_experiments.go and actions/setup/js/pick_experiment.cjs, and field names analysis_type, tags, notify. Use the returned present/evidence triple to decide which fields are genuinely absent.
The Side Quest intro's duplicate `cat` block can also be removed — the agent's output already provides the implementation summary the main model needs.
---
### Estimated Impact
| Metric | Before | After (estimated) |
|---|---|---|
| Avg tokens/run | ~746K | ~600K (~20% reduction) |
| Main-model context saved | — | ~30K tokens/run (workflow file + 1,083 lines of source code) |
| Parallelism opportunity | None | `field-presence-checker` runs concurrently with Steps 3–4 |
| Prompt body lines affected | — | ~46 lines refactored (~15% of body) |
### Implementation Steps
1. Add the two `## agent:` blocks at the bottom of `.github/workflows/ab-testing-advisor.md` (after line 386, before `{{#runtime-import shared/noop-reminder.md}}` or wherever sub-agent blocks belong per repository convention).
2. Replace Step 2 of the primary quest with the `workflow-characterizer` invocation shown above.
3. Replace Side Quest Area 1's verification block (and the duplicate Side Quest intro cats) with the `field-presence-checker` invocation shown above.
4. Compile: `gh aw compile ab-testing-advisor`
5. Test manually: `gh workflow run ab-testing-advisor.lock.yml` and verify the resulting issue still cites correct field-presence findings and workflow characterization.
### Notes for Reviewers
- The `field-presence-checker` is intentionally strict about `present` semantics so the Area 1 gate ("create sub-issue only if at least one field is genuinely absent") remains correct.
- The `workflow-characterizer` returns `quality_signals` so Step 7 ("Existing quality signals") of the original prompt is preserved — this is the only subjective field, and the small model only needs to surface flags for the main model to act on.
### References
- Optimizer run: [§26295992302](https://github.com/github/gh-aw/actions/runs/26295992302)
- Recent run sample: [§26294227315](https://github.com/github/gh-aw/actions/runs/26294227315)
<!-- gh-aw-tracker-id: daily-subagent-optimizer -->
> Generated by [⚡ Daily Sub-Agent Optimizer](https://github.com/github/gh-aw/actions/runs/26295992302) · ● 12.5M · [◷](https://github.com/search?q=repo%3Agithub%2Fgh-aw+is%3Aissue+%22gh-aw-workflow-call-id%3A+github%2Fgh-aw%2Fdaily-subagent-optimizer%22&type=issues)
> - [x] expires <!-- gh-aw-expires: 2026-05-29T15:29:15.777Z --> on May 29, 2026, 3:29 PM UTC
<!-- gh-aw-agentic-workflow: Daily Sub-Agent Optimizer, gh-aw-tracker-id: daily-subagent-optimizer, engine: claude, model: auto, id: 26295992302, workflow_id: daily-subagent-optimizer, run: https://github.com/github/gh-aw/actions/runs/26295992302 -->
<!-- gh-aw-workflow-id: daily-subagent-optimizer -->
<!-- gh-aw-workflow-call-id: github/gh-aw/daily-subagent-optimizer -->
Target Workflow
File:
.github/workflows/ab-testing-advisor.mdEngine:
copilot(bare mode)7-day token usage: ~1.49M tokens across 2 runs (~746K avg/run, ~7.7m avg duration)
Why This Workflow
Daily A/B Testing Advisor has the highest token usage among non-recently-optimized workflows that don't yet use inline sub-agents (Developer Documentation Consolidator, Package Specification Librarian, Q, and Test Quality Sentinel already use sub-agents; Contribution Check and Design Decision Gate were optimized in the last 14 days). The workflow has 5 primary-quest steps plus a 3-area side quest, and two of those phases are clean file-reading + extraction tasks — exactly the kind of work where the main model can be relieved of large file context by delegating to a small model.
Optimization 1 — Common Tool Prefix
Repeated Tool Calls Found
The following tool invocations appear at the start of 2 sections of the prompt body:
Sections affected:
These 1,083 lines of source are pulled into the main model's context twice. Score: Moderate (2 sections share 2 common opening tool calls).
Proposed Resolution
This prefix is fully subsumed by the
field-presence-checkersub-agent in Optimization 2: when the agent reads those two files and returns a presence-triple, the main model never needs tocatthem at all. If Optimization 2 is implemented, both sets ofcatcalls can be removed from the prompt body.Estimated savings (prefix-only, if applied without sub-agent): ~10K tokens/run (one less load of ~20K combined file content)
Optimization 2 — Inline Sub-Agents
LLM Expert Reasoning
Proposed Sub-Agents
1.
workflow-characterizer(small)Extracted task: Read a selected workflow .md file and return a structured 7-field characterization for Step 3 to consume.
Why small: Extractive summarization against a fixed schema — no cross-source reasoning required.
Score: 8/10 (independence: 3, model-adequacy: 3, parallelism: 0, size: 2)
Estimated savings: ~10K tokens/run (main model receives a ~500-token summary instead of the full workflow file, which can run 200–500 lines)
Agent definition (copy-paste ready)
Invocation change in main prompt (Step 2):
Before:
After:
2.
field-presence-checker(small)Extracted task: Read the experiments compiler and the
pick_experiment.cjssetup script and decide whether each of three named fields is genuinely implemented.Why small: Grep/lookup-style classification (yes/partial/no) against a fixed list of field names — no design judgment.
Score: 10/10 (independence: 3, model-adequacy: 3, parallelism: 2, size: 2)
Estimated savings: ~20K tokens/run (1,083 lines of source code never enter the main model's context; can run in parallel with Steps 3–4)
Agent definition (copy-paste ready)
Invocation change in main prompt (Side Quest Area 1):
Before:
Use the
field-presence-checkeragent with file pathspkg/workflow/compiler_experiments.goandactions/setup/js/pick_experiment.cjs, and field namesanalysis_type,tags,notify. Use the returnedpresent/evidencetriple to decide which fields are genuinely absent.