[subagent-optimizer] Optimize contribution-check — 2026-05-20

### Target Workflow

**File**: `.github/workflows/contribution-check.md`
**Engine**: `copilot` (with `agent: contribution-checker` file-based delegate)
**7-day token usage**: ~16.3M effective tokens across 1 captured episode (~16.3M avg/run, ~2.4M raw input tokens)

### Why This Workflow

This is the highest-token workflow without any inline `## agent:` blocks (excluding workflows that already use them, smoke tests, and this optimizer). The orchestrator prompt has 14 distinct phases and dedicates ~64 of its ~200 prompt-body lines to a static **Report Layout Rules + Example Report** reference block that gets reloaded into the main model's context on every run — even though the formatting work itself is purely templated and well-suited to a smaller model.

The workflow already delegates per-PR evaluation to a file-based subagent (`.github/agents/contribution-checker.agent.md`), so the orchestrator's remaining work is dispatch + synthesis. The synthesis half is the optimization opportunity.

---

### LLM Expert Reasoning

- **Report formatting is the largest single chunk of static reference text** in the orchestrator prompt (lines 200–269: ~70 lines of fixed layout rules + a full example). It scored 8/10 across the four classification dimensions (independence: 2, model-adequacy: 3, parallelism: 1, size: 2).
- **Formatting is extractive/templated** — group rows by `quality`, render fixed tables, apply fixed phrasing. The heuristics cheat-sheet flags "converting data from one format to another (JSON → markdown table)" as a clear `small` model task.
- **Comment routing** (lines 179–192) is a pure filter+project: keep entries where `comment` is non-empty AND `quality != lgtm`, copy the comment into a fixed JSON shape. Scored 6/10.
- **No qualifying common tool prefix was found** (Phase 4 verdict: low) — the workflow's `pre-agent` bash step already pre-fetches and shares the heavy file reads (CONTRIBUTING.md, PR filter results). So this proposal is sub-agent-only.
- Truncating CONTRIBUTING.md and the per-PR dispatch loop scored at 5/10 (moderate); included as optional follow-ups but not proposed here to keep the diff reviewable.

### Proposed Sub-Agents

#### 1. `report-formatter` (`small`)

**Extracted task**: Group the array of PR-verdict JSONs into Ready/Needs-look/Off-guidelines tables and return the markdown body for the report issue.
**Why small**: Pure templated formatting — group by `quality` field, render fixed tables, apply fixed phrasing.
**Score**: 8/10 (independence: 2, model-adequacy: 3, parallelism: 1, size: 2)
**Estimated savings**: ~3,500 tokens of static formatting reference removed from the orchestrator's prompt every run; the small model handles the synthesis at lower per-token cost.

<details>
<summary>Agent definition (copy-paste ready)</summary>

````markdown
## agent: `report-formatter`
---
description: Format collected PR verdicts into a grouped markdown report body for the maintainer issue
model: small
---
You receive: an array of PR verdict JSONs (fields: `number`, `title`, `author`, `lines`, `quality`, optional `comment`), a `skipped_count`, and a `run_url`.

Group rows by `quality`:
- 🟢 **Ready to review**: `lgtm`
- 🟡 **Needs a closer look**: `needs-work`
- 🔴 **Off-guidelines**: `spam`, `outdated`, or error verdicts (❓)

For each non-empty group, render one markdown table with columns: PR (linked), Title (truncated to 50 chars), Author, Lines, Quality. Wrap any group with > 10 rows in `<details><summary>Per-PR Details</summary>`.

Open with one summary sentence: e.g. "We looked at N new PRs — X look great, Y need a closer look, Z don't fit the guidelines." Close with: `Evaluated: {n} · Skipped: {n} · Run: {run_url}`. Use `###` headers and blank `---` separators between groups. Tone: warm, constructive, never shaming.

Return only the markdown body string. No preamble.
````

</details>

**Invocation change in main prompt:**

Before (lines 200–269, ~70 lines including all layout rules and the example report):

```
## Step 2: Compile Report

Create a single issue in THIS repository. Use the skipped_count from pr-filter-results.json...

### Report Layout Rules
[6 numbered rules with detail]

### Example Report
[~40 lines of example markdown]
```

After:

```
## Step 2: Compile Report

Use the `report-formatter` agent, passing the array of returned subagent JSON objects, the `skipped_count` from `pr-filter-results.json`, and the current `run_url`, to produce the report body. Then emit a single `create_issue` safe output with that body as `body` and `temporary_id: "aw_summary"`.
```

---

#### 2. `comment-dispatcher` (`small`)

**Extracted task**: Filter the verdict array to entries needing maintainer comments and return the comment payloads.
**Why small**: Pure filter-and-template — two-field check, copy one field into fixed JSON.
**Score**: 6/10 (independence: 1, model-adequacy: 3, parallelism: 1, size: 1)
**Estimated savings**: small but eliminates ~14 lines of routing logic from the orchestrator and isolates failures (a malformed verdict no longer blocks the orchestrator).

<details>
<summary>Agent definition (copy-paste ready)</summary>

````markdown
## agent: `comment-dispatcher`
---
description: From verdict JSONs, derive which PRs need comments and emit the comment payloads
model: small
---
You receive: the array of subagent verdict JSON objects. Each object has fields `number`, `quality`, and optional `comment`.

For each entry where `comment` is a non-empty string AND `quality` is NOT `lgtm`, output one row with `issue_number` set to `number` and `body` set to `comment` (verbatim, do not modify).

Drop entries with empty/missing `comment` or `quality == lgtm`.

Return only JSON:
```json
{"comments": [{"issue_number": 18744, "body": "<comment text>"}]}
```

If no entries qualify, return `{"comments": []}`.
````

</details>

**Invocation change in main prompt:**

Before (lines 179–192):

```
### Posting comments

For each PR where the subagent returned a non-empty `comment` field and the quality is NOT `lgtm`, call the `add_comment` safe output tool to post the comment to the PR.
[+ examples + caveats]
```

After:

```
### Posting comments

Use the `comment-dispatcher` agent on the collected verdict array to get the list of comments to post. For each returned entry, emit one `add_comment` safe output using `issue_number` and `body` (do not specify the repo — `target-repo` is pre-configured).
```

### Estimated Impact

| Metric | Before | After (estimated) |
|---|---|---|
| Avg tokens/run | ~16.3M effective | ~13.5M–14.5M (~10–15% reduction) |
| Main-model prompt context saved | — | ~3,500 tokens/run (formatting rules + routing logic) |
| Parallelism opportunity | None | `comment-dispatcher` and `report-formatter` can run concurrently after subagent results land |

### Implementation Steps

1. **Sub-agents**: Append the two `## agent:` blocks above to the bottom of `.github/workflows/contribution-check.md`, after the `{{#runtime-import shared/noop-reminder.md}}` line.
2. **Replace Section 2** (lines 200–269): keep only the `## Step 2: Compile Report` heading and the new invocation paragraph. Remove the entire `### Report Layout Rules` and `### Example Report` subsections.
3. **Replace the `### Posting comments` subsection** (lines 179–192) with the new invocation paragraph.
4. Compile: `gh aw compile contribution-check`
5. Test: `gh workflow run contribution-check.yml`
6. After one run, inspect the resulting report issue to confirm formatting fidelity vs. the prior baseline before letting the schedule resume unattended.

### Notes

- Phase 4 (common-prefix) verdict was `low` — the workflow's existing bash `pre-agent` step already factors out shared file reads. No setup-step extraction is proposed.
- `Step 3: Label the Report Issue` (score 5/10) and the CONTRIBUTING.md truncation logic (score 5/10) are moderate candidates but were not selected to keep the proposal diff reviewable. They can be revisited in a follow-up if the two proposed sub-agents land cleanly.
- The workflow already delegates per-PR evaluation to a file-based subagent (`contribution-checker.agent.md`); these proposed inline sub-agents are orthogonal — they offload synthesis tasks the orchestrator currently does itself.

### References

- Optimizer run: [§26171878713](https://github.com/github/gh-aw/actions/runs/26171878713)







> Generated by [⚡ Daily Sub-Agent Optimizer](https://github.com/github/gh-aw/actions/runs/26171878713) · ● 21.7M · [◷](https://github.com/search?q=repo%3Agithub%2Fgh-aw+is%3Aissue+%22gh-aw-workflow-call-id%3A+github%2Fgh-aw%2Fdaily-subagent-optimizer%22&type=issues)
> - [x] expires  on May 27, 2026, 3:30 PM UTC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[subagent-optimizer] Optimize contribution-check — 2026-05-20 #33569

Target Workflow

Why This Workflow

LLM Expert Reasoning

Proposed Sub-Agents

1. `report-formatter` (`small`)

2. `comment-dispatcher` (`small`)

Estimated Impact

Implementation Steps

Notes

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Metric	Before	After (estimated)
Avg tokens/run	~16.3M effective	~13.5M–14.5M (~10–15% reduction)
Main-model prompt context saved	—	~3,500 tokens/run (formatting rules + routing logic)
Parallelism opportunity	None	`comment-dispatcher` and `report-formatter` can run concurrently after subagent results land

[subagent-optimizer] Optimize contribution-check — 2026-05-20 #33569

Description

Target Workflow

Why This Workflow

LLM Expert Reasoning

Proposed Sub-Agents

1. report-formatter (small)

2. comment-dispatcher (small)

Estimated Impact

Implementation Steps

Notes

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

1. `report-formatter` (`small`)

2. `comment-dispatcher` (`small`)