Skip to content

[subagent-optimizer] Optimize contribution-check — 2026-05-20 #33569

@github-actions

Description

@github-actions

Target Workflow

File: .github/workflows/contribution-check.md
Engine: copilot (with agent: contribution-checker file-based delegate)
7-day token usage: ~16.3M effective tokens across 1 captured episode (~16.3M avg/run, ~2.4M raw input tokens)

Why This Workflow

This is the highest-token workflow without any inline ## agent: blocks (excluding workflows that already use them, smoke tests, and this optimizer). The orchestrator prompt has 14 distinct phases and dedicates ~64 of its ~200 prompt-body lines to a static Report Layout Rules + Example Report reference block that gets reloaded into the main model's context on every run — even though the formatting work itself is purely templated and well-suited to a smaller model.

The workflow already delegates per-PR evaluation to a file-based subagent (.github/agents/contribution-checker.agent.md), so the orchestrator's remaining work is dispatch + synthesis. The synthesis half is the optimization opportunity.


LLM Expert Reasoning

  • Report formatting is the largest single chunk of static reference text in the orchestrator prompt (lines 200–269: ~70 lines of fixed layout rules + a full example). It scored 8/10 across the four classification dimensions (independence: 2, model-adequacy: 3, parallelism: 1, size: 2).
  • Formatting is extractive/templated — group rows by quality, render fixed tables, apply fixed phrasing. The heuristics cheat-sheet flags "converting data from one format to another (JSON → markdown table)" as a clear small model task.
  • Comment routing (lines 179–192) is a pure filter+project: keep entries where comment is non-empty AND quality != lgtm, copy the comment into a fixed JSON shape. Scored 6/10.
  • No qualifying common tool prefix was found (Phase 4 verdict: low) — the workflow's pre-agent bash step already pre-fetches and shares the heavy file reads (CONTRIBUTING.md, PR filter results). So this proposal is sub-agent-only.
  • Truncating CONTRIBUTING.md and the per-PR dispatch loop scored at 5/10 (moderate); included as optional follow-ups but not proposed here to keep the diff reviewable.

Proposed Sub-Agents

1. report-formatter (small)

Extracted task: Group the array of PR-verdict JSONs into Ready/Needs-look/Off-guidelines tables and return the markdown body for the report issue.
Why small: Pure templated formatting — group by quality field, render fixed tables, apply fixed phrasing.
Score: 8/10 (independence: 2, model-adequacy: 3, parallelism: 1, size: 2)
Estimated savings: ~3,500 tokens of static formatting reference removed from the orchestrator's prompt every run; the small model handles the synthesis at lower per-token cost.

Agent definition (copy-paste ready)
## agent: `report-formatter`
---
description: Format collected PR verdicts into a grouped markdown report body for the maintainer issue
model: small
---
You receive: an array of PR verdict JSONs (fields: `number`, `title`, `author`, `lines`, `quality`, optional `comment`), a `skipped_count`, and a `run_url`.

Group rows by `quality`:
- 🟢 **Ready to review**: `lgtm`
- 🟡 **Needs a closer look**: `needs-work`
- 🔴 **Off-guidelines**: `spam`, `outdated`, or error verdicts (❓)

For each non-empty group, render one markdown table with columns: PR (linked), Title (truncated to 50 chars), Author, Lines, Quality. Wrap any group with > 10 rows in `<details><summary>Per-PR Details</summary>`.

Open with one summary sentence: e.g. "We looked at N new PRs — X look great, Y need a closer look, Z don't fit the guidelines." Close with: `Evaluated: {n} · Skipped: {n} · Run: {run_url}`. Use `###` headers and blank `---` separators between groups. Tone: warm, constructive, never shaming.

Return only the markdown body string. No preamble.

Invocation change in main prompt:

Before (lines 200–269, ~70 lines including all layout rules and the example report):

## Step 2: Compile Report

Create a single issue in THIS repository. Use the skipped_count from pr-filter-results.json...

### Report Layout Rules
[6 numbered rules with detail]

### Example Report
[~40 lines of example markdown]

After:

## Step 2: Compile Report

Use the `report-formatter` agent, passing the array of returned subagent JSON objects, the `skipped_count` from `pr-filter-results.json`, and the current `run_url`, to produce the report body. Then emit a single `create_issue` safe output with that body as `body` and `temporary_id: "aw_summary"`.

2. comment-dispatcher (small)

Extracted task: Filter the verdict array to entries needing maintainer comments and return the comment payloads.
Why small: Pure filter-and-template — two-field check, copy one field into fixed JSON.
Score: 6/10 (independence: 1, model-adequacy: 3, parallelism: 1, size: 1)
Estimated savings: small but eliminates ~14 lines of routing logic from the orchestrator and isolates failures (a malformed verdict no longer blocks the orchestrator).

Agent definition (copy-paste ready)
## agent: `comment-dispatcher`
---
description: From verdict JSONs, derive which PRs need comments and emit the comment payloads
model: small
---
You receive: the array of subagent verdict JSON objects. Each object has fields `number`, `quality`, and optional `comment`.

For each entry where `comment` is a non-empty string AND `quality` is NOT `lgtm`, output one row with `issue_number` set to `number` and `body` set to `comment` (verbatim, do not modify).

Drop entries with empty/missing `comment` or `quality == lgtm`.

Return only JSON:
```json
{"comments": [{"issue_number": 18744, "body": "<comment text>"}]}
```

If no entries qualify, return `{"comments": []}`.

Invocation change in main prompt:

Before (lines 179–192):

### Posting comments

For each PR where the subagent returned a non-empty `comment` field and the quality is NOT `lgtm`, call the `add_comment` safe output tool to post the comment to the PR.
[+ examples + caveats]

After:

### Posting comments

Use the `comment-dispatcher` agent on the collected verdict array to get the list of comments to post. For each returned entry, emit one `add_comment` safe output using `issue_number` and `body` (do not specify the repo — `target-repo` is pre-configured).

Estimated Impact

Metric Before After (estimated)
Avg tokens/run ~16.3M effective ~13.5M–14.5M (~10–15% reduction)
Main-model prompt context saved ~3,500 tokens/run (formatting rules + routing logic)
Parallelism opportunity None comment-dispatcher and report-formatter can run concurrently after subagent results land

Implementation Steps

  1. Sub-agents: Append the two ## agent: blocks above to the bottom of .github/workflows/contribution-check.md, after the {{#runtime-import shared/noop-reminder.md}} line.
  2. Replace Section 2 (lines 200–269): keep only the ## Step 2: Compile Report heading and the new invocation paragraph. Remove the entire ### Report Layout Rules and ### Example Report subsections.
  3. Replace the ### Posting comments subsection (lines 179–192) with the new invocation paragraph.
  4. Compile: gh aw compile contribution-check
  5. Test: gh workflow run contribution-check.yml
  6. After one run, inspect the resulting report issue to confirm formatting fidelity vs. the prior baseline before letting the schedule resume unattended.

Notes

  • Phase 4 (common-prefix) verdict was low — the workflow's existing bash pre-agent step already factors out shared file reads. No setup-step extraction is proposed.
  • Step 3: Label the Report Issue (score 5/10) and the CONTRIBUTING.md truncation logic (score 5/10) are moderate candidates but were not selected to keep the proposal diff reviewable. They can be revisited in a follow-up if the two proposed sub-agents land cleanly.
  • The workflow already delegates per-PR evaluation to a file-based subagent (contribution-checker.agent.md); these proposed inline sub-agents are orthogonal — they offload synthesis tasks the orchestrator currently does itself.

References

Generated by ⚡ Daily Sub-Agent Optimizer · ● 21.7M ·

  • expires on May 27, 2026, 3:30 PM UTC

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions