feat(skills): add review-loop skill with opt-in gates by sourcehawk · Pull Request #14 · sourcehawk/feature-dev-workflow

sourcehawk · 2026-05-30T22:55:35Z

Description

Adds an opt-in review-loop skill that drives a pull request's automated (Copilot) review toward clean, and wires it into the feature workflow at two default-off gates. Nothing in the workflow currently loops a PR through automated review; this lets a session request Copilot, wait without blocking the session, address the feedback with real rigor, and re-request — only when the user asks for it.

Changes

New review-loop skill: request Copilot via gh pr edit --add-reviewer "@copilot" (the REST requested_reviewers endpoint rejects Copilot), detect unavailability, wait in the background, triage every open comment through superpowers:receiving-code-review, address, re-request; terminate when a fresh review is silent and threads are resolved, with a 3-round cap.
Background-wait script await-copilot-review.sh (gh + shell builtins only — no new dependency).
Two opt-in, default-off gates in developing-a-feature: a per-sub-PR gate (sub_pr_review_loop in the state file, run by fanning-out-with-worktrees at ripening) and a PR-to-main gate before external review. Pushback pauses for the user interactively, or bubbles up to the wave checkpoint in autonomous fan-out.
README and project-CLAUDE skill-table rows.

Challenges

The mechanism is the load-bearing part and was easy to get wrong in two places, both verified against live GitHub rather than assumed:

Requesting Copilot looks like a requested_reviewers API call, but that endpoint 422s on Copilot — the only working path is gh pr edit --add-reviewer "@copilot" (gh ≥ 2.88.0).
Re-requesting (round 2+) is unreliable: once Copilot has reviewed, re-adding it sometimes fires a fresh review (observed: two reviews landing on one commit) and sometimes silently does nothing, and Copilot does not dependably auto-review on push. The skill re-triggers with remove-then-re-add, confirms a genuinely new review by watermark rather than the command's exit code, and stops-and-reports if none arrives — pointing the user at the durable path (a repo ruleset running Copilot review on every push) instead of spinning.

Testing

Built via superpowers:writing-skills (RED → GREEN → REFACTOR) with isolated baseline subagents. RED baselines reached for the wrong (422) request command, polled synchronously holding the session, and looped without an unavailability check. With the skill, isolated agents used the correct command, waited in the background, triaged before applying, detected the unavailable case and exited, and respected the 3-round cap. The re-request fix was driven by a live test on a real PR (confirming re-request is unreliable) and re-verified in isolation: the agent captures a watermark, re-triggers via remove-then-re-add, confirms by watermark, and stops-and-reports on timeout rather than spinning. The wait script's watermark filter was checked against real Copilot-reviewed PRs (finds a newer review, excludes a stale one).

Opt-in skill that drives a PR to review-clean by looping: request the automated (Copilot) review, wait for it in the background so the session is never held hostage, triage every open comment with the rigor of receiving-code-review, address the correct ones, and re-request until a fresh review is silent and threads are resolved. A 3-round cap stops non-converging loops. The verified mechanism (gh pr edit --add-reviewer "@copilot"; the REST requested_reviewers endpoint rejects Copilot) and a gh-and-builtins-only background-wait script keep the skill within the plugin's superpowers + gh dependency rule. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Wires feature-dev-workflow:review-loop into developing-a-feature at two opt-in, default-off points: a per-sub-PR gate (recorded as sub_pr_review_loop in the state file, run by fanning-out-with-worktrees at each sub-PR ripening before the orchestrator's own review) and a PR-to-main gate before external review. Pushback is interactive at the PR-to-main gate and bubbles up to the wave checkpoint in autonomous fan-out. Adds the README and project-CLAUDE skill-table rows. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The loop's round-2+ step assumed re-requesting Copilot works like the first request. It doesn't: once Copilot has reviewed, re-adding it as a reviewer is unreliable — sometimes a fresh review fires (even on the same commit), sometimes nothing happens, and Copilot does not dependably auto-review on push. Verified against a live PR (two reviews landed on one commit) and GitHub's own docs/community reports. Re-trigger with remove-then-re-add, confirm a genuinely new review by watermark rather than the gh exit code, and STOP-and-report if none arrives before the wait times out instead of spinning — pointing the user at the durable path (a repo ruleset running Copilot review on every push). Also note that a fired re-review may repeat already-addressed comments, so triage re-verifies against the current diff. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…uesting When a repo auto-reviews Copilot on every push, the push that lands the round's fixes already triggers a review. The previous step 6 then also did an explicit remove-then-re-add, producing a second redundant review. Capture the watermark before pushing, wait for a push-triggered review first, and fall back to the explicit remove-then-re-add only if that wait times out. Same command sequence either way; the repo's behavior decides which branch resolves it, with no double review when auto-review is on. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Adds an opt-in review-loop skill to drive PRs through automated Copilot review, and wires that skill into the feature-development workflow for sub-PR and final PR gates.

Changes:

Adds the review-loop skill and background polling script for detecting fresh Copilot reviews.
Adds default-off review-loop gates to multi-PR fan-out and PR-to-main workflow steps.
Updates the project CLAUDE skill table to include the new skill.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`skills/review-loop/SKILL.md`	Defines the review-loop workflow, Copilot request/re-request mechanics, termination rules, and pushback handling.
`skills/review-loop/templates/await-copilot-review.sh`	Adds a polling helper that waits for a fresh Copilot review newer than a watermark.
`skills/developing-a-feature/SKILL.md`	Adds opt-in review-loop gates for sub-PRs and final PRs.
`skills/fanning-out-with-worktrees/SKILL.md`	Runs the review-loop during sub-PR ripening when enabled.
`templates/project-CLAUDE.md`	Lists the new review-loop skill in the workflow skill table.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

sourcehawk · 2026-05-30T23:17:03Z

+filter="[.[]
+  | select((.user.login==\"$BOT\") or (.user.login|test(\"[Cc]opilot\")))
+  | select(.submitted_at > \"$SINCE\")
+  | {id, state, submitted_at, body}]
+  | sort_by(.submitted_at) | reverse"
+
+while :; do
+  found=$(gh api "repos/$REPO/pulls/$PR/reviews" --paginate --jq "$filter")


Fixed in 79a30e6. Verified the bug is real (a multi-page no-match did print repeated []). --slurp isn't usable here — gh rejects it together with --jq — so instead the filter now emits each matching review as a bare object rather than wrapping in an outer array. A no-match then prints nothing on every page, so the emptiness check is correct. Confirmed against a real paginated PR: past-watermark finds the reviews, future-watermark across forced single-item pages yields zero bytes.

Fixed in 79a30e6. Confirmed the bug is real (a multi-page no-match did print repeated []). --slurp isn't usable here — gh rejects it together with --jq — so instead the filter now emits each matching review as a bare object rather than wrapping in an outer array; a no-match then prints nothing on every page and the emptiness check is correct. Verified against a real paginated PR: past-watermark finds the reviews, future-watermark across forced single-item pages yields zero bytes.


 Sub-PRs into the feature branch are owned by `feature-dev-workflow:fanning-out-with-worktrees`, not this step.

+Once the PR to main is open — the single-PR PR, or the multi-PR integration PR — ask the user (via `AskUserQuestion`) whether to run an automated review-loop on it before handing off for external human review. If yes, **OPTIONAL SUB-SKILL:** `feature-dev-workflow:review-loop` against this PR; it drives the PR's automated (Copilot) review to clean. This is the interactive context, so a comment the loop wants to push back on pauses for the user. If no, hand off as-is.


…oller Address Copilot review on #14. Under set -euo pipefail a transient gh/network failure in the poll loop killed the script with gh's exit code, which the caller could not distinguish from the 124 timeout. Guard the poll with || true so a blip or rate-limit just retries next interval; only a real timeout exits 124. Also document why the submitted_at watermark comparison is a plain string compare (GitHub always returns UTC ISO-8601 Z timestamps, whose lexical order is chronological) rather than converting to epoch. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Two issues Copilot flagged, both verified against real PRs before fixing: - Wait poller false positive across pages. `gh api --paginate --jq` runs the filter once per page and concatenates, so an array-wrapped filter emitted `[]` per page — a multi-page no-match printed `[]\n[]…`, which is non-empty and != `[]`, read as "review found". (`--slurp` is not an option: gh rejects it together with `--jq`.) The filter now emits each matching review as a bare object, so a no-match prints nothing on every page and the emptiness test is correct. Documented why the watermark stays a string comparison (GitHub returns fixed-width UTC ISO-8601 Z). - Stale review commit at Gate B. The loop ran at PR-open, but Step 7's teardown is the last commit on the branch, so a clean review before teardown is stale against what external reviewers see. Gate B now runs the loop on the final PR diff — deferred until after the teardown commit (or re-run), and when CI gates teardown, once that commit is green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

+filter="[.[]
+  | select((.user.login==\"$BOT\") or (.user.login|test(\"[Cc]opilot\")))
+  | select(.submitted_at > \"$SINCE\")
+  | {id, state, submitted_at, body}]
+  | sort_by(.submitted_at) | reverse"


@@ -0,0 +1,129 @@
+---
+name: review-loop


…fixes My earlier commit (8b98754) left the wait poller broken — it kept the array-wrapped jq filter but dropped the != "[]" guard, so the first poll always false-positived. This rebuilds the poller and properly lands the two issues Copilot flagged on #14, both verified against a real PR: - Paginated false positive. gh runs the --jq filter once per page under --paginate and concatenates (and --slurp is rejected with --jq). The filter now emits each matching review as a bare object instead of an array, so a multi-page no-match prints nothing rather than repeated empty arrays. - Error bodies no longer mistaken for reviews. On an HTTP error gh prints the error JSON to stdout and exits non-zero; output is now captured only when gh succeeds, so a 404/rate-limit/blip leaves the result empty and the loop retries instead of false-positiving. (Found while testing.) - Stale review commit at Gate B (developing-a-feature). The loop ran at PR-open, but Step 7's teardown is the last commit on the branch, so a clean review before teardown is stale against what external reviewers see. Gate B now runs the loop on the final PR diff. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…kill # Conflicts: # skills/fanning-out-with-worktrees/SKILL.md

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

sourcehawk and others added 3 commits May 30, 2026 23:33

Copilot AI review requested due to automatic review settings May 30, 2026 22:55

Copilot started reviewing on behalf of sourcehawk May 30, 2026 22:55 View session

Copilot AI reviewed May 30, 2026

View reviewed changes

sourcehawk and others added 2 commits May 31, 2026 01:08

Copilot AI review requested due to automatic review settings May 30, 2026 23:16

Copilot started reviewing on behalf of sourcehawk May 30, 2026 23:17 View session

Copilot AI reviewed May 30, 2026

View reviewed changes

sourcehawk and others added 2 commits May 31, 2026 01:37

Merge remote-tracking branch 'origin/main' into feature/review-loop-s…

23d4f27

…kill # Conflicts: # skills/fanning-out-with-worktrees/SKILL.md

Copilot AI review requested due to automatic review settings May 30, 2026 23:48

Copilot started reviewing on behalf of sourcehawk May 30, 2026 23:48 View session

sourcehawk merged commit 4c1dc56 into main May 30, 2026
1 check failed

sourcehawk deleted the feature/review-loop-skill branch May 30, 2026 23:48

Copilot AI reviewed May 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(skills): add review-loop skill with opt-in gates#14

feat(skills): add review-loop skill with opt-in gates#14
sourcehawk merged 8 commits into
mainfrom
feature/review-loop-skill

sourcehawk commented May 30, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

sourcehawk May 30, 2026

Uh oh!

sourcehawk May 30, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		Sub-PRs into the feature branch are owned by `feature-dev-workflow:fanning-out-with-worktrees`, not this step.

		Once the PR to main is open — the single-PR PR, or the multi-PR integration PR — ask the user (via `AskUserQuestion`) whether to run an automated review-loop on it before handing off for external human review. If yes, OPTIONAL SUB-SKILL: `feature-dev-workflow:review-loop` against this PR; it drives the PR's automated (Copilot) review to clean. This is the interactive context, so a comment the loop wants to push back on pauses for the user. If no, hand off as-is.

Conversation

sourcehawk commented May 30, 2026

Description

Changes

Challenges

Related

Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

sourcehawk May 30, 2026

Choose a reason for hiding this comment

Uh oh!

sourcehawk May 30, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants