feat(skills): add review-loop skill with opt-in gates#14
Conversation
Opt-in skill that drives a PR to review-clean by looping: request the automated (Copilot) review, wait for it in the background so the session is never held hostage, triage every open comment with the rigor of receiving-code-review, address the correct ones, and re-request until a fresh review is silent and threads are resolved. A 3-round cap stops non-converging loops. The verified mechanism (gh pr edit --add-reviewer "@copilot"; the REST requested_reviewers endpoint rejects Copilot) and a gh-and-builtins-only background-wait script keep the skill within the plugin's superpowers + gh dependency rule. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Wires feature-dev-workflow:review-loop into developing-a-feature at two opt-in, default-off points: a per-sub-PR gate (recorded as sub_pr_review_loop in the state file, run by fanning-out-with-worktrees at each sub-PR ripening before the orchestrator's own review) and a PR-to-main gate before external review. Pushback is interactive at the PR-to-main gate and bubbles up to the wave checkpoint in autonomous fan-out. Adds the README and project-CLAUDE skill-table rows. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The loop's round-2+ step assumed re-requesting Copilot works like the first request. It doesn't: once Copilot has reviewed, re-adding it as a reviewer is unreliable — sometimes a fresh review fires (even on the same commit), sometimes nothing happens, and Copilot does not dependably auto-review on push. Verified against a live PR (two reviews landed on one commit) and GitHub's own docs/community reports. Re-trigger with remove-then-re-add, confirm a genuinely new review by watermark rather than the gh exit code, and STOP-and-report if none arrives before the wait times out instead of spinning — pointing the user at the durable path (a repo ruleset running Copilot review on every push). Also note that a fired re-review may repeat already-addressed comments, so triage re-verifies against the current diff. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…uesting When a repo auto-reviews Copilot on every push, the push that lands the round's fixes already triggers a review. The previous step 6 then also did an explicit remove-then-re-add, producing a second redundant review. Capture the watermark before pushing, wait for a push-triggered review first, and fall back to the explicit remove-then-re-add only if that wait times out. Same command sequence either way; the repo's behavior decides which branch resolves it, with no double review when auto-review is on. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds an opt-in review-loop skill to drive PRs through automated Copilot review, and wires that skill into the feature-development workflow for sub-PR and final PR gates.
Changes:
- Adds the
review-loopskill and background polling script for detecting fresh Copilot reviews. - Adds default-off review-loop gates to multi-PR fan-out and PR-to-main workflow steps.
- Updates the project CLAUDE skill table to include the new skill.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
skills/review-loop/SKILL.md |
Defines the review-loop workflow, Copilot request/re-request mechanics, termination rules, and pushback handling. |
skills/review-loop/templates/await-copilot-review.sh |
Adds a polling helper that waits for a fresh Copilot review newer than a watermark. |
skills/developing-a-feature/SKILL.md |
Adds opt-in review-loop gates for sub-PRs and final PRs. |
skills/fanning-out-with-worktrees/SKILL.md |
Runs the review-loop during sub-PR ripening when enabled. |
templates/project-CLAUDE.md |
Lists the new review-loop skill in the workflow skill table. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| filter="[.[] | ||
| | select((.user.login==\"$BOT\") or (.user.login|test(\"[Cc]opilot\"))) | ||
| | select(.submitted_at > \"$SINCE\") | ||
| | {id, state, submitted_at, body}] | ||
| | sort_by(.submitted_at) | reverse" | ||
|
|
||
| while :; do | ||
| found=$(gh api "repos/$REPO/pulls/$PR/reviews" --paginate --jq "$filter") |
There was a problem hiding this comment.
Fixed in 79a30e6. Verified the bug is real (a multi-page no-match did print repeated []). --slurp isn't usable here — gh rejects it together with --jq — so instead the filter now emits each matching review as a bare object rather than wrapping in an outer array. A no-match then prints nothing on every page, so the emptiness check is correct. Confirmed against a real paginated PR: past-watermark finds the reviews, future-watermark across forced single-item pages yields zero bytes.
There was a problem hiding this comment.
Fixed in 79a30e6. Confirmed the bug is real (a multi-page no-match did print repeated []). --slurp isn't usable here — gh rejects it together with --jq — so instead the filter now emits each matching review as a bare object rather than wrapping in an outer array; a no-match then prints nothing on every page and the emptiness check is correct. Verified against a real paginated PR: past-watermark finds the reviews, future-watermark across forced single-item pages yields zero bytes.
|
|
||
| Sub-PRs into the feature branch are owned by `feature-dev-workflow:fanning-out-with-worktrees`, not this step. | ||
|
|
||
| Once the PR to main is open — the single-PR PR, or the multi-PR integration PR — ask the user (via `AskUserQuestion`) whether to run an automated review-loop on it before handing off for external human review. If yes, **OPTIONAL SUB-SKILL:** `feature-dev-workflow:review-loop` against this PR; it drives the PR's automated (Copilot) review to clean. This is the interactive context, so a comment the loop wants to push back on pauses for the user. If no, hand off as-is. |
…oller Address Copilot review on #14. Under set -euo pipefail a transient gh/network failure in the poll loop killed the script with gh's exit code, which the caller could not distinguish from the 124 timeout. Guard the poll with || true so a blip or rate-limit just retries next interval; only a real timeout exits 124. Also document why the submitted_at watermark comparison is a plain string compare (GitHub always returns UTC ISO-8601 Z timestamps, whose lexical order is chronological) rather than converting to epoch. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two issues Copilot flagged, both verified against real PRs before fixing: - Wait poller false positive across pages. `gh api --paginate --jq` runs the filter once per page and concatenates, so an array-wrapped filter emitted `[]` per page — a multi-page no-match printed `[]\n[]…`, which is non-empty and != `[]`, read as "review found". (`--slurp` is not an option: gh rejects it together with `--jq`.) The filter now emits each matching review as a bare object, so a no-match prints nothing on every page and the emptiness test is correct. Documented why the watermark stays a string comparison (GitHub returns fixed-width UTC ISO-8601 Z). - Stale review commit at Gate B. The loop ran at PR-open, but Step 7's teardown is the last commit on the branch, so a clean review before teardown is stale against what external reviewers see. Gate B now runs the loop on the final PR diff — deferred until after the teardown commit (or re-run), and when CI gates teardown, once that commit is green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
| filter="[.[] | ||
| | select((.user.login==\"$BOT\") or (.user.login|test(\"[Cc]opilot\"))) | ||
| | select(.submitted_at > \"$SINCE\") | ||
| | {id, state, submitted_at, body}] | ||
| | sort_by(.submitted_at) | reverse" |
| @@ -0,0 +1,129 @@ | |||
| --- | |||
| name: review-loop | |||
…fixes My earlier commit (8b98754) left the wait poller broken — it kept the array-wrapped jq filter but dropped the != "[]" guard, so the first poll always false-positived. This rebuilds the poller and properly lands the two issues Copilot flagged on #14, both verified against a real PR: - Paginated false positive. gh runs the --jq filter once per page under --paginate and concatenates (and --slurp is rejected with --jq). The filter now emits each matching review as a bare object instead of an array, so a multi-page no-match prints nothing rather than repeated empty arrays. - Error bodies no longer mistaken for reviews. On an HTTP error gh prints the error JSON to stdout and exits non-zero; output is now captured only when gh succeeds, so a 404/rate-limit/blip leaves the result empty and the loop retries instead of false-positiving. (Found while testing.) - Stale review commit at Gate B (developing-a-feature). The loop ran at PR-open, but Step 7's teardown is the last commit on the branch, so a clean review before teardown is stale against what external reviewers see. Gate B now runs the loop on the final PR diff. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…kill # Conflicts: # skills/fanning-out-with-worktrees/SKILL.md
Description
Adds an opt-in
review-loopskill that drives a pull request's automated (Copilot) review toward clean, and wires it into the feature workflow at two default-off gates. Nothing in the workflow currently loops a PR through automated review; this lets a session request Copilot, wait without blocking the session, address the feedback with real rigor, and re-request — only when the user asks for it.Changes
review-loopskill: request Copilot viagh pr edit --add-reviewer "@copilot"(the RESTrequested_reviewersendpoint rejects Copilot), detect unavailability, wait in the background, triage every open comment throughsuperpowers:receiving-code-review, address, re-request; terminate when a fresh review is silent and threads are resolved, with a 3-round cap.await-copilot-review.sh(gh + shell builtins only — no new dependency).developing-a-feature: a per-sub-PR gate (sub_pr_review_loopin the state file, run byfanning-out-with-worktreesat ripening) and a PR-to-main gate before external review. Pushback pauses for the user interactively, or bubbles up to the wave checkpoint in autonomous fan-out.Challenges
The mechanism is the load-bearing part and was easy to get wrong in two places, both verified against live GitHub rather than assumed:
requested_reviewersAPI call, but that endpoint 422s on Copilot — the only working path isgh pr edit --add-reviewer "@copilot"(gh ≥ 2.88.0).Related
Touches
developing-a-featurealongside #14 (single-PR SDD), in a different section; both land review wiring infanning-out-with-worktrees/developing-a-featureand will need a trivial rebase depending on merge order.Testing
Built via
superpowers:writing-skills(RED → GREEN → REFACTOR) with isolated baseline subagents. RED baselines reached for the wrong (422) request command, polled synchronously holding the session, and looped without an unavailability check. With the skill, isolated agents used the correct command, waited in the background, triaged before applying, detected the unavailable case and exited, and respected the 3-round cap. The re-request fix was driven by a live test on a real PR (confirming re-request is unreliable) and re-verified in isolation: the agent captures a watermark, re-triggers via remove-then-re-add, confirms by watermark, and stops-and-reports on timeout rather than spinning. The wait script's watermark filter was checked against real Copilot-reviewed PRs (finds a newer review, excludes a stale one).