feat(cos): multiple ordered code-review agents per task#459
Conversation
Replace the single `reviewer` field with an ordered `reviewers` list plus `reviewStopMode` and `reviewerApplies`, so the custom-task and scheduled-task forms can run codex/gemini/copilot in a chosen order. Threads the list through to slashdo as `--review-with a,b,c [--review-stop-on-*] [--reviewer-applies]`. Bumps the bundled slashdo submodule to v3.1.0, which adds the multi-reviewer flag this feature drives.
- Only pre-request the native Copilot review when copilot LEADS the reviewer list; when it trails a CLI reviewer the follow-up requests it at its turn so Copilot reviews the post-fix diff, not a stale one. - Under an explicit stop-mode, allow merging on a `partial` aggregate (an intentional short-circuit) — previously the merge step only accepted `clean`/`too-large`, leaving stop-mode runs open.
- Emit only the per-reviewer-kind bullet that applies (copilot vs CLI) in the ordered follow-up step instead of always printing both. - Qualify "(for Copilot) resolve threads" in the compact follow-up variant. - Fix "the the review loop" double article in the merge step for the multi/CLI-reviewer case.
There was a problem hiding this comment.
Pull request overview
This PR upgrades the CoS “review loop” from a single per-task reviewer to an ordered multi-reviewer pipeline (e.g., codex → gemini → copilot), with configurable stop behavior and an option for CLI reviewers to apply fixes automatically. It threads the new reviewer list through task creation, validation/sanitization, agent spawning/cleanup (runner + direct CLI spawn paths), prompt generation, and the client UI (task + schedule forms, agent card badge), while keeping legacy reviewer tasks working via normalizeReviewers().
Changes:
- Replace singular
reviewerwith orderedreviewers[], plusreviewStopModeandreviewerApplies, end-to-end (schemas, metadata sanitization, addTask, spawning/cleanup, prompts). - Add
ReviewerPickerUI (add in click-order, reorder, remove; conditional stop-mode + reviewer-applies controls) and wire it into TaskAddForm and ScheduleTab. - Update prompt instructions + tests to reflect multi-reviewer slashdo flags (
--review-with a,b,c,--review-stop-on-*,--reviewer-applies) and multi-reviewer follow-up behavior.
Reviewed changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| server/services/taskSchedule.js | Updates task prompt wording to describe ordered multi-reviewer review steps. |
| server/services/cos.js | Threads normalized reviewer lists into task prompt templating and task metadata construction. |
| server/services/cleanupAgentWorktree.test.js | Expands coverage for multi-reviewer follow-up spawning and Copilot pre-request rules. |
| server/services/agentPromptBuilder.test.js | Updates/extends prompt expectations for reviewer lists and new flags/stop-modes. |
| server/services/agentPromptBuilder.js | Implements multi-reviewer prompt wiring, flag construction, and updated completion workflow text. |
| server/services/agentLifecycle.js | Updates spawn/cleanup paths to pass reviewer lists and handle Copilot pre-request only when leading. |
| server/services/agentCliSpawning.test.js | Ensures direct CLI spawning path passes reviewers/stop-mode/applies into cleanup. |
| server/services/agentCliSpawning.js | Threads normalized reviewers + flags through direct-spawn cleanup options. |
| server/lib/validation.test.js | Adds tests for reviewers list normalization and slashdo arg building. |
| server/lib/validation.js | Adds normalizeReviewers, stop-mode constants, and buildReviewWithArgs; extends schemas/sanitization. |
| client/src/components/cos/TaskAddForm.jsx | Replaces reviewer dropdown with ReviewerPicker and submits new reviewer metadata fields. |
| client/src/components/cos/tabs/ScheduleTab.jsx | Replaces reviewer dropdown with ReviewerPicker for scheduled tasks and writes new metadata keys. |
| client/src/components/cos/tabs/AgentCard.jsx | Updates review-loop badge to display ordered reviewer lists. |
| client/src/components/cos/ReviewerPicker.test.jsx | Adds unit tests for add/remove/reorder and conditional controls. |
| client/src/components/cos/ReviewerPicker.jsx | New ordered multi-reviewer picker component. |
| client/src/components/cos/constants.js | Adds client-side reviewer normalization + stop-mode constants (mirroring server). |
| .changelog/NEXT.md | Documents the new multi-reviewer capability and slashdo submodule bump. |
Comments suppressed due to low confidence (1)
server/services/agentLifecycle.js:1547
- The documentation comment above
spawnReviewLoopFollowUpstill describes spawning a Copilot-only/do:rprloop and GitHub-only behavior. The implementation now supports an orderedreviewerslist (droppingcopilotonly on non-GitHub forges) plusreviewStopMode/reviewerApplies. Please update that docblock to match the new multi-reviewer semantics and parameter names.
* branch (via createWorktree's `existingBranch` option) so it can fix-and-push
* without trampling concurrent agents.
*/
export async function spawnReviewLoopFollowUp({ originalAgentId, originalTask, prUrl, prBranch, sourceWorkspace, reviewers = DEFAULT_REVIEWERS, reviewStopMode = DEFAULT_REVIEW_STOP_MODE, reviewerApplies = false }) {
if (!prUrl || !prBranch) return null;
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…eviewer flow The block comment still described the singular `reviewer` choice and Copilot-only follow-up; document the ordered `reviewers` list, copilot-leads pre-request rule, and reviewStopMode/reviewerApplies threading.
`key={value}` would collide if the stored reviewers list ever contained
duplicates (legacy metadata / manual edits), corrupting reorder/remove. Render
an order-preserving de-duped list so keys stay unique without reintroducing the
empty-state coercion.
…doc fixes - Always hand the full ordered reviewer list to the follow-up; it requests Copilot at its turn, so a failed/absent pre-request (incl. copilot-only) no longer leaves the PR open. The only drop (copilot on non-GitHub forge) stays centralized in spawnReviewLoopFollowUp. - ScheduleTab feeds the picker the stored list without empty→copilot coercion so an explicit clear shows the empty-state hint (consistent with TaskAddForm). - Update spawnReviewLoopFollowUp docblock + Phase 6 schedule prose for the multi-reviewer / stop-mode / reviewer-applies flow.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 21 out of 21 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
server/services/agentPromptBuilder.js:286
- The follow-up merge/verification steps are hardcoded to
gh pr mergeandgh pr view, but follow-up tasks may be spawned for GitLab MRs when CLI reviewers are configured (copilot is stripped, not the whole follow-up). On non-GitHub forges these commands will fail, leaving the MR open and the worktree/branch potentially leaking. Please choose the correct forge CLI (ghvsglab) for merge/view based onreviewLoopPRHost/reviewLoopPRUrl(or, if review-loop follow-ups are intended to be GitHub-only, block spawning on non-GitHub hosts even for CLI reviewers and emit a clear warning).
gh pr merge "${prUrl}" --squash --delete-branch
\`\`\`
${prOwner && prRepo && prNumber ? `(Equivalent: \`gh pr merge ${prNumber} --repo ${prOwner}/${prRepo} --squash --delete-branch\`.)` : ''}
You have already verified the review is clean, so force the immediate merge. Adding any merge-deferral flag would leave the PR open after you exit.
5. Confirm the PR is actually merged before exiting: \`gh pr view "${prUrl}" --json state -q .state\` must return \`MERGED\`. If it returns \`OPEN\` or \`CLOSED\`, investigate (a check is failing, a thread is still unresolved, or branch protection is blocking) — fix and retry the merge. Do NOT exit until state is \`MERGED\`.
…utral The follow-up keeps CLI reviewers (codex/gemini/claude) on non-GitHub forges (only copilot is stripped there), so the review-loop prompt no longer hardcodes `gh pr diff` — it points at the CLI's own base-diff mode / `git diff <base>...HEAD`, noting `gh pr diff` works on GitHub.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 21 out of 21 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
server/services/agentLifecycle.js:1597
- This metadata block includes a comment about
skipCommitCheck, but no such metadata field is actually set here (and there are no other references in the codebase). This is misleading for future maintainers—either add the intendedskipCommitCheckflag (if it exists) or remove/update the comment to match actual behavior.
sourceTaskId: originalTask?.id || null,
sourceAgentId: originalAgentId || null,
// skipCommitCheck: this task may exit cleanly with zero new commits if the
// initial Copilot review came back clean. Don't false-positive that as failure.
readOnly: false
… comment - buildPostPRMergeSteps matches the known stop-modes (on-findings/on-clean) explicitly, so an invalid reviewStopMode falls through to the safe clean/too-large gate instead of allowing a `partial` merge. - Replace the stale `skipCommitCheck` comment (no such field) with an accurate note about the zero-commit clean exit.
The "short-circuits when every record is already subscribed" test cleared the peerFetch mock after a fixed 10ms sleep, but a subscribe's fire-and-forget push fires peerFetch a tick later — under CI load that call leaked past the sleep and tripped `expect(peerFetch).not.toHaveBeenCalled()`. Add a drain-only `__drainForTests()` hook (awaits the write/push tail without resetting state) and use it to settle the initial pushes deterministically before the assertion.
…ording - Update sanitizeTaskMetadata docblock to describe the value-constrained reviewer/reviewers/reviewStopMode keys (no longer boolean-only). - ScheduleTab coerces reviewerApplies as true/'true' so a legacy string 'false' doesn't read as checked. - The multi-reviewer follow-up copilot bullet now says "wait for the pre-requested review" when Copilot leads (it was already requested) and "request at its turn" only when it trails — avoids a redundant request.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 23 out of 23 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
server/services/agentLifecycle.js:1485
canSpawnFollowUpis computed from the full normalized reviewer list, butspawnReviewLoopFollowUp()may dropcopiloton non-GitHub forges and returnnullwhen that empties the effective list. Right now that results in silently not spawning any follow-up (and no warning) even though reviewLoop was enabled. Consider detecting anullreturn here and emitting a log/warning explaining that the reviewer chain is empty after forge filtering (e.g., copilot-only on GitLab), so the PR/MR won’t be auto-reviewed/merged.
const canSpawnFollowUp = shouldRequestCopilot && reviewerList.length > 0;
if (canSpawnFollowUp) {
await spawnReviewLoopFollowUp({
originalAgentId: agentId,
originalTask,
…ering The full-path "Code Review" note implied the system always pre-requests a native Copilot review when copilot is in the list. Branch on whether copilot LEADS: system pre-requests only when it leads; otherwise the follow-up requests it at its turn (and skips it on non-GitHub forges).
…viewers} The placeholder now substitutes an ordered, comma-joined reviewer list, so the singular name was misleading. Rename it across the schedule/reference prompt templates and their substitution sites (cos.js, referenceRepos.js) + test mocks.
…rename - Correct the changelog: Copilot is pre-requested only when it LEADS the list (not merely "in the list"). - Clarify the claim-flow's 3-round cap is intentionally more conservative than the CoS follow-up's 10-iteration cap (the two are distinct flows). - Rename the cos.js local `reviewer` to `reviewersCsv` — it holds the comma-joined reviewer list, not a single reviewer.
Summary
reviewerwith an orderedreviewerslist, plusreviewStopMode(all/on-findings/on-clean) andreviewerApplies, so the custom-task form and scheduled-task form can runcodex/gemini/copilotin a chosen order.ReviewerPickercomponent (click-order pills with ↑↓ reorder + ✕ remove, stop-mode select shown for 2+ reviewers, reviewer-applies toggle shown when a non-copilot reviewer is present). Wired intoTaskAddForm,ScheduleTab, and theAgentCardbadge.sanitizeTaskMetadata,cos.addTask, both spawn paths' cleanup (agentLifecyclerunner path andagentCliSpawningdirect path), and the agent prompts as slashdo--review-with a,b,c [--review-stop-on-findings|--review-stop-on-clean] [--reviewer-applies]. The system-spawned review-loop follow-up runs each reviewer in order; Copilot is pre-requested only when it's in the list, and CLI reviewers still run on non-GitHub forges.normalizeReviewers()(servervalidation.js+ clientconstants.jsmirror) keeps legacy single-reviewertasks/schedules working.lib/slashdosubmodule v2.18.0 → v3.1.0 (adds the multi-reviewer--review-withlist + stop-mode +--reviewer-applies), and adds.claude/commands/do/{depfree,pr-better,scan}.mdsymlinks to match the new command set.Test plan
cd server && npm test— full suite green (incl. newvalidation,agentPromptBuilder,cleanupAgentWorktree,agentCliSpawningreviewer cases)cd client && npm test— green (incl. newReviewerPickertests)npm run build— clean