Skip to content

feat(cos): multiple ordered code-review agents per task#459

Merged
atomantic merged 13 commits into
mainfrom
feat/multi-reviewer-ordered
May 23, 2026
Merged

feat(cos): multiple ordered code-review agents per task#459
atomantic merged 13 commits into
mainfrom
feat/multi-reviewer-ordered

Conversation

@atomantic
Copy link
Copy Markdown
Owner

Summary

  • Replaces the single per-task reviewer with an ordered reviewers list, plus reviewStopMode (all/on-findings/on-clean) and reviewerApplies, so the custom-task form and scheduled-task form can run codex/gemini/copilot in a chosen order.
  • New ReviewerPicker component (click-order pills with ↑↓ reorder + ✕ remove, stop-mode select shown for 2+ reviewers, reviewer-applies toggle shown when a non-copilot reviewer is present). Wired into TaskAddForm, ScheduleTab, and the AgentCard badge.
  • Threads the list end-to-end: schema + sanitizeTaskMetadata, cos.addTask, both spawn paths' cleanup (agentLifecycle runner path and agentCliSpawning direct path), and the agent prompts as slashdo --review-with a,b,c [--review-stop-on-findings|--review-stop-on-clean] [--reviewer-applies]. The system-spawned review-loop follow-up runs each reviewer in order; Copilot is pre-requested only when it's in the list, and CLI reviewers still run on non-GitHub forges.
  • Read-side normalizeReviewers() (server validation.js + client constants.js mirror) keeps legacy single-reviewer tasks/schedules working.
  • Bumps the bundled lib/slashdo submodule v2.18.0 → v3.1.0 (adds the multi-reviewer --review-with list + stop-mode + --reviewer-applies), and adds .claude/commands/do/{depfree,pr-better,scan}.md symlinks to match the new command set.

Test plan

  • cd server && npm test — full suite green (incl. new validation, agentPromptBuilder, cleanupAgentWorktree, agentCliSpawning reviewer cases)
  • cd client && npm test — green (incl. new ReviewerPicker tests)
  • npm run build — clean
  • Browser: enabled Review Loop in the task form, added/reordered/removed reviewers, confirmed stop-mode + reviewer-applies appear conditionally
  • Multi-reviewer review loop (codex → gemini → copilot) runs on this PR

atomantic added 3 commits May 23, 2026 12:42
Replace the single `reviewer` field with an ordered `reviewers` list plus
`reviewStopMode` and `reviewerApplies`, so the custom-task and scheduled-task
forms can run codex/gemini/copilot in a chosen order. Threads the list through
to slashdo as `--review-with a,b,c [--review-stop-on-*] [--reviewer-applies]`.
Bumps the bundled slashdo submodule to v3.1.0, which adds the multi-reviewer
flag this feature drives.
- Only pre-request the native Copilot review when copilot LEADS the reviewer
  list; when it trails a CLI reviewer the follow-up requests it at its turn so
  Copilot reviews the post-fix diff, not a stale one.
- Under an explicit stop-mode, allow merging on a `partial` aggregate (an
  intentional short-circuit) — previously the merge step only accepted
  `clean`/`too-large`, leaving stop-mode runs open.
- Emit only the per-reviewer-kind bullet that applies (copilot vs CLI) in the
  ordered follow-up step instead of always printing both.
- Qualify "(for Copilot) resolve threads" in the compact follow-up variant.
- Fix "the the review loop" double article in the merge step for the
  multi/CLI-reviewer case.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR upgrades the CoS “review loop” from a single per-task reviewer to an ordered multi-reviewer pipeline (e.g., codex → gemini → copilot), with configurable stop behavior and an option for CLI reviewers to apply fixes automatically. It threads the new reviewer list through task creation, validation/sanitization, agent spawning/cleanup (runner + direct CLI spawn paths), prompt generation, and the client UI (task + schedule forms, agent card badge), while keeping legacy reviewer tasks working via normalizeReviewers().

Changes:

  • Replace singular reviewer with ordered reviewers[], plus reviewStopMode and reviewerApplies, end-to-end (schemas, metadata sanitization, addTask, spawning/cleanup, prompts).
  • Add ReviewerPicker UI (add in click-order, reorder, remove; conditional stop-mode + reviewer-applies controls) and wire it into TaskAddForm and ScheduleTab.
  • Update prompt instructions + tests to reflect multi-reviewer slashdo flags (--review-with a,b,c, --review-stop-on-*, --reviewer-applies) and multi-reviewer follow-up behavior.

Reviewed changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
server/services/taskSchedule.js Updates task prompt wording to describe ordered multi-reviewer review steps.
server/services/cos.js Threads normalized reviewer lists into task prompt templating and task metadata construction.
server/services/cleanupAgentWorktree.test.js Expands coverage for multi-reviewer follow-up spawning and Copilot pre-request rules.
server/services/agentPromptBuilder.test.js Updates/extends prompt expectations for reviewer lists and new flags/stop-modes.
server/services/agentPromptBuilder.js Implements multi-reviewer prompt wiring, flag construction, and updated completion workflow text.
server/services/agentLifecycle.js Updates spawn/cleanup paths to pass reviewer lists and handle Copilot pre-request only when leading.
server/services/agentCliSpawning.test.js Ensures direct CLI spawning path passes reviewers/stop-mode/applies into cleanup.
server/services/agentCliSpawning.js Threads normalized reviewers + flags through direct-spawn cleanup options.
server/lib/validation.test.js Adds tests for reviewers list normalization and slashdo arg building.
server/lib/validation.js Adds normalizeReviewers, stop-mode constants, and buildReviewWithArgs; extends schemas/sanitization.
client/src/components/cos/TaskAddForm.jsx Replaces reviewer dropdown with ReviewerPicker and submits new reviewer metadata fields.
client/src/components/cos/tabs/ScheduleTab.jsx Replaces reviewer dropdown with ReviewerPicker for scheduled tasks and writes new metadata keys.
client/src/components/cos/tabs/AgentCard.jsx Updates review-loop badge to display ordered reviewer lists.
client/src/components/cos/ReviewerPicker.test.jsx Adds unit tests for add/remove/reorder and conditional controls.
client/src/components/cos/ReviewerPicker.jsx New ordered multi-reviewer picker component.
client/src/components/cos/constants.js Adds client-side reviewer normalization + stop-mode constants (mirroring server).
.changelog/NEXT.md Documents the new multi-reviewer capability and slashdo submodule bump.
Comments suppressed due to low confidence (1)

server/services/agentLifecycle.js:1547

  • The documentation comment above spawnReviewLoopFollowUp still describes spawning a Copilot-only /do:rpr loop and GitHub-only behavior. The implementation now supports an ordered reviewers list (dropping copilot only on non-GitHub forges) plus reviewStopMode/reviewerApplies. Please update that docblock to match the new multi-reviewer semantics and parameter names.
 * branch (via createWorktree's `existingBranch` option) so it can fix-and-push
 * without trampling concurrent agents.
 */
export async function spawnReviewLoopFollowUp({ originalAgentId, originalTask, prUrl, prBranch, sourceWorkspace, reviewers = DEFAULT_REVIEWERS, reviewStopMode = DEFAULT_REVIEW_STOP_MODE, reviewerApplies = false }) {
  if (!prUrl || !prBranch) return null;

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread server/services/agentLifecycle.js
…eviewer flow

The block comment still described the singular `reviewer` choice and Copilot-only
follow-up; document the ordered `reviewers` list, copilot-leads pre-request rule,
and reviewStopMode/reviewerApplies threading.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 1 comment.

Comment thread client/src/components/cos/ReviewerPicker.jsx
`key={value}` would collide if the stored reviewers list ever contained
duplicates (legacy metadata / manual edits), corrupting reorder/remove. Render
an order-preserving de-duped list so keys stay unique without reintroducing the
empty-state coercion.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 4 comments.

Comment thread client/src/components/cos/tabs/ScheduleTab.jsx
Comment thread server/services/agentLifecycle.js Outdated
Comment thread server/services/agentLifecycle.js
Comment thread server/services/taskSchedule.js Outdated
…doc fixes

- Always hand the full ordered reviewer list to the follow-up; it requests
  Copilot at its turn, so a failed/absent pre-request (incl. copilot-only) no
  longer leaves the PR open. The only drop (copilot on non-GitHub forge) stays
  centralized in spawnReviewLoopFollowUp.
- ScheduleTab feeds the picker the stored list without empty→copilot coercion so
  an explicit clear shows the empty-state hint (consistent with TaskAddForm).
- Update spawnReviewLoopFollowUp docblock + Phase 6 schedule prose for the
  multi-reviewer / stop-mode / reviewer-applies flow.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

server/services/agentPromptBuilder.js:286

  • The follow-up merge/verification steps are hardcoded to gh pr merge and gh pr view, but follow-up tasks may be spawned for GitLab MRs when CLI reviewers are configured (copilot is stripped, not the whole follow-up). On non-GitHub forges these commands will fail, leaving the MR open and the worktree/branch potentially leaking. Please choose the correct forge CLI (gh vs glab) for merge/view based on reviewLoopPRHost/reviewLoopPRUrl (or, if review-loop follow-ups are intended to be GitHub-only, block spawning on non-GitHub hosts even for CLI reviewers and emit a clear warning).
   gh pr merge "${prUrl}" --squash --delete-branch
   \`\`\`
   ${prOwner && prRepo && prNumber ? `(Equivalent: \`gh pr merge ${prNumber} --repo ${prOwner}/${prRepo} --squash --delete-branch\`.)` : ''}
   You have already verified the review is clean, so force the immediate merge. Adding any merge-deferral flag would leave the PR open after you exit.
5. Confirm the PR is actually merged before exiting: \`gh pr view "${prUrl}" --json state -q .state\` must return \`MERGED\`. If it returns \`OPEN\` or \`CLOSED\`, investigate (a check is failing, a thread is still unresolved, or branch protection is blocking) — fix and retry the merge. Do NOT exit until state is \`MERGED\`.

Comment thread server/services/agentPromptBuilder.js Outdated
…utral

The follow-up keeps CLI reviewers (codex/gemini/claude) on non-GitHub forges
(only copilot is stripped there), so the review-loop prompt no longer hardcodes
`gh pr diff` — it points at the CLI's own base-diff mode / `git diff <base>...HEAD`,
noting `gh pr diff` works on GitHub.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

server/services/agentLifecycle.js:1597

  • This metadata block includes a comment about skipCommitCheck, but no such metadata field is actually set here (and there are no other references in the codebase). This is misleading for future maintainers—either add the intended skipCommitCheck flag (if it exists) or remove/update the comment to match actual behavior.
      sourceTaskId: originalTask?.id || null,
      sourceAgentId: originalAgentId || null,
      // skipCommitCheck: this task may exit cleanly with zero new commits if the
      // initial Copilot review came back clean. Don't false-positive that as failure.
      readOnly: false

Comment thread server/services/agentPromptBuilder.js
… comment

- buildPostPRMergeSteps matches the known stop-modes (on-findings/on-clean)
  explicitly, so an invalid reviewStopMode falls through to the safe
  clean/too-large gate instead of allowing a `partial` merge.
- Replace the stale `skipCommitCheck` comment (no such field) with an accurate
  note about the zero-commit clean exit.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 3 comments.

Comment thread server/lib/validation.js
Comment thread client/src/components/cos/tabs/ScheduleTab.jsx
Comment thread server/services/agentPromptBuilder.js
atomantic added 2 commits May 23, 2026 13:43
The "short-circuits when every record is already subscribed" test cleared the
peerFetch mock after a fixed 10ms sleep, but a subscribe's fire-and-forget push
fires peerFetch a tick later — under CI load that call leaked past the sleep and
tripped `expect(peerFetch).not.toHaveBeenCalled()`. Add a drain-only
`__drainForTests()` hook (awaits the write/push tail without resetting state)
and use it to settle the initial pushes deterministically before the assertion.
…ording

- Update sanitizeTaskMetadata docblock to describe the value-constrained
  reviewer/reviewers/reviewStopMode keys (no longer boolean-only).
- ScheduleTab coerces reviewerApplies as true/'true' so a legacy string
  'false' doesn't read as checked.
- The multi-reviewer follow-up copilot bullet now says "wait for the
  pre-requested review" when Copilot leads (it was already requested) and
  "request at its turn" only when it trails — avoids a redundant request.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 23 out of 23 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

server/services/agentLifecycle.js:1485

  • canSpawnFollowUp is computed from the full normalized reviewer list, but spawnReviewLoopFollowUp() may drop copilot on non-GitHub forges and return null when that empties the effective list. Right now that results in silently not spawning any follow-up (and no warning) even though reviewLoop was enabled. Consider detecting a null return here and emitting a log/warning explaining that the reviewer chain is empty after forge filtering (e.g., copilot-only on GitLab), so the PR/MR won’t be auto-reviewed/merged.
      const canSpawnFollowUp = shouldRequestCopilot && reviewerList.length > 0;
      if (canSpawnFollowUp) {
        await spawnReviewLoopFollowUp({
          originalAgentId: agentId,
          originalTask,

Comment thread server/services/agentPromptBuilder.js Outdated
…ering

The full-path "Code Review" note implied the system always pre-requests a native
Copilot review when copilot is in the list. Branch on whether copilot LEADS:
system pre-requests only when it leads; otherwise the follow-up requests it at
its turn (and skips it on non-GitHub forges).
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 23 out of 23 changed files in this pull request and generated 2 comments.

Comment thread server/services/taskSchedule.js Outdated
Comment thread server/services/taskSchedule.js Outdated
…viewers}

The placeholder now substitutes an ordered, comma-joined reviewer list, so the
singular name was misleading. Rename it across the schedule/reference prompt
templates and their substitution sites (cos.js, referenceRepos.js) + test mocks.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 25 changed files in this pull request and generated 3 comments.

Comment thread server/services/taskSchedule.js Outdated
Comment thread .changelog/NEXT.md Outdated
Comment thread server/services/cos.js Outdated
…rename

- Correct the changelog: Copilot is pre-requested only when it LEADS the list
  (not merely "in the list").
- Clarify the claim-flow's 3-round cap is intentionally more conservative than
  the CoS follow-up's 10-iteration cap (the two are distinct flows).
- Rename the cos.js local `reviewer` to `reviewersCsv` — it holds the
  comma-joined reviewer list, not a single reviewer.
@atomantic atomantic merged commit 29d3f2a into main May 23, 2026
2 checks passed
@atomantic atomantic deleted the feat/multi-reviewer-ordered branch May 23, 2026 21:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants