feat(cos): multiple ordered code-review agents per task by atomantic · Pull Request #459 · atomantic/PortOS

atomantic · 2026-05-23T19:43:14Z

Summary

Replaces the single per-task reviewer with an ordered reviewers list, plus reviewStopMode (all/on-findings/on-clean) and reviewerApplies, so the custom-task form and scheduled-task form can run codex/gemini/copilot in a chosen order.
New ReviewerPicker component (click-order pills with ↑↓ reorder + ✕ remove, stop-mode select shown for 2+ reviewers, reviewer-applies toggle shown when a non-copilot reviewer is present). Wired into TaskAddForm, ScheduleTab, and the AgentCard badge.
Threads the list end-to-end: schema + sanitizeTaskMetadata, cos.addTask, both spawn paths' cleanup (agentLifecycle runner path and agentCliSpawning direct path), and the agent prompts as slashdo --review-with a,b,c [--review-stop-on-findings|--review-stop-on-clean] [--reviewer-applies]. The system-spawned review-loop follow-up runs each reviewer in order; Copilot is pre-requested only when it's in the list, and CLI reviewers still run on non-GitHub forges.
Read-side normalizeReviewers() (server validation.js + client constants.js mirror) keeps legacy single-reviewer tasks/schedules working.
Bumps the bundled lib/slashdo submodule v2.18.0 → v3.1.0 (adds the multi-reviewer --review-with list + stop-mode + --reviewer-applies), and adds .claude/commands/do/{depfree,pr-better,scan}.md symlinks to match the new command set.

Test plan

cd server && npm test — full suite green (incl. new validation, agentPromptBuilder, cleanupAgentWorktree, agentCliSpawning reviewer cases)
cd client && npm test — green (incl. new ReviewerPicker tests)
npm run build — clean
Browser: enabled Review Loop in the task form, added/reordered/removed reviewers, confirmed stop-mode + reviewer-applies appear conditionally
Multi-reviewer review loop (codex → gemini → copilot) runs on this PR

Replace the single `reviewer` field with an ordered `reviewers` list plus `reviewStopMode` and `reviewerApplies`, so the custom-task and scheduled-task forms can run codex/gemini/copilot in a chosen order. Threads the list through to slashdo as `--review-with a,b,c [--review-stop-on-*] [--reviewer-applies]`. Bumps the bundled slashdo submodule to v3.1.0, which adds the multi-reviewer flag this feature drives.

- Only pre-request the native Copilot review when copilot LEADS the reviewer list; when it trails a CLI reviewer the follow-up requests it at its turn so Copilot reviews the post-fix diff, not a stale one. - Under an explicit stop-mode, allow merging on a `partial` aggregate (an intentional short-circuit) — previously the merge step only accepted `clean`/`too-large`, leaving stop-mode runs open.

- Emit only the per-reviewer-kind bullet that applies (copilot vs CLI) in the ordered follow-up step instead of always printing both. - Qualify "(for Copilot) resolve threads" in the compact follow-up variant. - Fix "the the review loop" double article in the merge step for the multi/CLI-reviewer case.

Copilot

Pull request overview

This PR upgrades the CoS “review loop” from a single per-task reviewer to an ordered multi-reviewer pipeline (e.g., codex → gemini → copilot), with configurable stop behavior and an option for CLI reviewers to apply fixes automatically. It threads the new reviewer list through task creation, validation/sanitization, agent spawning/cleanup (runner + direct CLI spawn paths), prompt generation, and the client UI (task + schedule forms, agent card badge), while keeping legacy reviewer tasks working via normalizeReviewers().

Changes:

Replace singular reviewer with ordered reviewers[], plus reviewStopMode and reviewerApplies, end-to-end (schemas, metadata sanitization, addTask, spawning/cleanup, prompts).
Add ReviewerPicker UI (add in click-order, reorder, remove; conditional stop-mode + reviewer-applies controls) and wire it into TaskAddForm and ScheduleTab.
Update prompt instructions + tests to reflect multi-reviewer slashdo flags (--review-with a,b,c, --review-stop-on-*, --reviewer-applies) and multi-reviewer follow-up behavior.

Reviewed changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
server/services/taskSchedule.js	Updates task prompt wording to describe ordered multi-reviewer review steps.
server/services/cos.js	Threads normalized reviewer lists into task prompt templating and task metadata construction.
server/services/cleanupAgentWorktree.test.js	Expands coverage for multi-reviewer follow-up spawning and Copilot pre-request rules.
server/services/agentPromptBuilder.test.js	Updates/extends prompt expectations for reviewer lists and new flags/stop-modes.
server/services/agentPromptBuilder.js	Implements multi-reviewer prompt wiring, flag construction, and updated completion workflow text.
server/services/agentLifecycle.js	Updates spawn/cleanup paths to pass reviewer lists and handle Copilot pre-request only when leading.
server/services/agentCliSpawning.test.js	Ensures direct CLI spawning path passes reviewers/stop-mode/applies into cleanup.
server/services/agentCliSpawning.js	Threads normalized reviewers + flags through direct-spawn cleanup options.
server/lib/validation.test.js	Adds tests for reviewers list normalization and slashdo arg building.
server/lib/validation.js	Adds `normalizeReviewers`, stop-mode constants, and `buildReviewWithArgs`; extends schemas/sanitization.
client/src/components/cos/TaskAddForm.jsx	Replaces reviewer dropdown with `ReviewerPicker` and submits new reviewer metadata fields.
client/src/components/cos/tabs/ScheduleTab.jsx	Replaces reviewer dropdown with `ReviewerPicker` for scheduled tasks and writes new metadata keys.
client/src/components/cos/tabs/AgentCard.jsx	Updates review-loop badge to display ordered reviewer lists.
client/src/components/cos/ReviewerPicker.test.jsx	Adds unit tests for add/remove/reorder and conditional controls.
client/src/components/cos/ReviewerPicker.jsx	New ordered multi-reviewer picker component.
client/src/components/cos/constants.js	Adds client-side reviewer normalization + stop-mode constants (mirroring server).
.changelog/NEXT.md	Documents the new multi-reviewer capability and slashdo submodule bump.

Comments suppressed due to low confidence (1)

server/services/agentLifecycle.js:1547

The documentation comment above spawnReviewLoopFollowUp still describes spawning a Copilot-only /do:rpr loop and GitHub-only behavior. The implementation now supports an ordered reviewers list (dropping copilot only on non-GitHub forges) plus reviewStopMode/reviewerApplies. Please update that docblock to match the new multi-reviewer semantics and parameter names.

 * branch (via createWorktree's `existingBranch` option) so it can fix-and-push
 * without trampling concurrent agents.
 */
export async function spawnReviewLoopFollowUp({ originalAgentId, originalTask, prUrl, prBranch, sourceWorkspace, reviewers = DEFAULT_REVIEWERS, reviewStopMode = DEFAULT_REVIEW_STOP_MODE, reviewerApplies = false }) {
  if (!prUrl || !prBranch) return null;

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…eviewer flow The block comment still described the singular `reviewer` choice and Copilot-only follow-up; document the ordered `reviewers` list, copilot-leads pre-request rule, and reviewStopMode/reviewerApplies threading.

Copilot

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 1 comment.

`key={value}` would collide if the stored reviewers list ever contained duplicates (legacy metadata / manual edits), corrupting reorder/remove. Render an order-preserving de-duped list so keys stay unique without reintroducing the empty-state coercion.

Copilot

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 4 comments.

…doc fixes - Always hand the full ordered reviewer list to the follow-up; it requests Copilot at its turn, so a failed/absent pre-request (incl. copilot-only) no longer leaves the PR open. The only drop (copilot on non-GitHub forge) stays centralized in spawnReviewLoopFollowUp. - ScheduleTab feeds the picker the stored list without empty→copilot coercion so an explicit clear shows the empty-state hint (consistent with TaskAddForm). - Update spawnReviewLoopFollowUp docblock + Phase 6 schedule prose for the multi-reviewer / stop-mode / reviewer-applies flow.

Copilot

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

server/services/agentPromptBuilder.js:286

The follow-up merge/verification steps are hardcoded to gh pr merge and gh pr view, but follow-up tasks may be spawned for GitLab MRs when CLI reviewers are configured (copilot is stripped, not the whole follow-up). On non-GitHub forges these commands will fail, leaving the MR open and the worktree/branch potentially leaking. Please choose the correct forge CLI (gh vs glab) for merge/view based on reviewLoopPRHost/reviewLoopPRUrl (or, if review-loop follow-ups are intended to be GitHub-only, block spawning on non-GitHub hosts even for CLI reviewers and emit a clear warning).

   gh pr merge "${prUrl}" --squash --delete-branch
   \`\`\`
   ${prOwner && prRepo && prNumber ? `(Equivalent: \`gh pr merge ${prNumber} --repo ${prOwner}/${prRepo} --squash --delete-branch\`.)` : ''}
   You have already verified the review is clean, so force the immediate merge. Adding any merge-deferral flag would leave the PR open after you exit.
5. Confirm the PR is actually merged before exiting: \`gh pr view "${prUrl}" --json state -q .state\` must return \`MERGED\`. If it returns \`OPEN\` or \`CLOSED\`, investigate (a check is failing, a thread is still unresolved, or branch protection is blocking) — fix and retry the merge. Do NOT exit until state is \`MERGED\`.

…utral The follow-up keeps CLI reviewers (codex/gemini/claude) on non-GitHub forges (only copilot is stripped there), so the review-loop prompt no longer hardcodes `gh pr diff` — it points at the CLI's own base-diff mode / `git diff <base>...HEAD`, noting `gh pr diff` works on GitHub.

Copilot

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

server/services/agentLifecycle.js:1597

This metadata block includes a comment about skipCommitCheck, but no such metadata field is actually set here (and there are no other references in the codebase). This is misleading for future maintainers—either add the intended skipCommitCheck flag (if it exists) or remove/update the comment to match actual behavior.

      sourceTaskId: originalTask?.id || null,
      sourceAgentId: originalAgentId || null,
      // skipCommitCheck: this task may exit cleanly with zero new commits if the
      // initial Copilot review came back clean. Don't false-positive that as failure.
      readOnly: false

… comment - buildPostPRMergeSteps matches the known stop-modes (on-findings/on-clean) explicitly, so an invalid reviewStopMode falls through to the safe clean/too-large gate instead of allowing a `partial` merge. - Replace the stale `skipCommitCheck` comment (no such field) with an accurate note about the zero-commit clean exit.

Copilot

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 3 comments.

The "short-circuits when every record is already subscribed" test cleared the peerFetch mock after a fixed 10ms sleep, but a subscribe's fire-and-forget push fires peerFetch a tick later — under CI load that call leaked past the sleep and tripped `expect(peerFetch).not.toHaveBeenCalled()`. Add a drain-only `__drainForTests()` hook (awaits the write/push tail without resetting state) and use it to settle the initial pushes deterministically before the assertion.

…ording - Update sanitizeTaskMetadata docblock to describe the value-constrained reviewer/reviewers/reviewStopMode keys (no longer boolean-only). - ScheduleTab coerces reviewerApplies as true/'true' so a legacy string 'false' doesn't read as checked. - The multi-reviewer follow-up copilot bullet now says "wait for the pre-requested review" when Copilot leads (it was already requested) and "request at its turn" only when it trails — avoids a redundant request.

Copilot

Pull request overview

Copilot reviewed 23 out of 23 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

server/services/agentLifecycle.js:1485

canSpawnFollowUp is computed from the full normalized reviewer list, but spawnReviewLoopFollowUp() may drop copilot on non-GitHub forges and return null when that empties the effective list. Right now that results in silently not spawning any follow-up (and no warning) even though reviewLoop was enabled. Consider detecting a null return here and emitting a log/warning explaining that the reviewer chain is empty after forge filtering (e.g., copilot-only on GitLab), so the PR/MR won’t be auto-reviewed/merged.

      const canSpawnFollowUp = shouldRequestCopilot && reviewerList.length > 0;
      if (canSpawnFollowUp) {
        await spawnReviewLoopFollowUp({
          originalAgentId: agentId,
          originalTask,

…ering The full-path "Code Review" note implied the system always pre-requests a native Copilot review when copilot is in the list. Branch on whether copilot LEADS: system pre-requests only when it leads; otherwise the follow-up requests it at its turn (and skips it on non-GitHub forges).

Copilot

Pull request overview

Copilot reviewed 23 out of 23 changed files in this pull request and generated 2 comments.

…viewers} The placeholder now substitutes an ordered, comma-joined reviewer list, so the singular name was misleading. Rename it across the schedule/reference prompt templates and their substitution sites (cos.js, referenceRepos.js) + test mocks.

Copilot

Pull request overview

Copilot reviewed 25 out of 25 changed files in this pull request and generated 3 comments.

…rename - Correct the changelog: Copilot is pre-requested only when it LEADS the list (not merely "in the list"). - Clarify the claim-flow's 3-round cap is intentionally more conservative than the CoS follow-up's 10-iteration cap (the two are distinct flows). - Rename the cos.js local `reviewer` to `reviewersCsv` — it holds the comma-joined reviewer list, not a single reviewer.

atomantic added 3 commits May 23, 2026 12:42

atomantic requested a review from Copilot May 23, 2026 19:57

Copilot started reviewing on behalf of atomantic May 23, 2026 19:58 View session