Skip to content

subagent-driven-development: phantom completion fix + persistent validator pool (Team Mode)#578

Open
STRML wants to merge 23 commits into
obra:mainfrom
STRML:feature/persistent-validator-pool
Open

subagent-driven-development: phantom completion fix + persistent validator pool (Team Mode)#578
STRML wants to merge 23 commits into
obra:mainfrom
STRML:feature/persistent-validator-pool

Conversation

@STRML
Copy link
Copy Markdown

@STRML STRML commented Feb 28, 2026

Background

This PR addresses two confirmed failure modes in subagent-driven-development, plus a third discovered during testing.

Problem 1: Phantom completion

The most common complaint: a plan finishes, everything shows green, and the implementation is partially or entirely hallucinated. Tasks are "complete" but the code does not actually do what was asked.

The root cause is structural. The current spec reviewer prompt asks the reviewer to "read the actual code" and "do not trust the report" -- but there is nothing that forces it to produce verifiable output. A reviewer can (and does) write a plausible-sounding prose narrative, tick the boxes, and return passed without ever opening a file.

Why citations specifically? A file:line citation is the smallest piece of evidence that is both cheap to produce and hard to fake convincingly. Writing "requirement X is implemented at src/validators.py:42 -- def validate_email(addr): return bool(EMAIL_RE.match(addr))" requires either (a) having read that line of code, or (b) generating a plausible-looking but fabricated citation. Fabricated citations fail immediately on audit -- the file does not exist, the line number is wrong, or the excerpt does not implement the requirement.

Problem 2: Slow validation on large plans

Team mode parallelizes implementers, but validation is still sequential per task: spec review, then quality review, each as a fresh subagent. Each fresh subagent pays ~30-60 seconds of startup overhead and then re-reads the codebase from scratch. On a 10-task plan, that is 20 fresh review invocations, each re-discovering the same project structure.

There is a second problem: fresh subagents have no memory of prior tasks. If task 5 is supposed to use a function that was supposed to be implemented in task 2 but was not, the task 5 reviewer has no way to know.

Why persistent validators? A validator that stays alive across tasks accumulates codebase context. Combined with the SHA map (team lead passes prior task SHAs in every review request, validator re-reads the relevant diffs from git rather than relying on memory), this gives validators a reliable mechanism to check dependencies without trusting their own recall. The combined spec+quality pass also eliminates the spec->quality sequential bottleneck. One agent call per task instead of two.


What this PR does

Phase 1: Mandatory evidence (both modes, no architectural change)

implementer-prompt.md: New mandatory evidence section. Before reporting complete, implementers must paste full test command output (verbatim), git diff --stat output (verbatim), list of changed files with line counts, and behavior demonstration output if the task requires it. Reports without this section are explicitly rejected by the controller.

spec-reviewer-prompt.md: New mandatory citation section. For each requirement in the task spec, the reviewer must provide the specific file:line and a 2-3 line code excerpt. Requirements with no citation found are marked MISSING. A passed verdict without the per-requirement citation table is invalid and must be rejected by the controller.

Phase 2: Persistent validator pool (Team Mode only, requires beta teams API)

BETA DEPENDENCY: Phase 2 requires Claude Code experimental agent teams feature (TeamCreate, SendMessage, TaskList). Phase 1 ships independently and works without it.

validator-prompt.md (new file): A SendMessage payload template for persistent validator team members. Contains both spec compliance (with file:line citations) and code quality in a single pass. Includes SHA map field for cross-task dependency re-reads, mandatory citation format, verdict structure, and re-review protocol.

SKILL.md: Full Team Mode section covering:

  • Pre-flight: synthesize-and-clear -- agent synthesizes a complete self-contained execution prompt (full plan text, architecture context, team composition, initial task assignments) and presents it via EnterPlanMode. User selects "clear context and auto-accept edits" (Shift+Tab) to start team execution with a clean slate -- no /compact needed.
  • Team composition: 2-3 implementers (parallel) + 2 persistent validators
  • Review routing: how to send review requests including BASE/HEAD SHAs and the SHA map
  • Validator availability: message-based tracking (verdict receipt = validator available)
  • Re-review loop bound: 3 cycles max, then escalate to human with full rejection history
  • Context poisoning mitigations: re-read rule, bounded re-open, re-open cascade, validator rotation (fresh validator every 5 tasks for plans >8 tasks)
  • Validator failure recovery: 60s timeout -> one ping -> fall back to fresh subagent

code-quality-reviewer-prompt.md: Deleted. Its role is absorbed by validator-prompt.md in team mode; in standard mode, superpowers:requesting-code-review is referenced directly.

Problem 3: Context overload (found during first use)

When a user invokes team mode after a long brainstorming or planning session, the team lead starts with most of its context already consumed. As the plan executes, each task report and verdict adds more. A team lead that hits the context limit mid-plan stops orchestrating.

The original fix used a mandatory interactive /compact check: before TeamCreate, the agent would ask the user how much context they had left and wait for confirmation before proceeding. This worked but was slow and wasteful -- compaction summarizes accumulated noise rather than eliminating it.

Revised fix: The pre-flight now uses Claude Code's built-in "clear context and auto-accept edits" dialog instead. The agent synthesizes a complete self-contained execution prompt embedding the full plan, all task requirements, architecture context, team composition, and initial assignments -- everything the fresh session needs. It presents this via EnterPlanMode. The user selects option 1 (Shift+Tab) to clear context and begin execution. The team lead starts with zero accumulated context, not summarized context. No /compact wait, no back-and-forth. One keypress.

writing-plans: Team Mode surfacing + beta gating

Rebased on upstream main, which restructured writing-plans execution handoff to be capability-based (subagents vs. no subagents). Within the subagent-capable path, this PR adds Team Mode as a conditional option -- only offered when Team Mode is actually available, since it's a beta feature not present in most clients. When Team Mode is unavailable, Standard mode is used directly with no spurious choice presented.


SDK memory validation

Before shipping Phase 2, we validated that Claude Code team members retain conversation context between SendMessage calls. Test: spawn agent, send "remember the number 42", send separate message "what number did I ask you to remember?", agent recalled correctly. Result: RETAINED (tested 2026-02-28). The persistent validator topology depends on this.


Design doc

docs/plans/2026-02-28-persistent-validator-pool-design.md in this branch. Includes the full problem analysis, both phases, the SDK validation gate, context poisoning mitigations with their explicit tradeoffs, and success criteria. The design went through 4 rounds of iterative review before implementation.


Test plan

Phase 1 (standard and team mode):

  • Run a 3-5 task plan: verify every spec reviewer verdict contains file:line citations for each requirement, not just prose
  • Verify implementer reports contain test output + diff stat before the controller accepts them
  • Submit a report without the evidence section: verify controller rejects it

Phase 2 (team mode, requires beta teams feature):

  • Run a 3-5 task plan in team mode: verify validator verdicts contain per-requirement citations, combined spec+quality in single message
  • Plant a cross-task gap: task N spec mentions using a function from task N-2 that was never implemented. Verify the persistent validator catches it via the SHA map re-read, not just memory.
  • Trigger 3 consecutive rejections on a single task: verify team lead escalates to human after exactly 3 cycles
  • Measure wall-clock review latency vs. current sequential two-stage fresh-subagent flow on a 5-task plan: target 40% reduction

Pre-flight clean-context launch:

  • Invoke team mode after a long session: verify the agent synthesizes a self-contained execution prompt and enters plan mode before TeamCreate
  • Verify the synthesized prompt contains full task text, architecture context, and team composition -- sufficient to execute without prior conversation history
  • Confirm user selecting "clear context and auto-accept edits" starts team execution with clean context

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 28, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a two‑phase Persistent Validator Pool: Phase 1 enforces mandatory evidence and per‑requirement citations; Phase 2 introduces Team Mode with persistent validators, combined spec+quality reviews, cross‑task SHA maps, re‑review loops, rotation and recovery rules, and a fallback path if memory retention fails. Documentation and prompt templates updated.

Changes

Cohort / File(s) Summary
Planning & Implementation
docs/plans/2026-02-28-persistent-validator-pool-design.md, docs/plans/2026-02-28-persistent-validator-pool-implementation.md
Added comprehensive two‑phase design and implementation plans, rollout, success criteria, failure modes, gating (SDK memory test), fallback path, and detailed Phase 1/Phase 2 tasks.
Phase 1: Evidence & Citation Requirements
skills/subagent-driven-development/implementer-prompt.md, skills/subagent-driven-development/spec-reviewer-prompt.md
Implementer prompt now requires a "Mandatory Evidence" block (full test output, git diff --stat, changed files list, optional demo). Spec reviewer prompt requires per‑requirement file:line citations and short code excerpts; uncited requirements are flagged as missing.
Phase 2: Persistent Validator Pool & Prompts
skills/subagent-driven-development/validator-prompt.md, skills/subagent-driven-development/SKILL.md, skills/subagent-driven-development/...
Added validator-prompt.md (combined spec+quality validator template with BASE_SHA/HEAD_SHA, SHA map, re‑review protocol). SKILL.md reworked for Team Mode (validator topology, idle tracking, rotation rules, re‑review loops, cross‑task SHA handling, failure recovery, fallback).
Prompt Template Cleanup
skills/subagent-driven-development/code-quality-reviewer-prompt.md
Removed the separate code quality reviewer prompt; its responsibilities are subsumed by the combined validator prompt and Team Mode flow.

Sequence Diagram(s)

sequenceDiagram
    actor Implementer
    participant Repo as Git/Repo
    participant SDK as SDK Memory Test
    participant Orchestrator as Orchestrator
    participant ValidatorPool as Validator Pool
    participant Validator as Validator Agent

    Implementer->>Repo: Push changes + Mandatory Evidence (tests, diff stat, files)
    Orchestrator->>SDK: Run memory validation
    alt SDK passes (Team Mode)
        Orchestrator->>ValidatorPool: Activate team (assign validators)
        ValidatorPool->>Validator: Dispatch combined spec+quality review (BASE_SHA/HEAD_SHA, SHA map)
        Validator->>Repo: Fetch diffs & verify file:line citations
        Validator->>Orchestrator: Submit Report (Spec Compliance, Code Quality, Verdict)
        alt Fix required
            Orchestrator->>Implementer: Request fix (include validator citations)
            Implementer->>Repo: Push fix
            Orchestrator->>ValidatorPool: Re‑review (bounded cycles)
        end
    else SDK fails (fallback)
        Orchestrator->>Validator: Dispatch combined single‑agent review (fallback path)
        Validator->>Repo: Fetch diffs & verify claims
        Validator->>Orchestrator: Submit Report
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

🐰 I nibbled through diffs and hopped the queue,
Citations in paw, validators in view,
Phase one secures the evidence trail,
Phase two keeps memory so reviews don’t fail,
A little rabbit cheers: persistence prevails!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and specifically describes the main changes: a phantom completion fix (Phase 1 evidence requirement) and persistent validator pool implementation (Phase 2, Team Mode).
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description check ✅ Passed The PR description is detailed and directly related to the changeset, comprehensively explaining the three problems (phantom completion, slow validation, context overload) and how the changes address them across Phase 1 and Phase 2.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (9)
skills/subagent-driven-development/validator-prompt.md (1)

5-5: Address static analysis hint: add language specifier.

The fenced code block at line 5 (and continuing through the document) lacks a language specifier. While this is a documentation file with a template, consider adding a language hint for better rendering:

📝 Optional formatting improvement
-**Purpose:** Single-pass combined spec compliance and code quality review. Validators are persistent agents that retain context across tasks — use this to detect cross-task gaps by re-reading prior task diffs.
+**Purpose:** Single-pass combined spec compliance and code quality review. Validators are persistent agents that retain context across tasks — use this to detect cross-task gaps by re-reading prior task diffs.

Change the code block delimiter at line 13 from ``` to ```markdown or ```text for proper syntax highlighting.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/subagent-driven-development/validator-prompt.md` at line 5, The fenced
code block uses a plain triple-backtick delimiter (```); update that delimiter
to include a language specifier such as ```markdown or ```text so the block
renders with proper syntax highlighting and clearer formatting (replace the
existing ``` fence with ```markdown or ```text throughout the document).
skills/subagent-driven-development/SKILL.md (1)

122-122: Address markdown static analysis hints (optional).

Three fenced code blocks lack language specifiers (lines 122, 185, 327). While these are non-critical formatting issues, adding language hints (e.g., ```text) would improve rendering in some Markdown viewers.

Also applies to: 185-185, 327-327

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/subagent-driven-development/SKILL.md` at line 122, Three fenced code
blocks in SKILL.md are missing language specifiers; update each triple-backtick
block (the three occurrences of ``` with no language) to include an explicit
language hint such as ```text (or ```bash/```json as appropriate for the
content) so Markdown linters and renderers correctly highlight and render them.
skills/subagent-driven-development/implementer-prompt.md (2)

89-89: Clarify who the "controller" is.

The term "controller" appears without prior definition. Based on the PR context, this seems to refer to either the team lead (in team mode) or the main agent (in standard mode). Consider making this explicit: "rejected by the controller (team lead or main agent)" for clarity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/subagent-driven-development/implementer-prompt.md` at line 89, The
phrase "controller" in the sentence "A report submitted without this evidence
section will be **rejected by the controller**..." is ambiguous; update that
sentence in implementer-prompt.md to explicitly define "controller" by replacing
or appending clarification such as "rejected by the controller (team lead in
team mode, or the main agent in standard mode)" so reviewers/readers immediately
understand who enforces rejection; locate the exact string "rejected by the
controller" and change it to the clarified wording.

79-79: Consider more flexible git diff guidance.

The instruction git diff --stat <base-sha> assumes the implementer knows the base SHA. In practice, implementers might be working from various starting points (feature branch, multiple commits, etc.). Consider adding guidance like "or use git diff --stat $(git merge-base main HEAD) to compare against the branch point" for cases where the base isn't obvious.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/subagent-driven-development/implementer-prompt.md` at line 79, Update
the "**2. Diff stat** — run `git diff --stat <base-sha>`" line to offer flexible
alternatives when the base SHA isn't known: keep the original suggestion but add
guidance to obtain the branch point via the merge-base of main and the current
branch (so users can pass that as the base), and also mention using a simple
fallback like comparing against the previous commit (HEAD~1); reference the
"**2. Diff stat**" heading and the git diff --stat instruction to locate and
update the text.
docs/plans/2026-02-28-persistent-validator-pool-design.md (1)

95-95: Dependency inference limitation is clearly acknowledged.

The design acknowledges that validators can only catch dependencies "where the task spec explicitly mentions prior tasks" and that "implicit dependencies that go unmentioned in the spec may not be caught." This is an important limitation. Consider adding a recommendation to the writing-plans skill to explicitly document all task dependencies in the task spec text itself.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/plans/2026-02-28-persistent-validator-pool-design.md` at line 95, Update
the design note to add a concrete recommendation for the writing-plans skill:
require authors to include an explicit "Dependencies" section in each task spec
that lists prior task IDs, diffs, and any modules/schemas referenced; update
references to the writing-plans validator logic (the "Infer dependencies"
guidance) to point authors at this required section so implicit dependencies are
not missed and validators can reliably re-read prior diffs when listed.
docs/plans/2026-02-28-persistent-validator-pool-implementation.md (2)

4-4: SDK memory test already completed - Task 4 is documentation only.

Line 4 states the SDK memory validation test was already run and RETAINED (tested 2026-02-28). This means:

  1. Task 4 (lines 200-246) serves as documentation of the test protocol, not an action item
  2. Task 4b (lines 248-368) should be skipped entirely

Consider adding a note at Task 4 (line 200) stating: "Note: This test was already completed (see header line 4). This task documents the test protocol for reference."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/plans/2026-02-28-persistent-validator-pool-implementation.md` at line 4,
Add a short clarifying note at the start of Task 4 indicating the SDK memory
validation was already completed (referencing the header line that shows "SDK
memory test result: RETAINED (tested 2026-02-28)") and mark Task 4 as
documentation of the protocol only; also remove or explicitly mark Task 4b as
"skipped" or "not required" (update the Task 4 and Task 4b section
headings/messages in the document to reflect these statuses so readers know Task
4 is reference-only and Task 4b should be skipped).

47-68: Address markdown static analysis hints for embedded code blocks.

Multiple fenced code blocks in the plan lack language specifiers (lines 47, 109, and others per static analysis). While this is a documentation file, adding language hints would improve readability. However, given the volume of code blocks and the nature of this file as an implementation plan (not user-facing docs), this is a low-priority nitpick.

Also applies to: 109-124

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/plans/2026-02-28-persistent-validator-pool-implementation.md` around
lines 47 - 68, Several fenced code blocks in the plan (the blocks labeled "Test
output", "Diff stat", and the command/output examples under "Behavior
demonstration"/"Files changed") lack language specifiers; update each
triple-backtick fence to include an appropriate language hint (e.g., ```bash or
```sh for commands, ```text or ```console for pasted output, and ```diff for git
diff output) so syntax highlighters and linters stop flagging them; locate the
blocks by searching for the literal headings "Test output", "Diff stat", and
"Behavior demonstration" in the plan and add the language tags to their opening
``` lines.
skills/subagent-driven-development/spec-reviewer-prompt.md (2)

77-77: Redundant statement - consider consolidating.

This line repeats the message already stated at lines 70-71 ("A ✅ approval without per-requirement citations is invalid"). While repetition for emphasis can be effective, having it immediately after the first statement (only 6 lines apart) may be unnecessary. Consider either consolidating these statements or adding different emphasis if the repetition is intentional.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/subagent-driven-development/spec-reviewer-prompt.md` at line 77,
Remove the redundant sentence "A ✅ without the per-requirement citation table is
not accepted." (the duplicate statement that repeats the earlier "A ✅ approval
without per-requirement citations is invalid") by either deleting this second
occurrence or consolidating it into the earlier sentence so the rule appears
once with any needed emphasis; update the surrounding text in
spec-reviewer-prompt.md to keep a single clear statement about per-requirement
citations.

70-71: "Controller" terminology appears here too.

Same suggestion as in implementer-prompt.md: consider defining "controller" explicitly as "controller (team lead or main agent)" for clarity across both modes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/subagent-driven-development/spec-reviewer-prompt.md` around lines 70 -
71, The term "controller" is ambiguous in spec-reviewer-prompt.md; update the
first occurrence of "controller" (the bold sentence starting with "A ✅
approval...") to explicitly define it as "controller (team lead or main agent)"
and apply the same explicit parenthetical definition in implementer-prompt.md so
both modes use the same terminology; ensure the wording is brief and consistent
wherever "controller" is introduced.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/plans/2026-02-28-persistent-validator-pool-design.md`:
- Line 121: The Validator rotation tradeoff paragraph ("Validator rotation
(large plans):") is too weak given it undermines cross-task gap detection;
update docs and design to (1) explicitly document the limitation in SKILL.md
under "Validator rotation" and add a clear recommendation against using team
mode for plans where cross-task gap detection is critical, (2) propose and
document two concrete alternatives in the design doc: increase the rotation
threshold (e.g., rotate every 10 tasks instead of 5) and a "handoff protocol"
where the outgoing validator emits a summarized dependency digest (task IDs, key
outputs, unresolved references) that the incoming validator ingests, and (3) add
a short example of the handoff data shape and where it is produced/consumed
(referencing the Validator rotation paragraph and the term "handoff protocol")
so reviewers can evaluate feasibility.
- Around line 148-151: Revise the Phase 2 bullet list to accurately describe the
files: update the `spec-reviewer-prompt.md` bullet to say it remains for Phase 1
citation requirements (standard mode), add a new bullet for
`validator-prompt.md` stating "**create new.** Combined spec+quality prompt for
persistent validators (team mode)", and keep the
`code-quality-reviewer-prompt.md` bullet marked "**delete.**" so the intent and
file roles are unambiguous.

In `@docs/plans/2026-02-28-persistent-validator-pool-implementation.md`:
- Around line 248-368: The document contains an inconsistent fallback (Task 4b)
that describes a "Fresh Subagent" validator-prompt but the repo actually
contains the persistent-team validator-prompt used when SDK memory was retained;
update the plan by either removing Task 4b entirely or clearly marking it as
skipped: find the "Task 4b: (FALLBACK ONLY — skip if SDK retains context)"
section and the referenced validator-prompt.md and SKILL.md entries and replace
the fallback content with a single prominent note like "SKIPPED - SDK memory
retained. See main path Tasks 5-10" (or delete the fallback block), and ensure
any references to code-quality-reviewer-prompt.md in SKILL.md and docs are
updated to point to the actual validator-prompt.md if needed.

In `@skills/subagent-driven-development/SKILL.md`:
- Around line 196-200: The text is inconsistent about how validators signal
availability: update the SKILL.md lines describing validator signaling to match
validator-prompt.md and the implementation plan by specifying that validators
must send both TaskUpdate(status: "completed") and a SendMessage when their
verdict is ready; update the phrasing that currently says the team lead "uses
message receipt as the availability signal" to state the team lead treats either
the TaskUpdate or the SendMessage (preferably requiring TaskUpdate as the
canonical signal and SendMessage as a human-readable notification) as the
availability/ completion signal; ensure references to SendMessage and TaskUpdate
and the implementation plan (Task 6) are aligned so all docs consistently
require TaskUpdate(status: "completed") plus SendMessage.

---

Nitpick comments:
In `@docs/plans/2026-02-28-persistent-validator-pool-design.md`:
- Line 95: Update the design note to add a concrete recommendation for the
writing-plans skill: require authors to include an explicit "Dependencies"
section in each task spec that lists prior task IDs, diffs, and any
modules/schemas referenced; update references to the writing-plans validator
logic (the "Infer dependencies" guidance) to point authors at this required
section so implicit dependencies are not missed and validators can reliably
re-read prior diffs when listed.

In `@docs/plans/2026-02-28-persistent-validator-pool-implementation.md`:
- Line 4: Add a short clarifying note at the start of Task 4 indicating the SDK
memory validation was already completed (referencing the header line that shows
"SDK memory test result: RETAINED (tested 2026-02-28)") and mark Task 4 as
documentation of the protocol only; also remove or explicitly mark Task 4b as
"skipped" or "not required" (update the Task 4 and Task 4b section
headings/messages in the document to reflect these statuses so readers know Task
4 is reference-only and Task 4b should be skipped).
- Around line 47-68: Several fenced code blocks in the plan (the blocks labeled
"Test output", "Diff stat", and the command/output examples under "Behavior
demonstration"/"Files changed") lack language specifiers; update each
triple-backtick fence to include an appropriate language hint (e.g., ```bash or
```sh for commands, ```text or ```console for pasted output, and ```diff for git
diff output) so syntax highlighters and linters stop flagging them; locate the
blocks by searching for the literal headings "Test output", "Diff stat", and
"Behavior demonstration" in the plan and add the language tags to their opening
``` lines.

In `@skills/subagent-driven-development/implementer-prompt.md`:
- Line 89: The phrase "controller" in the sentence "A report submitted without
this evidence section will be **rejected by the controller**..." is ambiguous;
update that sentence in implementer-prompt.md to explicitly define "controller"
by replacing or appending clarification such as "rejected by the controller
(team lead in team mode, or the main agent in standard mode)" so
reviewers/readers immediately understand who enforces rejection; locate the
exact string "rejected by the controller" and change it to the clarified
wording.
- Line 79: Update the "**2. Diff stat** — run `git diff --stat <base-sha>`" line
to offer flexible alternatives when the base SHA isn't known: keep the original
suggestion but add guidance to obtain the branch point via the merge-base of
main and the current branch (so users can pass that as the base), and also
mention using a simple fallback like comparing against the previous commit
(HEAD~1); reference the "**2. Diff stat**" heading and the git diff --stat
instruction to locate and update the text.

In `@skills/subagent-driven-development/SKILL.md`:
- Line 122: Three fenced code blocks in SKILL.md are missing language
specifiers; update each triple-backtick block (the three occurrences of ``` with
no language) to include an explicit language hint such as ```text (or
```bash/```json as appropriate for the content) so Markdown linters and
renderers correctly highlight and render them.

In `@skills/subagent-driven-development/spec-reviewer-prompt.md`:
- Line 77: Remove the redundant sentence "A ✅ without the per-requirement
citation table is not accepted." (the duplicate statement that repeats the
earlier "A ✅ approval without per-requirement citations is invalid") by either
deleting this second occurrence or consolidating it into the earlier sentence so
the rule appears once with any needed emphasis; update the surrounding text in
spec-reviewer-prompt.md to keep a single clear statement about per-requirement
citations.
- Around line 70-71: The term "controller" is ambiguous in
spec-reviewer-prompt.md; update the first occurrence of "controller" (the bold
sentence starting with "A ✅ approval...") to explicitly define it as "controller
(team lead or main agent)" and apply the same explicit parenthetical definition
in implementer-prompt.md so both modes use the same terminology; ensure the
wording is brief and consistent wherever "controller" is introduced.

In `@skills/subagent-driven-development/validator-prompt.md`:
- Line 5: The fenced code block uses a plain triple-backtick delimiter (```);
update that delimiter to include a language specifier such as ```markdown or
```text so the block renders with proper syntax highlighting and clearer
formatting (replace the existing ``` fence with ```markdown or ```text
throughout the document).

ℹ️ Review info

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e4a2375 and 582927a.

📒 Files selected for processing (7)
  • docs/plans/2026-02-28-persistent-validator-pool-design.md
  • docs/plans/2026-02-28-persistent-validator-pool-implementation.md
  • skills/subagent-driven-development/SKILL.md
  • skills/subagent-driven-development/code-quality-reviewer-prompt.md
  • skills/subagent-driven-development/implementer-prompt.md
  • skills/subagent-driven-development/spec-reviewer-prompt.md
  • skills/subagent-driven-development/validator-prompt.md
💤 Files with no reviewable changes (1)
  • skills/subagent-driven-development/code-quality-reviewer-prompt.md

Comment thread docs/plans/2026-02-28-persistent-validator-pool-design.md
Comment thread docs/plans/2026-02-28-persistent-validator-pool-design.md Outdated
Comment thread docs/plans/2026-02-28-persistent-validator-pool-implementation.md
Comment thread skills/subagent-driven-development/SKILL.md
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
skills/subagent-driven-development/SKILL.md (2)

202-206: ⚠️ Potential issue | 🟡 Minor

Inconsistency: validator availability signaling mechanism (duplicate of past review).

Lines 203-205 describe availability tracking solely via SendMessage receipt, but line 126 states validators are "Declared idle/busy via TaskUpdate." According to the past review comment, validator-prompt.md instructs validators to use both TaskUpdate(status: "completed") AND SendMessage.

Recommendation: Update this section to clarify that:

  1. Validators send both TaskUpdate(status: "completed") (for programmatic tracking) and SendMessage (for human-readable notification)
  2. Team lead tracks availability using TaskUpdate as the canonical signal
  3. Align with validator-prompt.md and the implementation plan (Task 6, lines 597-598 per past review)

This directly addresses the unresolved past review comment.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/subagent-driven-development/SKILL.md` around lines 202 - 206, Update
the text to resolve the inconsistency by stating that validators emit both
TaskUpdate(status: "completed") for programmatic availability tracking and
SendMessage for human-readable notification; specify that the team lead treats
TaskUpdate as the canonical availability signal (use TaskUpdate events to decide
whether to send a new review) while SendMessage is used only for human
notifications; mention this aligns with validator-prompt.md and the
implementation plan (Task 6) so the section referencing validator availability
via SendMessage is changed to reflect both signals and the canonical role of
TaskUpdate.

126-126: ⚠️ Potential issue | 🟡 Minor

Inconsistency: validator availability signaling mechanism persists.

Line 126 states validators are "Declared idle/busy via TaskUpdate," but lines 203-205 state "Validators notify the team lead when their verdict is ready via SendMessage. The team lead uses message receipt as the availability signal."

This is the same inconsistency flagged in the past review comment on lines 202-206. The description needs to be unified across the file.

Recommendation: If validators use both TaskUpdate(status: "completed") AND SendMessage (as suggested by validator-prompt.md per the past review), update line 126 and lines 203-205 to reflect both mechanisms. Clarify which signal is canonical (likely TaskUpdate for programmatic tracking, with SendMessage as a human-readable notification).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/subagent-driven-development/SKILL.md` at line 126, Unify the
availability signaling description for validator agents: update the "Validator
agents (2)" line and the later paragraph that mentions "Validators notify the
team lead..." so they consistently state that validators declare status via
TaskUpdate(status: "completed") as the canonical, programmatic availability
signal and also send a SendMessage as a human-readable notification (per
validator-prompt.md); explicitly mark TaskUpdate as authoritative for workflow
tracking and SendMessage as optional/notification to the team lead.
🧹 Nitpick comments (1)
skills/subagent-driven-development/SKILL.md (1)

128-128: Optional: Add language specifiers to fenced code blocks.

Static analysis flags three fenced code blocks without language specifiers (lines 128, 191, 333). While these are templates/pseudo-code rather than executable code, adding text as the language identifier would silence the linter warnings:

-```
+```text

This is purely a stylistic improvement and does not affect functionality.

Also applies to: 191-191, 333-333

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/subagent-driven-development/SKILL.md` at line 128, Three fenced code
blocks use bare triple-backticks (``` ) causing linter warnings; update each
opening fence to include the language specifier "text" (i.e., change ``` to
```text) for the three occurrences of empty-language code blocks so the linter
stops flagging them and the markdown remains unchanged in content; search the
SKILL.md for code fences that are just three backticks and replace their opening
fence accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@skills/subagent-driven-development/SKILL.md`:
- Around line 202-206: Update the text to resolve the inconsistency by stating
that validators emit both TaskUpdate(status: "completed") for programmatic
availability tracking and SendMessage for human-readable notification; specify
that the team lead treats TaskUpdate as the canonical availability signal (use
TaskUpdate events to decide whether to send a new review) while SendMessage is
used only for human notifications; mention this aligns with validator-prompt.md
and the implementation plan (Task 6) so the section referencing validator
availability via SendMessage is changed to reflect both signals and the
canonical role of TaskUpdate.
- Line 126: Unify the availability signaling description for validator agents:
update the "Validator agents (2)" line and the later paragraph that mentions
"Validators notify the team lead..." so they consistently state that validators
declare status via TaskUpdate(status: "completed") as the canonical,
programmatic availability signal and also send a SendMessage as a human-readable
notification (per validator-prompt.md); explicitly mark TaskUpdate as
authoritative for workflow tracking and SendMessage as optional/notification to
the team lead.

---

Nitpick comments:
In `@skills/subagent-driven-development/SKILL.md`:
- Line 128: Three fenced code blocks use bare triple-backticks (``` ) causing
linter warnings; update each opening fence to include the language specifier
"text" (i.e., change ``` to ```text) for the three occurrences of empty-language
code blocks so the linter stops flagging them and the markdown remains unchanged
in content; search the SKILL.md for code fences that are just three backticks
and replace their opening fence accordingly.

ℹ️ Review info

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 582927a and fdec277.

📒 Files selected for processing (1)
  • skills/subagent-driven-development/SKILL.md

@STRML
Copy link
Copy Markdown
Author

STRML commented Feb 28, 2026

Addressed all coderabbit review comments in commit 0a0a054:

Comment 1 (validator rotation tradeoff, design.md:121): Added a rotation handoff protocol — before the outgoing validator is replaced, the team lead asks it for a dependency summary (cross-task relationships observed across its reviewed tasks). This summary is included in the first review request to the incoming validator, preserving the most critical cross-task knowledge across the rotation boundary. Applied in both design doc and SKILL.md.

Comment 2 (Files to Change description, design.md:151): Corrected the Phase 2 entry — it now accurately describes creating a new validator-prompt.md for persistent team members, with spec-reviewer-prompt.md unchanged from Phase 1 (standard mode only).

Comment 3 (Task 4b fallback, impl.md:368): Added a prominent ⚠️ SKIPPED notice at the top of Task 4b explaining that SDK memory was retained, the persistent topology was implemented instead, and the fallback description is preserved for reference only in case SDK behavior changes.

Comment 4 (TaskUpdate inconsistency, SKILL.md:206): Already resolved in commit 582927a (before this review was posted). validator-prompt.md line 78 was changed from TaskUpdate(status: "completed") to a plain SendMessage notification. Both files now use message-based availability signaling only.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
docs/plans/2026-02-28-persistent-validator-pool-implementation.md (2)

128-144: Consider adding language identifier to code block for clarity.

The code block at line 128 showing the "Current" content could benefit from a language identifier (e.g., markdown or text) to improve readability and satisfy linters. This is a minor formatting improvement that would align with markdown best practices.

📝 Proposed improvement
-Current (lines 58-60 inside template):
-```
+Current (lines 58-60 inside template):
+```text
     Report:
     - ✅ Spec compliant (if everything matches after code inspection)
     - ❌ Issues found: [list specifically what's missing or extra, with file:line references]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/plans/2026-02-28-persistent-validator-pool-implementation.md` around
lines 128 - 144, Update the triple-backtick code block that shows the "Current
(lines 58-60 inside template):" snippet so it includes a language identifier
(e.g., change ``` to ```text or ```markdown) to satisfy linters and improve
readability; specifically edit the fenced block containing the "Report:" lines
in the docs/plans template so the opening fence is annotated (look for the exact
"Report:" snippet shown) and leave the block contents unchanged.

186-191: Consider adding language identifier to bash code block.

The bash code block at line 187 could include the bash language identifier for syntax highlighting and linter compliance. This is a minor formatting improvement.

📝 Proposed improvement
 **Step 3: Check git log**
 
-```bash
+```bash
 git log --oneline -5

Note: This may already be correct in your editor; if line 187 already shows `bash`, disregard this comment.
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @docs/plans/2026-02-28-persistent-validator-pool-implementation.md around
lines 186 - 191, Add the bash language identifier to the fenced code block
containing the git command so the block starting with the triple backticks and
containing "git log --oneline -5" is annotated as bash; locate the code fence in
the document (the block that currently reads ``` followed by "git log --oneline
-5") and change it to include the language token (bash) on the opening fence to
enable proper syntax highlighting and linter compliance.


</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🤖 Prompt for all review comments with AI agents</summary>

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @skills/subagent-driven-development/SKILL.md:

  • Line 126: The doc is inconsistent about how validator availability is tracked:
    keep the validators' declaration "via TaskUpdate" in the "Validator agents (2)"
    line and update the team-lead description (the paragraph that currently says
    "Team lead tracks validator availability via message flow" and "uses message
    receipt as the availability signal") to explicitly state that the Team lead
    observes message flow for activity but relies on the validators' TaskUpdate
    state (TaskUpdate) as the authoritative availability signal; ensure references
    to "Validator agents (2)", "Team lead", and "TaskUpdate" are mentioned so the
    workflow example remains consistent.

Nitpick comments:
In @docs/plans/2026-02-28-persistent-validator-pool-implementation.md:

  • Around line 128-144: Update the triple-backtick code block that shows the
    "Current (lines 58-60 inside template):" snippet so it includes a language
    identifier (e.g., change totext or ```markdown) to satisfy linters and
    improve readability; specifically edit the fenced block containing the "Report:"
    lines in the docs/plans template so the opening fence is annotated (look for the
    exact "Report:" snippet shown) and leave the block contents unchanged.
  • Around line 186-191: Add the bash language identifier to the fenced code block
    containing the git command so the block starting with the triple backticks and
    containing "git log --oneline -5" is annotated as bash; locate the code fence in
    the document (the block that currently reads ``` followed by "git log --oneline
    -5") and change it to include the language token (bash) on the opening fence to
    enable proper syntax highlighting and linter compliance.

</details>

---

<details>
<summary>ℹ️ Review info</summary>

**Configuration used**: Repository UI

**Review profile**: CHILL

**Plan**: Pro

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between fdec277854872b9195d6d8c4c5fc5d4382469796 and 0a0a054c424434d38edcfdf9cf3137e474fb9da6.

</details>

<details>
<summary>📒 Files selected for processing (3)</summary>

* `docs/plans/2026-02-28-persistent-validator-pool-design.md`
* `docs/plans/2026-02-28-persistent-validator-pool-implementation.md`
* `skills/subagent-driven-development/SKILL.md`

</details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

Comment thread skills/subagent-driven-development/SKILL.md Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
skills/subagent-driven-development/SKILL.md (1)

203-203: Optional: vary sentence structure for improved readability.

Three consecutive sentences begin with "Team lead" which slightly impacts flow. Consider varying the structure for better readability.

♻️ Proposed variation
-Validator agents (2) declare their state via TaskUpdate: `in_progress` when they receive a review request, `completed` when they return a verdict. Team lead relies on this TaskUpdate state as the authoritative availability signal. Team lead also observes message flow as a secondary activity indicator — a validator is treated as busy from the moment they receive a review request until their verdict arrives. Team lead does not send a second review to a validator before receiving their verdict for the first.
+Validator agents (2) declare their state via TaskUpdate: `in_progress` when they receive a review request, `completed` when they return a verdict. The team lead relies on this TaskUpdate state as the authoritative availability signal, while also observing message flow as a secondary activity indicator — a validator is treated as busy from the moment they receive a review request until their verdict arrives. A second review is never sent to a validator before receiving their verdict for the first.

As per static analysis hint: LanguageTool flagged three successive sentences beginning with "Team lead" (ENGLISH_WORD_REPEAT_BEGINNING_RULE).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/subagent-driven-development/SKILL.md` at line 203, The three
successive sentences starting with "Team lead" reduce readability; rewrite the
passage that describes validator agents declaring state via TaskUpdate
(`in_progress`, `completed`) and how the team lead uses that state and message
flow to gate further requests so the sentences don't all begin with "Team lead"
— e.g., start one sentence with the action ("The team lead relies on this
TaskUpdate state..."), another with the observation ("They also observe message
flow..."), or use passive/compound structures to vary rhythm while keeping the
same meaning and preserving references to TaskUpdate, `in_progress`,
`completed`, validator, and verdict.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@skills/subagent-driven-development/SKILL.md`:
- Line 8: Update the sentence "Execute plan by dispatching fresh subagent per
task, with combined spec+quality review after each. In team mode: persistent
validator agents eliminate per-review startup overhead and enable cross-task gap
detection." to clarify that the "combined spec+quality review after each"
applies only to Team Mode; explicitly state that Standard Mode performs two
separate passes (first a spec compliance review, then a code quality review).
Edit the SKILL.md phrasing around that sentence to mention "Team Mode: combined
spec+quality review..." and "Standard Mode: sequential spec compliance review
followed by code quality review" so readers are not misled.

---

Nitpick comments:
In `@skills/subagent-driven-development/SKILL.md`:
- Line 203: The three successive sentences starting with "Team lead" reduce
readability; rewrite the passage that describes validator agents declaring state
via TaskUpdate (`in_progress`, `completed`) and how the team lead uses that
state and message flow to gate further requests so the sentences don't all begin
with "Team lead" — e.g., start one sentence with the action ("The team lead
relies on this TaskUpdate state..."), another with the observation ("They also
observe message flow..."), or use passive/compound structures to vary rhythm
while keeping the same meaning and preserving references to TaskUpdate,
`in_progress`, `completed`, validator, and verdict.

ℹ️ Review info

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0a0a054 and 141b990.

📒 Files selected for processing (1)
  • skills/subagent-driven-development/SKILL.md

Comment thread skills/subagent-driven-development/SKILL.md Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (5)
skills/subagent-driven-development/SKILL.md (5)

48-48: Consider clarifying that this section describes Standard Mode.

The section header "The Process" could be misinterpreted as applying to both Standard and Team modes. Since Team Mode has its own dedicated section (line 106), explicitly labeling this as "The Standard Mode Process" would improve clarity.

📝 Suggested clarification
-## The Process
+## The Process (Standard Mode)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/subagent-driven-development/SKILL.md` at line 48, Rename the ambiguous
section header "The Process" to explicitly "The Standard Mode Process" in
SKILL.md so readers understand it refers only to Standard Mode (contrast with
the existing "Team Mode" section); update any nearby references or links that
mention "The Process" to match the new header text to avoid broken anchors and
ensure the Team Mode section remains distinct.

335-399: Consider adding language specifiers to fenced code blocks.

The fenced code block at line 335 (Team Mode example workflow) doesn't have a language specifier. Consider adding text to follow markdown best practices.

📝 Suggested fix
-```
+```text
 You: I'm using Subagent-Driven Development (team mode) for this plan.
 
 [Pre-flight: "How much context do you have left? If below ~50%, run /compact before I proceed."]

(Apply to the opening fence tag of the code block)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/subagent-driven-development/SKILL.md` around lines 335 - 399, The
fenced code block containing the "You: I'm using Subagent-Driven Development
(team mode) for this plan." Team Mode example should include a language
specifier; change the opening fence from ``` to ```text (leave the contents and
closing fence unchanged) so the block is marked as plain text in SKILL.md and
improves markdown rendering.

203-203: Consider varying sentence structure for better readability.

Three consecutive sentences begin with "Team lead," which slightly impacts readability. Consider rewording to vary the sentence structure.

✍️ Suggested rewording
-Validator agents (2) declare their state via TaskUpdate: `in_progress` when they receive a review request, `completed` when they return a verdict. Team lead relies on this TaskUpdate state as the authoritative availability signal. Team lead also observes message flow as a secondary activity indicator — a validator is treated as busy from the moment they receive a review request until their verdict arrives. Team lead does not send a second review to a validator before receiving their verdict for the first.
+Validator agents (2) declare their state via TaskUpdate: `in_progress` when they receive a review request, `completed` when they return a verdict. The team lead relies on this TaskUpdate state as the authoritative availability signal, while also observing message flow as a secondary activity indicator. A validator is treated as busy from the moment they receive a review request until their verdict arrives, and the team lead does not send a second review to a validator before receiving their verdict for the first.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/subagent-driven-development/SKILL.md` at line 203, The three
consecutive sentences starting with "Team lead" in the paragraph describing
Validator agents and TaskUpdate create a repetitive rhythm; rewrite them to vary
sentence structure while preserving meaning: keep the phrases "Validator agents
(2)", "TaskUpdate: `in_progress`", and "completed" and that the Team lead relies
on TaskUpdate as the authoritative availability signal, but rephrase so one
sentence explains TaskUpdate states, another explains the Team lead's reliance
on that state, and a third describes observing message flow as a secondary
indicator and the rule that a validator won’t receive a second review before
their verdict—all while avoiding starting each sentence with "Team lead".

128-133: Consider adding language specifiers to fenced code blocks.

The fenced code block at line 128 (team composition diagram) doesn't have a language specifier. While this doesn't affect functionality, adding text as a language identifier follows markdown best practices and may improve rendering in some viewers.

📝 Suggested fix
-```
+```text
 Team:
   - team-lead (you)
   - implementer-1, implementer-2, implementer-3   ← parallel, one per task group
   - validator-1, validator-2                       ← persistent, combined spec+quality

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @skills/subagent-driven-development/SKILL.md around lines 128 - 133, Add a
language specifier to the fenced code block that contains the "Team:"
composition diagram in SKILL.md by changing the opening fence from ``` to

entries like implementer-1, validator-1) is marked as plain text for better
rendering and markdown tooling.

191-197: Consider adding language specifiers to fenced code blocks.

The fenced code block at line 191 (review request template) doesn't have a language specifier. Consider adding text or markdown to improve rendering consistency.

📝 Suggested fix
-```
+```text
 Task spec: [FULL TEXT of task requirements]
 Implementer report: [full report including test output, diff stat, files changed]
 BASE_SHA: [commit before this task]
 HEAD_SHA: [current HEAD after implementer's commit]
 SHA map (for dependency re-reads): [dict of prior task → their HEAD_SHA, e.g. {"task-1": "abc123", "task-2": "def456"}]

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @skills/subagent-driven-development/SKILL.md around lines 191 - 197, The
fenced code block in the review-request template (the Task spec / Implementer
report / BASE_SHA block) lacks a language tag; update the triple-backtick fence
to include a language specifier such as text or markdown (e.g., change totext) so the block renders consistently—edit the fenced block in SKILL.md
where the Task spec/Implementer report/BASE_SHA/HEAD_SHA/SHA map snippet appears
to add the specifier.


</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🤖 Prompt for all review comments with AI agents</summary>

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @skills/subagent-driven-development/SKILL.md:

  • Around line 203-206: The text incorrectly claims validators update TaskUpdate
    to signal availability; remove references to TaskUpdate and instead state that
    validators use SendMessage to notify the team lead that their verdict is ready
    and that the team lead treats SendMessage as the authoritative availability
    signal. Update the paragraph to say validators are treated as busy from receipt
    of a review request until their SendMessage verdict arrives, and the team lead
    will not send a second review to a validator before receiving that SendMessage;
    ensure references to TaskUpdate are deleted and replace with SendMessage-based
    signaling (referencing SendMessage and the team lead's routing behavior).

Nitpick comments:
In @skills/subagent-driven-development/SKILL.md:

  • Line 48: Rename the ambiguous section header "The Process" to explicitly "The
    Standard Mode Process" in SKILL.md so readers understand it refers only to
    Standard Mode (contrast with the existing "Team Mode" section); update any
    nearby references or links that mention "The Process" to match the new header
    text to avoid broken anchors and ensure the Team Mode section remains distinct.
  • Around line 335-399: The fenced code block containing the "You: I'm using
    Subagent-Driven Development (team mode) for this plan." Team Mode example should
    include a language specifier; change the opening fence from totext
    (leave the contents and closing fence unchanged) so the block is marked as plain
    text in SKILL.md and improves markdown rendering.
  • Line 203: The three consecutive sentences starting with "Team lead" in the
    paragraph describing Validator agents and TaskUpdate create a repetitive rhythm;
    rewrite them to vary sentence structure while preserving meaning: keep the
    phrases "Validator agents (2)", "TaskUpdate: in_progress", and "completed" and
    that the Team lead relies on TaskUpdate as the authoritative availability
    signal, but rephrase so one sentence explains TaskUpdate states, another
    explains the Team lead's reliance on that state, and a third describes observing
    message flow as a secondary indicator and the rule that a validator won’t
    receive a second review before their verdict—all while avoiding starting each
    sentence with "Team lead".
  • Around line 128-133: Add a language specifier to the fenced code block that
    contains the "Team:" composition diagram in SKILL.md by changing the opening
    fence from totext so the block (the lines starting with "Team:" and the
    subsequent list entries like implementer-1, validator-1) is marked as plain text
    for better rendering and markdown tooling.
  • Around line 191-197: The fenced code block in the review-request template (the
    Task spec / Implementer report / BASE_SHA block) lacks a language tag; update
    the triple-backtick fence to include a language specifier such as text or
    markdown (e.g., change totext) so the block renders consistently—edit
    the fenced block in SKILL.md where the Task spec/Implementer
    report/BASE_SHA/HEAD_SHA/SHA map snippet appears to add the specifier.

</details>

---

<details>
<summary>ℹ️ Review info</summary>

**Configuration used**: Repository UI

**Review profile**: CHILL

**Plan**: Pro

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between 141b990b8eb1aab769e38d6a2ae489fab6d8bd48 and 4ec9a5080784d9d23f3ec7a39c5d68d23121221f.

</details>

<details>
<summary>📒 Files selected for processing (1)</summary>

* `skills/subagent-driven-development/SKILL.md`

</details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

Comment thread skills/subagent-driven-development/SKILL.md Outdated
rahulsc added a commit to rahulsc/superpowers that referenced this pull request Mar 1, 2026
Define three evidence types (command, citation, diff) that all
other skills reference. Add rejection rule for missing evidence.
Add re-review loop bound (max 3 cycles). Add state.yml integration
for verification result persistence. Generalize "24 failure memories"
reference. Update Common Failures and Agent delegation patterns to
reference specific evidence types.

Addresses Findings V, Y from skills audit.
Inspired by PR obra#578 (@author) evidence-based review.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
rahulsc added a commit to rahulsc/superpowers that referenced this pull request Mar 1, 2026
…rsistence

Require command+diff evidence for task completion. Add TDD gate
(RED+GREEN evidence before marking complete). Add controller
anti-rationalization (never skip review because task seems simple).
Add re-review loop bound (max 3 cycles, then escalate). Add
pre-flight context check before each subagent dispatch. Add
state.yml progress tracking for cross-session resume. Fix final
reviewer scope to whole-feature (not just last task). Add token
optimization note for directory-based plans. Add review tiering
(light/standard/critical). Update example workflow to show
evidence collection and state.yml updates.

Addresses obra#485, obra#463, obra#291, obra#147, obra#87, PR obra#578, Findings D, K, V, W.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
rahulsc added a commit to rahulsc/superpowers that referenced this pull request Mar 1, 2026
…ates

Support mixed QA+implementation waves (Wave 0 = QA writes tests,
Wave N = implementation + QA writes tests for N+1). Add evidence
requirements (command + diff) for all completion reports. Add
state.yml wave tracking (plan.completed_tasks, plan.current_wave,
worktree.implementers.*). Add model tiering (opus for lead/review,
sonnet for implementers). Trim implementer prompt for project agents.
Add worktree lifecycle documentation, re-review loop bound (max 3
cycles), pre-flight context check, and citation evidence requirement
in reviewer prompts.

Addresses obra#429, obra#464, PR obra#578, Findings D, P, Q, T, U, V.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
rahulsc added a commit to rahulsc/superpowers that referenced this pull request Mar 1, 2026
Reject prose-only verdicts — require file:line citations per requirement.
Add anti-skip enforcement in Red Flags. Add re-review loop bound (max
3 cycles, then escalate). Add security review tier for auth/payment/data
tasks. Add per-implementer SHA guidance for team mode. Add uncommitted
changes check (git status before dispatch). References canonical evidence
format from verification-before-completion.

Addresses obra#528, obra#463, obra#479, PR obra#560, PR obra#578, Findings T, V.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
rahulsc added a commit to rahulsc/superpowers that referenced this pull request Mar 1, 2026
Define three evidence types (command, citation, diff) that all
other skills reference. Add rejection rule for missing evidence.
Add re-review loop bound (max 3 cycles). Add state.yml integration
for verification result persistence. Generalize "24 failure memories"
reference. Update Common Failures and Agent delegation patterns to
reference specific evidence types.

Addresses Findings V, Y from skills audit.
Inspired by PR obra#578 (@author) evidence-based review.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
rahulsc added a commit to rahulsc/superpowers that referenced this pull request Mar 1, 2026
…rsistence

Require command+diff evidence for task completion. Add TDD gate
(RED+GREEN evidence before marking complete). Add controller
anti-rationalization (never skip review because task seems simple).
Add re-review loop bound (max 3 cycles, then escalate). Add
pre-flight context check before each subagent dispatch. Add
state.yml progress tracking for cross-session resume. Fix final
reviewer scope to whole-feature (not just last task). Add token
optimization note for directory-based plans. Add review tiering
(light/standard/critical). Update example workflow to show
evidence collection and state.yml updates.

Addresses obra#485, obra#463, obra#291, obra#147, obra#87, PR obra#578, Findings D, K, V, W.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
rahulsc added a commit to rahulsc/superpowers that referenced this pull request Mar 1, 2026
…ates

Support mixed QA+implementation waves (Wave 0 = QA writes tests,
Wave N = implementation + QA writes tests for N+1). Add evidence
requirements (command + diff) for all completion reports. Add
state.yml wave tracking (plan.completed_tasks, plan.current_wave,
worktree.implementers.*). Add model tiering (opus for lead/review,
sonnet for implementers). Trim implementer prompt for project agents.
Add worktree lifecycle documentation, re-review loop bound (max 3
cycles), pre-flight context check, and citation evidence requirement
in reviewer prompts.

Addresses obra#429, obra#464, PR obra#578, Findings D, P, Q, T, U, V.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
rahulsc added a commit to rahulsc/superpowers that referenced this pull request Mar 1, 2026
Reject prose-only verdicts — require file:line citations per requirement.
Add anti-skip enforcement in Red Flags. Add re-review loop bound (max
3 cycles, then escalate). Add security review tier for auth/payment/data
tasks. Add per-implementer SHA guidance for team mode. Add uncommitted
changes check (git status before dispatch). References canonical evidence
format from verification-before-completion.

Addresses obra#528, obra#463, obra#479, PR obra#560, PR obra#578, Findings T, V.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
rahulsc added a commit to rahulsc/superpowers that referenced this pull request Mar 2, 2026
… PR obra#578)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
rahulsc added a commit to rahulsc/superpowers that referenced this pull request Mar 2, 2026
… PR obra#578)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@obra obra added bug Something isn't working claude-code Claude Code (Anthropic CLI) issues labels Mar 4, 2026
@STRML STRML force-pushed the feature/persistent-validator-pool branch from 9bec260 to 72d572f Compare March 10, 2026 00:53
@STRML
Copy link
Copy Markdown
Author

STRML commented Mar 10, 2026

Rebase note + design decision: surfacing Standard vs Team Mode

Just rebased on upstream main. One conflict in skills/writing-plans/SKILL.md worth documenting:

Upstream change: Restructured the execution handoff to be capability-based — if the harness has subagents, use subagent-driven-development; if not, fall back to executing-plans. No explicit choice offered.

This branch's change: Added Team Mode as an explicit option alongside Standard mode (the original fix that motivated this PR).

Resolution: Kept upstream's capability-based structure as the outer shell, but gated the Standard vs Team Mode choice on Team Mode availability — since it's a beta feature not present in most clients, presenting the choice unconditionally would be misleading:

If harness has subagents (Claude Code, etc.):
- REQUIRED: Use superpowers:subagent-driven-development
- If Team Mode is available (beta): offer Standard vs Team Mode — Team mode runs parallel implementers with persistent validators; best for 3+ independent tasks
- Otherwise: use Standard mode — sequential, fresh subagent per task, two-pass review

The design rationale: upstream's capability-based gating is correct, but within the subagent-capable path we want to surface Team Mode when it's available — otherwise users never discover it. Gating on availability avoids presenting a choice that most clients can't act on.

rahulsc added a commit to rahulsc/superpowers that referenced this pull request Mar 10, 2026
Define three evidence types (command, citation, diff) that all
other skills reference. Add rejection rule for missing evidence.
Add re-review loop bound (max 3 cycles). Add state.yml integration
for verification result persistence. Generalize "24 failure memories"
reference. Update Common Failures and Agent delegation patterns to
reference specific evidence types.

Addresses Findings V, Y from skills audit.
Inspired by PR obra#578 (@author) evidence-based review.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
rahulsc added a commit to rahulsc/superpowers that referenced this pull request Mar 10, 2026
…ates

Support mixed QA+implementation waves (Wave 0 = QA writes tests,
Wave N = implementation + QA writes tests for N+1). Add evidence
requirements (command + diff) for all completion reports. Add
state.yml wave tracking (plan.completed_tasks, plan.current_wave,
worktree.implementers.*). Add model tiering (opus for lead/review,
sonnet for implementers). Trim implementer prompt for project agents.
Add worktree lifecycle documentation, re-review loop bound (max 3
cycles), pre-flight context check, and citation evidence requirement
in reviewer prompts.

Addresses obra#429, obra#464, PR obra#578, Findings D, P, Q, T, U, V.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
rahulsc added a commit to rahulsc/superpowers that referenced this pull request Mar 10, 2026
… PR obra#578)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
rahulsc added a commit to rahulsc/superpowers that referenced this pull request Mar 10, 2026
…rsistence

Require command+diff evidence for task completion. Add TDD gate
(RED+GREEN evidence before marking complete). Add controller
anti-rationalization (never skip review because task seems simple).
Add re-review loop bound (max 3 cycles, then escalate). Add
pre-flight context check before each subagent dispatch. Add
state.yml progress tracking for cross-session resume. Fix final
reviewer scope to whole-feature (not just last task). Add token
optimization note for directory-based plans. Add review tiering
(light/standard/critical). Update example workflow to show
evidence collection and state.yml updates.

Addresses obra#485, obra#463, obra#291, obra#147, obra#87, PR obra#578, Findings D, K, V, W.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
rahulsc added a commit to rahulsc/superpowers that referenced this pull request Mar 10, 2026
Reject prose-only verdicts — require file:line citations per requirement.
Add anti-skip enforcement in Red Flags. Add re-review loop bound (max
3 cycles, then escalate). Add security review tier for auth/payment/data
tasks. Add per-implementer SHA guidance for team mode. Add uncommitted
changes check (git status before dispatch). References canonical evidence
format from verification-before-completion.

Addresses obra#528, obra#463, obra#479, PR obra#560, PR obra#578, Findings T, V.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
rahulsc added a commit to rahulsc/superpowers that referenced this pull request Mar 11, 2026
…e management

Comprehensive audit of all 16 skills with fixes and enhancements:

Skills improvements:
- verification-before-completion: canonical evidence format (command/citation/diff)
- using-git-worktrees: rework to layer on native EnterWorktree
- test-driven-development: add pipelined TDD mode and evidence gates
- agent-team-driven-development: add pipelined TDD and evidence gates
- dispatching-parallel-agents: add worktree isolation and autonomy
- composing-teams: add agent hierarchy and test-writer role
- receiving-code-review: add evidence demand and pedantry filter
- systematic-debugging: improve CSO, add fork and worktree support
- writing-skills: align with Anthropic guide, add versioning
- finishing-a-development-branch: fix worktree CWD and add state.yml
- requesting-code-review: require citation evidence from reviewers
- writing-plans: add directory-based plans and test expectations
- brainstorming: add verification gates, research step, assumption challenging
- executing-plans: add evidence gates and cold session resume
- subagent-driven-development: add evidence gates and progress persistence
- using-superpowers: trim red flags, fix subagent context gap

Cross-cutting fixes:
- Generalize "your human partner" to "the user" across all skills
- Restore Cursor/Codex/multi-platform support across skills
- Correct spec-reviewer agent type and remove bias language (CX-4, CX-18, CX-19)
- Align code-reviewer placeholder to {PLAN_OR_REQUIREMENTS} (CX-7)
- Add worktree env file guidance and fix hook bash compatibility (obra superpowers obraGH-521, obraGH-572, obraGH-584)
- Add integration section and sequential fallback to dispatching (CX-17, markelz0r)
- Add announce and cross-task dependency check to agent-team (CX-15, obra superpowers obraGH-578)
- Update stale skill references and verification links (CX-9)
- Address CodeRabbit review findings

Documentation:
- Skills audit brainstorm, design document, and implementation plan
- Skills audit v2 design, research, and implementation plan
- Upstream research: open issues/PRs, closed issues, forks survey, unmerged PRs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
rahulsc added a commit to rahulsc/superpowers that referenced this pull request Mar 12, 2026
…e management

Comprehensive audit of all 16 skills with fixes and enhancements:

Skills improvements:
- verification-before-completion: canonical evidence format (command/citation/diff)
- using-git-worktrees: rework to layer on native EnterWorktree
- test-driven-development: add pipelined TDD mode and evidence gates
- agent-team-driven-development: add pipelined TDD and evidence gates
- dispatching-parallel-agents: add worktree isolation and autonomy
- composing-teams: add agent hierarchy and test-writer role
- receiving-code-review: add evidence demand and pedantry filter
- systematic-debugging: improve CSO, add fork and worktree support
- writing-skills: align with Anthropic guide, add versioning
- finishing-a-development-branch: fix worktree CWD and add state.yml
- requesting-code-review: require citation evidence from reviewers
- writing-plans: add directory-based plans and test expectations
- brainstorming: add verification gates, research step, assumption challenging
- executing-plans: add evidence gates and cold session resume
- subagent-driven-development: add evidence gates and progress persistence
- using-superpowers: trim red flags, fix subagent context gap

Cross-cutting fixes:
- Generalize "your human partner" to "the user" across all skills
- Restore Cursor/Codex/multi-platform support across skills
- Correct spec-reviewer agent type and remove bias language (CX-4, CX-18, CX-19)
- Align code-reviewer placeholder to {PLAN_OR_REQUIREMENTS} (CX-7)
- Add worktree env file guidance and fix hook bash compatibility (obra superpowers obraGH-521, obraGH-572, obraGH-584)
- Add integration section and sequential fallback to dispatching (CX-17, markelz0r)
- Add announce and cross-task dependency check to agent-team (CX-15, obra superpowers obraGH-578)
- Update stale skill references and verification links (CX-9)
- Address CodeRabbit review findings

Documentation:
- Skills audit brainstorm, design document, and implementation plan
- Skills audit v2 design, research, and implementation plan
- Upstream research: open issues/PRs, closed issues, forks survey, unmerged PRs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@STRML STRML force-pushed the feature/persistent-validator-pool branch from 9928621 to b049f7e Compare March 13, 2026 23:37
@STRML
Copy link
Copy Markdown
Author

STRML commented Mar 13, 2026

Code review

Found 3 issues:

  1. Validator state tracking contradiction: SendMessage vs TaskUpdate — The "Validator Idle State Tracking" section (line 242) declares SendMessage receipt as the authoritative availability signal, explicitly saying "not TaskUpdate state." But the "Recovering After Compaction" section (line 277) says to "Check validator TaskUpdate state to see which validators are busy," and the example workflow (lines 405-407) shows using TaskList and TaskUpdate to track validator state. These cannot all be correct. Additionally, validator-prompt.md (line 78) only instructs validators to send a SendMessage — it never tells them to call TaskUpdate — so the compaction recovery procedure has no working mechanism for validator state.

### Validator Idle State Tracking
A validator is treated as busy from the moment they receive a review request until their SendMessage verdict arrives. Team lead treats SendMessage receipt as the authoritative availability signal — not TaskUpdate state. Team lead does not send a second review to a validator before receiving their SendMessage verdict for the first.
Validators notify the team lead when their verdict is ready via `SendMessage`. The team lead routes the next available review request only after receiving that message.

1. Run `TaskList` immediately to read authoritative task state (completed, in_progress, pending)
2. Check validator TaskUpdate state to see which validators are busy
3. Only send status-check messages to implementers whose tasks show `in_progress` but have not yet sent a report — do not broadcast to all members

After sending this verdict, notify the team lead via `SendMessage` so they know you are available for the next review.

  1. Stale compaction reference in Advantages section — Line 469 says "Lean team lead context (compact before spawning keeps orchestration turns fast)" but the Pre-Flight section and Key Constraints both replaced compaction with the EnterPlanMode approach. The parenthetical describes the old flow.

- Reviews run concurrently with ongoing implementation
- Lean team lead context (compact before spawning keeps orchestration turns fast)

  1. Dangling reference to deleted code-quality-reviewer-prompt.mdRELEASE-NOTES.md line 445 still lists code-quality-reviewer-prompt.md as an existing prompt template, but the PR deletes this file.

- `spec-reviewer-prompt.md` - Skeptical verification against requirements
- `code-quality-reviewer-prompt.md` - Standard code review

@STRML STRML force-pushed the feature/persistent-validator-pool branch from 13cd1ca to 4ba475c Compare March 14, 2026 01:36
@STRML
Copy link
Copy Markdown
Author

STRML commented Mar 15, 2026

This is ready to merge after using on a large number of internal projects. It is by my estimation 2-5x faster than subagent mode and appears to be easier on tokens as well, as the team members persist and do not need to re-read files for context.

wxtry added a commit to wxtry/superpowers that referenced this pull request Mar 23, 2026
…olution

Merges Phase 1 (mandatory evidence + file:line citations) and Phase 2
(persistent validator pool, team mode) from STRML's PR obra#578.

Conflict in writing-plans/SKILL.md resolved: kept both execution options
(Subagent-Driven + Inline Execution) while adding Team Mode detection
within the subagent path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@obra obra added the subagents Subagent-driven development and dispatch label Mar 23, 2026
STRML added 23 commits April 2, 2026 21:36
Implementers must now include full test output, git diff --stat,
and file list before reporting complete. Reports without evidence
will be rejected by the controller.
Spec reviewer must now produce per-requirement file:line citations
with code excerpts. Prose-only approvals are rejected. This makes
phantom completion immediately detectable from the evidence trail.
Add TeamCreate branch to When to Use diagram. Update description
and core principle to reflect team mode with persistent validators.
Combined spec+quality review in a single SendMessage payload.
Includes SHA map for cross-task dependency re-reads, mandatory
file:line citations per requirement, and self-reporting via TaskUpdate.
Add Team Mode section to subagent-driven-development:
- Pre-flight context compaction recommendation before TeamCreate
- 2 persistent validator agents (combined spec+quality review)
- SHA map in review requests for cross-task dependency re-reads
- 3-cycle review cap with escalation
- Context poisoning mitigations: re-read rule, bounded re-open, cascade handling
- Validator rotation for plans >8 tasks with explicit tradeoff disclosure
- Validator failure recovery (60s timeout -> fresh subagent fallback)
…prompt

Remove code-quality-reviewer-prompt.md (superseded by validator-prompt.md
in team mode; standard mode uses requesting-code-review skill).
Update Red Flags to cover both modes including team-mode-specific rules.
Require file:line citations before accepting any verdict.
Add compaction reminder to team mode never list.
Show team mode orchestration with parallel implementers, SHA map
maintenance, and validator round-trip. Update cost analysis and
add context compaction note to team lead overhead.
Align with revised task instructions:
- Prompt Templates: list code-quality-reviewer-prompt.md explicitly for standard mode
- Pre-Flight: use concise single-paragraph format
- Key Constraints: add 9th constraint (never skip compaction)
- Team Lifecycle: simplified numbering (6 steps instead of 7)
- Example Workflow: update standard example inline with validator pattern
- Cost section: match revised format with per-mode breakdown
…Templates

code-quality-reviewer-prompt.md is deleted; standard mode now references
superpowers:requesting-code-review directly (which the file called anyway).
Standard mode example workflow was incorrectly updated to show combined
validator dispatch. Standard mode still uses separate spec and quality
reviewers. Team mode example (separate section) correctly shows validator.
- Quality Gates: distinguish standard (two-stage) vs team (single-pass) review
- Validator Idle Tracking: remove TaskUpdate-based availability (task ID never
  defined); replace with message-based flow (verdict receipt = available)
- Red Flags "If reviewer finds issues": add mode qualifiers (standard: same
  subagent; team: SendMessage to implementer + route to validator)
- validator-prompt.md: remove TaskUpdate(completed) call (no validator task
  exists to update); replace with SendMessage notification instruction
- Process digraph and Key Constraints: update to match message-based tracking
Previous guidance was passive ("if 20+ turns, consider compacting") and
easily skipped. A team lead with near-full context will fail mid-plan
with no recovery path.

Now: before TeamCreate, the agent must ask the user how much context
remains and wait for /compact before proceeding if below ~50%. This is
a hard gate, not a suggestion. Updated Pre-Flight, Key Constraints,
Team Lifecycle, Red Flags, example workflow, and Cost section.
- design doc "Files to Change": correct Phase 2 entry to show
  validator-prompt.md (new file) instead of extending spec-reviewer-prompt.md
- design doc rotation: add handoff protocol — outgoing validator sends
  dependency summary to team lead before being replaced
- SKILL.md rotation: same handoff protocol added
- implementation plan Task 4b: add prominent SKIPPED notice (SDK memory
  was retained; persistent topology was implemented instead)

The TaskUpdate inconsistency flagged in SKILL.md:206 was already
resolved in commit 582927a.
…signal

Team lead now explicitly checks TaskUpdate state before routing reviews,
with message flow as a secondary activity indicator. Resolves inconsistency
between line 126 (validators declare idle/busy via TaskUpdate) and the
Validator Idle State Tracking section (which only described message flow).
Standard mode performs sequential two-pass review (spec then quality);
team mode runs combined spec+quality in a single validator pass.
The opening sentence incorrectly implied combined review for all modes.
After compaction, TaskList persists as external storage — team lead
should read it immediately rather than assuming state was lost and
pinging all implementers. Blind broadcasts are unnecessary and expensive.
…dator availability signal

Validators signal availability via SendMessage verdict only.
TaskUpdate is not used for availability tracking. Removes inconsistency
introduced in previous review pass.
Mike reported a bug where the coordinator bypassed verification by doing
implementation work itself instead of delegating to implementer subagents.
Adds an explicit constraint and Red Flag entry to prevent this.
Execution handoff previously offered only 2 options (standard, parallel
session), leaving team mode unreachable. subagent-driven-development gates
team mode on user opt-in, but the opt-in question was never asked.

Now presents 3 explicit choices with descriptions to help users decide.
…xt launch

Team mode now synthesizes a complete self-contained execution prompt and
presents it via EnterPlanMode. User selects 'clear context and auto-accept
edits' to start team execution with clean context — no /compact needed.
…rite pattern

SendMessage is the primary signal during normal operation; TaskUpdate
provides durability for compaction recovery. Team lead owns idle→busy
(in_progress on assign), validator owns busy→idle (completed after
verdict). Also fixes stale compaction reference in advantages section
and dangling code-quality-reviewer-prompt.md reference in release notes.
@STRML STRML force-pushed the feature/persistent-validator-pool branch from 4ba475c to faed54f Compare April 3, 2026 01:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working claude-code Claude Code (Anthropic CLI) issues subagents Subagent-driven development and dispatch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants