[enhancement] acceptance-review 缺在 IDD mode model — 「人不 review」需要 first-class 做法#109
[enhancement] acceptance-review 缺在 IDD mode model — 「人不 review」需要 first-class 做法#109kiki830621 wants to merge 5 commits into
Conversation
…octrine (Refs #102) Per /spectra-discuss conclusion (issue-comment-4486107094): IDD's human-in-the-loop is the NSQL confirmation loop. The human confirms intent BEFORE execution (at issue + idd-diagnose); idd-verify is an execution-fidelity check, not a confirmation loop. verify-gated PASS becomes the named, sanctioned terminal default disposition. New MANIFESTO section between '兩個正交的維度' axis discussion and 'TDD/SDD special case'. Contains: - NSQL confirmation loop ⇆ IDD pipeline mapping table - verify-gated terminal-disposition doctrine - Verify-as-review reframe (5 adversarial agents + Codex exceed single human merge reviewer on correctness) - --review flag: opt-in re-open of the confirmation loop, NOT a quality gate; per-invocation flag, NOT a standing config field - auto-merge clarification: legitimate under verify-gated PASS, justified by 'verify is the gate' (not 'merges are reversible'); auto-merge != auto-close; #37 owns autopilot mechanics Doctrine deliberately does NOT change idd-all default (鐵律 keeps '永遠不 auto-merge PR'). MANIFESTO frames what verify-gated MEANS; #37 handles when autopilot actually performs auto-merge. NSQL reference at kiki830621/NSQL (v4.1.0), already registered in CLAUDE.md Reference Projects (commit fd2f21c via #103).
…ing (Refs #102) Implements the doctrine added in 4aef425 (MANIFESTO Human-in-the-loop / NSQL Confirmation Protocol section). idd-all Phase 6 terminal report now reflects the verify-gated default disposition: - New --review flag parses to $REVIEW_FLAG (orthogonal to --pr/--no-pr/ --in-chain, no mutex) - argument-hint updated to advertise --review - Step 0 bootstrap adds parse_review_flag TaskCreate entry - Phase 6 PR-mode report dispatches on $REVIEW_FLAG: - default: 'Verify: verify-gated PASS' + 'Next: merge ..., then /idd-close' - --review: 'Verify: verify-gated PASS — awaiting human acceptance (re-opened confirmation loop per --review)' + 'Next: review PR ..., merge after acceptance, then /idd-close' - Phase 6 direct-commit-mode report: analogous swap - 鐵律 line 834 ('永遠不 auto-merge PR') UNCHANGED — auto-merge mechanic belongs to #37 autopilot, not idd-all default. Doctrine permits; default behavior does not perform. --review is messaging-only — does NOT make idd-all wait, does NOT change verify/implement/close behavior. Effect: terminal report wording. Per discussion conclusion (issue-comment-4486107094), a standing acceptance_policy config field is deferred until #37 needs it.
…conditional wording (Refs #102) Mirrors idd-all #102 changes (commit 1fdcba0) for the chain orchestrator: - argument-hint advertises --review (alongside --bfs and --cwd) - Phase 0 args parsing recognizes --review → $REVIEW_FLAG - Phase 2 chain loop propagates $REVIEW_FLAG to each chained sub-skill call: 'Skill(skill=idd-all, args=#$CURRENT --in-chain --cwd $CWD${REVIEW_FLAG:+ $REVIEW_FLAG})' — ${REVIEW_FLAG:+ ...} guarantees a leading space ONLY when set, avoiding stray-space fragility when flag unset. - Step 0 bootstrap adds parse_review_flag task entry - Cluster PR body Pending Review checklist dispatches on $REVIEW_FLAG: - default: '- [x] **Verify-gated**: per-issue verify PASS — cluster ready to merge' - --review: '- [ ] **Pending: human acceptance review of cluster PR** (per --review flag)' Per MANIFESTO 'Human-in-the-loop: IDD 即 NSQL Confirmation Protocol' doctrine (commit 4aef425). --review is messaging + propagation only — chain shell still stops at verified, never auto-closes, never auto-merges. idd-implement Step 5.5 / idd-all Phase 5 / pr-flow.md canonical PR-body templates intentionally untouched in this PR — tracked in #108 as the 4-template consistency family sync follow-up (per Step 2.5 tangential).
CHANGELOG v2.65.0 entry covers the #102 NSQL doctrine + --review flag: - MANIFESTO Human-in-the-loop section (commit 4aef425) - idd-all --review flag + Phase 6 verify-gated wording (commit 1fdcba0) - idd-all-chain --review propagation + Phase 4 / PR body conditional (commit 11a99e0) Notes section explicitly records that idd-implement Step 5.5 / idd-all Phase 5 / pr-flow.md canonical PR-body checklist wording is intentionally left at old wording for this release — 4-template consistency family follow-up tracked in #108. plugin.json version 2.64.0 → 2.65.0 (minor: terminal messaging behavior change + additive --review flag; MANIFESTO doctrine is documentation). plugin.json description field intentionally not rewritten — that is plugin-update Phase 2 marketplace.json sync's job at /idd-close Step 6.5, not in this PR (same pattern as #103's 2f63feb).
…tion + CHANGELOG chain-flow.md note (Refs #102) In-PR polish from /idd-verify --pr 109 (6-AI ensemble: 3/6 reviewers caught F1, 3/6 missed — regression / DA / Codex convergent). F1 (HIGH blocking, caught by regression + DA + Codex): - idd-all-chain/SKILL.md PR-body checklist conditional used ${REVIEW_FLAG:+A}${REVIEW_FLAG:-B} thinking it was a mutex. It is NOT — ${var:-word} returns $var when set, not the alternative branch. Under --review, the line rendered with literal '--review' appended at the end. Reproduced empirically: REVIEW_FLAG=--review yielded '...after merge--review' instead of '...after merge'. - Rewrote using explicit 'if [ -n "$REVIEW_FLAG" ]; then ...; else ...; fi' BEFORE the heredoc, composing the line into REVIEW_CHECKLIST_LINE which the heredoc then interpolates. Empirically validated both branches render correctly. - Also renamed the '## Pending review' heading to '## Review status' since it is no longer always 'pending' under the default (verify-gated) path. DA1 (LOW, Devil's Advocate): - MANIFESTO doctrine clarified to extend the 'auto-merge is not the idd-all default' constraint to /loop / ralph-loop / external CI callers too. Phase 6 'verify-gated PASS, ready to merge' is a state declaration, NOT an authorization for any caller to script 'gh pr merge'. Autopilot remains #37's territory. F2 (LOW, regression): - CHANGELOG Notes section extended to mention 'references/chain-flow.md' as the 4th file in the consistency family — Step 5.7 sister sweep caught it during /idd-implement #102 and updated #108 to 5-template scope. Notes section now matches that scope. Deferred (recorded in verify findings, not blocking): - F3 (regression LOW): Phase 4 final report stdout text not --review-aware - DA2 (DA INFO): logic reviewer's bash analysis error — process observation - DA3 (DA INFO): 'messaging-only' could be qualified to 'orchestrator-scope'
Verify Report — PR #109Engine5 general-purpose Agents (Claude reviewers, file-based output) + Codex (gpt-5.5 xhigh, run_in_background) → 6-AI ensemble AggregateCONDITIONAL PASS → fixed in-PR → effectively PASS — 1 HIGH blocking (F1) caught by 3 of 6 reviewers (regression / DA / Codex convergent), fixed in commit Scope coveragePR Pre-merge gates
#102 — IDD acceptance-review doctrine +
|
| Deliverable | Status | Evidence |
|---|---|---|
| 1. MANIFESTO new section (Human-in-the-loop / NSQL Confirmation Protocol) | FULLY | MANIFESTO.md:95-135 — mapping table + verify-gated terminal default + verify-as-review reframe + --review semantics + auto-merge clarification + (post-fix) /loop external-caller clarification. 4aef425 + dc61ffb |
| 2. idd-all + idd-all-chain Phase 6/4 messaging swap | FULLY (post-fix) | idd-all/SKILL.md:778-843 (Phase 6 PR+direct-commit reports dispatch on $REVIEW_FLAG). idd-all-chain/SKILL.md:561-578 (PR-body checklist conditional, now mutex-correct after dc61ffb). Section heading renamed ## Pending review → ## Review status. 1fdcba0 + 11a99e0 + dc61ffb |
3. --review per-invocation flag |
FULLY | Both skills' Phase 0 args parsers + argument-hints + Step 0 bootstrap parse_review_flag tasks. idd-all-chain Phase 2 chain loop propagates via ${REVIEW_FLAG:+ $REVIEW_FLAG} to sub-/idd-all --in-chain. Orthogonal (no mutex) with --pr/--no-pr/--in-chain/--bfs/--cwd. 1fdcba0 + 11a99e0 |
Scope compliance ✓: out-of-scope items intact —
idd-implementStep 5.5 +idd-allPhase 5 +references/pr-flow.md+references/chain-flow.mdPR-body checklist wording untouched → tracked in Sync 4-template PR-body checklist wording to match #102 NSQL doctrine #108 (5-template family follow-up, scope extended during Step 5.7)- Autopilot auto-merge mechanics → feature: /idd-all (無參數) bulk-solve mode — AI 自動 sequence + bundle 所有 open issue #37 territory
references/pr-flow.mdmode resolution algorithm →--reviewis not a third axisidd-verify/idd-implement/idd-closeskill internals untouched — doctrine interprets, doesn't change themREADME.md→ deferred to/idd-closeStep 6.5 //plugin-tools:plugin-updatePhase 2.5 (per IDD human-in-the-loop should conform to the NSQL confirmation protocol (v4.1.0) #103 precedentdb25514)- Standing
acceptance_policyconfig field → deferred per Discuss Conclusion
Findings
| # | Severity | Action | Source | Description | Status |
|---|---|---|---|---|---|
| F1 | HIGH | In-PR fix | agents:regression+devils-advocate+codex (3 sources convergent) | idd-all-chain/SKILL.md:566 used ${REVIEW_FLAG:+A}${REVIEW_FLAG:-B} thinking it was a mutex. ${var:-word} returns $var when set, NOT the alternative branch. Under --review, line rendered with stray --review literal appended (empirically reproduced: ... after merge--review). |
FIXED in dc61ffb — explicit if/else builds $REVIEW_CHECKLIST_LINE before the heredoc; empirically validated both branches mutually exclusive |
| DA1 | LOW | In-PR fix | agents:devils-advocate | MANIFESTO doctrine could be misread by /loop / ralph-loop / external CI callers as authorizing gh pr merge after Phase 6 verify-gated PASS report. Doctrine sanctions verify-gated as terminal state, NOT as auto-merge authorization. |
FIXED in dc61ffb — MANIFESTO doctrine extended with explicit /loop / external-CI clarification: state declaration ≠ merge authorization; autopilot remains #37 territory |
| F2 | LOW | In-PR fix | agents:regression | CHANGELOG Notes section listed 4-template scope for #108, but Step 5.7 sister sweep already extended it to 5-template (added chain-flow.md:254). Notes drifted. |
FIXED in dc61ffb — CHANGELOG Notes section now lists chain-flow.md as the 4th file + adds parenthetical explaining the 4→5 scope expansion |
| F3 | LOW | Follow-up | agents:regression | idd-all-chain Phase 4 final stdout text (forest tree + STOP message) is not --review-aware. PR-body checklist is (post-fix). |
Defer — terminal output cosmetic, PR-body is the audit artifact |
| DA2 | INFO | Process note | agents:devils-advocate | Logic reviewer's claim "bash-correct mutex" was an analysis error (didn't empirically test the ${VAR:-word} semantics). Process improvement: complex bash idioms should be empirically reproduced, not eyeballed. |
Defer — process observation only, no code change |
| DA3 | INFO | Follow-up | agents:devils-advocate | --review "messaging-only" claim is true at the orchestrator scope but downstream (humans, CI parsers) can react to the text differently. Refining language to "orchestrator-scope messaging-only" might be tighter. |
Defer — wording refinement, not regression-blocking |
| LOW (cosmetic) | LOW | Defer | agents:requirements | 3 cosmetic observations: stale Trace 1 example block in idd-all (lines 893-901) still shows old Verify: PASS wording; argument-hint length growing; future-robustness note on single-source REVIEW_FLAG parsing. |
Defer — examples/hints stale but not user-visible-broken; can address in a doc polish pass alongside #108 |
Process Gaps
(none — all 5 Claude reviewer Agents + Codex produced findings files first attempt, no Step 2.5 Recovery Protocol fires)
Self-referentiality check (DA adversarial angle 1)
Doctrine self-referentiality (PR adds doctrine that verify-gated PASS is sufficient terminal, ships via 6-AI verify whose PASS would itself be the sanctioned disposition) — DA dismissed as blocker because:
- The discuss conclusion (commit
4aef425references it) is external grounding — the doctrine isn't bootstrapped from this PR's verify alone. - NSQL (
kiki830621/NSQL) is an independent external reference protocol that IDD conforms to, not invents. - The 6-AI ensemble's adversarial design (Devil's Advocate + Codex cross-model) is itself the falsifiable check the doctrine names — convergent F1 catch demonstrates the design works.
Next
- F1 + DA1 + F2 already fixed in
dc61ffb(committed + pushed). PR [enhancement] acceptance-review 缺在 IDD mode model — 「人不 review」需要 first-class 做法 #109 now ready for merge. - F3 / DA2 / DA3 / cosmetic findings → Step 5b triage (most defer; some land in Sync 4-template PR-body checklist wording to match #102 NSQL doctrine #108 or new follow-up).
- After polish triage: PR [enhancement] acceptance-review 缺在 IDD mode model — 「人不 review」需要 first-class 做法 #109 ready for merge →
/idd-close #102.
Reports referenced
- Requirements: PASS (3 cosmetic obs)
- Logic: PASS (analysis error on F1, but 6 other checks correct)
- Security: PASS (0 findings)
- Regression: CONDITIONAL → PASS (F1 fixed)
- Devil's Advocate: CONDITIONAL → PASS (F1 + DA1 fixed)
- Codex: PARTIALLY → PASS (F1 fixed; concurred on chain-flow drift now in Sync 4-template PR-body checklist wording to match #102 NSQL doctrine #108)
Refs #102
Summary
Ships the MANIFESTO doctrine +
--reviewflag that came out of/spectra-discuss#102. Three sequenced changes implement the conclusion:MANIFESTO doctrine — new section
## Human-in-the-loop: IDD 即 NSQL Confirmation Protocolbetween the verification/closure-axes discussion and the TDD/SDD special-case section. Names the structural fact that IDD's human-in-the-loop is an NSQL Confirmation Protocol instance: human's confirmation loop closes before execution (atissue+idd-diagnose);idd-verifyis execution-fidelity, not a confirmation loop;verify-gatedis the named, sanctioned terminal default disposition;--reviewre-opens the confirmation loop (opt-in, NOT a quality gate); auto-merge legitimate but not theidd-alldefault — autopilot mechanics belong to feature: /idd-all (無參數) bulk-solve mode — AI 自動 sequence + bundle 所有 open issue #37.idd-all--review flag + Phase 6 wording — new--reviewflag parses to$REVIEW_FLAG. Phase 6 PR-mode + direct-commit-mode reports dispatch on the flag:Verify: verify-gated PASS+Next: merge, then /idd-close--review:Verify: verify-gated PASS — awaiting human acceptance (re-opened confirmation loop per --review)+Next: review PR, merge after acceptance, then /idd-closeDrops the legacy
Pending: human reviewframing that implied a default second gate. 鐵律永遠不 auto-merge PRunchanged — the doctrine permits; the default does not perform.idd-all-chain--review propagation + PR body conditional — Phase 0 args parsing recognizes--review. Phase 2 chain loop appends$REVIEW_FLAGto each/idd-all #M --in-chaininvocation (via${REVIEW_FLAG:+ $REVIEW_FLAG}for safe whitespace handling). Cluster PR body checklist dispatches:- [x] Verify-gated: per-issue verify PASS — cluster ready to merge--review:- [ ] Pending: human acceptance review of cluster PR (per --review flag)Plugin bumped to v2.65.0.
Out-of-scope (deliberate)
idd-implementStep 5.5 /idd-allPhase 5 /pr-flow.mdcanonical PR-body checklist wording — filed as Sync 4-template PR-body checklist wording to match #102 NSQL doctrine #108 (4-template consistency follow-up). Discuss conclusion's narrow scope kept--reviewto the orchestrators only; Sync 4-template PR-body checklist wording to match #102 NSQL doctrine #108 syncs the rest post-merge.pr-flow.mdmode resolution algorithm —--reviewis per-invocation, NOT a third axis.pr-flow.mduntouched.acceptance_policyconfig field — deferred per Discuss Conclusion./idd-closeStep 6.5 via/plugin-tools:plugin-updatePhase 2.5 (per IDD human-in-the-loop should conform to the NSQL confirmation protocol (v4.1.0) #103 precedentdb25514).Checklist
Generated by /idd-implement on PR path. Do NOT add a GitHub close trailer (Closes/Fixes/Resolves) — IDD discipline requires manual /idd-close after merge to enforce checklist gate + closing summary.