Skip to content

[enhancement] acceptance-review 缺在 IDD mode model — 「人不 review」需要 first-class 做法#109

Open
kiki830621 wants to merge 5 commits into
mainfrom
idd/102-acceptance-review-doctrine
Open

[enhancement] acceptance-review 缺在 IDD mode model — 「人不 review」需要 first-class 做法#109
kiki830621 wants to merge 5 commits into
mainfrom
idd/102-acceptance-review-doctrine

Conversation

@kiki830621
Copy link
Copy Markdown
Contributor

Refs #102

Summary

Ships the MANIFESTO doctrine + --review flag that came out of /spectra-discuss #102. Three sequenced changes implement the conclusion:

  1. MANIFESTO doctrine — new section ## Human-in-the-loop: IDD 即 NSQL Confirmation Protocol between the verification/closure-axes discussion and the TDD/SDD special-case section. Names the structural fact that IDD's human-in-the-loop is an NSQL Confirmation Protocol instance: human's confirmation loop closes before execution (at issue + idd-diagnose); idd-verify is execution-fidelity, not a confirmation loop; verify-gated is the named, sanctioned terminal default disposition; --review re-opens the confirmation loop (opt-in, NOT a quality gate); auto-merge legitimate but not the idd-all default — autopilot mechanics belong to feature: /idd-all (無參數) bulk-solve mode — AI 自動 sequence + bundle 所有 open issue #37.

  2. idd-all --review flag + Phase 6 wording — new --review flag parses to $REVIEW_FLAG. Phase 6 PR-mode + direct-commit-mode reports dispatch on the flag:

    • default: Verify: verify-gated PASS + Next: merge, then /idd-close
    • --review: Verify: verify-gated PASS — awaiting human acceptance (re-opened confirmation loop per --review) + Next: review PR, merge after acceptance, then /idd-close

    Drops the legacy Pending: human review framing that implied a default second gate. 鐵律 永遠不 auto-merge PR unchanged — the doctrine permits; the default does not perform.

  3. idd-all-chain --review propagation + PR body conditional — Phase 0 args parsing recognizes --review. Phase 2 chain loop appends $REVIEW_FLAG to each /idd-all #M --in-chain invocation (via ${REVIEW_FLAG:+ $REVIEW_FLAG} for safe whitespace handling). Cluster PR body checklist dispatches:

    • default: - [x] Verify-gated: per-issue verify PASS — cluster ready to merge
    • --review: - [ ] Pending: human acceptance review of cluster PR (per --review flag)

Plugin bumped to v2.65.0.

Out-of-scope (deliberate)

Checklist

  • Diagnose ✓ (Diagnosis — Complexity initially Spectra)
  • Discuss ✓ (Conclusion — re-routed Spectra → Plan)
  • Plan ✓ (Implementation Plan — EnterPlanMode approved 2026-05-20)
  • Implement (4 commits, +123/−12 across 5 files)
  • Verify (run /idd-verify --pr )
  • Pending: human review of this PR + /idd-close after merge

Generated by /idd-implement on PR path. Do NOT add a GitHub close trailer (Closes/Fixes/Resolves) — IDD discipline requires manual /idd-close after merge to enforce checklist gate + closing summary.

…octrine (Refs #102)

Per /spectra-discuss conclusion (issue-comment-4486107094): IDD's
human-in-the-loop is the NSQL confirmation loop. The human confirms
intent BEFORE execution (at issue + idd-diagnose); idd-verify is an
execution-fidelity check, not a confirmation loop. verify-gated PASS
becomes the named, sanctioned terminal default disposition.

New MANIFESTO section between '兩個正交的維度' axis discussion and
'TDD/SDD special case'. Contains:

- NSQL confirmation loop ⇆ IDD pipeline mapping table
- verify-gated terminal-disposition doctrine
- Verify-as-review reframe (5 adversarial agents + Codex exceed single
  human merge reviewer on correctness)
- --review flag: opt-in re-open of the confirmation loop, NOT a quality
  gate; per-invocation flag, NOT a standing config field
- auto-merge clarification: legitimate under verify-gated PASS,
  justified by 'verify is the gate' (not 'merges are reversible');
  auto-merge != auto-close; #37 owns autopilot mechanics

Doctrine deliberately does NOT change idd-all default (鐵律 keeps
'永遠不 auto-merge PR'). MANIFESTO frames what verify-gated MEANS;
#37 handles when autopilot actually performs auto-merge.

NSQL reference at kiki830621/NSQL (v4.1.0), already registered in
CLAUDE.md Reference Projects (commit fd2f21c via #103).
…ing (Refs #102)

Implements the doctrine added in 4aef425 (MANIFESTO Human-in-the-loop /
NSQL Confirmation Protocol section). idd-all Phase 6 terminal report
now reflects the verify-gated default disposition:

- New --review flag parses to $REVIEW_FLAG (orthogonal to --pr/--no-pr/
  --in-chain, no mutex)
- argument-hint updated to advertise --review
- Step 0 bootstrap adds parse_review_flag TaskCreate entry
- Phase 6 PR-mode report dispatches on $REVIEW_FLAG:
  - default: 'Verify: verify-gated PASS' + 'Next: merge ..., then /idd-close'
  - --review: 'Verify: verify-gated PASS — awaiting human acceptance (re-opened
    confirmation loop per --review)' + 'Next: review PR ..., merge after
    acceptance, then /idd-close'
- Phase 6 direct-commit-mode report: analogous swap
- 鐵律 line 834 ('永遠不 auto-merge PR') UNCHANGED — auto-merge mechanic
  belongs to #37 autopilot, not idd-all default. Doctrine permits;
  default behavior does not perform.

--review is messaging-only — does NOT make idd-all wait, does NOT
change verify/implement/close behavior. Effect: terminal report
wording. Per discussion conclusion (issue-comment-4486107094), a
standing acceptance_policy config field is deferred until #37 needs
it.
…conditional wording (Refs #102)

Mirrors idd-all #102 changes (commit 1fdcba0) for the chain orchestrator:

- argument-hint advertises --review (alongside --bfs and --cwd)
- Phase 0 args parsing recognizes --review → $REVIEW_FLAG
- Phase 2 chain loop propagates $REVIEW_FLAG to each chained sub-skill
  call: 'Skill(skill=idd-all, args=#$CURRENT --in-chain --cwd $CWD${REVIEW_FLAG:+ $REVIEW_FLAG})'
  — ${REVIEW_FLAG:+ ...} guarantees a leading space ONLY when set, avoiding
  stray-space fragility when flag unset.
- Step 0 bootstrap adds parse_review_flag task entry
- Cluster PR body Pending Review checklist dispatches on $REVIEW_FLAG:
  - default: '- [x] **Verify-gated**: per-issue verify PASS — cluster ready to merge'
  - --review: '- [ ] **Pending: human acceptance review of cluster PR** (per --review flag)'

Per MANIFESTO 'Human-in-the-loop: IDD 即 NSQL Confirmation Protocol' doctrine
(commit 4aef425). --review is messaging + propagation only — chain shell still
stops at verified, never auto-closes, never auto-merges.

idd-implement Step 5.5 / idd-all Phase 5 / pr-flow.md canonical PR-body
templates intentionally untouched in this PR — tracked in #108 as the
4-template consistency family sync follow-up (per Step 2.5 tangential).
CHANGELOG v2.65.0 entry covers the #102 NSQL doctrine + --review flag:
- MANIFESTO Human-in-the-loop section (commit 4aef425)
- idd-all --review flag + Phase 6 verify-gated wording (commit 1fdcba0)
- idd-all-chain --review propagation + Phase 4 / PR body conditional (commit 11a99e0)

Notes section explicitly records that idd-implement Step 5.5 / idd-all
Phase 5 / pr-flow.md canonical PR-body checklist wording is intentionally
left at old wording for this release — 4-template consistency family
follow-up tracked in #108.

plugin.json version 2.64.0 → 2.65.0 (minor: terminal messaging behavior
change + additive --review flag; MANIFESTO doctrine is documentation).
plugin.json description field intentionally not rewritten — that is
plugin-update Phase 2 marketplace.json sync's job at /idd-close Step 6.5,
not in this PR (same pattern as #103's 2f63feb).
…tion + CHANGELOG chain-flow.md note (Refs #102)

In-PR polish from /idd-verify --pr 109 (6-AI ensemble: 3/6 reviewers
caught F1, 3/6 missed — regression / DA / Codex convergent).

F1 (HIGH blocking, caught by regression + DA + Codex):
- idd-all-chain/SKILL.md PR-body checklist conditional used
  ${REVIEW_FLAG:+A}${REVIEW_FLAG:-B} thinking it was a mutex. It is NOT —
  ${var:-word} returns $var when set, not the alternative branch. Under
  --review, the line rendered with literal '--review' appended at the end.
  Reproduced empirically: REVIEW_FLAG=--review yielded
  '...after merge--review' instead of '...after merge'.
- Rewrote using explicit 'if [ -n "$REVIEW_FLAG" ]; then ...; else
  ...; fi' BEFORE the heredoc, composing the line into REVIEW_CHECKLIST_LINE
  which the heredoc then interpolates. Empirically validated both branches
  render correctly.
- Also renamed the '## Pending review' heading to '## Review status' since
  it is no longer always 'pending' under the default (verify-gated) path.

DA1 (LOW, Devil's Advocate):
- MANIFESTO doctrine clarified to extend the 'auto-merge is not the
  idd-all default' constraint to /loop / ralph-loop / external CI callers
  too. Phase 6 'verify-gated PASS, ready to merge' is a state declaration,
  NOT an authorization for any caller to script 'gh pr merge'. Autopilot
  remains #37's territory.

F2 (LOW, regression):
- CHANGELOG Notes section extended to mention 'references/chain-flow.md'
  as the 4th file in the consistency family — Step 5.7 sister sweep
  caught it during /idd-implement #102 and updated #108 to 5-template
  scope. Notes section now matches that scope.

Deferred (recorded in verify findings, not blocking):
- F3 (regression LOW): Phase 4 final report stdout text not --review-aware
- DA2 (DA INFO): logic reviewer's bash analysis error — process observation
- DA3 (DA INFO): 'messaging-only' could be qualified to 'orchestrator-scope'
@kiki830621
Copy link
Copy Markdown
Contributor Author

Verify Report — PR #109

Engine

5 general-purpose Agents (Claude reviewers, file-based output) + Codex (gpt-5.5 xhigh, run_in_background) → 6-AI ensemble

Aggregate

CONDITIONAL PASS → fixed in-PR → effectively PASS — 1 HIGH blocking (F1) caught by 3 of 6 reviewers (regression / DA / Codex convergent), fixed in commit dc61ffb (also picked up DA1 + F2 polish in same commit). Mergeable as-is.

Scope coverage

PR Refs #102. Other #N mentions in PR body (#103 / #108 / #37) are prose context. Verified scope: #102 only.

Pre-merge gates

  • Step 0.7 PR↔issue correspondence ✓ (Refs #102 at line 1 of PR body)
  • Step 0.8 auto-close trap ✓ both sources clean (closingIssuesReferences empty + per-commit headline/body regex 0 hits across 4aef425 / 1fdcba0 / 11a99e0 / 8a41ee7 / dc61ffb)

#102 — IDD acceptance-review doctrine + --review flag

Requirements coverage: 3/3 deliverables FULLY addressed post-fix (1 PARTIALLY → FULLY after dc61ffb).

Deliverable Status Evidence
1. MANIFESTO new section (Human-in-the-loop / NSQL Confirmation Protocol) FULLY MANIFESTO.md:95-135 — mapping table + verify-gated terminal default + verify-as-review reframe + --review semantics + auto-merge clarification + (post-fix) /loop external-caller clarification. 4aef425 + dc61ffb
2. idd-all + idd-all-chain Phase 6/4 messaging swap FULLY (post-fix) idd-all/SKILL.md:778-843 (Phase 6 PR+direct-commit reports dispatch on $REVIEW_FLAG). idd-all-chain/SKILL.md:561-578 (PR-body checklist conditional, now mutex-correct after dc61ffb). Section heading renamed ## Pending review## Review status. 1fdcba0 + 11a99e0 + dc61ffb
3. --review per-invocation flag FULLY Both skills' Phase 0 args parsers + argument-hints + Step 0 bootstrap parse_review_flag tasks. idd-all-chain Phase 2 chain loop propagates via ${REVIEW_FLAG:+ $REVIEW_FLAG} to sub-/idd-all --in-chain. Orthogonal (no mutex) with --pr/--no-pr/--in-chain/--bfs/--cwd. 1fdcba0 + 11a99e0

Scope compliance ✓: out-of-scope items intact —

Findings

# Severity Action Source Description Status
F1 HIGH In-PR fix agents:regression+devils-advocate+codex (3 sources convergent) idd-all-chain/SKILL.md:566 used ${REVIEW_FLAG:+A}${REVIEW_FLAG:-B} thinking it was a mutex. ${var:-word} returns $var when set, NOT the alternative branch. Under --review, line rendered with stray --review literal appended (empirically reproduced: ... after merge--review). FIXED in dc61ffb — explicit if/else builds $REVIEW_CHECKLIST_LINE before the heredoc; empirically validated both branches mutually exclusive
DA1 LOW In-PR fix agents:devils-advocate MANIFESTO doctrine could be misread by /loop / ralph-loop / external CI callers as authorizing gh pr merge after Phase 6 verify-gated PASS report. Doctrine sanctions verify-gated as terminal state, NOT as auto-merge authorization. FIXED in dc61ffb — MANIFESTO doctrine extended with explicit /loop / external-CI clarification: state declaration ≠ merge authorization; autopilot remains #37 territory
F2 LOW In-PR fix agents:regression CHANGELOG Notes section listed 4-template scope for #108, but Step 5.7 sister sweep already extended it to 5-template (added chain-flow.md:254). Notes drifted. FIXED in dc61ffb — CHANGELOG Notes section now lists chain-flow.md as the 4th file + adds parenthetical explaining the 4→5 scope expansion
F3 LOW Follow-up agents:regression idd-all-chain Phase 4 final stdout text (forest tree + STOP message) is not --review-aware. PR-body checklist is (post-fix). Defer — terminal output cosmetic, PR-body is the audit artifact
DA2 INFO Process note agents:devils-advocate Logic reviewer's claim "bash-correct mutex" was an analysis error (didn't empirically test the ${VAR:-word} semantics). Process improvement: complex bash idioms should be empirically reproduced, not eyeballed. Defer — process observation only, no code change
DA3 INFO Follow-up agents:devils-advocate --review "messaging-only" claim is true at the orchestrator scope but downstream (humans, CI parsers) can react to the text differently. Refining language to "orchestrator-scope messaging-only" might be tighter. Defer — wording refinement, not regression-blocking
LOW (cosmetic) LOW Defer agents:requirements 3 cosmetic observations: stale Trace 1 example block in idd-all (lines 893-901) still shows old Verify: PASS wording; argument-hint length growing; future-robustness note on single-source REVIEW_FLAG parsing. Defer — examples/hints stale but not user-visible-broken; can address in a doc polish pass alongside #108

Process Gaps

(none — all 5 Claude reviewer Agents + Codex produced findings files first attempt, no Step 2.5 Recovery Protocol fires)

Self-referentiality check (DA adversarial angle 1)

Doctrine self-referentiality (PR adds doctrine that verify-gated PASS is sufficient terminal, ships via 6-AI verify whose PASS would itself be the sanctioned disposition) — DA dismissed as blocker because:

  1. The discuss conclusion (commit 4aef425 references it) is external grounding — the doctrine isn't bootstrapped from this PR's verify alone.
  2. NSQL (kiki830621/NSQL) is an independent external reference protocol that IDD conforms to, not invents.
  3. The 6-AI ensemble's adversarial design (Devil's Advocate + Codex cross-model) is itself the falsifiable check the doctrine names — convergent F1 catch demonstrates the design works.

Next

Reports referenced

  • Requirements: PASS (3 cosmetic obs)
  • Logic: PASS (analysis error on F1, but 6 other checks correct)
  • Security: PASS (0 findings)
  • Regression: CONDITIONAL → PASS (F1 fixed)
  • Devil's Advocate: CONDITIONAL → PASS (F1 + DA1 fixed)
  • Codex: PARTIALLY → PASS (F1 fixed; concurred on chain-flow drift now in Sync 4-template PR-body checklist wording to match #102 NSQL doctrine #108)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant