Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion plugins/issue-driven-dev/.claude-plugin/plugin.json

Large diffs are not rendered by default.

12 changes: 12 additions & 0 deletions plugins/issue-driven-dev/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,18 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

## [2.65.0] - 2026-05-20

### Added

- **`MANIFESTO.md` — Human-in-the-loop: IDD 即 NSQL Confirmation Protocol section** ([#102](https://github.com/PsychQuant/issue-driven-development/issues/102)): formalizes the doctrine that IDD's human-in-the-loop **is** an instance of the NSQL Confirmation Protocol ([kiki830621/NSQL](https://github.com/kiki830621/NSQL) v4.1.0, already registered as a reference project in CLAUDE.md via #103's `fd2f21c`). Doctrine elements: (1) NSQL confirmation loop ⇆ IDD pipeline mapping table — human's confirmation loop closes **before** execution (at `issue` + `idd-diagnose`); `idd-verify` is an execution-fidelity check, not a confirmation loop. (2) **`verify-gated` is the named, sanctioned terminal default disposition** — one clean 6/6 verify PASS is sufficient to merge; issue was the acceptance contract, verify confirmed delivery. (3) Verify-as-review reframe — 5 specialized adversarial agents + an independent model (Codex) on correctness exceed a single human merge reviewer's thoroughness; "AI verify PASS = no review" is a backwards read. (4) **`--review` flag — opt-in to re-open the confirmation loop**, NOT a quality gate, per-invocation flag (NOT a standing config field — exceptions don't warrant standing policy). (5) auto-merge legitimacy under verify-gated PASS, justified by "verify is the gate" (not "merges are reversible"); guardrails mandatory; `auto-merge ≠ auto-close`; autopilot mechanics belong to [#37](https://github.com/PsychQuant/issue-driven-development/issues/37) — `idd-all` default behavior unchanged (鐵律 `永遠不 auto-merge PR` stays).

- **`--review` flag on `idd-all` + `idd-all-chain`** ([#102](https://github.com/PsychQuant/issue-driven-development/issues/102)): per-invocation messaging-only flag implementing the MANIFESTO doctrine above. Default Phase 6 report on `idd-all`: `Verify: verify-gated PASS` + `Next: merge <PR>, then /idd-close #N` (drops the legacy `Pending: human review` framing that implied a default second gate). With `--review`: `Verify: verify-gated PASS — awaiting human acceptance (re-opened confirmation loop per --review)` + `Next: review PR, merge after acceptance, then /idd-close #N`. `idd-all-chain` mirrors the same pattern: Phase 0 args parsing recognizes `--review`, Phase 2 chain loop propagates the flag to each chained `/idd-all #M --in-chain` invocation (so per-issue Phase 6 reports also reflect), Phase 4 cluster PR body checklist dispatches conditionally — default `- [x] Verify-gated: per-issue verify PASS — cluster ready to merge`, `--review` → `- [ ] Pending: human acceptance review of cluster PR (per --review flag)`. Flag is orthogonal to `--pr`/`--no-pr`/`--in-chain`/`--bfs`/`--cwd` (no mutex). Effect is messaging-only — does NOT make the orchestrator wait, does NOT change `idd-implement`/`idd-verify`/`idd-close` internals.

### Notes

- Discuss-conclusion-aligned scope: `idd-implement` Step 5.5 + `idd-all` Phase 5 + `references/pr-flow.md` + `references/chain-flow.md` PR-body checklist wording **intentionally left at old wording** in this release. Sister consistency follow-up tracked as [#108](https://github.com/PsychQuant/issue-driven-development/issues/108) — "Sync PR-body checklist wording to match #102 NSQL doctrine" — to land in a separate PR. (Originally 4 templates; surfaced as 5-template family during /idd-implement #102 Step 5.7 sister sweep — `chain-flow.md:254` is the canonical chain-shell contract doc that mirrors the same `Pending: human review of cluster PR` wording the orchestrator skills used to emit.)

## [2.64.0] - 2026-05-20

### Changed
Expand Down
44 changes: 44 additions & 0 deletions plugins/issue-driven-dev/MANIFESTO.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,50 @@ falsifiability(IDD) = falsifiability(TDD) ← idd-implement Step 3 RED

---

## Human-in-the-loop: IDD 即 NSQL Confirmation Protocol

IDD 的 human-in-the-loop 不是隨意散落在 pipeline 各處的 ad-hoc 確認 —— 它是 [NSQL](https://github.com/kiki830621/NSQL)(Human-AI Confirmation Protocol)的一個 instance。NSQL 的核心契約:**AI detect ambiguity → 對 user render structured understanding → user confirm/correct intent → execute**。一句話:**clarify before execute, never guess**。

| NSQL confirmation loop | IDD pipeline |
|---|---|
| AI parses & identifies ambiguity | `idd-diagnose` + Layer V / Plan / Spectra(ambiguity detector) |
| AI shows structured understanding | Diagnosis comment / `EnterPlanMode` plan |
| Human confirms 對 / 不對改成… | issue authored / clarified / plan approved |
| Execute | `idd-implement` |
| —(loop closed) | `idd-verify` = **execution-fidelity check**, not a confirmation loop |

關鍵不對稱:**人的 confirmation loop 在 execution 之前就關閉了**。`idd-verify` 檢查的是「執行結果是否忠於已 confirmed intent」,不是再開一次 confirmation loop。Post-execution 沒有東西可以給人 confirm —— issue(已 confirm 的 intent)+ verify(fidelity check)就夠了。

### `verify-gated` 是 terminal 預設 disposition

一次乾淨的 6/6 verify PASS 足以 merge —— issue 是 acceptance contract,verify 確認 delivery。`idd-all` / `idd-all-chain` Phase 6/4 報告的 `verify-gated PASS, ready to merge` 是這個 doctrine 的具體表達。

把 `idd-verify` 的 6-AI ensemble 通過當成「沒 review」是反向誤讀 —— **AI verify PASS 本身就是 review**:5 個 specialized adversarial agents(requirements / logic / security / regression / devil's advocate)+ 一個獨立 model(Codex),cross-lens、falsifiable,在 correctness 維度上**超過**一個人類 merge reviewer 的 thoroughness。Verify 本身是 acceptance 的 falsifiable 部分。

### `--review` flag:opt-in 重開 confirmation loop

罕見情境裡,user 想對自己 issue 起疑("我可能 issue 寫錯了"),可以用 `--review` flag 重開 confirmation loop。Phase 6 報告會從 `verify-gated PASS, ready to merge` 切成 `verify-gated PASS — awaiting human acceptance (re-opened confirmation loop per --review)`。

`--review` **不是 quality gate** —— 它是 opt-in 重開 confirmation loop。Verify 的 falsifiability 不受影響;改變的只是「user 還想自己再過一次」的 explicit 表態。`--review` 是 per-invocation flag,**不是** standing config field —— 一個 exception 不該升格成 standing policy。

### auto-merge 的合法性與限制

在 `verify-gated PASS` 的 doctrine 下,clean 6/6 verify PASS 之上 auto-merge **是合法的** —— 證成理由是 "verify is the gate"(獨立 ensemble 已抓 catch),**不是** "merges are reversible"(revert 不對稱:一個 reviewed change + 兩個 history scars + distribution 已 ship)。

Guardrails 是 mandatory,不是 skipped:

- Clean 6/6 PASS only(never CONDITIONAL / FAIL)
- Step 0.8 auto-close-trap scan 必須 clean
- Squash commit message 必須 clean

**auto-merge ≠ auto-close** —— `idd-close` 仍要跑(closing summary 是 audit artifact,不可省)。

`idd-all` / `idd-all-chain` 預設**不**做 auto-merge(鐵律保留 `永遠不 auto-merge PR`)—— 那是 **#37** bulk-solve autopilot 的 mechanic,由 #37 處理 auto-merge → auto-close 序列細節,**不**在 #102 scope。本 section 只把 `verify-gated` 升格成 named, sanctioned terminal disposition —— 讓 #37 之後接手 autopilot 時有 doctrine 可依。

同樣的約束適用於 `/loop` / `ralph-loop` / 外部 CI 等自動化 caller:把 Phase 6 「verify-gated PASS, ready to merge」當作「請 `gh pr merge`」的 trigger 是**讀錯了**。Doctrine 只 sanction 那個 disposition 為合法 terminal state,**沒有**授權任何 caller 自動 merge。Auto-merge 路徑統一由 **#37** 用 guardrail-aware mechanic 接手;在 #37 ship 前,自動化 caller 仍須停在 Phase 6 report 並由 user / `idd-close` 接手實際 merge。

---

## TDD/SDD 是 IDD 的 special case

業界通常把 TDD、SDD、issue tracking 當作三個獨立的方法論,團隊自行決定要用哪些、怎麼組合。IDD 的核心主張是:**它們不是平行的選擇,而是存在包含關係。**
Expand Down
37 changes: 31 additions & 6 deletions plugins/issue-driven-dev/skills/idd-all-chain/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ description: |
Recursive shell over /idd-all — sub-skill spawns (sister bug / follow-up finding / tangential / sister concern) detected via spawn manifest, chain-eligible enqueued automatically.
Use when: root issue likely ripples (refactor with sister bugs / spec change with cross-spec impact / multi-layer feature) and you want single PR review.
Stops at verified — never auto-close, /idd-close per issue still required.
argument-hint: "[#NNN] [--cwd /path/to/clone] e.g. '#28', '#28 --cwd /path/to/repo'"
argument-hint: "[#NNN ...] [--bfs] [--review] [--cwd /path/to/clone] e.g. '#28', '#A #B #C --bfs', '#28 --review' (--review opt-in re-opens NSQL confirmation loop at terminal report)"
allowed-tools:
- Bash(gh:*)
- Bash(git:*)
Expand Down Expand Up @@ -76,7 +76,8 @@ Inherits `/idd-all` config protocol (walked-up `.claude/issue-driven-dev.local.j
**動任何事之前**先用 `TaskCreate` 建 stage-level todo list:

```
TaskCreate(name="preflight", description="Phase 0: 解析 args (≥1 root + optional --bfs)、gh auth、確認每個 root issue 都 OPEN")
TaskCreate(name="preflight", description="Phase 0: 解析 args (≥1 root + optional --bfs/--review)、gh auth、確認每個 root issue 都 OPEN")
TaskCreate(name="parse_review_flag", description="Phase 0: 解析 --review flag → $REVIEW_FLAG (Phase 2 chain loop 傳到 sub-/idd-all --in-chain;Phase 4 final report wording 切換 verify-gated default vs awaiting human acceptance;per #102 NSQL doctrine)")
TaskCreate(name="check_diagnosis_readiness", description="Phase 0.4 (v2.55+ #47, helper extracted v2.57+ #51, multi-root v2.60+ #46): invoke scripts/check-diagnosis-readiness.sh <github-repo> <root1> [<root2> ...] → JSON {ready/not_ready}; not_ready=0 → silent pass; not_ready>0 → AskUserQuestion 3-option (run /idd-diagnose first / proceed anyway / cancel). Placed before cluster branch / manifest creation so cancel has zero side effect.")
TaskCreate(name="setup_cluster_branch", description="Phase 0.5: 建 cluster branch — N=1 用 idd/chain-<N>-<slug>, N>1 用 idd/chain-multi-<hash8>-<root1-slug> from default branch + 初始化 spawn manifest schema v2 (root_issues + traversal)")
TaskCreate(name="init_queue", description="Phase 1: QUEUE seeded with all roots (sorted asc), per-root DEPTH_MAP[$root]=0, ROOT_ID_MAP, FAIL_ROOTS set, CHAIN_MAX_DEPTH=3 + CHAIN_MAX_ISSUES=10")
Expand All @@ -102,18 +103,26 @@ Same as `/idd-all`,plus:
declare -a ROOT_ISSUES=()
TRAVERSAL="dfs" # default
CWD_FLAG=""
REVIEW_FLAG="" # "" | "--review" — set by --review flag (v2.65+ #102)
for ((i=0; i<${#ARGS[@]}; i++)); do
arg="${ARGS[i]}"
case "$arg" in
\#[0-9]*)
ROOT_ISSUES+=("${arg#\#}") ;;
--bfs)
TRAVERSAL="bfs" ;;
--review)
# v2.65+ #102 — opt-in re-open NSQL confirmation loop.
# Propagated to each chained /idd-all #M --in-chain in Phase 2 so per-issue
# Phase 6 reports also reflect; Phase 4 chain final report also dispatches.
# Messaging-only effect — does NOT make chain wait. Per MANIFESTO
# "Human-in-the-loop: IDD 即 NSQL Confirmation Protocol" doctrine.
REVIEW_FLAG="--review" ;;
--cwd=*) CWD_FLAG="${arg#--cwd=}" ;;
--cwd) i=$((i+1)); CWD_FLAG="${ARGS[i]}" ;;
esac
done
[ ${#ROOT_ISSUES[@]} -eq 0 ] && abort "Usage: /idd-all-chain #NNN [#MMM ...] [--bfs] [--cwd /path]"
[ ${#ROOT_ISSUES[@]} -eq 0 ] && abort "Usage: /idd-all-chain #NNN [#MMM ...] [--bfs] [--review] [--cwd /path]"

# Sort roots ascending for deterministic hash + lowest-root-first slug selection
IFS=$'\n' ROOT_ISSUES_SORTED=($(sort -n <<<"${ROOT_ISSUES[*]}"))
Expand Down Expand Up @@ -382,8 +391,12 @@ while [ ${#QUEUE[@]} -gt 0 ]; do

# Invoke /idd-all in chain context. Export current root_id so sub-skills can
# propagate it to manifest-append.sh (per D1 schema v2 root_id field).
# Propagate $REVIEW_FLAG (v2.65+ #102) so each per-issue Phase 6 report also
# reflects the verify-gated vs awaiting-human-acceptance disposition.
# ${REVIEW_FLAG:+ $REVIEW_FLAG} appends with a leading space ONLY when set,
# avoiding a stray space when REVIEW_FLAG="" — otherwise args parse fragility.
export IDD_CHAIN_CURRENT_ROOT_ID="$CURRENT_ROOT"
Skill(skill="issue-driven-dev:idd-all", args="#$CURRENT --in-chain --cwd $CWD")
Skill(skill="issue-driven-dev:idd-all", args="#$CURRENT --in-chain --cwd $CWD${REVIEW_FLAG:+ $REVIEW_FLAG}")
unset IDD_CHAIN_CURRENT_ROOT_ID

# Determine /idd-all completion state — read latest verify comment phase
Expand Down Expand Up @@ -529,6 +542,18 @@ else
SUMMARY_LINE="Multi-root chain (N=${N_ROOTS} roots: ${ROOT_ISSUES_SORTED[*]}) solved as one cluster via \`/idd-all-chain\` (v2.60+, traversal=${TRAVERSAL}). Total ${#CHAINED_ORDER[@]} processed issues across all root subtrees."
fi

# Compose review-state checklist line with explicit if/else BEFORE heredoc
# interpolation (v2.65.1+ fix for the broken ${VAR:-word} mutex attempt that
# this file shipped with — that idiom returns $VAR when set, not the
# alternative branch, so the --review path leaked the literal `--review` at
# the end of the rendered line. Build the line in a single var, then
# interpolate, so the heredoc only sees the final string.)
if [ -n "$REVIEW_FLAG" ]; then
REVIEW_CHECKLIST_LINE="- [ ] **Pending: human acceptance review of cluster PR** (per --review flag) + /idd-close $REFS_LIST after merge"
else
REVIEW_CHECKLIST_LINE="- [x] **Verify-gated**: per-issue verify PASS — cluster ready to merge → /idd-close $REFS_LIST per issue after merge"
fi

PR_BODY=$(cat <<EOF
Refs $REFS_LIST

Expand All @@ -545,12 +570,12 @@ $OVERVIEW_ROWS
## Per-issue details
$DETAILS_BLOCKS

## Pending review
## Review status

- [x] Diagnose ✓ for all ${#CHAINED_ORDER[@]} issues
- [x] Implement ✓
- [x] Verify ✓ (per-issue 6-AI ensemble)
- [ ] **Pending: human review of cluster PR + /idd-close $REFS_LIST after merge**
$REVIEW_CHECKLIST_LINE

---

Expand Down
Loading