feat: idd-verify orchestration playbook (5 Agents + Recovery Protocol)#73
Conversation
…otocol (Refs #52, resolves #70 structurally) Per #52 Plan tier (#52): - Step 0 TaskList bootstrap: rename launch_parallel_reviewers description to reflect new 6 tool call pattern (5 Agent + 1 Bash codex, no TeamCreate). Add NEW recovery_protocol task entry. - Step 2 engine note preamble: explicit warning subagent_type=Explore 不適合 verify (read-only, 無 Write tool — #47 incident proof). Document why TeamCreate (pre-v2.59.0 model) is rejected: Write-missing teammate tools + wait_for_idle context-loss + #70 cleanup gap. - Step 2 spawn: replace TeamCreate teammates with 5 parallel Agent(subagent_type=general-purpose) calls. Each prompt now mandatorily contains (a) explicit findings file output path, (b) 'DO NOT idle without producing output' rule, (c) retry-context-re-paste hint. - Step 2 Devil's Advocate sequencing: bash polling loop on sibling findings files (max 30 iter × 5s = 2.5min timeout) replaces TeamCreate wait_for_idle primitive. Timeout fallback writes 'skipped: timeout' findings. - NEW Step 2.5 Recovery Protocol section between Step 2 spawn and Step 3 merge: per-role file existence check + SendMessage/spawn-fresh retry with FULL context re-paste (never assume context survived idle/wake) + second-idle coordinator self-review fallback + explicit 'Process Gaps' section in master report (no silent engine degradation). - Step 3 prose updates: 'Agent Team and Codex' → '5 reviewer Agents and Codex'; source tag '[team:logic+codex]' → '[agents:logic+codex]'. - Step 4 master report templates (both local/branch and PR mode): sweep 'Agent Team (5 Claude reviewers)' → '5 general-purpose Agents (Claude reviewers, file-based output)'. - Frontmatter: remove TeamCreate from allowed-tools (no longer used). - 驗證架構 ASCII tree: update 'Agent Team' label. - Engine: team alias: keep CLI alias for backward compat, document underlying as 5 standalone Agent calls. Side effect: #70 (TeamDelete cleanup gap on idle teammates) is structurally dissolved by this Plan — no team to delete since reviewer Agents are standalone calls that return to coordinator after completion. #70 will be closed in a separate /idd-close cycle citing this Plan. Refs #52 Refs #70
…stence, team→agents sweep (Refs #52) Per /idd-verify --pr 73 round 1 codex findings: - P1.1 — Devil's Advocate timeout was silently treated as valid review: DA polling loop wrote non-empty timeout content + exit 0; Step 2.5a only checked file non-empty (-s test). Fix: DA writes sentinel header '[STAGE 2.5 RECOVERY: DEVILS_ADVOCATE_TIMEOUT_<n>/4]' on timeout; Step 2.5a detects sentinel via head -1 + grep, routes to retry/fallback same as missing file case. - P1.2 — Recovery Protocol Step 2.5b retry used '/tmp/verify_<N>_prompt_<role>.md' but Step 2 never instructed coordinator to save those prompts. Added explicit pre-spawn prompt persistence note in Step 2a with cat > tmpfile <<EOF pattern. Coordinator now saves all 5 prompts before invoking 5 parallel Agent calls. - P1.3 — Stale 'team:' source tags in 3 master report templates (line ~628-630, 658, 668, 688-690) contradicted Step 3 prose update. Swept all 'team:logic+codex' / 'team:security' / 'team:regression' / 'team:devils-advocate' → 'agents:*'. Also updated architecture ASCII tree ('看不到 team 的討論' → '看不到其他 reviewer Agents 的 findings 檔') and 鐵律 rule ('看不到 team 的討論' → '看不到 5 reviewer Agents 的 findings 檔'). Refs #52
…ting (Refs #52) Per /idd-verify --pr 73 round 2 codex findings: - P1.1 — sentinel file persists past Step 2.5a, downstream retry (-s) and fallback (! -s) pass without action: sentinel content IS non-empty, so retry polling would see it and exit; fallback ! -s test fails. Fix: after detecting sentinel in 2.5a, rm -f the file. Now retry/fallback correctly see role as missing and proceed. - P1.2 — Devil's Advocate timeout writer had bash quoting syntax error: 'Devil\'s' inside single-quoted printf format cannot escape apostrophe (bash single quotes don't honor backslash). Rewrote to printf '%s\n\n%s\n' with double-quoted format-string args; apostrophe lives inside double quotes where no escaping is needed. Empirical bash smoke: PASS (sentinel writes + grep detects). Refs #52
Verify Report — PR #73EngineCodex-only degraded mode (Anthropic API rate-limit blocked Claude reviewer ensemble throughout this session). Engine: Codex CLI (
AggregatePASS — 0 blocking, 0 follow-up after 3 codex verify rounds. Scope coverage
Verify history
#52 — idd-verify orchestration playbookRequirements coverage: 6/6 addressed. Plan tier decisions (D1-D5) all implemented:
Cumulative changes:
#70 — TeamDelete cleanup gap (structurally resolved)Resolution mechanism: PR #73 removes TeamCreate from idd-verify Step 2. No team is created → no TeamDelete failure surface exists. The bug surface that #70 reports is now structurally impossible in idd-verify (the originating skill where #70 surfaced this session via verify-pr58 attempt). Recommendation: Recommendation✅ Ready to merge + invoke Process Gap CaveatThis PR's verify was conducted under codex-only degraded mode. The NEW 5-Agent + Codex orchestration the PR introduces has NOT been exercised in production yet (since the verify itself didn't use the new mechanism due to Anthropic API limit). First-real-use validation will occur on the NEXT Should the next real-use surface new bugs, they will be filed as standard Cumulative findings: 5 P1 caught + fixed across 3 codex rounds (zero P0 throughout, on a structural change of this magnitude). |
* fix: drop literal 'Closes #N' from pr-flow.md canonical PR-body template (Refs #87 #74) The canonical PR-body template embedded `Closes #${N}` in the anti-trailer warning. Skills (idd-implement, idd-all) copied this pattern; on heredoc substitution it becomes a real `Closes #<num>` that GitHub auto-close matches context-blind, bypassing /idd-close. Reword to a digit-free form so no `<keyword> #<number>` substring can exist. Refs #87 #74 * fix: drop literal 'Closes #N' from idd-implement/idd-all PR-body templates (Refs #74 #87) idd-implement Step 5.5 and idd-all Phase 5 PR-body heredocs substituted ${NUMBER}/${N} into the anti-trailer warning, producing a real `Closes #<num>` that GitHub auto-close matched on merge — bypassing /idd-close's gate. Confirmed incidents: #559, che-apple-mail-mcp#99, #73, #56. Reword all to the digit-free form. idd-all-chain already used literal '#N' (safe) but is aligned to identical wording so the safe pattern is uniform and cannot be re-parameterized. Refs #74 #87 * feat: add idd-verify Step 0.8 PR-body auto-close-trailer scan gate (Refs #74 #87) Defence-in-depth gate complementing the #87/#74 template rewords. In PR mode, scans the rendered PR body for GitHub auto-close trailers (closes|fixes|resolves #<digit>) and warns — these would auto-close the issue at merge, bypassing /idd-close's checklist gate. Warn-only (a PR body may legitimately quote the keywords in prose). Adds matching scan_pr_body_trailers entry to the Step 0 bootstrap TaskList. Refs #74 #87 * fix: idd-verify Step 0.8 regex — exclude /idd-close skill-invocation false positive (Refs #74 #87) The initial Step 0.8 regex used \b word boundaries, which matched the 'close' inside '/idd-close #N' — an IDD skill invocation instruction, not a GitHub auto-close trailer. GitHub treats 'idd-close' as one hyphenated token and does NOT auto-close it (verified: PR #94 closingIssuesReferences is empty despite its body containing '/idd-close #87 #74'). Replace \b prefix with (^|[^-/[:alnum:]]) so the keyword must not be preceded by '-', '/', or an alphanumeric — precisely matching GitHub's behavior while still catching context-blind hits like (Closes #N) / **Closes #N**. Found by dogfooding the new gate against this PR's own body. Refs #74 #87 * fix: idd-verify Step 0.8 — use closingIssuesReferences instead of hand-rolled regex (Refs #74 #87) R1 verify (/idd-verify --pr 94) found the Step 0.8 regex missed the colon form 'Closes: #87' — a GitHub-documented auto-close form. Reimplementing GitHub's keyword parser by regex is inherently fragile (colon form, cross-repo form, issue-URL form all need separate handling, and cross-repo behavior is itself disputed). Replace the regex with GitHub's own authoritative parse: gh pr view --json closingIssuesReferences lists exactly which issues the PR will auto-close on merge — covering every trailer form GitHub honors, with zero false positive (it IS GitHub's determination) and zero parser-reimplementation risk. This supersedes the regex from 3ba7435 + the regex false-positive fix from 5831128; /idd-close #N skill invocations naturally never appear in closingIssuesReferences so no special exclusion is needed. Refs #74 #87 * fix: idd-verify Step 0.8 — drop orphan regex doc, surface gh-failure, use .url (Refs #74 #87) R2 verify (/idd-verify --pr 94 round 2) found 3 cleanup items, all in Step 0.8: - DEF-1 (6-AI unanimous, blocking): the R2 regex->closingIssuesReferences rewrite deleted the regex code but left the 'Regex 設計' paragraph documenting it — orphan dead doc contradicting the new prose. Deleted. - L-1: '2>/dev/null || true' conflated a gh failure with a clean PR (silent fail-open). Switched to 'if CMD; then ... else ...' so a gh failure prints a 'Step 0.8 skipped' note instead of silently passing. - L-2: softened the 'zero false negative by definition' overclaim — note that closingIssuesReferences is eventually-consistent (settled by verify time). - L-3 (Codex polish): query .url instead of bare .number so a cross-repo close ref is unambiguous in the warning output. Refs #74 #87
Refs #52
Refs #70
Summary
/idd-verifyStep 2 spawn 機制重構:從TeamCreate(5 teammates withRead/Grep/Glob/Bashtools, no Write)改為 5 個平行Agent(subagent_type=general-purpose)calls + 1Bash codex execbackground。NEW Step 2.5 Recovery Protocol section 處理 file-existence check + retry context re-paste + coordinator self-review fallback。Why
/idd-verify #47跑時 spawn 5 個 Claude reviewer agents 卻無一產生 findings — verify 從預期的 6-AI ensemble 退化成 1-AI(只剩 Codex)。Root cause 三層(per #52 issue body):subagent_type=Explore沒 Write tool → 無法寫 findings 檔Plus 本 session 多次 verify cycle 撞到 #70(TeamDelete fail on idle teammates,team 殘留)。
Side effect — #70 structurally dissolves
Switching from TeamCreate to standalone Agent calls structurally removes the TeamDelete failure surface — no team exists to cleanup. #70 will be closed in a separate
/idd-closecycle citing this Plan.Changes (1 commit)
63d2474feat(idd-verify): Step 2 spawn restructure + NEW Step 2.5 Recovery Protocol (Refs feature: idd-verify orchestration playbook (subagent_type guidance + recovery protocol) #52, resolves bug: /idd-verify 結束沒確認 teams 被關掉 — TeamDelete fails on idle-but-active reviewers + no cleanup SOP #70 structurally)Files Changed
plugins/issue-driven-dev/skills/idd-verify/SKILL.md(+259/-104) — Step 0 bootstrap rename + Step 2 engine note + Step 2 spawn restructure + NEW Step 2.5 Recovery Protocol + Step 3 prose update + Step 4 templates sweep + frontmatter allowed-tools cleanupPlan tier decisions (D1-D5 from approved plan)
Agent(subagent_type=general-purpose)callstoolsfield lacks Write — same failure mode as Explore. Switching to general-purpose Agent gives full tool set + dissolves #70references/verify-orchestration.md)wait_for_idleprimitive; polling-on-file gives deterministic semanticVerification — first-real-use validation track
No automated test harness for orchestration — orchestration validates only via first real
/idd-verifyinvocation after merge. Manual smoke matrix from Plan:/idd-verify #Xfirst invocation post-mergeChecklist
/idd-verify --pr <N>)/idd-close #52after merge/idd-close #70separately after this merges (cites feature: idd-verify orchestration playbook (subagent_type guidance + recovery protocol) #52 as structural resolution)🤖 Generated by
/idd-implementon PR path (Plan tier approved). Do NOT addCloses #52— IDD discipline requires manual/idd-closeafter merge.