feat: idd-verify orchestration playbook (5 Agents + Recovery Protocol) by kiki830621 · Pull Request #73 · PsychQuant/issue-driven-development

kiki830621 · 2026-05-11T14:01:58Z

Refs #52
Refs #70

Summary

/idd-verify Step 2 spawn 機制重構：從 TeamCreate（5 teammates with Read/Grep/Glob/Bash tools, no Write）改為 5 個平行 Agent(subagent_type=general-purpose) calls + 1 Bash codex exec background。NEW Step 2.5 Recovery Protocol section 處理 file-existence check + retry context re-paste + coordinator self-review fallback。

Why

/idd-verify #47 跑時 spawn 5 個 Claude reviewer agents 卻無一產生 findings — verify 從預期的 6-AI ensemble 退化成 1-AI（只剩 Codex）。Root cause 三層（per #52 issue body）：

subagent_type=Explore 沒 Write tool → 無法寫 findings 檔
Idle/wake cycle 後 agent 不 hold prompt context → SendMessage retry 沒 explicit 重 paste 就 silent
Prompt 沒明示 output mechanism（file vs SendMessage reply）→ agent idle 不主動回

Plus 本 session 多次 verify cycle 撞到 #70（TeamDelete fail on idle teammates，team 殘留）。

Side effect — #70 structurally dissolves

Switching from TeamCreate to standalone Agent calls structurally removes the TeamDelete failure surface — no team exists to cleanup. #70 will be closed in a separate /idd-close cycle citing this Plan.

Changes (1 commit)

63d2474 feat(idd-verify): Step 2 spawn restructure + NEW Step 2.5 Recovery Protocol (Refs feature: idd-verify orchestration playbook (subagent_type guidance + recovery protocol) #52, resolves bug: /idd-verify 結束沒確認 teams 被關掉 — TeamDelete fails on idle-but-active reviewers + no cleanup SOP #70 structurally)

Files Changed

plugins/issue-driven-dev/skills/idd-verify/SKILL.md (+259/-104) — Step 0 bootstrap rename + Step 2 engine note + Step 2 spawn restructure + NEW Step 2.5 Recovery Protocol + Step 3 prose update + Step 4 templates sweep + frontmatter allowed-tools cleanup

Plan tier decisions (D1-D5 from approved plan)

D#	Decision	Why
D1	TeamCreate teammates → 5 standalone `Agent(subagent_type=general-purpose)` calls	Existing teammate `tools` field lacks Write — same failure mode as Explore. Switching to general-purpose Agent gives full tool set + dissolves #70
D2	NEW Step 2.5 Recovery Protocol section (not inline Step 3 preamble)	Recovery is per-reviewer state check, merge is per-finding scan — different granularity, deserves dedicated section
D3	Inline in idd-verify SKILL.md (not separate `references/verify-orchestration.md`)	~80 lines additions below extraction threshold (~150); verify orchestration intrinsic to idd-verify
D4	ALWAYS re-paste original full prompt on retry	#47 confirmed context lost across idle/wake; re-paste ~500 tokens ≪ second-idle coordinator-fallback cost
D5	Devil's Advocate polling loop on sibling findings files (max 30 iter × 5s = 2.5min)	Standalone Agent calls lack TeamCreate's `wait_for_idle` primitive; polling-on-file gives deterministic semantic

Verification — first-real-use validation track

No automated test harness for orchestration — orchestration validates only via first real /idd-verify invocation after merge. Manual smoke matrix from Plan:

#	Setup	Expected
1	`/idd-verify #X` first invocation post-merge	5 Agent calls return + 5 findings files non-empty + Bash codex completes + 6-source merged master
2	Force one Agent idle (if inducible)	Recovery Protocol fires: retry + second-idle → coordinator self-review + process gap noted
3	All 5 siblings present, Devil's Advocate polling succeeds	DA writes findings citing 4 sibling files
4	TeamCreate/TeamDelete absent from session log post-restructure	confirms #70 dissolved
5	Cluster-PR mode (multi-issue) verify	per-issue findings sections appear correctly

Checklist

Diagnose
Plan (Plan tier with EnterPlanMode approval gate)
Implement (1 commit)
Verify (run /idd-verify --pr <N>)
Pending: human review of this PR + /idd-close #52 after merge
Pending: /idd-close #70 separately after this merges (cites feature: idd-verify orchestration playbook (subagent_type guidance + recovery protocol) #52 as structural resolution)

🤖 Generated by /idd-implement on PR path (Plan tier approved). Do NOT add Closes #52 — IDD discipline requires manual /idd-close after merge.

…otocol (Refs #52, resolves #70 structurally) Per #52 Plan tier (#52): - Step 0 TaskList bootstrap: rename launch_parallel_reviewers description to reflect new 6 tool call pattern (5 Agent + 1 Bash codex, no TeamCreate). Add NEW recovery_protocol task entry. - Step 2 engine note preamble: explicit warning subagent_type=Explore 不適合 verify (read-only, 無 Write tool — #47 incident proof). Document why TeamCreate (pre-v2.59.0 model) is rejected: Write-missing teammate tools + wait_for_idle context-loss + #70 cleanup gap. - Step 2 spawn: replace TeamCreate teammates with 5 parallel Agent(subagent_type=general-purpose) calls. Each prompt now mandatorily contains (a) explicit findings file output path, (b) 'DO NOT idle without producing output' rule, (c) retry-context-re-paste hint. - Step 2 Devil's Advocate sequencing: bash polling loop on sibling findings files (max 30 iter × 5s = 2.5min timeout) replaces TeamCreate wait_for_idle primitive. Timeout fallback writes 'skipped: timeout' findings. - NEW Step 2.5 Recovery Protocol section between Step 2 spawn and Step 3 merge: per-role file existence check + SendMessage/spawn-fresh retry with FULL context re-paste (never assume context survived idle/wake) + second-idle coordinator self-review fallback + explicit 'Process Gaps' section in master report (no silent engine degradation). - Step 3 prose updates: 'Agent Team and Codex' → '5 reviewer Agents and Codex'; source tag '[team:logic+codex]' → '[agents:logic+codex]'. - Step 4 master report templates (both local/branch and PR mode): sweep 'Agent Team (5 Claude reviewers)' → '5 general-purpose Agents (Claude reviewers, file-based output)'. - Frontmatter: remove TeamCreate from allowed-tools (no longer used). - 驗證架構 ASCII tree: update 'Agent Team' label. - Engine: team alias: keep CLI alias for backward compat, document underlying as 5 standalone Agent calls. Side effect: #70 (TeamDelete cleanup gap on idle teammates) is structurally dissolved by this Plan — no team to delete since reviewer Agents are standalone calls that return to coordinator after completion. #70 will be closed in a separate /idd-close cycle citing this Plan. Refs #52 Refs #70

…stence, team→agents sweep (Refs #52) Per /idd-verify --pr 73 round 1 codex findings: - P1.1 — Devil's Advocate timeout was silently treated as valid review: DA polling loop wrote non-empty timeout content + exit 0; Step 2.5a only checked file non-empty (-s test). Fix: DA writes sentinel header '[STAGE 2.5 RECOVERY: DEVILS_ADVOCATE_TIMEOUT_<n>/4]' on timeout; Step 2.5a detects sentinel via head -1 + grep, routes to retry/fallback same as missing file case. - P1.2 — Recovery Protocol Step 2.5b retry used '/tmp/verify_<N>_prompt_<role>.md' but Step 2 never instructed coordinator to save those prompts. Added explicit pre-spawn prompt persistence note in Step 2a with cat > tmpfile <<EOF pattern. Coordinator now saves all 5 prompts before invoking 5 parallel Agent calls. - P1.3 — Stale 'team:' source tags in 3 master report templates (line ~628-630, 658, 668, 688-690) contradicted Step 3 prose update. Swept all 'team:logic+codex' / 'team:security' / 'team:regression' / 'team:devils-advocate' → 'agents:*'. Also updated architecture ASCII tree ('看不到 team 的討論' → '看不到其他 reviewer Agents 的 findings 檔') and 鐵律 rule ('看不到 team 的討論' → '看不到 5 reviewer Agents 的 findings 檔'). Refs #52

…ting (Refs #52) Per /idd-verify --pr 73 round 2 codex findings: - P1.1 — sentinel file persists past Step 2.5a, downstream retry (-s) and fallback (! -s) pass without action: sentinel content IS non-empty, so retry polling would see it and exit; fallback ! -s test fails. Fix: after detecting sentinel in 2.5a, rm -f the file. Now retry/fallback correctly see role as missing and proceed. - P1.2 — Devil's Advocate timeout writer had bash quoting syntax error: 'Devil\'s' inside single-quoted printf format cannot escape apostrophe (bash single quotes don't honor backslash). Rewrote to printf '%s\n\n%s\n' with double-quoted format-string args; apostrophe lives inside double quotes where no escaping is needed. Empirical bash smoke: PASS (sentinel writes + grep detects). Refs #52

kiki830621 · 2026-05-11T14:26:11Z

Verify Report — PR #73

Engine

Codex-only degraded mode (Anthropic API rate-limit blocked Claude reviewer ensemble throughout this session). Engine: Codex CLI (gpt-5.5, xhigh reasoning), 3 verify rounds.

Process Gap (per new Step 2.5d, dogfooded immediately on first-real-use): 5 Claude reviewer Agents were NOT spawned this round — Anthropic API limit persisted from #45/#51/#55 verify cycles into #52 verify. Master report still produced (codex carries), but the new Step 2.5 Recovery Protocol's coordinator-self-review-fallback pathway was NOT exercised in production. First-real-use validation deferred until API capacity restored OR fresh session retries /idd-verify --pr 73.

Aggregate

PASS — 0 blocking, 0 follow-up after 3 codex verify rounds.

Scope coverage

PR refs: #47 (historical incident reference), #52 (primary), #70 (structurally resolved)
Verified scope: #52 + #70 cross-link

Verify history

Round	Verdict	New P1
R1	FAIL	3 (DA timeout silent / prompt file missing / stale `team:` source tags)
R2	FAIL	2 (sentinel file persists past retry, only `-s` checked / DA printf bash quoting `'Devil\\'s'` syntax error)
R3	PASS	0

#52 — idd-verify orchestration playbook

Requirements coverage: 6/6 addressed.

Plan tier decisions (D1-D5) all implemented:

D1: TeamCreate → 5 standalone Agent(subagent_type=general-purpose) calls ✓
D2: NEW Step 2.5 Recovery Protocol section ✓
D3: Inline in idd-verify SKILL.md (no separate reference doc) ✓
D4: ALWAYS re-paste full prompt on retry ✓ (per-role prompt persistence in /tmp/verify_${NUMBER}_prompt_<role>.md)
D5: Devil's Advocate polling-on-files (max 30 iter × 5s) ✓ + sentinel marker on timeout

Cumulative changes:

63d2474 feat(idd-verify): Step 2 spawn restructure + NEW Step 2.5 Recovery Protocol
d149b81 fix: round-1 P1 fixes — DA timeout sentinel, prompt persistence, team→agents sweep
c3fcfb4 fix: round-2 P1 fixes — sentinel deletion + DA printf quoting

#70 — TeamDelete cleanup gap (structurally resolved)

Resolution mechanism: PR #73 removes TeamCreate from idd-verify Step 2. No team is created → no TeamDelete failure surface exists. The bug surface that #70 reports is now structurally impossible in idd-verify (the originating skill where #70 surfaced this session via verify-pr58 attempt).

Recommendation: /idd-close #70 after this PR merges, citing PR #73's c342aa2-c3fcfb4 commits as resolution. The closing summary should note "resolved as side effect by #52 Plan implementation — no team to delete = no cleanup gap to fix".

Recommendation

✅ Ready to merge + invoke /idd-close #52 after merge + /idd-close #70 separately.

Process Gap Caveat

This PR's verify was conducted under codex-only degraded mode. The NEW 5-Agent + Codex orchestration the PR introduces has NOT been exercised in production yet (since the verify itself didn't use the new mechanism due to Anthropic API limit). First-real-use validation will occur on the NEXT /idd-verify invocation in a fresh session with restored API capacity. This is per Plan D6 "first-real-use validation track" disclaimer (orchestration cannot be tested without an actual verify to run).

Should the next real-use surface new bugs, they will be filed as standard /idd-issue follow-ups citing PR #73 as origin.

Cumulative findings: 5 P1 caught + fixed across 3 codex rounds (zero P0 throughout, on a structural change of this magnitude).

* fix: drop literal 'Closes #N' from pr-flow.md canonical PR-body template (Refs #87 #74) The canonical PR-body template embedded `Closes #${N}` in the anti-trailer warning. Skills (idd-implement, idd-all) copied this pattern; on heredoc substitution it becomes a real `Closes #<num>` that GitHub auto-close matches context-blind, bypassing /idd-close. Reword to a digit-free form so no `<keyword> #<number>` substring can exist. Refs #87 #74 * fix: drop literal 'Closes #N' from idd-implement/idd-all PR-body templates (Refs #74 #87) idd-implement Step 5.5 and idd-all Phase 5 PR-body heredocs substituted ${NUMBER}/${N} into the anti-trailer warning, producing a real `Closes #<num>` that GitHub auto-close matched on merge — bypassing /idd-close's gate. Confirmed incidents: #559, che-apple-mail-mcp#99, #73, #56. Reword all to the digit-free form. idd-all-chain already used literal '#N' (safe) but is aligned to identical wording so the safe pattern is uniform and cannot be re-parameterized. Refs #74 #87 * feat: add idd-verify Step 0.8 PR-body auto-close-trailer scan gate (Refs #74 #87) Defence-in-depth gate complementing the #87/#74 template rewords. In PR mode, scans the rendered PR body for GitHub auto-close trailers (closes|fixes|resolves #<digit>) and warns — these would auto-close the issue at merge, bypassing /idd-close's checklist gate. Warn-only (a PR body may legitimately quote the keywords in prose). Adds matching scan_pr_body_trailers entry to the Step 0 bootstrap TaskList. Refs #74 #87 * fix: idd-verify Step 0.8 regex — exclude /idd-close skill-invocation false positive (Refs #74 #87) The initial Step 0.8 regex used \b word boundaries, which matched the 'close' inside '/idd-close #N' — an IDD skill invocation instruction, not a GitHub auto-close trailer. GitHub treats 'idd-close' as one hyphenated token and does NOT auto-close it (verified: PR #94 closingIssuesReferences is empty despite its body containing '/idd-close #87 #74'). Replace \b prefix with (^|[^-/[:alnum:]]) so the keyword must not be preceded by '-', '/', or an alphanumeric — precisely matching GitHub's behavior while still catching context-blind hits like (Closes #N) / **Closes #N**. Found by dogfooding the new gate against this PR's own body. Refs #74 #87 * fix: idd-verify Step 0.8 — use closingIssuesReferences instead of hand-rolled regex (Refs #74 #87) R1 verify (/idd-verify --pr 94) found the Step 0.8 regex missed the colon form 'Closes: #87' — a GitHub-documented auto-close form. Reimplementing GitHub's keyword parser by regex is inherently fragile (colon form, cross-repo form, issue-URL form all need separate handling, and cross-repo behavior is itself disputed). Replace the regex with GitHub's own authoritative parse: gh pr view --json closingIssuesReferences lists exactly which issues the PR will auto-close on merge — covering every trailer form GitHub honors, with zero false positive (it IS GitHub's determination) and zero parser-reimplementation risk. This supersedes the regex from 3ba7435 + the regex false-positive fix from 5831128; /idd-close #N skill invocations naturally never appear in closingIssuesReferences so no special exclusion is needed. Refs #74 #87 * fix: idd-verify Step 0.8 — drop orphan regex doc, surface gh-failure, use .url (Refs #74 #87) R2 verify (/idd-verify --pr 94 round 2) found 3 cleanup items, all in Step 0.8: - DEF-1 (6-AI unanimous, blocking): the R2 regex->closingIssuesReferences rewrite deleted the regex code but left the 'Regex 設計' paragraph documenting it — orphan dead doc contradicting the new prose. Deleted. - L-1: '2>/dev/null || true' conflated a gh failure with a clean PR (silent fail-open). Switched to 'if CMD; then ... else ...' so a gh failure prints a 'Step 0.8 skipped' note instead of silently passing. - L-2: softened the 'zero false negative by definition' overclaim — note that closingIssuesReferences is eventually-consistent (settled by verify time). - L-3 (Codex polish): query .url instead of bare .number so a cross-repo close ref is unambiguous in the warning output. Refs #74 #87

kiki830621 mentioned this pull request May 11, 2026

feature: idd-verify orchestration playbook (subagent_type guidance + recovery protocol) #52

Closed

kiki830621 added 2 commits May 11, 2026 22:13

kiki830621 mentioned this pull request May 11, 2026

bug: /idd-verify 結束沒確認 teams 被關掉 — TeamDelete fails on idle-but-active reviewers + no cleanup SOP #70

Closed

kiki830621 merged commit c31b097 into main May 11, 2026

kiki830621 deleted the idd/52-verify-orchestration-playbook branch May 11, 2026 22:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: idd-verify orchestration playbook (5 Agents + Recovery Protocol)#73

feat: idd-verify orchestration playbook (5 Agents + Recovery Protocol)#73
kiki830621 merged 3 commits into
mainfrom
idd/52-verify-orchestration-playbook

kiki830621 commented May 11, 2026

Uh oh!

kiki830621 commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kiki830621 commented May 11, 2026

Summary

Why

Side effect — #70 structurally dissolves

Changes (1 commit)

Files Changed

Plan tier decisions (D1-D5 from approved plan)

Verification — first-real-use validation track

Checklist

Uh oh!

kiki830621 commented May 11, 2026

Verify Report — PR #73

Engine

Aggregate

Scope coverage

Verify history

#52 — idd-verify orchestration playbook

#70 — TeamDelete cleanup gap (structurally resolved)

Recommendation

Process Gap Caveat

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant