docs(mt#1616): Rewrite implement-task Step 9 with post-PR convergence protocol#975
Conversation
…nvergence protocol Originating retrospective: 2026-05-07 PR #970/mt#1610 idle-drift incident. Old Step 9 (Hand off to verify) said stop working on the session and run /verify-task. Both stale post-mt#1551: /verify-task is now a closeout wrapper for the bypass-merge path only, and the actual review bridge is the reviewer-bot via webhook. The instruction to stop produced idle drift when no wait mechanism was set up. New Step 9 (Drive PR to convergence — IN-REVIEW to merge): - Names mcp__minsky__session_pr_wait-for-review as the default mechanism with explicit reviewer filter and since-default note. - Names ScheduleWakeup as the alternative for multi-day waits. - Forbids idling without a named mechanism. - Notes that pr_watch_create is inert today per the survey-toolkit memory (stub client, no scheduler, desktop-only OperatorNotify) — do not recommend until production gaps close. - Branches on review state: APPROVE → session_pr_merge atomically sets DONE; CHANGES_REQUESTED → cascade-defense + class-not-instance sweep per §7; COMMENT → assess. - Lists four convergence-failure escape valves with tracking memories: self-authored bot PR (gh api PUT bypass + /verify-task closeout), round-N self-reversal, CoT-leakage twice on same HEAD, and webhook-miss silence after push. - Adds pre-bypass discipline cross-reference to verify_ci_fired_before_bypass_merge. Two adjacent updates for consistency with the same architectural shift: - Process overview line 41 retitled. - §0 IN-REVIEW transition message updated: instead of pointing at /verify-task as the next step, points at the §9 convergence loop and notes the standard session_pr_merge path auto-sets DONE. Bridge memory feedback_post_pr_convergence_idle_drift will be updated post-merge to mark the structural task complete.
There was a problem hiding this comment.
Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown
Solid and necessary doc rewrite that replaces a passive handoff with an explicit convergence loop aligned with CLAUDE.md. However, Step 9 names the default wait tool with a hyphenated identifier that likely doesn’t exist (inconsistent with underscore-only tool naming), risking runtime failures and recreating idle drift — please correct/verify the exact tool name across the section. Also, several out-of-repo memory references are unverified; double-check slugs to avoid stale cross-references. Once the tool ID is fixed, the rest looks coherent and on-spec.
Findings
- [BLOCKING] .claude/skills/implement-task/SKILL.md:1 — Inconsistent/likely invalid tool name:
mcp__minsky__session_pr_wait-for-reviewuses a hyphen; all other Minsky tool IDs use underscores
In §9 you instruct the agent to callmcp__minsky__session_pr_wait-for-reviewas the default mechanism. This tool name is inconsistent with the established naming convention used throughout the repo (mcp__minsky__session_pr_create,session_pr_merge,session_pr_get,pr_watch_create— all use underscores, not hyphens). A hyphenated tool ID is very likely not resolvable by the MCP registry and will cause a "tool not found" error at run time, recreating the very idle-drift you’re trying to prevent.
Please correct the identifier to the actual registered tool name (presumably mcp__minsky__session_pr_wait_for_review) everywhere it appears in this section (default mechanism paragraph and any later references), or adjust the guidance if the canonical tool is named differently. Include a one-line parenthetical with the exact parameter shape to reduce call-site ambiguity.
- [NON-BLOCKING] .claude/skills/implement-task/SKILL.md:116 — Out-of-repo memory path references cannot be verified here
This section cites multiple bridge memories by filename (e.g.,feedback_post_pr_convergence_idle_drift,feedback_survey_event_resumption_toolkit_before_proposing_self_poll_or_user_ping, others under~/.claude/...). Those references are out-of-repo. [NEEDS VERIFICATION: out-of-repo path — reviewer cannot verify] Please double-check that the memory entries exist with the exact slugs used and that their content matches the behaviors described (e.g., pr_watch_create being inert). If names differ even slightly, cross-references will mislead operators.
There was a problem hiding this comment.
Review: mt#1616 — Rewrite implement-task Step 9 with post-PR convergence protocol
CI status: in_progress at review time on commit 0429797a7 (build + Prevent Placeholder Tests both running). Skill markdown only; no behavioral risk if a check flakes.
Coverage: 1/1 file reviewed (single-file skill markdown change, 95 lines net; under the proportionality rule's 1–20 file threshold for direct review).
Summary
Rewrites implement-task SKILL.md Step 9 from "Hand off to verify" (stale post-mt#1551) to "Drive PR to convergence (IN-REVIEW → merge)". Names mcp__minsky__session_pr_wait-for-review as the default action, forbids idling without a named wait/poll mechanism, and lists four convergence-failure escape valves with their tracking memories. Two adjacent updates (Process overview line + §0 IN-REVIEW transition message) for consistency with the same architectural shift.
The earlier reviewer-bot review (CHANGES_REQUESTED, dismissed at b4dc77d-equivalent dismissal id) raised one BLOCKING finding which I dismissed as factually false — the bot claimed mcp__minsky__session_pr_wait-for-review was "likely not resolvable by the MCP registry" because of the hyphen. That call was the very mechanism that surfaced the review (returned success: true, matched: true), the deferred-tools manifest contains 9+ hyphenated mcp__minsky__* names, and three other authoritative sources (feedback_user_does_not_review, the convergence memo, the bridge memory) also use the hyphenated form. The bot's "all use underscores" claim is empirically wrong. Pattern matches feedback_verify_reviewer_bot_claims_at_cited_location_before_fixing.
The NON-BLOCKING (verify memory slugs) was verified — all five primary slugs return matching memory_search records.
Cross-cutting concerns
None. Single-file markdown change with three internally consistent edits.
Spec verification
Task: mt#1616
| Criterion | Status | Evidence |
|---|---|---|
| Step 9 rewritten with new convergence-protocol content | Met | .claude/skills/implement-task/SKILL.md:315+ — entire §9 body replaced |
| Step 9 explicitly states "this step does not stop the session — it actively waits" | Met | "This step does NOT stop the session — it actively waits." present in §9 |
mcp__minsky__session_pr_wait-for-review named as the default mechanism |
Met | "Default mechanism: call mcp__minsky__session_pr_wait-for-review..." present in §9 |
ScheduleWakeup named as the alternative for multi-day waits |
Met | "Alternative for genuinely-async multi-day waits: ScheduleWakeup with delaySeconds in the 1200–1800s range" present in §9 |
| All five referenced feedback memories cited by name | Met | grep returned 6 hits (one cited twice). Verified slugs: feedback_user_does_not_review, feedback_self_authored_pr_merge_constraints, feedback_bot_pr_convergence_via_bypass, feedback_reviewer_bot_self_reversal_signal, feedback_reviewer_bot_cot_leakage_forces_bypass. All five returned matching memory_search records. |
/verify-task framing corrected |
Met | §0 IN-REVIEW transition message updated to point at §9 convergence loop and note the standard session_pr_merge path auto-sets DONE; §9 closes with explicit "Do NOT pre-emptively call /verify-task — it only fires on the bypass-merge fallback path." |
Bridge memory feedback_post_pr_convergence_idle_drift marked retired |
Pending post-merge | Will be marked retired/superseded in a memory_update call after merge |
All 7 success criteria met (one is post-merge).
Acceptance tests
| Test | Result |
|---|---|
AT #1 — agent invoking /implement-task and reaching Step 9 sees explicit session_pr_wait-for-review instruction |
✓ |
| AT #2 — Step 9 references all 5 convergence-related memories by name | ✓ (grep -c returned 6, with one cited twice) |
AT #3 — Step 9 does not contain "Run /verify-task mt#X to verify" |
✓ (grep -c returned 0) |
AT #4 — bun run validate-all is N/A for skill markdown |
✓ |
AT #5 — walk-through: agent following new Step 9 calls session_pr_wait-for-review rather than idling |
✓ (this very PR exercised the wait per the new flow) |
bunx prettier --check clean.
Documentation impact
This PR IS the doc update. Skill markdown only. No additional docs need updating; the §0 IN-REVIEW transition message and Process overview line were updated alongside §9 for consistency with the same mt#1551 architectural shift.
Concurrency analysis
(N/A — no check-then-act pattern introduced; skill markdown is static content.)
Live verification
(N/A — docs-only change with no live external behavior dependency. Per AT #5, the walk-through verification is encoded in the protocol's own use during this PR's lifecycle.)
Event rationale
COMMENT event chosen because the PR was opened by minsky-ai[bot] and this review is being submitted by the same App identity — GitHub structurally blocks self-approval. The merge gate is satisfied by this review's spec-verification + documentation-impact sections per feedback_self_authored_pr_merge_constraints.
(Had Claude run the review.)
Summary
Rewrites the
implement-taskskill's Step 9 from "Hand off to verify" (stale post-mt#1551) to "Drive PR to convergence (IN-REVIEW → merge)". Namesmcp__minsky__session_pr_wait-for-reviewas the default action and explicitly forbids idling without a named wait/poll mechanism.Originating retrospective: 2026-05-07 PR #970/mt#1610 idle-drift incident.
Why
Old Step 9 said "stop working on the session" and "Run
/verify-task mt#Xto verify." Both stale:/verify-taskis a closeout wrapper for the bypass-merge path only — it does not fire on the standardsession_pr_mergepath (which auto-sets DONE).minsky-reviewer[bot]via webhook, not/verify-task.The convergence loop (signals + escape valves) was already documented across 5+ feedback memories. This task names the loop's entrypoint so the agent knows to start running it after PR creation.
Key changes
.claude/skills/implement-task/SKILL.md:mcp__minsky__session_pr_wait-for-reviewnamed as the default mechanism, with notes on thereviewerfilter and thesince=nowdefault behavior (caught me with my pants down on PR refactor(mt#1610): Remove sessiondb configuration block #970 round 1 — wait returnedmatched: falsebecause the bot had already posted before I called).ScheduleWakeupnamed as the alternative for multi-day async waits.feedback_post_pr_convergence_idle_driftbridge memory.pr_watch_createis inert today (stub client + no scheduler + desktop-only OperatorNotify) perfeedback_survey_event_resumption_toolkit_before_proposing_self_poll_or_user_ping— don't recommend until production gaps close.session_pr_merge(auto-DONE); CHANGES_REQUESTED → cascade-defense + class-not-instance per §7; COMMENT → assess.feedback_self_authored_pr_merge_constraints+feedback_bot_pr_convergence_via_bypass), round-N self-reversal (feedback_reviewer_bot_self_reversal_signal), CoT-leakage 2× same HEAD (feedback_reviewer_bot_cot_leakage_forces_bypass), webhook-miss silence after push.feedback_verify_ci_fired_before_bypass_merge./verify-taskas the next step (stale), points at the §9 convergence loop and notes the standard merge path auto-sets DONE.Testing
Skill markdown —
bun run validate-allis N/A per spec acceptance test #4. Verified manually:/implement-taskand reaching Step 9 sees explicitsession_pr_wait-for-reviewinstruction. ✓grep -creturned 6 hits across 5 memory names.) ✓grep -c "Run \\\/verify-task mt#X\` to verify"` returns 0. ✓session_pr_wait-for-reviewrather than idling. ✓bunx prettier --checkclean.Out of scope
/converge-prskill. Inline Step 9 is sufficient until proven insufficient.session_pr_createthat auto-triggers the wait. Same reasoning — escalate only if behavioral fix fails./verify-taskas an audit gate (file separately if discovered).Documentation impact
No update needed — this PR IS the doc update. Skill markdown only.
Concurrency analysis
(N/A — no check-then-act pattern introduced; skill markdown is static content.)
Live verification
(N/A — docs-only change with no live external behavior dependency.)
Follow-up
Bridge memory
feedback_post_pr_convergence_idle_driftwill be marked retired post-merge (referencing this PR as the structural fix that landed).