Skip to content

docs(mt#1616): Rewrite implement-task Step 9 with post-PR convergence protocol#975

Merged
edobry merged 1 commit into
mainfrom
task/mt-1616
May 8, 2026
Merged

docs(mt#1616): Rewrite implement-task Step 9 with post-PR convergence protocol#975
edobry merged 1 commit into
mainfrom
task/mt-1616

Conversation

@minsky-ai
Copy link
Copy Markdown
Contributor

@minsky-ai minsky-ai Bot commented May 8, 2026

Summary

Rewrites the implement-task skill's Step 9 from "Hand off to verify" (stale post-mt#1551) to "Drive PR to convergence (IN-REVIEW → merge)". Names mcp__minsky__session_pr_wait-for-review as the default action and explicitly forbids idling without a named wait/poll mechanism.

Originating retrospective: 2026-05-07 PR #970/mt#1610 idle-drift incident.

Why

Old Step 9 said "stop working on the session" and "Run /verify-task mt#X to verify." Both stale:

  • Per mt#1551, /verify-task is a closeout wrapper for the bypass-merge path only — it does not fire on the standard session_pr_merge path (which auto-sets DONE).
  • The actual review bridge between PR-created and merge-ready is minsky-reviewer[bot] via webhook, not /verify-task.
  • The instruction to stop produced idle drift when no wait mechanism was set up. The agent posted a summary message and sat there with nothing happening until the user prompted.

The convergence loop (signals + escape valves) was already documented across 5+ feedback memories. This task names the loop's entrypoint so the agent knows to start running it after PR creation.

Key changes

.claude/skills/implement-task/SKILL.md:

  • §9 retitled "Drive PR to convergence (IN-REVIEW → merge)" — Process overview line 41 also updated.
  • §9 body rewritten:
    • Explicit "this step does NOT stop the session — it actively waits" framing.
    • mcp__minsky__session_pr_wait-for-review named as the default mechanism, with notes on the reviewer filter and the since=now default behavior (caught me with my pants down on PR refactor(mt#1610): Remove sessiondb configuration block #970 round 1 — wait returned matched: false because the bot had already posted before I called).
    • ScheduleWakeup named as the alternative for multi-day async waits.
    • Forbids idling without a named mechanism; references the feedback_post_pr_convergence_idle_drift bridge memory.
    • Notes pr_watch_create is inert today (stub client + no scheduler + desktop-only OperatorNotify) per feedback_survey_event_resumption_toolkit_before_proposing_self_poll_or_user_ping — don't recommend until production gaps close.
    • Branch logic on review state: APPROVE → session_pr_merge (auto-DONE); CHANGES_REQUESTED → cascade-defense + class-not-instance per §7; COMMENT → assess.
    • Four convergence-failure escape valves enumerated with their tracking memories: self-authored bot PR (feedback_self_authored_pr_merge_constraints + feedback_bot_pr_convergence_via_bypass), round-N self-reversal (feedback_reviewer_bot_self_reversal_signal), CoT-leakage 2× same HEAD (feedback_reviewer_bot_cot_leakage_forces_bypass), webhook-miss silence after push.
    • Pre-bypass discipline cross-reference to feedback_verify_ci_fired_before_bypass_merge.
  • §0 IN-REVIEW transition message updated for consistency: instead of pointing at /verify-task as the next step (stale), points at the §9 convergence loop and notes the standard merge path auto-sets DONE.

Testing

Skill markdown — bun run validate-all is N/A per spec acceptance test #4. Verified manually:

bunx prettier --check clean.

Out of scope

  • Creating a separate /converge-pr skill. Inline Step 9 is sufficient until proven insufficient.
  • A PostToolUse hook on session_pr_create that auto-triggers the wait. Same reasoning — escalate only if behavioral fix fails.
  • Other skills that reference /verify-task as an audit gate (file separately if discovered).

Documentation impact

No update needed — this PR IS the doc update. Skill markdown only.

Concurrency analysis

(N/A — no check-then-act pattern introduced; skill markdown is static content.)

Live verification

(N/A — docs-only change with no live external behavior dependency.)

Follow-up

Bridge memory feedback_post_pr_convergence_idle_drift will be marked retired post-merge (referencing this PR as the structural fix that landed).

…nvergence protocol

Originating retrospective: 2026-05-07 PR #970/mt#1610 idle-drift incident.

Old Step 9 (Hand off to verify) said stop working on the session and run
/verify-task. Both stale post-mt#1551: /verify-task is now a closeout
wrapper for the bypass-merge path only, and the actual review bridge is
the reviewer-bot via webhook. The instruction to stop produced idle drift
when no wait mechanism was set up.

New Step 9 (Drive PR to convergence — IN-REVIEW to merge):

- Names mcp__minsky__session_pr_wait-for-review as the default mechanism
  with explicit reviewer filter and since-default note.
- Names ScheduleWakeup as the alternative for multi-day waits.
- Forbids idling without a named mechanism.
- Notes that pr_watch_create is inert today per the survey-toolkit memory
  (stub client, no scheduler, desktop-only OperatorNotify) — do not
  recommend until production gaps close.
- Branches on review state: APPROVE → session_pr_merge atomically sets
  DONE; CHANGES_REQUESTED → cascade-defense + class-not-instance sweep
  per §7; COMMENT → assess.
- Lists four convergence-failure escape valves with tracking memories:
  self-authored bot PR (gh api PUT bypass + /verify-task closeout),
  round-N self-reversal, CoT-leakage twice on same HEAD, and webhook-miss
  silence after push.
- Adds pre-bypass discipline cross-reference to verify_ci_fired_before_bypass_merge.

Two adjacent updates for consistency with the same architectural shift:

- Process overview line 41 retitled.
- §0 IN-REVIEW transition message updated: instead of pointing at
  /verify-task as the next step, points at the §9 convergence loop and
  notes the standard session_pr_merge path auto-sets DONE.

Bridge memory feedback_post_pr_convergence_idle_drift will be updated
post-merge to mark the structural task complete.
@minsky-ai minsky-ai Bot added the authorship/co-authored Co-authored by human and AI agent label May 8, 2026
Copy link
Copy Markdown

@minsky-reviewer minsky-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown


Solid and necessary doc rewrite that replaces a passive handoff with an explicit convergence loop aligned with CLAUDE.md. However, Step 9 names the default wait tool with a hyphenated identifier that likely doesn’t exist (inconsistent with underscore-only tool naming), risking runtime failures and recreating idle drift — please correct/verify the exact tool name across the section. Also, several out-of-repo memory references are unverified; double-check slugs to avoid stale cross-references. Once the tool ID is fixed, the rest looks coherent and on-spec.

Findings

  • [BLOCKING] .claude/skills/implement-task/SKILL.md:1 — Inconsistent/likely invalid tool name: mcp__minsky__session_pr_wait-for-review uses a hyphen; all other Minsky tool IDs use underscores
    In §9 you instruct the agent to call mcp__minsky__session_pr_wait-for-review as the default mechanism. This tool name is inconsistent with the established naming convention used throughout the repo (mcp__minsky__session_pr_create, session_pr_merge, session_pr_get, pr_watch_create — all use underscores, not hyphens). A hyphenated tool ID is very likely not resolvable by the MCP registry and will cause a "tool not found" error at run time, recreating the very idle-drift you’re trying to prevent.

Please correct the identifier to the actual registered tool name (presumably mcp__minsky__session_pr_wait_for_review) everywhere it appears in this section (default mechanism paragraph and any later references), or adjust the guidance if the canonical tool is named differently. Include a one-line parenthetical with the exact parameter shape to reduce call-site ambiguity.

  • [NON-BLOCKING] .claude/skills/implement-task/SKILL.md:116 — Out-of-repo memory path references cannot be verified here
    This section cites multiple bridge memories by filename (e.g., feedback_post_pr_convergence_idle_drift, feedback_survey_event_resumption_toolkit_before_proposing_self_poll_or_user_ping, others under ~/.claude/...). Those references are out-of-repo. [NEEDS VERIFICATION: out-of-repo path — reviewer cannot verify] Please double-check that the memory entries exist with the exact slugs used and that their content matches the behaviors described (e.g., pr_watch_create being inert). If names differ even slightly, cross-references will mislead operators.

Copy link
Copy Markdown
Contributor Author

@minsky-ai minsky-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: mt#1616 — Rewrite implement-task Step 9 with post-PR convergence protocol

CI status: in_progress at review time on commit 0429797a7 (build + Prevent Placeholder Tests both running). Skill markdown only; no behavioral risk if a check flakes.

Coverage: 1/1 file reviewed (single-file skill markdown change, 95 lines net; under the proportionality rule's 1–20 file threshold for direct review).

Summary

Rewrites implement-task SKILL.md Step 9 from "Hand off to verify" (stale post-mt#1551) to "Drive PR to convergence (IN-REVIEW → merge)". Names mcp__minsky__session_pr_wait-for-review as the default action, forbids idling without a named wait/poll mechanism, and lists four convergence-failure escape valves with their tracking memories. Two adjacent updates (Process overview line + §0 IN-REVIEW transition message) for consistency with the same architectural shift.

The earlier reviewer-bot review (CHANGES_REQUESTED, dismissed at b4dc77d-equivalent dismissal id) raised one BLOCKING finding which I dismissed as factually false — the bot claimed mcp__minsky__session_pr_wait-for-review was "likely not resolvable by the MCP registry" because of the hyphen. That call was the very mechanism that surfaced the review (returned success: true, matched: true), the deferred-tools manifest contains 9+ hyphenated mcp__minsky__* names, and three other authoritative sources (feedback_user_does_not_review, the convergence memo, the bridge memory) also use the hyphenated form. The bot's "all use underscores" claim is empirically wrong. Pattern matches feedback_verify_reviewer_bot_claims_at_cited_location_before_fixing.

The NON-BLOCKING (verify memory slugs) was verified — all five primary slugs return matching memory_search records.

Cross-cutting concerns

None. Single-file markdown change with three internally consistent edits.

Spec verification

Task: mt#1616

Criterion Status Evidence
Step 9 rewritten with new convergence-protocol content Met .claude/skills/implement-task/SKILL.md:315+ — entire §9 body replaced
Step 9 explicitly states "this step does not stop the session — it actively waits" Met "This step does NOT stop the session — it actively waits." present in §9
mcp__minsky__session_pr_wait-for-review named as the default mechanism Met "Default mechanism: call mcp__minsky__session_pr_wait-for-review..." present in §9
ScheduleWakeup named as the alternative for multi-day waits Met "Alternative for genuinely-async multi-day waits: ScheduleWakeup with delaySeconds in the 1200–1800s range" present in §9
All five referenced feedback memories cited by name Met grep returned 6 hits (one cited twice). Verified slugs: feedback_user_does_not_review, feedback_self_authored_pr_merge_constraints, feedback_bot_pr_convergence_via_bypass, feedback_reviewer_bot_self_reversal_signal, feedback_reviewer_bot_cot_leakage_forces_bypass. All five returned matching memory_search records.
/verify-task framing corrected Met §0 IN-REVIEW transition message updated to point at §9 convergence loop and note the standard session_pr_merge path auto-sets DONE; §9 closes with explicit "Do NOT pre-emptively call /verify-task — it only fires on the bypass-merge fallback path."
Bridge memory feedback_post_pr_convergence_idle_drift marked retired Pending post-merge Will be marked retired/superseded in a memory_update call after merge

All 7 success criteria met (one is post-merge).

Acceptance tests

Test Result
AT #1 — agent invoking /implement-task and reaching Step 9 sees explicit session_pr_wait-for-review instruction
AT #2 — Step 9 references all 5 convergence-related memories by name ✓ (grep -c returned 6, with one cited twice)
AT #3 — Step 9 does not contain "Run /verify-task mt#X to verify" ✓ (grep -c returned 0)
AT #4bun run validate-all is N/A for skill markdown
AT #5 — walk-through: agent following new Step 9 calls session_pr_wait-for-review rather than idling ✓ (this very PR exercised the wait per the new flow)

bunx prettier --check clean.

Documentation impact

This PR IS the doc update. Skill markdown only. No additional docs need updating; the §0 IN-REVIEW transition message and Process overview line were updated alongside §9 for consistency with the same mt#1551 architectural shift.

Concurrency analysis

(N/A — no check-then-act pattern introduced; skill markdown is static content.)

Live verification

(N/A — docs-only change with no live external behavior dependency. Per AT #5, the walk-through verification is encoded in the protocol's own use during this PR's lifecycle.)

Event rationale

COMMENT event chosen because the PR was opened by minsky-ai[bot] and this review is being submitted by the same App identity — GitHub structurally blocks self-approval. The merge gate is satisfied by this review's spec-verification + documentation-impact sections per feedback_self_authored_pr_merge_constraints.

(Had Claude run the review.)

@minsky-ai minsky-ai Bot added the review:bot-commented Reviewer bot has commented on this PR (informational) label May 8, 2026
@edobry edobry merged commit 214f08b into main May 8, 2026
2 checks passed
@edobry edobry deleted the task/mt-1616 branch May 8, 2026 00:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

authorship/co-authored Co-authored by human and AI agent review:bot-commented Reviewer bot has commented on this PR (informational)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant