Skip to content

feat(mt#1523): Add TOCTOU/concurrency sweep step to implement-task skill#960

Merged
edobry merged 2 commits into
mainfrom
task/mt-1523
May 6, 2026
Merged

feat(mt#1523): Add TOCTOU/concurrency sweep step to implement-task skill#960
edobry merged 2 commits into
mainfrom
task/mt-1523

Conversation

@minsky-ai
Copy link
Copy Markdown
Contributor

@minsky-ai minsky-ai Bot commented May 6, 2026

Summary

Adds a new mandatory step to the implement-task skill that requires explicit
TOCTOU enumeration when the implementation introduces or modifies a
check-then-act pattern. Three windows must be enumerated (read atomicity,
decision-action gap, stale-read), each with a documented fix-or-accept
decision. The "small window" rationalization is explicitly forbidden with
the canonical framing from feedback_toctou_enumeration_required.

Closes mt#1523.

Numbering note

The spec requested §7a, but that slot is already occupied by mt#1415's
verification-artifact step (added 2026-05-04). The new TOCTOU step lands as
§7b instead — same logical position (between §7 verify and §8 create-PR),
distinct concern. The Process overview at the top of the file lists §7b
alongside §7a so future readers see both.

Why

Two failures from the mt#1483 / PR #928 retrospective shared one root cause:

  1. listCommitsAhead had a two-call inter-process race that survived three
    reviewer-bot rounds — caught only when the user prompted "is there a race
    opportunity?" (N-1 detection).
  2. When prompted to enumerate, I correctly identified three TOCTOU classes
    but rationalized away Class 2 with two flawed arguments ("git push enforces
    FF" — false for the relevant ref; "hours vs seconds is non-material" —
    handwave).

Reviewer-bot does not reliably catch TOCTOU; it catches functional and
structural concerns but has not in practice surfaced multi-call atomicity or
decision-action gaps. The structural fix is a prompt-level enumeration step
in the implementer skill — the right tier per feedback_toctou_enumeration_required.

Key changes

.claude/skills/implement-task/SKILL.md:

  • New ### 7b. TOCTOU / concurrency sweep for check-then-act code between
    §7a (verification artifacts) and §8 (create PR).
  • Step is mandatory when applicable; (N/A — no check-then-act pattern introduced) is the explicit no-op marker for non-applicable PRs.
  • Four trigger conditions covering external-state read+act, hooks/gates/guards,
    precondition+downstream-act, and subprocess+pre-check.
  • Three mandatory windows: read atomicity, decision-action gap, stale-read at
    read time — each with a Mitigation pattern.
  • Forbidden small-window rationalization with the exact canonical framing
    ("if the race is the same SHAPE of bug the code exists to prevent, the size
    of the window is a UX consideration, not a correctness one").
  • Four named Accept rationales: idempotent, FF-conflict-preserving,
    irreducible, automatic recovery — each requiring a code-comment annotation.
  • Three named Mitigate rationales: creates redo, silent worse-state,
    observable to consumer.
  • Reviewer-bot caveat: agent's responsibility, not deferred to review.
  • Non-trivial mitigation requires a follow-up structural task plus a code
    comment naming the residual race / tracking task.
  • PR body must carry either a ## Concurrency analysis subsection or an
    explicit N/A line.
  • Cross-reference to bridge memory feedback_toctou_enumeration_required and
    the originating incident (mt#1483 PR feat(mt#1483): Add branch-freshness PreToolUse hook to block stale-branch commits #928).

Process-overview list (top of file) updated to list §7b.

Concurrency analysis

(N/A — no check-then-act pattern introduced. This is a docs-only edit to a
skill file; the changes are markdown, not code.)

Testing

Verified all 9 spec criteria are encoded in the new section (each via grep):

  • Step is mandatory and lists 4 trigger conditions.
  • Three windows enumerated with mitigation each.
  • Small-window forbidden quote present verbatim.
  • 4 accept rationales + 3 mitigate rationales listed by name.
  • Reviewer-bot caveat present.
  • Follow-up-task instruction present.
  • Bridge memory reference present.
  • Skills are loaded by the harness, not compiled to CLAUDE.md/AGENTS.md
    (verified mt#1415's prior §7a is also not compiled there). The "if
    applicable" qualifier in the SC is satisfied as no-op.

Out of scope

  • TOCTOU analysis in review-pr or auditor skills — separate concern; the
    agent should self-catch before review, not defer.
  • A hook that enforces TOCTOU enumeration — the prompt-level step is the
    right tier.
  • Class-2 mitigation for mt#1483 itself — tracked in mt#1522 (CAS guard on
    origin/main SHA).
  • Updating the bridge memory feedback_toctou_enumeration_required to point
    at this PR — deferred to post-merge per the spec ("once this PR lands").

edobry added 2 commits May 6, 2026 12:01
Adds a new §7b (between §7 verify and §8 create-PR) requiring an explicit
TOCTOU enumeration when the implementation introduces or modifies a
check-then-act pattern.

Note on numbering: the spec requested §7a, but that slot is already
occupied by mt#1415's verification-artifact step (added 2026-05-04). The
new TOCTOU step lands as §7b instead — same logical position (between §7
and §8), distinct concern.

Step content covers:
- When the step applies (4 trigger conditions: external-state read+act,
  hook/gate/guard, precondition+downstream-act, subprocess+pre-check).
- Three mandatory windows: read atomicity, decision-action gap, stale-read.
- Forbidden small-window rationalization with the canonical framing
  (same-shape-of-bug at smaller time scale is still the bug).
- Four Accept rationales: idempotent, FF-conflict-preserving, irreducible,
  automatic recovery.
- Three Mitigate rationales: creates redo, silent worse-state, observable
  to consumer.
- Reviewer-bot caveat: TOCTOU is the agent's responsibility; reviewer-bot
  does not reliably catch multi-call atomicity or decision-action gaps.
- Non-trivial mitigation requires a follow-up structural task + a code
  comment naming the residual race.
- PR body must carry either a Concurrency analysis subsection or an
  explicit N/A line.
- Cross-reference to bridge memory feedback_toctou_enumeration_required
  and the originating incident (mt#1483 PR #928 listCommitsAhead race).

Skills are loaded by the harness, not compiled to CLAUDE.md/AGENTS.md, so
no compile step required — verified that the prior §7a (mt#1415) is also
not compiled.

PR will declare §7b N/A in the Concurrency analysis section: this is a
docs-only skill edit with no check-then-act pattern.
@minsky-ai minsky-ai Bot added the authorship/co-authored Co-authored by human and AI agent label May 6, 2026
Copy link
Copy Markdown
Contributor Author

@minsky-ai minsky-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: Self-review of bot-authored PR (same-identity → COMMENT only)

CI status: pending — 1 in_progress (Prevent Placeholder Tests), 0 passed, 0 failed.

Summary

1-file docs-only change adding §7b (TOCTOU/concurrency sweep) to .claude/skills/implement-task/SKILL.md. All 9 spec criteria verified met by reading the diff line-by-line against the spec text. 0 BLOCKING / 0 NON-BLOCKING findings. Event is COMMENT because PR author (minsky-ai[bot]) is the same App identity submitting the review — APPROVE is structurally blocked by GitHub. The cross-identity reviewer (minsky-reviewer[bot]) is the path to a non-self APPROVE.

Cross-cutting concerns

None. Single-section addition to one skill file.

Spec verification

Task: mt#1523

Criterion Status Evidence
Inserted in .claude/skills/implement-task/SKILL.md between §7 and §8 Met New section at .claude/skills/implement-task/SKILL.md:217 (### 7b.); §7a (verification artifact) at line 173, §8 (Create PR) at line 305. Spec said "§7a" but slot was already occupied by mt#1415; numbering note in PR body explains.
Mandatory + 4 trigger conditions Met Line 219: "Mandatory when this implementation introduces or modifies a check-then-act pattern". Lines 226-229 list exactly 4 trigger conditions (filesystem/API/db/git+env reads, hook/gate/guard, precondition+downstream, subprocess+pre-check).
Three windows with fix-or-accept Met Lines 236-251: read atomicity / decision-action gap / stale-read at read time, each with a Mitigation entry.
"Small window" forbidden with canonical framing Met Lines 255-257 carry the exact spec-required quote: "if the race is the same SHAPE of bug the code exists to prevent, the size of the window is a UX consideration, not a correctness one. A seconds-class instance of an hours-class bug is still the bug."
4 Accept + 3 Mitigate rationales Met Accept (lines 264-272): Idempotent, FF-conflict-preserving, Irreducible, Automatic recovery. Mitigate (lines 280-285): Creates redo, Silent worse-state, Observable to consumer.
Reviewer-bot caveat (agent's responsibility, not deferred to review) Met Lines 219-222: "reviewer-bot does NOT reliably catch TOCTOU... Do not defer this to review."
Non-trivial mitigation → follow-up task + code comment Met Lines 287-290: "file a follow-up structural task... document the residual window in a code comment naming the follow-up task."
Cross-reference to bridge memory feedback_toctou_enumeration_required Met Line 299: "Cross-reference: feedback_toctou_enumeration_required.md is the bridge memory carrying this discipline pre-skill".
Skill updates compiled to CLAUDE.md / AGENTS.md / .cursor/rules/ "if applicable" Met (N/A) Skills are loaded by the harness at invoke time, not compiled into CLAUDE.md/AGENTS.md (verified mt#1415's prior §7a is also not compiled there). The "if applicable" qualifier in the SC is satisfied as no-op. PR body documents the verification.

All 9 criteria met. Acceptance tests:

  • "lock the cache before write" hypothetical → would hit §7b's lock/precondition trigger; three windows enumerable; accept/mitigate rationales applicable. PASS by construction.
  • Future-hook-on-check-then-act → step is mandatory + 4 trigger conditions cover the case. PASS by construction.
  • Compiled-output rendering → N/A per the if-applicable qualifier. PASS.

Documentation impact

No update needed — the skill IS the documentation, and it is updated in this PR. No architecture/theory/contributing/readme changes required (the skill is self-contained guidance for the implementer; it doesn't change the lifecycle, command registry, or user-facing surface).

(Had Claude look into this — AI-assisted review)

@minsky-ai minsky-ai Bot added the review:bot-commented Reviewer bot has commented on this PR (informational) label May 6, 2026
Copy link
Copy Markdown

@minsky-reviewer minsky-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown


The reviewer ran but produced no findings. This is not an approval — the model emitted no submit_finding, submit_inline_comment, or conclude_review calls.

@edobry edobry merged commit 641fc8b into main May 6, 2026
2 checks passed
@edobry edobry deleted the task/mt-1523 branch May 6, 2026 16:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

authorship/co-authored Co-authored by human and AI agent review:bot-commented Reviewer bot has commented on this PR (informational)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant