edobry · edobry · May 6, 2026 · May 6, 2026 · May 6, 2026
diff --git a/.claude/skills/implement-task/SKILL.md b/.claude/skills/implement-task/SKILL.md
@@ -39,6 +39,7 @@ Step 5: Plan the implementation
 Step 6: Develop
 Step 7: Verify implementation
 Step 7a: Ship verification artifact for structural changes (when in scope)
+Step 7b: TOCTOU / concurrency sweep for check-then-act code (mandatory when applicable)
 Step 8: Create PR (IN-PROGRESS → IN-REVIEW)
 Step 9: Hand off to verify
 
@@ -213,6 +214,94 @@ This pattern was established by mt#1399 (smoke test for output-tools wiring —
 - The redacted live-run output from running the artifact, OR
 - A documented override: the artifact has not been run because (a) the target has not been deployed yet, (b) the author lacks live-target access per documented policy, or (c) the target has a rate-limit or maintenance-window constraint. "I read the code carefully" is not a valid override.
 
+### 7b. TOCTOU / concurrency sweep for check-then-act code
+
+**Mandatory** when this implementation introduces or modifies a check-then-act pattern. The
+sweep is the agent's responsibility — reviewer-bot does NOT reliably catch TOCTOU. It
+catches functional and structural concerns; it has not in practice surfaced multi-call
+atomicity or decision-action gaps. **Do not defer this to review.**
+
+**When this step applies (any of):**
+
+- Code reads external state and acts on the read (filesystem, API, database, git refs, env).
+- Code implements a hook, gate, or guard (any precondition-check-then-permit pattern).
+- Code validates a precondition then acts on a downstream resource (lock, ref, file).
+- Code spawns a subprocess after a check the subprocess could invalidate.
+
+If none apply, this step is a no-op — record "(N/A — no check-then-act pattern introduced)"
+in the PR body and proceed.
+
+**Enumerate the three windows. Every time. For each, document a fix-or-accept decision.**
+
+1. **Read atomicity.** Does the check make multiple separate reads of the underlying
+   state? If yes, can the state change between reads (parallel writers, concurrent
+   fetches, sibling processes)?
+
+   - **Mitigation:** collapse to a single atomic read.
+
+2. **Decision-action gap.** Between the moment of decision (allow / proceed) and the
+   action (push / write / spawn), can the underlying state change in a way that
+   invalidates the decision?
+
+   - **Mitigation:** re-check immediately before the action, or compare-and-swap on a
+     captured identifier (CAS / version / SHA).
+
+3. **Stale-read at read time.** Was the data already old when read (remote-tracking ref
+   not recently fetched, cached config, memoized snapshot, in-process cache)?
+   - **Mitigation:** force a fresh read at check time.
+
+**The "small window" rationalization is forbidden.**
+
+> If the race is the same SHAPE of bug the code exists to prevent, the size of the window
+> is a UX consideration, not a correctness one. A seconds-class instance of an hours-class
+> bug is still the bug.
+
+Do not dismiss a window because "the race is unlikely" or "the window is short." Either
+mitigate, or write down an explicit accept-rationale from the list below.
+
+**Accept is valid when ANY of these hold:**
+
+- **Idempotent.** The action produces the same outcome on retry; concurrent execution is
+  safe by construction.
+- **FF-conflict-preserving.** The action's failure mode on conflict is correctness-preserving
+  (e.g., a git push that requires fast-forward — the push rejects, the agent retries against
+  the new state, no silent corruption).
+- **Irreducible.** No locking / CAS primitive is available in the underlying system; the
+  window cannot be closed without external infrastructure.
+- **Automatic recovery.** Post-conflict recovery is automatic and observable (the next
+  invocation correctly re-reads and re-acts on the updated state).
+
+Document the chosen accept-rationale in a code comment at the race site so the next reader
+doesn't have to re-derive it. Naming the rationale is part of the accept; an unannotated
+accept reads as overlooked.
+
+**Mitigate is required when ANY of these hold:**
+
+- **Creates redo.** Race outcome forces the user to redo work (wasted iteration on stale
+  base, replayed action against changed state).
+- **Silent worse-state.** Race outcome silently produces a worse state (agent builds on
+  stale base again, action lands but does the wrong thing, no error surface).
+- **Observable to consumer.** Race outcome is visible to a downstream consumer (reviewer,
+  build system, user-facing log) in a way that creates confusion or work for them.
+
+**If mitigation is non-trivial:** file a follow-up structural task (per the
+meta-retrospective principle that residual races deserve named owners) and document the
+residual window in a code comment naming the follow-up task. A residual race with a
+tracking task is qualitatively different from a residual race that's been forgotten about.
+
+**What goes in the PR body.** Either:
+
+- A "## Concurrency analysis" subsection enumerating the three windows with a one-line
+  fix-or-accept decision per window and the chosen accept-rationale (or fix description).
+- An explicit "(N/A — no check-then-act pattern introduced)" line if this step doesn't
+  apply.
+
+Cross-reference: `feedback_toctou_enumeration_required.md` is the bridge memory carrying
+this discipline pre-skill; once this skill section ships, that memory's job becomes
+historical record + pointer here. The originating incident (mt#1483 PR #928 round-N-1
+detection of `listCommitsAhead` Class-1 race + Class-2 dismissal) is the regression
+example for this section.
+
 ### 8. Create PR (IN-PROGRESS → IN-REVIEW)
 
 **This step owns the IN-PROGRESS → IN-REVIEW transition.**