Skip to content

feat(mt#1707): markdown-aware filtering for block-out-of-band-merge hook#1028

Open
minsky-ai[bot] wants to merge 3 commits into
mainfrom
task/mt-1707
Open

feat(mt#1707): markdown-aware filtering for block-out-of-band-merge hook#1028
minsky-ai[bot] wants to merge 3 commits into
mainfrom
task/mt-1707

Conversation

@minsky-ai
Copy link
Copy Markdown
Contributor

@minsky-ai minsky-ai Bot commented May 11, 2026

Summary

Adds markdown-aware filtering to .claude/hooks/block-out-of-band-merge.ts (mt#1695). Before substring scanning, the hook now elides trigger-phrase matches that appear inside markdown code spans, fenced code blocks, or blockquote lines. Fixes the mt#1701 PR #1021 false-positive class where a docs-only PR documenting Railway field names tripped the hook against itself, forcing the author to paraphrase field names in the PR body to get past the merge gate.

Motivation & Context

mt#1695 shipped the hook with a substring matcher over 6 trigger phrases (out-of-band, post-merge config, Railway config change, serviceInstanceUpdate, rootDirectory, dockerfilePath). It correctly catches coordination-step PRs but cannot distinguish between:

  • Coordination signal: "After merge, the operator must flip rootDirectory on Railway" — REAL trigger; should block.
  • Meta-documentation: "This PR adds a `rootDirectory` field reference to DEPLOY.md" — describing the field, not requesting a flip; should NOT block.
  • Acceptance evidence: `$ grep "rootDirectory" services/reviewer/DEPLOY.md` — a grep example in test output; should NOT block.

The mt#1695 PR #1020 R1 reviewer suggested this filtering as a NON-BLOCKING-but-desirable improvement; the original implementer deferred it as out-of-scope. PR #1021 (mt#1701) was the predicted false-positive case — the author worked around it by paraphrasing field names in the PR body. This PR closes the gap.

Design / Approach

elideMarkdownNonProse(body) runs three passes before scanning:

  1. Fenced code blocks (```...``` on their own lines) — multi-line, processed first so they cannot be later misread as containing inline spans or blockquotes.
  2. Inline code spans (`...`) — single-line, backtick-delimited.
  3. Blockquote lines (lines starting with >).

Each pass replaces matched content with same-length whitespace rather than deleting it. This preserves character positions so indexOf results in the cleaned text remain valid offsets into the original body — excerpts in denial messages continue to slice from body (showing real surrounding context), not from the cleaned text (which would show whitespace).

Optional ## Note-heading exclusion (success criterion #4) was deliberately omitted per the spec's Out of scope note ("may not be needed if code-span filtering catches enough cases") — keeping scope tight.

Key Changes

  • .claude/hooks/block-out-of-band-merge.ts — added elideMarkdownNonProse() (exported) and integrated it into scanForTriggerPhrases() via the scanText variable.
  • .claude/hooks/block-out-of-band-merge.test.ts — 14 new test cases:
    • scanForTriggerPhrases — markdown filtering (mt#1707): 9 cases (code-span, fence, blockquote, bare-prose regression, mt#1681 mixed shape, prose-wins-over-code-span, PR docs(mt#1701): update DEPLOY.md for post-mt#1681 build-context shape #1021-style docs body, multi-line fence content, position preservation).
    • elideMarkdownNonProse: 5 unit cases (no-op on prose, code-span blanking, fence blanking with newlines preserved, blockquote blanking, overall-length invariant).
  • .minsky/rules/hook-files.mdc — new "Markdown-aware filtering (mt#1707)" subsection inside the existing "Out-of-Band Merge Guard" section.
  • AGENTS.md, CLAUDE.md, .cursor/rules/hook-files.mdc — regenerated via bun run minsky rules compile --target=... from the canonical .minsky/rules/hook-files.mdc source.

Concurrency analysis

N/A — no check-then-act pattern introduced. This PR is a pure-function refinement to an existing hook's body-scanning logic. The hook's existing structure (read PR body → scan → permit/deny) is unchanged; only the scan's input transformation is modified.

Testing

$ bun test --preload ./tests/setup.ts --timeout=15000 ./.claude/hooks/block-out-of-band-merge.test.ts
 37 pass
 0 fail
 76 expect() calls
Ran 37 tests across 1 file. [60.00ms]

$ bun test --preload ./tests/setup.ts --timeout=15000 ./.claude/hooks
 745 pass
 0 fail
 1305 expect() calls
Ran 745 tests across 17 files. [149.00ms]

Typecheck clean (mcp__minsky__validate_typecheck → 0 errors). Lint clean (mcp__minsky__validate_lint → 0 errors).

Live verification

Acceptance tests #3 / #4 / #5 from the spec, run against real PR bodies:

$ bun -e 'import { scanForTriggerPhrases } from "./.claude/hooks/block-out-of-band-merge"; ...'
PR #1013 (mt#1681): BLOCK (matches: ["out-of-band","rootDirectory"]) — regression anchor holds
PR #1021 (mt#1701): ALLOW (matches: [])                              — false-positive class fixed
PR #1004:           ALLOW (matches: [])                              — no triggers, no regression

PR #1013 matches only out-of-band + rootDirectory (both appearing in bare prose) rather than all 4 phrases the original substring matcher caught — dockerfilePath and serviceInstanceUpdate appear only in code spans in the actual PR body and are now correctly elided. The block still fires because at least one bare-prose match remains.

Documentation impact

.minsky/rules/hook-files.mdc gets a new "Markdown-aware filtering (mt#1707)" paragraph documenting the three elision passes, the position-preservation property, and the originating false-positive incident (mt#1701 PR #1021). Compiled outputs regenerate from there via the rules compile pipeline.

Scope notes

In scope: the three files named in the spec's Scope section, plus the three compiled-output files regenerated by rules compile.

Out of scope (per spec):

  • Proximity-heuristic filtering ("require near post-merge, must, flip").
  • Changes to the trigger-phrase list itself.
  • A settings.local.json env override (the existing MINSKY_ACK_OOB_MERGE=1 remains the operator escape valve).
  • ## Note-heading exclusion (success criterion build(deps-dev): bump lint-staged from 15.5.2 to 16.0.0 #4, optional; code-span filtering suffices for the observed false-positive class).

Parallel-work disclosure

The session was started with MINSKY_FORCE_PARALLEL=1 because the open PR #1026 (mt#1699) also modifies .minsky/rules/hook-files.mdc. The two changes target structurally independent sections (mt#1707 updates the existing "## Out-of-Band Merge Guard" section; PR #1026 adds a new "## Generated File Edit Guard" section) — textual conflict risk is minimal. PR #1026 is review:bot-approved and mergeable_state: clean at PR-create time.

References

(Had Claude do the implementation.)

edobry added 2 commits May 10, 2026 23:48
Adds elideMarkdownNonProse() pre-pass to scanForTriggerPhrases() that
blanks trigger phrases appearing inside markdown code spans, fenced code
blocks, and blockquote lines before substring matching. Catches the
mt#1701 PR #1021 false-positive class where docs PRs legitimately
reference Railway field names as code-span identifiers.

Same-length whitespace replacement preserves character positions so
excerpt slicing from the original body still shows real surrounding
context to the operator.

- Hook: elideMarkdownNonProse + integration into scanForTriggerPhrases
- Tests: 14 new cases covering code-span / fence / blockquote / mixed /
  position-preservation; existing 23 tests preserved as regression anchors
- Rule doc: .minsky/rules/hook-files.mdc + regenerated CLAUDE.md /
  AGENTS.md / .cursor/rules/hook-files.mdc via rules compile

Live-fire smoke:
  PR #1013 (mt#1681): BLOCK (matches out-of-band + rootDirectory in
                              bare prose) — regression anchor holds
  PR #1021 (mt#1701): ALLOW — false-positive class fixed
  PR #1004: ALLOW — no triggers, no regression
@minsky-ai minsky-ai Bot added the authorship/co-authored Co-authored by human and AI agent label May 11, 2026
Copy link
Copy Markdown

@minsky-reviewer minsky-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown


The markdown-aware filtering is a solid step but has brittle gaps that can reintroduce false positives. Specifically, inline code spans delimited by multiple backticks (CommonMark) aren’t elided, and fenced code blocks are only matched as triple backticks at column 0, missing indented and tilde-fenced variants; both are BLOCKING. Tests also lack coverage for these shapes and CRLF/newline and nested/continued blockquote cases. Please harden the regexes (or parsing approach) to support variable backtick counts and robust fences, and add tests for these scenarios. Minor nits around blockquote handling scope and small perf/doc clarifications are noted as non-blocking.

Findings

  • [BLOCKING] .claude/hooks/block-out-of-band-merge.ts:1 — Inline code-span elision fails for multi-backtick delimiters (CommonMark) — trigger phrases inside are not elided
    The inline code-span regex only matches single-backtick spans: cleaned = cleaned.replace(/[^\n]+/g, blankSameLength);`. Under CommonMark, inline code can be delimited by any number of consecutive backticks (e.g., rootDirectory) to allow backticks inside the code content. Phrases inside such multi-backtick spans will not be elided and can falsely trigger the gate. Example body (expected: ALLOW, actual with current code: BLOCK):
This PR documents the ``rootDirectory`` field and adds examples.

Please update the inline-span handling to support variable-length backtick fences per CommonMark (e.g., a regex that captures N backticks and matches the same N backticks closing, not crossing newlines) and add a test covering this case.

  • [BLOCKING] .claude/hooks/block-out-of-band-merge.ts:1 — Fenced code block regex requires closing fence at start-of-line only — fails to elide indented fences and optional language info; can leak triggers
    The fence regex ^```[\s\S]*?^```$ with the m flag assumes both opening and closing fences start at column 0 and uses only triple backticks. In Markdown, fences can be indented up to 3 spaces and can use backticks or tildes with optional info strings (e.g., ```bash). Also, the $ anchor here combined with `m` doesn't enforce end-of-line reliably across Windows `\r\n` endings.

Cases that will not be elided and may spuriously trigger:

  • Indented fences:

    " bash\n$ grep 'rootDirectory' file\n "

  • Tilde fences:

    "~~~\nserviceInstanceUpdate\n~~~"

  • Fences with trailing spaces or CRLF newlines.

Consider a more robust pattern that allows up to three leading spaces, supports both backtick and tilde fences, optional info strings, and matches through to a closing fence of the same marker length. Add tests for indented and tilde-fenced blocks.

  • [NON-BLOCKING] .claude/hooks/block-out-of-band-merge.ts:156 — Blockquote elision may miss multi-line wrapped quotes and nested markers; considers only the first line
    The blockquote pass uses cleaned = cleaned.replace(/^>[^\n]*$/gm, blankSameLength);, which blanks only single physical lines beginning with >. In Markdown, a single logical blockquote paragraph can be wrapped across multiple lines, each prefixed by > (possibly with leading spaces or multiple > for nesting). If the subsequent wrapped lines drop the > (common in hand-written text), prose containing trigger phrases may remain and be scanned. Consider:
> Perform serviceInstanceUpdate on Railway
  rootDirectory must be updated too.

Your code will elide only the first line; the continuation line remains and may spuriously trigger. If you intend to filter only lines that actually begin with >, clarify in docs/tests. Otherwise, expand matching to allow up to 3 leading spaces and optional repeated > markers for nested quotes, and consider treating contiguous quoted blocks even if some wrapped lines miss the marker.

  • [NON-BLOCKING] .claude/hooks/block-out-of-band-merge.test.ts:1 — Test coverage gaps: no cases for multi-backtick inline spans, indented/tilde-fenced blocks, CRLF handling, or nested/continued blockquotes
    The new tests exercise single-backtick spans, triple-backtick fences at column 0, and simple > quotes. They do not cover several realistic Markdown variants that the implementation currently does not handle:
  • Inline spans delimited by multiple backticks (code) to allow backticks inside content.
  • Fenced code blocks indented by up to 3 spaces; tilde ~~~ fences; fences with info strings and trailing spaces; Windows \r\n newlines.
  • Nested quotes (>>), quotes with leading spaces, and wrapped quote paragraphs across multiple lines.

Please add tests for these shapes alongside any implementation hardening.

Inline comments

  • .claude/hooks/block-out-of-band-merge.ts:147 — Nit: blankSameLength currently preserves newlines but also preserves other whitespace characters inside elided regions. Since you .replace(/\s+/g, " ") when building excerpts, this is fine, but if you ever rely on exact spacing, tabs inside code blocks will be preserved as tabs. If you want strictly spaces for non-newline codepoints, consider replacing /[^\n]/g (as you do) but also normalizing \r specifically for CRLF inputs to avoid leaving stray \r characters unblanked.
  • .claude/hooks/block-out-of-band-merge.ts:197 — Consider precomputing const lowerPhrases = TRIGGER_PHRASES.map(p => p.toLowerCase()) to avoid repeated toLowerCase() per phrase call, and possibly short-circuiting on scanText.includes(...) before computing excerpt. Minor, but it trims a bit of overhead when bodies are large.
  • .cursor/rules/hook-files.mdc:110 — Docs mention inline code spans and fenced blocks with triple backticks. Given the current implementation, inline spans with multi-backtick delimiters and tilde fences are not filtered. Either scope the docs explicitly to the supported subset or extend the implementation to cover the broader CommonMark shapes.

…Mark variants

R1 BLOCKING #1 — inline code-span elision now supports variable-length
backtick runs per CommonMark (rootDirectory, ). Regex
captures the opening run and requires the same closing run not followed
by another backtick.

R1 BLOCKING #2 — fenced-code elision now supports tilde fences, up to 3
spaces of indentation, optional info strings, trailing whitespace, and
CRLF line endings.

R1 NON-BLOCKING #1 — blockquote elision now supports up to 3 leading
spaces and nested >> markers, CRLF-tolerant. Lazy continuation (wrapped
paragraph without > on subsequent lines) remains a documented limitation.

R1 inline nits — precomputed lowercased phrase per iteration; extracted
serviceInstanceUpdate trigger-phrase literal to a named constant to
satisfy the magic-string-duplication lint rule at the new test density.

11 new tests under elideMarkdownNonProse — CommonMark variants (PR #1028
R1): multi-backtick spans (with and without internal backticks), indented
fences, tilde fences, info-string fences, CRLF fences, nested blockquotes,
indented blockquotes, CRLF blockquotes, length-invariant sweep across all
shapes, and the documented lazy-continuation limitation.

Tests: 48/0 (this file), 816/0 (.claude/hooks). Lint clean (0 errors,
0 warnings). Typecheck clean.

Live-fire smoke unchanged:
  PR #1013 (mt#1681): BLOCK ([out-of-band, rootDirectory])
  PR #1021 (mt#1701): ALLOW
  PR #1004:           ALLOW

Rule doc updated to describe CommonMark coverage and the lazy-continuation
limitation; compiled outputs regenerated.
Copy link
Copy Markdown

@minsky-reviewer minsky-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent adversarial review (Chinese-wall)
Reviewer: minsky-reviewer[bot] via openai:gpt-5
Tier: unknown


R2 verification: The prior BLOCKING findings have been addressed. Inline code-span elision now supports variable-length backtick delimiters, and the fenced code block handling was hardened to allow up to 3-space indentation, both backtick and tilde fences, optional info strings, and CRLF endings. Comprehensive tests were added for these CommonMark variants, along with regression checks to ensure bare-prose triggers still fire and position preservation for excerpts remains intact. I find no new critical defects introduced by the fix; the changes are cohesive and well-documented. Approving this PR.

@minsky-ai minsky-ai Bot dismissed minsky-reviewer[bot]’s stale review May 12, 2026 16:15

Stale CHANGES_REQUESTED — R1 BLOCKING findings addressed in commit cb89c4d (CommonMark variants: multi-backtick spans, tilde/indented fences, CRLF, nested blockquotes). Reviewer-bot's R2 APPROVED on cb89c4d confirmed the fixes. Dismissing per standard merge gate (rejections.length === 0 required for isApproved).

Copy link
Copy Markdown
Contributor Author

@minsky-ai minsky-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: markdown-aware filtering for block-out-of-band-merge hook

CI status: pass — 2 checks passed, 0 failed (build, Prevent Placeholder Tests)
Smoke: pass — bun test --preload ./tests/setup.ts --timeout=15000 ./.claude/hooks/block-out-of-band-merge.test.ts — 48/0

Summary

PR implements three regex passes (elideMarkdownNonProse) that blank fenced code blocks, inline code spans, and blockquote lines before trigger-phrase scanning. The same-length whitespace replacement preserves character offsets so excerpts in denial messages continue to slice from the original body. All 48 tests pass. No blocking findings. The R1 BLOCKING gaps (multi-backtick inline spans, tilde/indented fences) are confirmed fixed by dedicated test cases and manual regex validation.

Spec verification

Task: mt#1707

Criterion Status Evidence
1. Accepts trigger phrases in code spans, code fences, blockquotes WITHOUT firing Met elideMarkdownNonProse passes 1–3 blank those contexts; tests at lines 109, 114, 125 confirm; scanForTriggerPhrases(body) returns [] for all three shapes
2. Still fires on bare prose containing trigger phrases; mt#1681 PR #1013 regression anchor holds Met Test at line 134 (bare prose → BLOCK); mt#1681 body shape test at line 141 confirms out-of-band in bare prose still fires even when rootDirectory/dockerfilePath are in code spans
3. PR #1021 (mt#1701) docs-body would PASS without modification Met Test at line 174 (PR #1021-style docs body): code-span field references → scanForTriggerPhrases returns []
4. Optional: heading-based exclusion N/A Explicitly marked optional in spec; implementation omits it; code-span filtering catches the observed false-positive class
5. Tests updated: code span → ALLOW, code fence → ALLOW, blockquote → ALLOW, bare prose → BLOCK, mt#1681 body → BLOCK Met 12 new test cases in describe("scanForTriggerPhrases — markdown filtering (mt#1707)") and describe("elideMarkdownNonProse — CommonMark variants (PR #1028 R1)")
6. CLAUDE.md / .minsky/rules section updated Met .minsky/rules/hook-files.mdc lines 113-135 document all three filter contexts, same-length replacement, and the lazy-continuation known limitation; CLAUDE.md and AGENTS.md updated identically; .cursor/rules/hook-files.mdc also updated

All success criteria met.

Adoption sweep

Symbol / command Consumers found Classification
elideMarkdownNonProse (new export) block-out-of-band-merge.test.ts (imported line 4; describe("elideMarkdownNonProse") at line 335 + R1 variants at line 381) + called internally by scanForTriggerPhrases at line 204 Adopted

No missing consumers.

Documentation impact

Updated in this PR — .minsky/rules/hook-files.mdc, CLAUDE.md, AGENTS.md, and .cursor/rules/hook-files.mdc all carry the "Markdown-aware filtering (mt#1707)" subsection documenting the three filter contexts, the elision mechanism (same-length whitespace replacement), the position-preservation invariant, and the known lazy-continuation limitation.

(Had Claude look into this — AI-assisted review)

@minsky-ai minsky-ai Bot added the review:bot-commented Reviewer bot has commented on this PR (informational) label May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

authorship/co-authored Co-authored by human and AI agent review:bot-commented Reviewer bot has commented on this PR (informational)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant