fix(BUG-016): refresh claude-mem-heal to detect + patch v13.x cascade pattern (cross-OS)#83
Merged
Conversation
… pattern (cross-OS) User encountered `/mcp Failed to reconnect to plugin:claude-mem:mcp-search: -32000` in this session AFTER BUG-014 (PR #75) restored the install AND BUG-015 (PR #81) shipped the detection layer. Root cause traced to claude-mem-heal silently no-oping against v13.3.0 installs. scripts/claude-mem-heal.{sh,ps1}::Repair-McpJson was authored in PR #57 against v12.7.4's broken pattern (literal `${_R%/}`). v13.0.0+ ships a different broken pattern: `sh -c` with cascading-printf pipe that triggers the same EPIPE race documented in thedotmack/claude-mem#2607. The original detection regex (`${_R%/}`) doesn't match v13.x at all -> heal exits silent, leaving the .mcp.json broken -> MCP server fails to start -> -32000. Changes: 1. scripts/claude-mem-heal.sh::heal_mcp_json: - Detection extended: triggers on EITHER v12.7.4 `${_R%/}` OR v13.x signature (`"sh".*"-c"` OR `while IFS= read`). - Replacement template: canonical race-free form using `done | head -n1` instead of `done` with inner `break`. This consumes the entire producer pipe (no leftover writes -> no EPIPE -- Option A from upstream issue #2607 applied locally). - Log message distinguishes which signature was patched. 2. scripts/claude-mem-heal.ps1::Repair-McpJson: equivalent on Windows. 3. tests/setup-linux.bats: 3 new asserts: - both heal scripts grep `while IFS=` (v13.x signature) - both heal scripts contain `head -n1` (race-free template) - both reference BUG-016 + claude-mem#2607 Empirical validation on user's daily-driver Windows during implementation: 3 affected .mcp.json files (cache 13.3.0 + marketplace junction + marketplace canonical) patched on first heal run (exit 0). Second run silent (idempotent -- no v12/v13 signature left to detect). Spec at specs/BUG-016-claude-mem-heal-v13-refresh/. Diff: +63 prod (heal scripts) + 28 test = 91 LOC. Production diff at spec-gate threshold; spec folder ships per discipline. Lesson candidate (post-merge): "heal scripts must be versioned against the upstream bug class they paper over; when upstream's bug pattern changes, the heal's detection regex MUST be refreshed in the same PR that discovers the new pattern." Companion to BUG-015 (PR #81 detection layer) and BUG-012 (PR #70 junction fix). Pairs with upstream issue thedotmack/claude-mem#2607 where Option A `head -n1` is recommended as the canonical upstream fix for the same pipe race.
6 tasks
mlorentedev
added a commit
that referenced
this pull request
May 21, 2026
…cross-OS) (#84) * fix(BUG-017): extend claude-mem-heal to patch hooks.json EPIPE race (cross-OS) User encountered `UserPromptSubmit operation blocked by hook: printf: write error: Permission denied` on the hive project minutes after BUG-016 (PR #83) merged. BUG-016 closed the same EPIPE race for `.mcp.json` but explicitly deferred `hooks.json` -- the deferral was wrong: same root cause, same symptom class, just a different surface. The upstream `plugin/hooks/hooks.json` ships 6 hooks (Setup, SessionStart x2, UserPromptSubmit, PostToolUse, PreToolUse, Stop) all using the same broken cascade-pipe pattern. When the consumer breaks early, unconsumed producer writes EPIPE on Git Bash Windows. Changes: 1. scripts/claude-mem-heal.sh: - new `heal_hooks_json` function (~12 LOC) - walks both `<dir>/hooks/hooks.json` (cache layout, no `plugin/` subdir) and `<dir>/plugin/hooks/hooks.json` (marketplace layout) - minimal substitution via sed: `break; }; done` -> `}; done | head -n1` - idempotent (skips when broken pattern absent) - one log line per patched file with hook count 2. scripts/claude-mem-heal.ps1: - new `Repair-HooksJson` function (~20 LOC) - equivalent walk + substitution (.Replace() with literal string) - PSScriptAnalyzer + AST clean, ASCII-only 3. tests/setup-linux.bats: 3 new parity asserts - both scripts define the new function - both contain the literal `break; }; done` -> `head -n1` substitution - both walk hooks.json AND plugin/hooks/hooks.json paths - both reference BUG-017 + claude-mem#2607 Empirical (2026-05-21 user's Windows daily-driver): - First run: 14 hook commands patched across 2 files (7 cache + 7 marketplace-via-junction, since BUG-012's `thedotmack` junction aliases to `thedotmack-claude-mem/plugin`) - Second run: silent (idempotent -- no broken pattern left to detect) - Post-patch grep: `break; }; done` count -> 0; `head -n1` count -> 7 per file Spec at specs/BUG-017-claude-mem-heal-hooks-json-race/. 52 LOC of production diff (at threshold) + spec. Lesson (post-merge): "When a bug class spans multiple surfaces of an upstream system, the heal must patch ALL surfaces in the same PR. BUG-016 deferred hooks.json; BUG-017 was needed minutes later because the same user hit the same race on a different surface. Pre-emptively walk all known affected surfaces rather than waiting for the second user report." Pairs with: - BUG-016 (PR #83) -- same pattern fix applied to .mcp.json - BUG-015 (PR #81) -- detection layer surfacing when path resolution fails - Upstream issue thedotmack/claude-mem#2607 -- where this Option A fix is what we recommend for upstream merge. * fix(BUG-017): correct backslash count in bats grep pattern CI test 471 failed because the grep pattern for `hooks\hooks.json` had 4 backslashes (over-escaped in single-quoted bash). Reduced to 2 backslashes -- the correct count for matching a single literal backslash in grep BRE. Bash single-quoted: 'hooks\\hooks\.json' -> 4-char literal `\\` -> grep matches `\` (wrong) 'hooks\hooks\.json' -> 2-char literal `\` -> grep matches `\` (correct) No production code change.
This was referenced May 21, 2026
mlorentedev
added a commit
that referenced
this pull request
May 21, 2026
Move from specs/ to specs/archive/ per SDD lifecycle close (the folder move IS the archive marker; status: archived frontmatter update deferred to per-spec follow-up if needed). This session shipped (today, 2026-05-21): - AI-014-opencode-windows-bootstrap (PR #78) - BUG-014-claude-mem-marketplace-register (PR #75) - BUG-016-claude-mem-heal-v13-refresh (PR #83) - BUG-017-claude-mem-heal-hooks-json-race (PR #84) - BUG-018-userpromptsubmit-continue-directive (PR #85) - REFACTOR-003-diff-check-ps1 (PR #82) Catch-up archive (merged earlier weeks but specs/ folder lingered): - BUG-007-remove-github-plugin-broken (PR #65, 2026-05-19) - BUG-011-mcp-loop-claude-json-guard (PR #69, 2026-05-20) - BUG-012-claude-mem-marketplace-junction (PR #70, 2026-05-20) - SDD-005-github-copilot-instructions-sync (PR #62, 2026-05-19) - SDD-006-vault-integrity-check (PR #63, 2026-05-19) Active specs remaining in specs/ (not yet merged): - REFACTOR-002-paths-in-env-contract (queued, still draft) - WIN-002-windows-smoke-sweep (partial closure via PR #73, full clean-VM sweep still open) 33 file moves total (3 files per spec × 11 specs). Zero content change.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
User encountered `/mcp Failed to reconnect to plugin:claude-mem:mcp-search: -32000` in this session AFTER BUG-014 (PR #75) restored the install AND BUG-015 (PR #81) shipped the detection layer. Root cause: `claude-mem-heal` silently no-ops against v13.3.0 installs.
`Repair-McpJson` was authored in PR #57 against v12.7.4's broken pattern (`${_R%/}` literal). v13.0.0+ ships a different broken pattern — `sh -c` with cascading-printf pipe that triggers the EPIPE race documented in thedotmack/claude-mem#2607. The original detection regex (`${_R%/}`) doesn't match v13.x → heal exits silent → `.mcp.json` stays broken → MCP server fails to start → -32000.
Empirical validation (this branch, user's Windows daily-driver)
```
PS> pwsh -NoProfile -File scripts/claude-mem-heal.ps1 -VerboseOutput
[claude-mem-heal] patched .mcp.json (v13.x cascade -> head -n1 race-free form): C:\Users\Manu.claude\plugins\cache\thedotmack\claude-mem\13.3.0.mcp.json
[claude-mem-heal] zod present in C:\Users\Manu.claude\plugins\cache\thedotmack\claude-mem\13.3.0
[claude-mem-heal] legacy marketplace path already present: ...
[claude-mem-heal] patched .mcp.json (v13.x cascade -> head -n1 race-free form): ...\marketplaces\thedotmack\plugin.mcp.json
[claude-mem-heal] patched .mcp.json (v13.x cascade -> head -n1 race-free form): ...\marketplaces\thedotmack-claude-mem\plugin.mcp.json
$LASTEXITCODE = 0
```
3 `.mcp.json` files patched on first run. Second run silent (idempotent).
Changes
`scripts/claude-mem-heal.sh::heal_mcp_json`:
`scripts/claude-mem-heal.ps1::Repair-McpJson`: equivalent on Windows (PSScriptAnalyzer + AST clean, ASCII-only).
`tests/setup-linux.bats`: 3 new cross-OS parity asserts (v13.x signature detection, `head -n1` template, BUG-016 + #2607 references).
Spec at `specs/BUG-016-claude-mem-heal-v13-refresh/` (proposal + tasks + verification).
Diff
```
scripts/claude-mem-heal.ps1 | 34 ++++++++++++++++++++++++++++------
scripts/claude-mem-heal.sh | 29 +++++++++++++++++++++++------
tests/setup-linux.bats | 28 ++++++++++++++++++++++++++++
```
Test plan
Out of scope
Lesson candidate
"Heal scripts must be versioned against the upstream bug class they paper over; when upstream's bug pattern changes, the heal's detection regex MUST be refreshed in the same PR that discovers the new pattern. Else the heal silently no-ops while users continue hitting the bug."
Companion PRs / issues