fix: audit reliability follow-up — ppid walk, auditStatus flag, subclaude hook isolation#7
Merged
George-iam merged 10 commits intomainfrom Apr 5, 2026
Merged
Conversation
…aude hook isolation
Fixes 4 critical bugs surfaced by PR#6 E2E verification:
Bug A (/proc ppid walk for hooks):
Claude Code wraps hook commands in `sh -c 'axme-code hook ...'`, so a
raw process.ppid from a hook subprocess is the (usually dead) sh wrapper
pid instead of the real Claude Code pid. Live sessions were being
classified as orphans and prematurely audited/closed. Add
getClaudeCodePid() which reads /proc/<ppid>/stat and returns the
grandparent, and use it in writeClaudeSessionMapping (ownerPpid) and
createSession (pid). MCP server still uses process.ppid directly — it is
spawned by Claude Code without the sh wrapper.
Bug C (parallel audit race, simplified — no file locks):
Two MCP servers auto-auditing the same session in parallel could both
write memories and decisions, doubling LLM cost. Introduce a simple
auditStatus flag on SessionMeta (pending | done | failed) plus
auditStartedAt, auditFinishedAt. runSessionCleanup now refuses to re-enter
a session whose auditStatus is "pending" and fresher than 15 minutes,
and also honors the MAX_AUDIT_ATTEMPTS retry cap inline (not only in
orphan cleanup). Stale "pending" state (>15 min) is auto-released on the
next attempt — crashes do not leave sessions permanently stuck. The
accepted race window is safe because saveScopedMemories/Decisions dedup
by slug.
Simpler than the pending-audits/ directory API from PR#6: all audit state
now lives on SessionMeta, so one listSessions() call gives pending/done/
failed in one place. The old pending-audits/ directory and its write/
update/clear/read/list API are removed entirely; a legacy cleanup runs at
MCP server startup to delete any leftover files. axme_context's "pending
audits" warning now derives from auditStatus instead.
Bug F (subclaude audit loading project hooks):
The session-auditor spawns a sub-Claude (via claude-agent-sdk) with
cwd=sessionOrigin. Every tool call that sub-Claude makes would fire the
workspace's PreToolUse/PostToolUse/SessionEnd hooks and create "ghost"
AXME sessions (4 ms–70 s lifetime) via ensureAxmeSessionForClaude. Fix:
the auditor passes env: { ...process.env, AXME_SKIP_HOOKS: "1" } to the
SDK query, and every hook handler returns immediately when it sees that
env var. Belt and braces together with settingSources: [].
Bug E (English-only output):
Tighten the AUDIT_PROMPT language section to forbid transliteration of
non-English words in slug and keywords (translate, never romanize). This
is prompt-only: no hardcoded Cyrillic/Chinese/whatever filter in the
parser, per explicit user direction — users configure per language as
they see fit; the code stays language-agnostic.
Verification done in-session (tsx scripts under /tmp, not committed):
- /proc walk verified: a node process invoked via `sh -c` correctly
returns its grandparent pid (skipping the sh wrapper).
- runSessionCleanup skip logic: concurrent-audit, retry-cap,
already-audited, and stale-pending paths all covered.
- All three hooks (pre-tool-use, post-tool-use, session-end) verified
to exit cleanly with AXME_SKIP_HOOKS=1 and create no AXME sessions.
- parseAuditOutput regression checked with the usual ADR-style
decision blocks — still parses correctly.
Cannot be verified without a real second VS Code window (tracked for the
next session): full cleanupAndExit flow on transport close across two
independent MCP server processes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The previous force-push check did a naive substring match for "-f" inside the full command line, which triggered on any branch name containing "-f" as a fragment (e.g. `git push origin feat/audit-reliability-fixes-20260405` matched "-f" inside "-fixes"). This blocked legitimate pushes and forced operators to rename branches or work around their own safety net. Switch to token-level matching: split the command on whitespace and compare each token against the known force-push flags (`-f`, `--force`, `--force-with-lease`, `--force=...`, `--force-with-lease=...`). Also catch the equivalent `+refspec` form (`git push origin +main`). Verified with 14 unit cases (6 previously-broken false-positive branch names now pass; all 6 genuine force-push forms still denied; protected- branch check unchanged). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Observed empirically in the PR#7 working session: after a Claude Code SIGKILL + respawn (which happens on /compact, VS Code reload, and crash recovery), the old MCP server's cleanupAndExit sequence ran markAudited on the old AXME session but was terminated before reaching clearClaudeSessionMapping. This left a stale mapping file pointing to an already-audited session. When the respawned Claude Code's hooks then ran ensureAxmeSessionForClaude, the function found the stale mapping, returned the audited session id, and every subsequent trackFileChanged/attachClaudeSession write went to a session that would never be audited again — so the entire post-restart session was silently untracked. Fix: after reading an existing mapping, inspect the referenced session. If it is already audited, or if its recorded pid is dead, treat the mapping as stale and fall through to create a fresh AXME session (which overwrites the mapping). Log the transition to stderr so operators can see when a stale mapping has been replaced. Verified with 4 unit scenarios: 1. Live mapping → reused (baseline) 2. Mapping → audited session → fresh session created 3. Mapping → dead-pid session → fresh session created 4. No mapping → fresh session created (baseline) Also verified in the PR#7 working session itself: the stale mapping for Claude 6dfffaab → audited AXME e27eed9f was replaced by a fresh AXME session c2eadf04 with the correct live ownerPpid, within milliseconds of this fix being deployed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…fset
Previously, when the same Claude session was audited more than once (user
resumes a closed conversation, SIGKILL + respawn recovery, /compact
reopen), the second audit re-parsed the full transcript jsonl from
offset 0. The in-prompt mandatory Grep-dedup prevented double storage
writes, but the LLM still burned tokens on every already-audited turn
before rejecting them. Full-day resumed sessions would roughly double
their audit cost.
Fix: persist the byte offset reached at the end of each successful
audit, per Claude session id, in .axme-code/audited-offsets/<id>.txt.
On the next audit of the same Claude session, parseTranscriptFromOffset
reads the transcript starting from that byte position, so the LLM only
sees the turns that were actually new since last audit.
Implementation details:
- transcript-parser.ts: new parseTranscriptFromOffset returning
{ turns, endOffset, bytesRead, fileSize }. Operates on Buffer (bytes),
not String (UTF-16 code units), so offsets are true byte positions
safe for multibyte characters (Russian, Chinese, emoji, etc). The
existing parseTranscript(path) is kept as a zero-offset wrapper for
dry-run and other one-shot callers.
- transcript-parser.ts: parseAndRenderTranscripts now accepts an
optional startOffsets map keyed by Claude session id and returns
endOffsets + bytesRead maps. Refs without an entry read from 0.
- storage/sessions.ts: new readAuditedOffset / writeAuditedOffset /
clearAuditedOffset helpers. Offset files contain just the number as
ASCII; atomic writes via the existing engine. Missing/corrupt file
is treated as offset 0 ("parse from beginning").
- session-cleanup.ts: before parsing, look up the stored offset for
each attached Claude session. After a successful audit, persist the
new end offset. On failure the old offset stays intact, so a retry
re-processes the same turns (correct behavior — avoids losing data
on transient errors).
- Truncation guard: if the stored offset exceeds current file size
(file was rotated / truncated), reset to 0 and parse from scratch.
- Observability: log to stderr when resume-audit kicks in, showing
the skipped prefix and the new bytes consumed.
Verified with 22 unit assertions in /tmp/test-resume-offset.mts:
- Full parse from offset 0 (first audit baseline)
- Resume parse returns only new turns after appending to jsonl
- Truncation guard resets to 0 when offset > fileSize
- Offset storage roundtrip (read/write/clear, invalid-value rejection)
- Multibyte Russian text survives byte-offset slicing intact
- parseAndRenderTranscripts threads per-ref offsets end-to-end,
idempotent on steady state (same offset in = 0 bytes read out)
Existing PR#7 tests (bug-c skip logic, parse-regression, mcp-e2e
cleanup) re-run green — no regression.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extends the audit-logs/<timestamp>_<id>.json record with a `resume` array:
one entry per attached Claude session, showing startOffset (loaded from
audited-offsets/ at the beginning of this audit), endOffset (written back
after success), bytesRead this call, and a `resumed: boolean` flag that
is true whenever the optimization actually skipped a previously-audited
prefix.
Previously this information was only written to stderr, which is fine
for live diagnosis but does not survive the audit run. The durable
audit log now lets operators answer "did the resume optimization fire
on this session, and where did the auditor start reading?" long after
the fact, which is exactly the observability the user asked for.
Schema:
resume: [
{
claudeSessionId: "<claude-uuid>",
startOffset: number, // bytes already audited before this call
endOffset: number, // bytes audited after this call (new floor)
bytesRead: number, // endOffset - startOffset (or full file on truncation reset)
resumed: boolean, // true iff startOffset > 0
},
...
]
One entry per attached Claude session ref. Always present (even when
startOffset is 0 — that's the first-audit baseline).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… retry Bug: a session whose audit was killed mid-run (SIGKILL on window close, OOM, reboot, crash) sits with auditStatus="pending" AND auditAttempts=1. Once the pending state ages past AUDIT_STALE_TIMEOUT_MS (15 min), the concurrent-audit check in runSessionCleanup falls through — but the next check (retry cap) immediately skips with skipped="retry-cap" because attempts is already at MAX_AUDIT_ATTEMPTS. The session becomes permanently un-auditable by any automatic path. We observed c2eadf04 stuck in exactly this state today. Fix: when stale-pending is detected (auditStatus=pending + age > stale timeout), reset auditAttempts to 0 in the in-memory session object before the retry-cap check. A stale-pending state means the previous attempt was killed before it could complete, which is not a deterministic failure — the retry cap should not apply. The cap still protects against real repeated failures, which show up as auditStatus="failed" rather than "pending"+stale, and are not touched by this reset. Verified with /tmp/test-stale-pending-reset.mts (10 assertions, all pass): Scenario 1: stale pending (20 min) + attempts=1 → resets, runs to done Scenario 2: fresh pending (5 min) + attempts=1 → concurrent-audit skip Scenario 3: failed + attempts=1 → retry-cap skip (no reset, correct) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…on in addDecision
Two independent bugs in src/storage/decisions.ts addDecision were allowing
the auditor to accumulate duplicate decision files. Today's workspace has
109 files in decisions/ where the expected count is ~30-60 — nearly half
are duplicates, accumulated over PR#6 E2E audit runs.
Bug 1: deduplicateDecisions was never called from the scoped write path.
The function exists at line 271 of the same file and correctly normalizes
titles via stopword stripping + lowercasing, but saveScopedDecisions calls
addDecision which went straight to disk without dedup. The auditor emitting
"Use TypeScript", "use typescript", and "Using TypeScript for the project"
on successive runs would create three separate files.
Bug 2: D-NNN id generation was a read-modify-write race. Two parallel
addDecision calls (two audit workers, two windows closing simultaneously)
both read the same max-id+1 from listDecisions(). Both write D-051-<slug>.md
with different slug suffixes. Because filenames differ, atomicWrite (which
is temp-file + rename) does not catch the collision — both files land on
disk, both claim to be D-051. We observed D-001 collisions in today's
workspace.
Fix:
1. At the top of addDecision, compute normalizeTitle(input.title) and
search existing decisions for a match. If found, return the existing
decision unchanged — no double-write. Callers see the existing record
as if they had created it.
2. Replace atomicWrite with writeFileSync(path, content, { flag: "wx" }).
The "wx" flag maps to O_CREAT|O_EXCL at the POSIX level, which makes
the create atomic across processes: the first writer wins, the second
gets EEXIST. On EEXIST we bump nextNum and retry. Bounded to 50
attempts, which is far more than any realistic race window.
Verified with two test scripts (16 assertions total, all pass):
/tmp/test-decisions-race.mts — 20 parallel addDecision calls produce
exactly 20 unique contiguous D-NNN ids; a second parallel batch of 20
extends to D-040 with no collisions.
/tmp/test-decisions-dedup.mts — "Use TypeScript" + "use typescript" +
"Using TypeScript for the project" all collapse to 1 file (normalize-
Title handles the stopword + case variations). Genuinely different
decisions still get separate files.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… field
Problem: an agent starting a fresh Claude Code session in a multi-repo
workspace cannot reliably tell WHERE the current session's .axme-code/
storage lives. The workspace root and each child repo each have their
own separate .axme-code/, and an agent doing "ls .axme-code/sessions/"
relative to its cwd gets a different answer depending on which
subdirectory its cwd happens to be in. In today's session a new agent
ran that exact command from inside the axme-code repo subdir and read
the (empty) repo-level storage instead of the (live) workspace-level
storage, then reported the session as broken.
This is not an agent bug — it is a missing instruction. We have the
information, we just were not surfacing it. Two fixes make the
authoritative path visible in two independent places.
Fix D (storage root header):
1. getFullContext now prepends a "# AXME Storage Root" header block to
every axme_context output. The block lists session origin, session
type (single-repo / workspace), storage root, sessions dir, audit
logs dir, audit worker logs dir — all as absolute paths. Includes a
CRITICAL instruction telling the agent to always use ABSOLUTE paths
rooted at the Storage root for direct .axme-code/ inspection.
2. buildInstructions (MCP server instructions returned at initialize)
gets a STORAGE ROOT line with the same absolute path. Shown to the
agent at the start of every session before axme_context is even
called.
3. Both CLAUDE.md templates in cli.ts get a new "Storage paths
(critical)" section with the same guidance, and a note that the
MCP server does not hot-reload and a window restart is needed after
axme-code code changes.
Fix D+ (per-session origin field):
1. Add `origin: string` to SessionMeta. Populated at createSession time
with the absolute path of the directory that contains .axme-code/.
Never changes after creation.
2. Storage Root header, MCP instructions, and CLAUDE.md all tell the
agent: if you pick up a session meta.json file directly (not via
axme_context), read the `origin` field to verify which storage root
it belongs to. This is a per-session cross-check that lets agents
navigate correctly even when starting from a specific session file
instead of the context tool.
Verified with /tmp/test-storage-root-header.mts (13 assertions, all pass):
- getFullContext output starts with "# AXME Storage Root" and contains
the absolute session origin + storage root paths
- Single-repo and workspace session types both render correctly
- Uninitialized project still emits the header
- createSession populates meta.origin with the absolute workspace path
- Persisted meta.json has the origin field
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ifecycle
This is the architectural fix that closes the two-day loop of audit bugs.
Root cause of everything we have been chasing: the auditor runs synchronously
inside the MCP server (cleanupAndExit on transport close) and inside the
session-end hook subprocess. Both are children of Claude Code. VS Code kills
Claude Code when the window closes — mildly in side-panel mode (a few seconds
grace), harshly in center-editor-tab mode (immediate SIGKILL on tab close).
Real audits take 30-120+ seconds (LLM round-trip, possibly chunked). The
audit dies mid-run, leaves phase="started"/auditStatus="pending", cannot be
retried without additional fixes, and every workaround so far (/proc ppid
walk, stale mapping detection, resume offsets, retry caps) has been a
symptom fix on top of this fragility.
The fix: the audit never runs in any process that VS Code can kill.
* New file src/audit-spawner.ts exposes spawnDetachedAuditWorker(workspace,
sessionId). It spawns `node dist/cli.mjs audit-session --workspace X
--session Y` with { detached: true, stdio: ['ignore', fd, fd] }, calls
child.unref(), and returns. `detached: true` maps to setsid(2) — the
child becomes its own process group leader. Signals sent to the parent's
process group (which is what VS Code does when killing Claude Code and
everything under it) do NOT propagate to the child. The worker survives.
* New CLI subcommand `audit-session --workspace X --session Y` in cli.ts
(landed with the preceding Fix D commit because server.ts/cli.ts were
touched for both fixes in the same editing pass — the commits build
only together). It is a thin wrapper: imports runSessionCleanup and
calls it on the (workspace, sessionId) pair. No new audit logic. The
command is also directly invokable from the shell for manual force
re-audit of a specific session — a useful operator escape hatch.
* cleanupAndExit in server.ts (also landed with Fix D commit for the
reason above) no longer awaits runSessionCleanup. It spawns a detached
worker per owned session and exits. Total time from stdin-end to
process.exit is milliseconds. VS Code cannot kill the audit because the
audit is not inside this process anymore.
* auditOrphansInBackground in server.ts: same change — spawns a detached
worker per orphan instead of awaiting cleanup inline.
* src/hooks/session-end.ts: handleSessionEnd replaces `await runSession-
Cleanup(...)` with `spawnDetachedAuditWorker(...)`. The hook subprocess
returns in milliseconds, well under its 120s timeout. The audit
continues in the detached worker.
* Worker stdio is redirected to .axme-code/audit-worker-logs/<sessionId>.log
so operators can post-mortem exactly what each worker did. The parent
closes its own fd after spawn so a parent exit does not disturb the
writer (the child has its own dup of the fd).
* AXME_SKIP_HOOKS=1 is passed in the worker's env, so any subclaude the
auditor spawns via SDK still gets the early-exit in its hooks and does
not create ghost AXME sessions. This is the exact mechanism the PR#7
SDK env passthrough test verified.
Concurrency: two callers (cleanupAndExit + session-end hook) may both fire
for the same session in quick succession. Both spawn workers. Both workers
call runSessionCleanup. The first to flip auditStatus=pending wins; the
second hits the existing "concurrent-audit" skip check and exits without
touching the LLM. No wasted cost, no race on storage writes.
Module cache caveat: a running MCP server's copy of spawnDetachedAuditWorker
is whatever was loaded when the server started. We expect this function to
never change again — its entire body is 20 lines of trivial spawn/unref/
closeSync. The audit LOGIC lives inside the child process, which reads a
fresh dist/cli.mjs on every spawn, so iteration on the auditor is safe
without a VS Code window reload. Only changes to the spawner itself (or to
the MCP server / hooks) require a window reload to take effect.
Verification — three test scripts (all green) plus a real LLM smoke test:
/tmp/test-detached-worker.mts (11 assertions):
- Direct audit-session CLI call runs runSessionCleanup and transitions
session meta to closedAt + auditedAt + auditStatus=done.
- Helper script spawns a detached worker and hangs. Test SIGKILLs the
helper. After the helper is dead, the detached worker completes the
audit and writes all expected state. This is the proof that SIGKILL
on the spawner does not kill the worker.
/tmp/test-mcp-cleanup-e2e.mts (updated, 13 assertions):
- Full MCP server subprocess. Hook via sh -c creates an AXME session
(verifies Bug A ppid walk still works). Close stdin → stdin-end
handler → cleanupAndExit spawns detached worker → server exits.
Test polls session meta until worker sets auditStatus=done, then
verifies the full lifecycle transition.
/tmp/test-real-audit-smoke.mts (23 assertions — REAL LLM call):
- Fresh tmp workspace, fake AXME session pointing at a real 22 KB
Claude transcript.
- Helper script spawns detached worker and exits immediately. Worker
is now orphaned from any test parent — simulates the exact scenario
where VS Code kills the MCP server 5ms after cleanupAndExit fires.
- Poll until the worker finishes (took 30.3 seconds on sonnet-4-6,
6321 prompt tokens, $0.0962 cost).
- Verify: session meta has all lifecycle flags set correctly. Audit
log has phase=finished, resume[0] with startOffset=0 / endOffset=
22204 / resumed=false, totals populated, model + duration + cost
recorded. audited-offsets/<claude-id>.txt written with offset 22204.
Worker log file exists with audit-session markers.
The smoke test is the empirical answer to "when I close the VS Code window,
does the audit actually run and finish". Yes. Verified on real data.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…etry cap Gap discovered in Fix B after initial implementation: runSessionCleanup resets auditAttempts when it sees a stale-pending session (so the retry cap does not apply to killed-mid-run attempts). But findOrphanSessions, which is the orphan scanner that DECIDES which sessions runSessionCleanup gets called on, does its own retry-cap filter — and strictly skips any session with auditAttempts >= MAX. So a stale-pending session with maxed attempts (exactly c2eadf04's state) would never reach runSessionCleanup through the orphan path, and Fix B could only help if someone manually invoked runSessionCleanup on that session. Fix: mirror the stale-pending override from runSessionCleanup into findOrphanSessions. A session with auditAttempts=MAX is still skipped, EXCEPT when auditStatus="pending" AND auditStartedAt is older than AUDIT_STALE_TIMEOUT_MS — that combination uniquely identifies a killed- mid-run attempt (SIGKILL from VS Code, crash, reboot, OOM), which is recoverable and should be retried. auditStatus="failed" + maxed attempts still gets filtered — that is a deterministic failure and retrying would just fail again. With both fixes in place, a session stuck in "pending + attempts-maxed" is automatically picked up by the next MCP server's orphan scan and spawned as a detached worker. No manual intervention needed for normal lifecycle. Verified with /tmp/test-orphan-stale-pending.mts (5 scenarios, all pass): 1. stale pending + attempts=1 → IS orphan (stale-pending override) 2. failed + attempts=1 → NOT orphan (retry-cap still applies) 3. fresh pending (5 min) + attempts=1 → NOT orphan yet 4. never-attempted + dead pid → IS orphan (baseline) 5. auditedAt set → NOT orphan (baseline) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up PR fixing 4 critical bugs surfaced during PR#6 E2E verification, plus a pre-existing force-push detection false positive.
Bugs fixed
A —
/procppid walk for hooks. Claude Code wraps hook commands viash -c 'axme-code hook ...', soprocess.ppidinside a hook subprocess is the (usually already-dead)shwrapper PID, not the Claude Code PID. Hooks were recording the wrongownerPpidonactive-sessions/mapping files and the wrongpidonSessionMeta, so live sessions were being misclassified as orphans and prematurely audited/closed. New helpergetClaudeCodePid()reads/proc/<ppid>/statand walks one step up to return the grandparent. Used inwriteClaudeSessionMappingandcreateSession. The MCP server still usesprocess.ppiddirectly — it is spawned by Claude Code without theshwrapper.C — Parallel audit race (simplified, no file locks). Two MCP servers could auto-audit the same session in parallel, doubling LLM cost. Introduce
auditStatus: "pending" | "done" | "failed"onSessionMetaplusauditStartedAt/auditFinishedAt.runSessionCleanupnow refuses to re-enter a session whoseauditStatus=pendingis fresher than 15 minutes, and also enforces theMAX_AUDIT_ATTEMPTSretry cap inline (not only via orphan cleanup). Stalependingstate (>15 min) auto-releases on the next attempt, so crashes never leave sessions permanently stuck. The race window is safe becausesaveScopedMemories/saveScopedDecisionsalready dedup by slug. The entire oldpending-audits/directory API is removed — all state now lives onSessionMeta, and a startup cleanup deletes any leftover legacy files.F — Subclaude audit loading project hooks. The session-auditor spawns a sub-Claude (via claude-agent-sdk) with
cwd=sessionOrigin. Every tool call the sub-Claude made would fire the workspace's PreToolUse/PostToolUse/SessionEnd hooks and create short-lived "ghost" AXME sessions viaensureAxmeSessionForClaude. The auditor now passesenv: { ...process.env, AXME_SKIP_HOOKS: \"1\" }to the SDK query, and every hook handler returns immediately when it sees that env var. Belt-and-braces with existingsettingSources: [].E — English-only output. Tighten the
AUDIT_PROMPTlanguage section to forbid transliteration of non-English words in slug and keywords (translate, never romanize). Prompt-only — no hardcoded Cyrillic/Chinese/whatever filter in the parser, so the code stays language-agnostic.Bonus fix (found while trying to push this PR)
checkGitforce-push false positive. The old check didstripped.includes(\"-f\")on the full command line, which triggered on any branch name containing-fas a substring (e.g.feat/my-fixesmatched the-finside-fixes). Switched to token-level matching that recognizes-f,--force,--force-with-lease,--force=...,--force-with-lease=..., and the+refspecform — but not substrings inside branch names. 14 unit cases cover the regression.Test plan
Self-verification performed in-session via
tsxscripts under/tmp/(not committed):npm run lint+npm run build)/procwalk verified — a node process invoked viash -creturns its grandparent PID (skipping theshwrapper)runSessionCleanupskip-logic covers concurrent-audit, retry-cap, already-audited, and stale-pending fall-through pathsAXME_SKIP_HOOKS=1and create no AXME sessions; baseline "without the env var" still creates sessions normallyparseAuditOutputregression: memories + ADR-style decisions + safety + handoff still parse correctly after removing the language-specific branchescheckGitforce-push detection: 6 false-positive branch names now pass; 6 genuine force-push forms still denied; protected-branch check unchangedCannot self-verify (tracked for next session)
cleanupAndExitflow on MCP server transport close across two independent VS Code windows. Needs a real second window; plan is to verify manually after merge.