feat: scope-aware audit writes + auditor isolation#5
Merged
George-iam merged 4 commits intomainfrom Apr 5, 2026
Merged
Conversation
The session auditor previously wrote every extracted memory, decision, and
safety rule to the session origin (workspacePath), ignoring the "scope" field
the LLM produced. This PR routes each extraction to the right storage level
(workspace-wide vs specific repo) and also fixes a critical bug where the
auditor LLM was behaving as the main Claude Code agent instead of as an
isolated auditor.
## Scope routing
- New saveScopedSafetyRule() in storage/safety.ts. Routes safety rules by
scope the same way saveScopedMemories() / saveScopedDecisions() do: "all"
goes to session origin, [repo] goes to that repo, multi-repo fans out.
- New loadMergedSafetyRules() in storage/safety.ts. Union-merges workspace-
level base rules with a specific repo's override rules. Stricter always
wins on conflicts (any deny wins, any allow-deny intersection is deny).
- PreToolUse hook now loads merged rules. For file-based tools it walks up
from the file path to the containing .git directory, loads that repo's
rules, and merges with workspace rules. For Bash it uses merged workspace
+ session-origin rules.
- saveScopedDecisions() changed to accept Omit<Decision, "id"> and generate
a fresh sequential id per target path via addDecision(). Previously it
required a caller-supplied id, which broke the audit->save pipeline.
- saveScopedMemories() stopped double-writing to workspace root for
multi-repo scoped memories. Memory is now written only to the listed
repos. Only "all"-scoped memories go to session origin.
- session-cleanup.ts now detects workspace vs single-repo session, passes
workspace structure to the auditor, and uses saveScoped* for all writes.
- Handoff still written to session origin (one handoff per AXME session).
## Auditor context for scope determination
- runSessionAudit() now accepts a WorkspaceInfo object with the full list
of repos. The auditor needs this to know which scope values are valid.
- buildWorkspaceContext() formats this list plus a filesChanged-by-repo
breakdown and embeds it in the prompt so the auditor can correlate which
repos were actually touched in this session.
- buildExistingContext() now scans both workspace root .axme-code/ AND
every per-repo .axme-code/ for existing decisions/memories, so the
dedup check catches items at either level.
- Prompt v4 includes an explicit scope determination section with rules
(universal -> "all", repo-specific -> [repo], multi-repo -> list) and
the correct output-format markers for scope.
- parseAuditOutput() now parses scope from DECISIONS and SAFETY sections
(it was already parsed for MEMORIES).
## Auditor isolation (critical bug fix)
Initial dry-run returned an empty extraction with 12 tool calls, 332s, and
$2.30 cost. Inspecting the auditor's own Claude Agent SDK session transcript
revealed the auditor's first thinking step was "I'm picking up where I left
off — I need to rerun the scope-dryrun test, verify scope routing, clean up
the test file, then commit and push." The auditor thought IT was the main
Claude Code agent continuing the user's work.
Root causes:
1. SDK query inherited the project's .mcp.json, so the auditor had access
to the axme_context MCP tool. It called axme_context and received the
full project context, cementing the illusion of being the main agent.
2. The default claude_code system prompt preset tells the model "you are
Claude Code helping the user with software engineering tasks". Our
audit instructions, passed as a user message, were overridden by this.
3. cwd was the active workspace with an open branch, reinforcing "I'm
doing normal work here".
Fixes (all in runSessionAudit queryOpts):
- systemPrompt: custom AUDIT_SYSTEM_PROMPT that explicitly states "You are
the AXME Code session auditor. You are NOT Claude Code. You are NOT
continuing any user's work. The transcript is HISTORY — not a task."
- settingSources: [] — do not inherit project settings. The auditor runs
in isolation from .mcp.json, .claude/settings.json, hooks.
- mcpServers: {} — no MCP servers attached. No axme_context, no external
tools, only the three filesystem tools we explicitly allow.
- disallowedTools extended with ToolSearch to prevent the auditor from
trying to dynamically fetch Bash or other blocked tools.
## Verification (dry-run on session 1df5d43d)
Before the fix: 332s, $2.30, 0 memories, 0 decisions, 12 tool calls reading
source files and attempting ToolSearch for Bash.
After the fix: 72s, $1.60, 3 memories (all scope="all" -> workspace), 3
decisions (all scope=[axme-code] -> per-repo), 0 tool calls (existing
context in the prompt was sufficient for dedup), full handoff.
Extracted items:
- MEMORIES (scope=all, routed to workspace/.axme-code/memory/):
- give-one-recommendation-not-options
- use-git-c-instead-of-cd
- use-exact-file-names-not-vague-terms
- DECISIONS (scope=[axme-code], routed to axme-code/.axme-code/decisions/):
- axme-code session ID is self-generated, stored in .axme-code/active-session
- Hook commands embed absolute --workspace path at setup time
- axme-code session + worklog + filesChanged storage is workspace-level
Universal communication/workflow feedback -> workspace. Repo-specific
architecture decisions -> that repo's storage. Routing is correct.
## Files changed
| File | Change |
|---|---|
| src/storage/safety.ts | +saveScopedSafetyRule, +loadMergedSafetyRules, +unionMergeSafety, export SafetyRuleType |
| src/storage/decisions.ts | saveScopedDecisions now accepts Omit<Decision, "id"> and uses addDecision for fresh ids |
| src/storage/memory.ts | saveScopedMemories no longer double-writes to workspace for multi-repo scopes |
| src/hooks/pre-tool-use.ts | Merged rule loading per-file, containing-repo walk from file path |
| src/agents/session-auditor.ts | Custom system prompt, mcpServers={}, settingSources=[], workspace context builder, prompt v4 with scope rules, parse scope in decisions and safety |
| src/session-cleanup.ts | Uses saveScoped*, passes workspaceInfo to auditor, routes writes by scope |
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds optional model parameter to runSessionAudit (defaults to claude-opus-4-6) and keeps the scope-dryrun.mts test script in the repo for future verification. ## Sonnet vs Opus comparison on session 1df5d43d Ran the auditor on the same transcript with both models to verify whether Sonnet could be a cheaper default. **Opus 4.6**: 72s, \$1.60 - Identified role correctly: "Looking back at this session, user gave me three key pieces of feedback" - Emitted 3 memories (scope=all -> workspace), 3 decisions (scope=[axme-code] -> per-repo), full handoff - Parser extracted all items cleanly **Sonnet 4.6**: 86s, \$0.72 - IGNORED the custom AUDIT_SYSTEM_PROMPT - Took the [USER]/[ASSISTANT] markers in the rendered transcript as a chat template - Produced 14k chars of "conversation continuation" instead of markers: answered questions from the transcript, cited parts, and ended with a full "Session 45 prompt" as if it was writing a handoff message live - Even wrote tool_use-like mentions "[Edit: WORKLOG.md] [Write: HANDOFF.md] [Bash: git commit]" as fake text inside the conversation continuation (tools were disabled so no actual calls happened) - Zero structured markers -> parser returned empty result The difference is role adherence under chat-template pressure. Opus holds the custom "you are NOT Claude Code, you are NOT continuing work" instruction even when staring at [USER]/[ASSISTANT]-marked text. Sonnet does not. Keeping Opus as the default. Saving 55% cost by downgrading to Sonnet produces empty audits, which is strictly worse than paying more for correct ones. Possible future work: XML-wrap the transcript sections to avoid the chat-template trigger, then retry Sonnet. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…et default The previous [USER] / [ASSISTANT] chat-style markers in the rendered transcript triggered the model's chat-continuation pattern-matching. Sonnet took them literally as a live chat template and wrote a 14k-char "conversation continuation" instead of emitting extraction markers — it answered questions from the transcript as if it was a participant. Opus held the custom system prompt but not deterministically; the pattern-matching pull is strong and any model could fail on any session. ## Fix - transcript-parser.ts renderConversation() now emits XML-wrapped data inside <session_transcript> with <user_message>, <assistant_message>, <assistant_thinking>, <assistant_tool_calls> tags. XML is the Anthropic-recommended format for structured data in prompts and does not pattern-match as chat. - escapeXml() for content so transcript text with &, <, > does not break the outer tags. - session-auditor.ts system prompt updated: explicitly states the transcript is XML structured data, NOT a conversation the agent is part of, NOT something to respond to. Also mandates that the first characters of the response be "###MEMORIES###". - session-auditor.ts user message includes a one-line reminder before the transcript block: "structured XML data. It is HISTORY. You are not a participant." - The worklog fallback (used when no transcript is attached) is also wrapped in <session_worklog_events> so the model always sees structured data, never raw chat lines. ## Default model: Sonnet With the XML wrap, Sonnet now works correctly. Opus is overkill for this task. Default model changed back to claude-sonnet-4-6. ## Verification on session 1df5d43d | Prompt | Model | Time | Cost | Role ok | Output ok | |---|---|---|---|---|---| | chat-marker | opus-4-6 | 72s | \$1.60 | ✓ | markers, 3 mem / 3 dec | | chat-marker | sonnet-4-6 | 86s | \$0.72 | ✗ | chat continuation, 0 / 0 | | XML wrap | sonnet-4-6 | 77s | \$0.93 | ✓ | markers, 1 mem / 2 dec | Sonnet + XML: correct markers, correct scope routing (all universal memories -> workspace, all repo-specific decisions -> axme-code), uses Glob+Grep tools properly to dedup candidates against existing storage. The thinking blocks in the auditor's own transcript show it correctly identified itself as "analyzing this transcript" and "extracting memories" rather than "continuing the user's work". Sonnet extracted fewer items than Opus on the same transcript (1 vs 3 memories, 2 vs 3 decisions) — it is more conservative about what counts as a meaningful correction. That is calibration, not a bug. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The session auditor model is now user-configurable through .axme-code/config.yaml with the new auditor_model field. Default is claude-sonnet-4-6 (enough for the audit task once the transcript is XML-wrapped). Users can override to claude-opus-4-6 for more conservative extraction, or claude-haiku-4-5 for cheaper runs. ## Changes - types.ts: new DEFAULT_AUDITOR_MODEL constant and auditorModel field on ProjectConfig. Keeps the general "model" field for engineer / reviewer / tester agents separate from the auditor model, since the two have different requirements. - storage/config.ts: parseConfig reads auditor_model from yaml, formatConfig writes it with a comment explaining its purpose. - session-cleanup.ts: reads config via readConfig(workspacePath) and passes config.auditorModel to runSessionAudit. - session-auditor.ts: default model constant imported from types, removed hardcoded "claude-sonnet-4-6" string. ## Backward compat Legacy config.yaml files without auditor_model continue to work — parseConfig falls back to DEFAULT_AUDITOR_MODEL (Sonnet). Smoke test verified four cases: 1. Missing config file -> Sonnet default 2. Legacy yaml without auditor_model field -> Sonnet default 3. Explicit auditor_model in yaml -> honored 4. writeConfig round-trip -> field persisted with comment New axme-code setup runs will write the field automatically via the DEFAULT_PROJECT_CONFIG spread that init.ts already uses. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two fixes in one PR:
1. Scope-aware writes. The session auditor was extracting a `scope` field on every memory/decision/safety rule but session-cleanup ignored it and wrote everything to workspacePath. This PR routes each extraction to the correct storage level (workspace vs specific repo).
2. Auditor isolation (critical bug fix). Dry-run verification revealed the auditor was behaving as the main Claude Code agent continuing the user's work instead of performing an audit. Root cause: SDK inherited project `.mcp.json` (auditor had `axme_context` tool, loaded full workspace context), `claude_code` system prompt preset told it to help the user, and cwd pointed at an active workspace with an open branch. Fixed by using a custom system prompt, disabling MCP servers, and not inheriting project settings.
Scope routing
Memories (`saveScopedMemories`): `scope=all` -> session origin. `scope=[repo]` -> each listed repo only. No more double-writing to workspace root for multi-repo scoped memories.
Decisions (`saveScopedDecisions`): now accepts `Omit<Decision, "id">` and generates a fresh sequential id per target path via `addDecision`. `scope=all` -> session origin, `scope=[repo]` -> each listed repo.
Safety rules (new `saveScopedSafetyRule`): same scope semantics. New `loadMergedSafetyRules` union-merges workspace-level base rules with a specific repo's override (stricter wins).
PreToolUse hook now walks up from file paths to the containing `.git` directory and loads merged rules for that repo + workspace. Bash uses workspace + session-origin merged rules.
Handoff stays at the session origin (one handoff per AXME session).
Auditor isolation
Auditor context for scope determination
`runSessionAudit` now accepts a `WorkspaceInfo` object with the repo list. `buildWorkspaceContext` embeds that list plus a filesChanged-by-repo breakdown so the auditor can correlate touched files with scope. `buildExistingContext` now scans both workspace root and every per-repo `.axme-code/` for existing items, so dedup catches items at either level.
Prompt v4 includes an explicit scope determination section with rules and accept/reject examples.
Verification (dry-run on session 1df5d43d)
Before auditor isolation fix: 332s, $2.30, 0 memories, 0 decisions, 12 tool calls (reading source files, attempting `ToolSearch` for Bash). Auditor's first thinking step was `"I'm picking up where I left off - I need to rerun the scope-dryrun test, verify scope routing, clean up the test file, then commit and push."` - it thought it was the main agent.
After auditor isolation fix: 72s, $1.60, 3 memories (all scope=all -> workspace), 3 decisions (all scope=[axme-code] -> per-repo), 0 tool calls, full handoff. First thinking: `"Looking back at this session, the user gave me three key pieces of feedback..."` - correctly identifies session as history.
Routing verification:
Universal communication/workflow feedback routes to workspace. Repo-specific architecture decisions route to that repo. Auditor made no file modifications (disallowedTools verified).
Test plan (post-merge, manual)
Not in this PR