feat: scope-aware audit writes + auditor isolation by George-iam · Pull Request #5 · AxmeAI/axme-code

George-iam · 2026-04-05T09:14:50Z

Summary

Two fixes in one PR:

1. Scope-aware writes. The session auditor was extracting a `scope` field on every memory/decision/safety rule but session-cleanup ignored it and wrote everything to workspacePath. This PR routes each extraction to the correct storage level (workspace vs specific repo).

2. Auditor isolation (critical bug fix). Dry-run verification revealed the auditor was behaving as the main Claude Code agent continuing the user's work instead of performing an audit. Root cause: SDK inherited project `.mcp.json` (auditor had `axme_context` tool, loaded full workspace context), `claude_code` system prompt preset told it to help the user, and cwd pointed at an active workspace with an open branch. Fixed by using a custom system prompt, disabling MCP servers, and not inheriting project settings.

Scope routing

Memories (`saveScopedMemories`): `scope=all` -> session origin. `scope=[repo]` -> each listed repo only. No more double-writing to workspace root for multi-repo scoped memories.

Decisions (`saveScopedDecisions`): now accepts `Omit<Decision, "id">` and generates a fresh sequential id per target path via `addDecision`. `scope=all` -> session origin, `scope=[repo]` -> each listed repo.

Safety rules (new `saveScopedSafetyRule`): same scope semantics. New `loadMergedSafetyRules` union-merges workspace-level base rules with a specific repo's override (stricter wins).

PreToolUse hook now walks up from file paths to the containing `.git` directory and loads merged rules for that repo + workspace. Bash uses workspace + session-origin merged rules.

Handoff stays at the session origin (one handoff per AXME session).

Auditor isolation

`systemPrompt`: custom `AUDIT_SYSTEM_PROMPT` that explicitly states the auditor is NOT Claude Code, is NOT continuing work, and must only emit the structured output.
`settingSources: []` - do not inherit `.mcp.json`, `.claude/settings.json`, or hooks.
`mcpServers: {}` - no `axme_*` tools attached.
`disallowedTools` extended with `ToolSearch` to prevent dynamic tool lookup.

Auditor context for scope determination

`runSessionAudit` now accepts a `WorkspaceInfo` object with the repo list. `buildWorkspaceContext` embeds that list plus a filesChanged-by-repo breakdown so the auditor can correlate touched files with scope. `buildExistingContext` now scans both workspace root and every per-repo `.axme-code/` for existing items, so dedup catches items at either level.

Prompt v4 includes an explicit scope determination section with rules and accept/reject examples.

Verification (dry-run on session 1df5d43d)

Before auditor isolation fix: 332s, $2.30, 0 memories, 0 decisions, 12 tool calls (reading source files, attempting `ToolSearch` for Bash). Auditor's first thinking step was `"I'm picking up where I left off - I need to rerun the scope-dryrun test, verify scope routing, clean up the test file, then commit and push."` - it thought it was the main agent.

After auditor isolation fix: 72s, $1.60, 3 memories (all scope=all -> workspace), 3 decisions (all scope=[axme-code] -> per-repo), 0 tool calls, full handoff. First thinking: `"Looking back at this session, the user gave me three key pieces of feedback..."` - correctly identifies session as history.

Routing verification:

Item	Scope	Routes to
give-one-recommendation-not-options	all	workspace/.axme-code/memory/
use-git-c-instead-of-cd	all	workspace/.axme-code/memory/
use-exact-file-names-not-vague-terms	all	workspace/.axme-code/memory/
axme-code-session-id-self-generated	[axme-code]	axme-code/.axme-code/decisions/
hook-commands-embed-workspace-path	[axme-code]	axme-code/.axme-code/decisions/
axme-code-session-storage-workspace-level	[axme-code]	axme-code/.axme-code/decisions/

Universal communication/workflow feedback routes to workspace. Repo-specific architecture decisions route to that repo. Auditor made no file modifications (disallowedTools verified).

Test plan (post-merge, manual)

Restart VS Code so new MCP server picks up the build
Make a few tool calls in a fresh session, close VS Code
Check new session has correct scope routing:
- Universal lessons landed in workspace/.axme-code/memory/
- Repo-specific decisions landed in that-repo/.axme-code/decisions/
- Safety rules scoped correctly
- Handoff written to session origin
Verify PreToolUse hook correctly merges workspace + repo safety rules

Not in this PR

Minor: transcript-parser filters `` only when it starts a user message. Reminders embedded mid-message leak through. Auditor still works but output is slightly noisier. Small follow-up.

The session auditor previously wrote every extracted memory, decision, and safety rule to the session origin (workspacePath), ignoring the "scope" field the LLM produced. This PR routes each extraction to the right storage level (workspace-wide vs specific repo) and also fixes a critical bug where the auditor LLM was behaving as the main Claude Code agent instead of as an isolated auditor. ## Scope routing - New saveScopedSafetyRule() in storage/safety.ts. Routes safety rules by scope the same way saveScopedMemories() / saveScopedDecisions() do: "all" goes to session origin, [repo] goes to that repo, multi-repo fans out. - New loadMergedSafetyRules() in storage/safety.ts. Union-merges workspace- level base rules with a specific repo's override rules. Stricter always wins on conflicts (any deny wins, any allow-deny intersection is deny). - PreToolUse hook now loads merged rules. For file-based tools it walks up from the file path to the containing .git directory, loads that repo's rules, and merges with workspace rules. For Bash it uses merged workspace + session-origin rules. - saveScopedDecisions() changed to accept Omit<Decision, "id"> and generate a fresh sequential id per target path via addDecision(). Previously it required a caller-supplied id, which broke the audit->save pipeline. - saveScopedMemories() stopped double-writing to workspace root for multi-repo scoped memories. Memory is now written only to the listed repos. Only "all"-scoped memories go to session origin. - session-cleanup.ts now detects workspace vs single-repo session, passes workspace structure to the auditor, and uses saveScoped* for all writes. - Handoff still written to session origin (one handoff per AXME session). ## Auditor context for scope determination - runSessionAudit() now accepts a WorkspaceInfo object with the full list of repos. The auditor needs this to know which scope values are valid. - buildWorkspaceContext() formats this list plus a filesChanged-by-repo breakdown and embeds it in the prompt so the auditor can correlate which repos were actually touched in this session. - buildExistingContext() now scans both workspace root .axme-code/ AND every per-repo .axme-code/ for existing decisions/memories, so the dedup check catches items at either level. - Prompt v4 includes an explicit scope determination section with rules (universal -> "all", repo-specific -> [repo], multi-repo -> list) and the correct output-format markers for scope. - parseAuditOutput() now parses scope from DECISIONS and SAFETY sections (it was already parsed for MEMORIES). ## Auditor isolation (critical bug fix) Initial dry-run returned an empty extraction with 12 tool calls, 332s, and $2.30 cost. Inspecting the auditor's own Claude Agent SDK session transcript revealed the auditor's first thinking step was "I'm picking up where I left off — I need to rerun the scope-dryrun test, verify scope routing, clean up the test file, then commit and push." The auditor thought IT was the main Claude Code agent continuing the user's work. Root causes: 1. SDK query inherited the project's .mcp.json, so the auditor had access to the axme_context MCP tool. It called axme_context and received the full project context, cementing the illusion of being the main agent. 2. The default claude_code system prompt preset tells the model "you are Claude Code helping the user with software engineering tasks". Our audit instructions, passed as a user message, were overridden by this. 3. cwd was the active workspace with an open branch, reinforcing "I'm doing normal work here". Fixes (all in runSessionAudit queryOpts): - systemPrompt: custom AUDIT_SYSTEM_PROMPT that explicitly states "You are the AXME Code session auditor. You are NOT Claude Code. You are NOT continuing any user's work. The transcript is HISTORY — not a task." - settingSources: [] — do not inherit project settings. The auditor runs in isolation from .mcp.json, .claude/settings.json, hooks. - mcpServers: {} — no MCP servers attached. No axme_context, no external tools, only the three filesystem tools we explicitly allow. - disallowedTools extended with ToolSearch to prevent the auditor from trying to dynamically fetch Bash or other blocked tools. ## Verification (dry-run on session 1df5d43d) Before the fix: 332s, $2.30, 0 memories, 0 decisions, 12 tool calls reading source files and attempting ToolSearch for Bash. After the fix: 72s, $1.60, 3 memories (all scope="all" -> workspace), 3 decisions (all scope=[axme-code] -> per-repo), 0 tool calls (existing context in the prompt was sufficient for dedup), full handoff. Extracted items: - MEMORIES (scope=all, routed to workspace/.axme-code/memory/): - give-one-recommendation-not-options - use-git-c-instead-of-cd - use-exact-file-names-not-vague-terms - DECISIONS (scope=[axme-code], routed to axme-code/.axme-code/decisions/): - axme-code session ID is self-generated, stored in .axme-code/active-session - Hook commands embed absolute --workspace path at setup time - axme-code session + worklog + filesChanged storage is workspace-level Universal communication/workflow feedback -> workspace. Repo-specific architecture decisions -> that repo's storage. Routing is correct. ## Files changed | File | Change | |---|---| | src/storage/safety.ts | +saveScopedSafetyRule, +loadMergedSafetyRules, +unionMergeSafety, export SafetyRuleType | | src/storage/decisions.ts | saveScopedDecisions now accepts Omit<Decision, "id"> and uses addDecision for fresh ids | | src/storage/memory.ts | saveScopedMemories no longer double-writes to workspace for multi-repo scopes | | src/hooks/pre-tool-use.ts | Merged rule loading per-file, containing-repo walk from file path | | src/agents/session-auditor.ts | Custom system prompt, mcpServers={}, settingSources=[], workspace context builder, prompt v4 with scope rules, parse scope in decisions and safety | | src/session-cleanup.ts | Uses saveScoped*, passes workspaceInfo to auditor, routes writes by scope | Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Adds optional model parameter to runSessionAudit (defaults to claude-opus-4-6) and keeps the scope-dryrun.mts test script in the repo for future verification. ## Sonnet vs Opus comparison on session 1df5d43d Ran the auditor on the same transcript with both models to verify whether Sonnet could be a cheaper default. **Opus 4.6**: 72s, \$1.60 - Identified role correctly: "Looking back at this session, user gave me three key pieces of feedback" - Emitted 3 memories (scope=all -> workspace), 3 decisions (scope=[axme-code] -> per-repo), full handoff - Parser extracted all items cleanly **Sonnet 4.6**: 86s, \$0.72 - IGNORED the custom AUDIT_SYSTEM_PROMPT - Took the [USER]/[ASSISTANT] markers in the rendered transcript as a chat template - Produced 14k chars of "conversation continuation" instead of markers: answered questions from the transcript, cited parts, and ended with a full "Session 45 prompt" as if it was writing a handoff message live - Even wrote tool_use-like mentions "[Edit: WORKLOG.md] [Write: HANDOFF.md] [Bash: git commit]" as fake text inside the conversation continuation (tools were disabled so no actual calls happened) - Zero structured markers -> parser returned empty result The difference is role adherence under chat-template pressure. Opus holds the custom "you are NOT Claude Code, you are NOT continuing work" instruction even when staring at [USER]/[ASSISTANT]-marked text. Sonnet does not. Keeping Opus as the default. Saving 55% cost by downgrading to Sonnet produces empty audits, which is strictly worse than paying more for correct ones. Possible future work: XML-wrap the transcript sections to avoid the chat-template trigger, then retry Sonnet. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…et default The previous [USER] / [ASSISTANT] chat-style markers in the rendered transcript triggered the model's chat-continuation pattern-matching. Sonnet took them literally as a live chat template and wrote a 14k-char "conversation continuation" instead of emitting extraction markers — it answered questions from the transcript as if it was a participant. Opus held the custom system prompt but not deterministically; the pattern-matching pull is strong and any model could fail on any session. ## Fix - transcript-parser.ts renderConversation() now emits XML-wrapped data inside <session_transcript> with <user_message>, <assistant_message>, <assistant_thinking>, <assistant_tool_calls> tags. XML is the Anthropic-recommended format for structured data in prompts and does not pattern-match as chat. - escapeXml() for content so transcript text with &, <, > does not break the outer tags. - session-auditor.ts system prompt updated: explicitly states the transcript is XML structured data, NOT a conversation the agent is part of, NOT something to respond to. Also mandates that the first characters of the response be "###MEMORIES###". - session-auditor.ts user message includes a one-line reminder before the transcript block: "structured XML data. It is HISTORY. You are not a participant." - The worklog fallback (used when no transcript is attached) is also wrapped in <session_worklog_events> so the model always sees structured data, never raw chat lines. ## Default model: Sonnet With the XML wrap, Sonnet now works correctly. Opus is overkill for this task. Default model changed back to claude-sonnet-4-6. ## Verification on session 1df5d43d | Prompt | Model | Time | Cost | Role ok | Output ok | |---|---|---|---|---|---| | chat-marker | opus-4-6 | 72s | \$1.60 | ✓ | markers, 3 mem / 3 dec | | chat-marker | sonnet-4-6 | 86s | \$0.72 | ✗ | chat continuation, 0 / 0 | | XML wrap | sonnet-4-6 | 77s | \$0.93 | ✓ | markers, 1 mem / 2 dec | Sonnet + XML: correct markers, correct scope routing (all universal memories -> workspace, all repo-specific decisions -> axme-code), uses Glob+Grep tools properly to dedup candidates against existing storage. The thinking blocks in the auditor's own transcript show it correctly identified itself as "analyzing this transcript" and "extracting memories" rather than "continuing the user's work". Sonnet extracted fewer items than Opus on the same transcript (1 vs 3 memories, 2 vs 3 decisions) — it is more conservative about what counts as a meaningful correction. That is calibration, not a bug. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The session auditor model is now user-configurable through .axme-code/config.yaml with the new auditor_model field. Default is claude-sonnet-4-6 (enough for the audit task once the transcript is XML-wrapped). Users can override to claude-opus-4-6 for more conservative extraction, or claude-haiku-4-5 for cheaper runs. ## Changes - types.ts: new DEFAULT_AUDITOR_MODEL constant and auditorModel field on ProjectConfig. Keeps the general "model" field for engineer / reviewer / tester agents separate from the auditor model, since the two have different requirements. - storage/config.ts: parseConfig reads auditor_model from yaml, formatConfig writes it with a comment explaining its purpose. - session-cleanup.ts: reads config via readConfig(workspacePath) and passes config.auditorModel to runSessionAudit. - session-auditor.ts: default model constant imported from types, removed hardcoded "claude-sonnet-4-6" string. ## Backward compat Legacy config.yaml files without auditor_model continue to work — parseConfig falls back to DEFAULT_AUDITOR_MODEL (Sonnet). Smoke test verified four cases: 1. Missing config file -> Sonnet default 2. Legacy yaml without auditor_model field -> Sonnet default 3. Explicit auditor_model in yaml -> honored 4. writeConfig round-trip -> field persisted with comment New axme-code setup runs will write the field automatically via the DEFAULT_PROJECT_CONFIG spread that init.ts already uses. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

George-iam and others added 4 commits April 5, 2026 09:14

George-iam merged commit 0c0a561 into main Apr 5, 2026

George-iam deleted the feat/scoped-audit-writes-20260405 branch April 7, 2026 08:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: scope-aware audit writes + auditor isolation#5

feat: scope-aware audit writes + auditor isolation#5
George-iam merged 4 commits intomainfrom
feat/scoped-audit-writes-20260405

George-iam commented Apr 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant