Skip to content

feat: scope-aware audit writes + auditor isolation#5

Merged
George-iam merged 4 commits intomainfrom
feat/scoped-audit-writes-20260405
Apr 5, 2026
Merged

feat: scope-aware audit writes + auditor isolation#5
George-iam merged 4 commits intomainfrom
feat/scoped-audit-writes-20260405

Conversation

@George-iam
Copy link
Copy Markdown
Contributor

Summary

Two fixes in one PR:

1. Scope-aware writes. The session auditor was extracting a `scope` field on every memory/decision/safety rule but session-cleanup ignored it and wrote everything to workspacePath. This PR routes each extraction to the correct storage level (workspace vs specific repo).

2. Auditor isolation (critical bug fix). Dry-run verification revealed the auditor was behaving as the main Claude Code agent continuing the user's work instead of performing an audit. Root cause: SDK inherited project `.mcp.json` (auditor had `axme_context` tool, loaded full workspace context), `claude_code` system prompt preset told it to help the user, and cwd pointed at an active workspace with an open branch. Fixed by using a custom system prompt, disabling MCP servers, and not inheriting project settings.

Scope routing

Memories (`saveScopedMemories`): `scope=all` -> session origin. `scope=[repo]` -> each listed repo only. No more double-writing to workspace root for multi-repo scoped memories.

Decisions (`saveScopedDecisions`): now accepts `Omit<Decision, "id">` and generates a fresh sequential id per target path via `addDecision`. `scope=all` -> session origin, `scope=[repo]` -> each listed repo.

Safety rules (new `saveScopedSafetyRule`): same scope semantics. New `loadMergedSafetyRules` union-merges workspace-level base rules with a specific repo's override (stricter wins).

PreToolUse hook now walks up from file paths to the containing `.git` directory and loads merged rules for that repo + workspace. Bash uses workspace + session-origin merged rules.

Handoff stays at the session origin (one handoff per AXME session).

Auditor isolation

  • `systemPrompt`: custom `AUDIT_SYSTEM_PROMPT` that explicitly states the auditor is NOT Claude Code, is NOT continuing work, and must only emit the structured output.
  • `settingSources: []` - do not inherit `.mcp.json`, `.claude/settings.json`, or hooks.
  • `mcpServers: {}` - no `axme_*` tools attached.
  • `disallowedTools` extended with `ToolSearch` to prevent dynamic tool lookup.

Auditor context for scope determination

`runSessionAudit` now accepts a `WorkspaceInfo` object with the repo list. `buildWorkspaceContext` embeds that list plus a filesChanged-by-repo breakdown so the auditor can correlate touched files with scope. `buildExistingContext` now scans both workspace root and every per-repo `.axme-code/` for existing items, so dedup catches items at either level.

Prompt v4 includes an explicit scope determination section with rules and accept/reject examples.

Verification (dry-run on session 1df5d43d)

Before auditor isolation fix: 332s, $2.30, 0 memories, 0 decisions, 12 tool calls (reading source files, attempting `ToolSearch` for Bash). Auditor's first thinking step was `"I'm picking up where I left off - I need to rerun the scope-dryrun test, verify scope routing, clean up the test file, then commit and push."` - it thought it was the main agent.

After auditor isolation fix: 72s, $1.60, 3 memories (all scope=all -> workspace), 3 decisions (all scope=[axme-code] -> per-repo), 0 tool calls, full handoff. First thinking: `"Looking back at this session, the user gave me three key pieces of feedback..."` - correctly identifies session as history.

Routing verification:

Item Scope Routes to
give-one-recommendation-not-options all workspace/.axme-code/memory/
use-git-c-instead-of-cd all workspace/.axme-code/memory/
use-exact-file-names-not-vague-terms all workspace/.axme-code/memory/
axme-code-session-id-self-generated [axme-code] axme-code/.axme-code/decisions/
hook-commands-embed-workspace-path [axme-code] axme-code/.axme-code/decisions/
axme-code-session-storage-workspace-level [axme-code] axme-code/.axme-code/decisions/

Universal communication/workflow feedback routes to workspace. Repo-specific architecture decisions route to that repo. Auditor made no file modifications (disallowedTools verified).

Test plan (post-merge, manual)

  • Restart VS Code so new MCP server picks up the build
  • Make a few tool calls in a fresh session, close VS Code
  • Check new session has correct scope routing:
    • Universal lessons landed in workspace/.axme-code/memory/
    • Repo-specific decisions landed in that-repo/.axme-code/decisions/
    • Safety rules scoped correctly
    • Handoff written to session origin
  • Verify PreToolUse hook correctly merges workspace + repo safety rules

Not in this PR

  • Minor: transcript-parser filters `` only when it starts a user message. Reminders embedded mid-message leak through. Auditor still works but output is slightly noisier. Small follow-up.

George-iam and others added 4 commits April 5, 2026 09:14
The session auditor previously wrote every extracted memory, decision, and
safety rule to the session origin (workspacePath), ignoring the "scope" field
the LLM produced. This PR routes each extraction to the right storage level
(workspace-wide vs specific repo) and also fixes a critical bug where the
auditor LLM was behaving as the main Claude Code agent instead of as an
isolated auditor.

## Scope routing

- New saveScopedSafetyRule() in storage/safety.ts. Routes safety rules by
  scope the same way saveScopedMemories() / saveScopedDecisions() do: "all"
  goes to session origin, [repo] goes to that repo, multi-repo fans out.
- New loadMergedSafetyRules() in storage/safety.ts. Union-merges workspace-
  level base rules with a specific repo's override rules. Stricter always
  wins on conflicts (any deny wins, any allow-deny intersection is deny).
- PreToolUse hook now loads merged rules. For file-based tools it walks up
  from the file path to the containing .git directory, loads that repo's
  rules, and merges with workspace rules. For Bash it uses merged workspace
  + session-origin rules.
- saveScopedDecisions() changed to accept Omit<Decision, "id"> and generate
  a fresh sequential id per target path via addDecision(). Previously it
  required a caller-supplied id, which broke the audit->save pipeline.
- saveScopedMemories() stopped double-writing to workspace root for
  multi-repo scoped memories. Memory is now written only to the listed
  repos. Only "all"-scoped memories go to session origin.
- session-cleanup.ts now detects workspace vs single-repo session, passes
  workspace structure to the auditor, and uses saveScoped* for all writes.
- Handoff still written to session origin (one handoff per AXME session).

## Auditor context for scope determination

- runSessionAudit() now accepts a WorkspaceInfo object with the full list
  of repos. The auditor needs this to know which scope values are valid.
- buildWorkspaceContext() formats this list plus a filesChanged-by-repo
  breakdown and embeds it in the prompt so the auditor can correlate which
  repos were actually touched in this session.
- buildExistingContext() now scans both workspace root .axme-code/ AND
  every per-repo .axme-code/ for existing decisions/memories, so the
  dedup check catches items at either level.
- Prompt v4 includes an explicit scope determination section with rules
  (universal -> "all", repo-specific -> [repo], multi-repo -> list) and
  the correct output-format markers for scope.
- parseAuditOutput() now parses scope from DECISIONS and SAFETY sections
  (it was already parsed for MEMORIES).

## Auditor isolation (critical bug fix)

Initial dry-run returned an empty extraction with 12 tool calls, 332s, and
$2.30 cost. Inspecting the auditor's own Claude Agent SDK session transcript
revealed the auditor's first thinking step was "I'm picking up where I left
off — I need to rerun the scope-dryrun test, verify scope routing, clean up
the test file, then commit and push." The auditor thought IT was the main
Claude Code agent continuing the user's work.

Root causes:
1. SDK query inherited the project's .mcp.json, so the auditor had access
   to the axme_context MCP tool. It called axme_context and received the
   full project context, cementing the illusion of being the main agent.
2. The default claude_code system prompt preset tells the model "you are
   Claude Code helping the user with software engineering tasks". Our
   audit instructions, passed as a user message, were overridden by this.
3. cwd was the active workspace with an open branch, reinforcing "I'm
   doing normal work here".

Fixes (all in runSessionAudit queryOpts):
- systemPrompt: custom AUDIT_SYSTEM_PROMPT that explicitly states "You are
  the AXME Code session auditor. You are NOT Claude Code. You are NOT
  continuing any user's work. The transcript is HISTORY — not a task."
- settingSources: [] — do not inherit project settings. The auditor runs
  in isolation from .mcp.json, .claude/settings.json, hooks.
- mcpServers: {} — no MCP servers attached. No axme_context, no external
  tools, only the three filesystem tools we explicitly allow.
- disallowedTools extended with ToolSearch to prevent the auditor from
  trying to dynamically fetch Bash or other blocked tools.

## Verification (dry-run on session 1df5d43d)

Before the fix: 332s, $2.30, 0 memories, 0 decisions, 12 tool calls reading
source files and attempting ToolSearch for Bash.

After the fix: 72s, $1.60, 3 memories (all scope="all" -> workspace), 3
decisions (all scope=[axme-code] -> per-repo), 0 tool calls (existing
context in the prompt was sufficient for dedup), full handoff.

Extracted items:
- MEMORIES (scope=all, routed to workspace/.axme-code/memory/):
  - give-one-recommendation-not-options
  - use-git-c-instead-of-cd
  - use-exact-file-names-not-vague-terms
- DECISIONS (scope=[axme-code], routed to axme-code/.axme-code/decisions/):
  - axme-code session ID is self-generated, stored in .axme-code/active-session
  - Hook commands embed absolute --workspace path at setup time
  - axme-code session + worklog + filesChanged storage is workspace-level

Universal communication/workflow feedback -> workspace. Repo-specific
architecture decisions -> that repo's storage. Routing is correct.

## Files changed

| File | Change |
|---|---|
| src/storage/safety.ts | +saveScopedSafetyRule, +loadMergedSafetyRules, +unionMergeSafety, export SafetyRuleType |
| src/storage/decisions.ts | saveScopedDecisions now accepts Omit<Decision, "id"> and uses addDecision for fresh ids |
| src/storage/memory.ts | saveScopedMemories no longer double-writes to workspace for multi-repo scopes |
| src/hooks/pre-tool-use.ts | Merged rule loading per-file, containing-repo walk from file path |
| src/agents/session-auditor.ts | Custom system prompt, mcpServers={}, settingSources=[], workspace context builder, prompt v4 with scope rules, parse scope in decisions and safety |
| src/session-cleanup.ts | Uses saveScoped*, passes workspaceInfo to auditor, routes writes by scope |

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds optional model parameter to runSessionAudit (defaults to claude-opus-4-6)
and keeps the scope-dryrun.mts test script in the repo for future verification.

## Sonnet vs Opus comparison on session 1df5d43d

Ran the auditor on the same transcript with both models to verify whether
Sonnet could be a cheaper default.

**Opus 4.6**: 72s, \$1.60
- Identified role correctly: "Looking back at this session, user gave me three key pieces of feedback"
- Emitted 3 memories (scope=all -> workspace), 3 decisions (scope=[axme-code] -> per-repo), full handoff
- Parser extracted all items cleanly

**Sonnet 4.6**: 86s, \$0.72
- IGNORED the custom AUDIT_SYSTEM_PROMPT
- Took the [USER]/[ASSISTANT] markers in the rendered transcript as a chat template
- Produced 14k chars of "conversation continuation" instead of markers:
  answered questions from the transcript, cited parts, and ended with
  a full "Session 45 prompt" as if it was writing a handoff message live
- Even wrote tool_use-like mentions "[Edit: WORKLOG.md] [Write: HANDOFF.md]
  [Bash: git commit]" as fake text inside the conversation continuation
  (tools were disabled so no actual calls happened)
- Zero structured markers -> parser returned empty result

The difference is role adherence under chat-template pressure. Opus holds
the custom "you are NOT Claude Code, you are NOT continuing work" instruction
even when staring at [USER]/[ASSISTANT]-marked text. Sonnet does not.

Keeping Opus as the default. Saving 55% cost by downgrading to Sonnet
produces empty audits, which is strictly worse than paying more for
correct ones. Possible future work: XML-wrap the transcript sections to
avoid the chat-template trigger, then retry Sonnet.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…et default

The previous [USER] / [ASSISTANT] chat-style markers in the rendered
transcript triggered the model's chat-continuation pattern-matching.
Sonnet took them literally as a live chat template and wrote a 14k-char
"conversation continuation" instead of emitting extraction markers —
it answered questions from the transcript as if it was a participant.
Opus held the custom system prompt but not deterministically; the
pattern-matching pull is strong and any model could fail on any session.

## Fix

- transcript-parser.ts renderConversation() now emits XML-wrapped data
  inside <session_transcript> with <user_message>, <assistant_message>,
  <assistant_thinking>, <assistant_tool_calls> tags. XML is the
  Anthropic-recommended format for structured data in prompts and
  does not pattern-match as chat.
- escapeXml() for content so transcript text with &, <, > does not
  break the outer tags.
- session-auditor.ts system prompt updated: explicitly states the
  transcript is XML structured data, NOT a conversation the agent
  is part of, NOT something to respond to. Also mandates that the
  first characters of the response be "###MEMORIES###".
- session-auditor.ts user message includes a one-line reminder before
  the transcript block: "structured XML data. It is HISTORY. You are
  not a participant."
- The worklog fallback (used when no transcript is attached) is also
  wrapped in <session_worklog_events> so the model always sees
  structured data, never raw chat lines.

## Default model: Sonnet

With the XML wrap, Sonnet now works correctly. Opus is overkill for
this task. Default model changed back to claude-sonnet-4-6.

## Verification on session 1df5d43d

| Prompt | Model | Time | Cost | Role ok | Output ok |
|---|---|---|---|---|---|
| chat-marker | opus-4-6 | 72s | \$1.60 | ✓ | markers, 3 mem / 3 dec |
| chat-marker | sonnet-4-6 | 86s | \$0.72 | ✗ | chat continuation, 0 / 0 |
| XML wrap | sonnet-4-6 | 77s | \$0.93 | ✓ | markers, 1 mem / 2 dec |

Sonnet + XML: correct markers, correct scope routing (all universal
memories -> workspace, all repo-specific decisions -> axme-code),
uses Glob+Grep tools properly to dedup candidates against existing
storage. The thinking blocks in the auditor's own transcript show
it correctly identified itself as "analyzing this transcript" and
"extracting memories" rather than "continuing the user's work".

Sonnet extracted fewer items than Opus on the same transcript (1 vs 3
memories, 2 vs 3 decisions) — it is more conservative about what
counts as a meaningful correction. That is calibration, not a bug.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The session auditor model is now user-configurable through
.axme-code/config.yaml with the new auditor_model field. Default is
claude-sonnet-4-6 (enough for the audit task once the transcript is
XML-wrapped). Users can override to claude-opus-4-6 for more
conservative extraction, or claude-haiku-4-5 for cheaper runs.

## Changes

- types.ts: new DEFAULT_AUDITOR_MODEL constant and auditorModel field
  on ProjectConfig. Keeps the general "model" field for engineer /
  reviewer / tester agents separate from the auditor model, since
  the two have different requirements.
- storage/config.ts: parseConfig reads auditor_model from yaml,
  formatConfig writes it with a comment explaining its purpose.
- session-cleanup.ts: reads config via readConfig(workspacePath) and
  passes config.auditorModel to runSessionAudit.
- session-auditor.ts: default model constant imported from types,
  removed hardcoded "claude-sonnet-4-6" string.

## Backward compat

Legacy config.yaml files without auditor_model continue to work —
parseConfig falls back to DEFAULT_AUDITOR_MODEL (Sonnet). Smoke test
verified four cases:
1. Missing config file -> Sonnet default
2. Legacy yaml without auditor_model field -> Sonnet default
3. Explicit auditor_model in yaml -> honored
4. writeConfig round-trip -> field persisted with comment

New axme-code setup runs will write the field automatically via the
DEFAULT_PROJECT_CONFIG spread that init.ts already uses.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@George-iam George-iam merged commit 0c0a561 into main Apr 5, 2026
@George-iam George-iam deleted the feat/scoped-audit-writes-20260405 branch April 7, 2026 08:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant