docs: commit real shadow-mode audit log + analyzer#17
Conversation
… tests Splits the classification heuristics out of `tools/claude-code-hook.mjs` into a pure, dependency-free module at `tools/lib/classify.mjs`. The hook becomes a thin stdin → classify → propose → log layer. Adds `tools/lib/classify.test.mjs` with 107 table-driven vitest cases covering every branch: git push / force / force-with-lease, rm -rf flag permutations (+ false-positive guards for `git commit -m "rm -rf"`), git read/write/reset/destructive buckets, package-manager invocations, gh CLI, MCP read/write detection, null/undefined tool names, fallback. Writing the tests surfaced two classification gaps in the current heuristics (kept as-is in this refactor, documented as KNOWN GAP): 1. `npx vitest` falls through to `shell:exec:generic` — the regex only catches `npx <test|typecheck|lint|build>`. 2. Notion's `notion-search` / `notion-fetch` are misclassified as write because the read-verb detector expects the op name to *start* with read/list/search/fetch/..., but Notion double-namespaces ops with a `notion-` prefix. Both gaps are tracked in Kanbi task `oTphpxViAqvBobM3WMUz` (classification table re-tuning, after real shadow-mode data is collected). `vitest.config.ts` include pattern extended to pick up `tools/**/*.test.mjs`. Full suite: 197 tests pass (was 90). Kanbi: `SRxlAOrknM6VaVxxavIa`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fixes 8 issues raised by reviewers on the classify.mjs refactor: P1 (fail-open regression): classify is now dynamically imported inside main() — a broken classify.mjs falls into the fail-open path instead of crashing the hook at module load time. Classification fixes: - rm: regex now handles arbitrary flag permutations (`rm -f -v -r`) - git clean: detects `-fd` / `-df` / `-fdx` via lookahead over the tail - read-query: bails out on shell redirection (`echo x > ~/.bashrc`) - gh api: `-X POST`, `-f`, `-F`, `--field`, `--raw-field` classify as write - npm/pnpm/yarn ci: added to pkg:mutate bucket - wget: classified alongside curl as shell:net:curl - MCP lookahead: `(?![a-z])` prevents `listen` matching `list`; dropped `/i` flag so camelCase boundaries (`getBoard`) still resolve as read Tests added for each fix; full suite is 226 passing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Captures 121 real PreToolUse hook observations from the author's own Claude Code sessions (2026-04-19 17:04 → 2026-04-20 16:36 UTC) with the username masked. Includes: - docs/samples/shadow-mode-claude-code.jsonl — raw audit log - docs/samples/README.md — how it was captured, summary numbers, observations (95% auto-approved, 5% escalated; every escalation is a genuinely dangerous action: push-force, rm -rf, git push) - scripts/analyze-audit-log.mjs — summary analyzer (totals, escalation breakdown, top action types by volume) - docs/dogfooding.md updated with a pointer to the sample + analyzer This is source material for: - ADR-003 (execution modes) — real examples of shadow-mode output - English blog post "The Missing Layer" — data-backed argument - Dev Summit talk — quantitative slide - IRBANK PoC proposal — "we've already dogfooded this" Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
📝 WalkthroughWalkthroughExtracted classification logic into a new library, refactored the Claude Code hook to dynamically import it, added comprehensive tests, published shadow-mode audit-log samples with analysis tooling and documentation. Changes
Sequence Diagram(s)sequenceDiagram
participant Claude as Claude Code (client)
participant Hook as claude-code-hook.mjs
participant Classify as tools/lib/classify.mjs
participant Tegata as tegata.propose()/audit log
Claude->>Hook: PreToolUse (toolName, toolInput)
Hook->>Classify: dynamic import (on first use)
Hook->>Classify: classify(toolName, toolInput)
Classify-->>Hook: {type, riskScore}
Hook->>Tegata: tegata.propose(...) -> decision
Tegata-->>Hook: decision_status / decision_tier
Hook->>Tegata: append entry to ~/.claude/tegata-audit.jsonl
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts
Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces a new audit log analyzer script and refactors the Claude Code hook by moving its classification logic into a dedicated, testable module. It also adds comprehensive documentation and sample audit data to support dogfooding. The review feedback suggests enhancing the analyzer's scalability by using streams for large log files, handling potential division-by-zero errors in percentage calculations, and refining the git clean heuristic to recognize the --force flag.
| const lines = readFileSync(path, "utf8") | ||
| .split("\n") | ||
| .filter((l) => l.trim().length > 0); |
There was a problem hiding this comment.
Reading the entire audit log into memory using readFileSync and split('\n') can lead to performance issues or memory exhaustion if the log file grows very large (e.g., over months of usage). For a more scalable approach, consider using a readable stream with the node:readline module to process the file line-by-line.
| const firstTs = entries[0]?.ts; | ||
| const lastTs = entries[entries.length - 1]?.ts; | ||
|
|
||
| const pct = (n) => ((n / total) * 100).toFixed(1); |
There was a problem hiding this comment.
The pct function will return NaN if the audit log is empty (total === 0), leading to confusing output like Approved: 0 (NaN%). It's safer to handle the zero-total case explicitly.
| const pct = (n) => ((n / total) * 100).toFixed(1); | |
| const pct = (n) => total === 0 ? "0.0" : ((n / total) * 100).toFixed(1); |
| return { type: "shell:git:reset-hard", riskScore: 85 }; | ||
| if ( | ||
| /^git\s+branch\s+-D\b/.test(c) || | ||
| /^git\s+clean\b(?=.*(?:^|\s)-[a-z]*f[a-z]*)/.test(c) |
There was a problem hiding this comment.
The git clean heuristic currently only detects the short flag -f. It should also account for the long flag --force, as it is equally destructive and commonly used.
| /^git\s+clean\b(?=.*(?:^|\s)-[a-z]*f[a-z]*)/.test(c) | |
| /^git\s+clean\b(?=.*(?:^|\s)(?:-[a-z]*f[a-z]*|--force)\b)/.test(c) |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 42c0d50290
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| /^git\s+clean\b(?=.*(?:^|\s)-[a-z]*f[a-z]*)/.test(c) | ||
| ) | ||
| return { type: "shell:git:destructive", riskScore: 75 }; |
There was a problem hiding this comment.
Exclude dry-run git clean from destructive classification
The new git clean regex marks any command containing -f as shell:git:destructive, so git clean -nfd is treated as destructive even though git clean -h documents -n, --dry-run as a non-deleting preview. In enforce mode this can escalate/block harmless inspection commands and add avoidable friction, so the classifier should ignore destructive labeling when -n/--dry-run is present.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/samples/README.md`:
- Around line 66-69: The README overstates the sample by claiming "`git push`
alone is correctly escalated" without noting an earlier preserved pre-fix entry;
update the text around the snippet that discusses "riskScore 71",
"escalateAbove: 70" and the sample lines showing "risk_score: 70" and "approved"
to qualify that the correct escalation applies to later entries (post-fix) and
that the earlier `risk_score: 70`/`approved` line is preserved history; edit the
sentence to say "later in the dataset" or explicitly call out the pre-fix entry
as preserved history so readers see both behaviors.
- Around line 23-24: The README incorrectly references SHADOW_MODE; update the
documentation to describe the actual hook switch: TEGATA_HOOK_ENFORCE (i.e.,
shadow mode is the default unless TEGATA_HOOK_ENFORCE === "1"). Replace the
example and explanation that mention SHADOW_MODE with the real variable name and
behavior, and optionally note that tools/claude-code-hook.mjs uses
TEGATA_HOOK_ENFORCE to toggle enforcement so readers won’t expect a capture-time
SHADOW_MODE env var to have effect.
In `@scripts/analyze-audit-log.mjs`:
- Around line 25-50: The script currently computes percentages and prints a time
range without guarding for empty input (variables: entries, total, pct, firstTs,
lastTs), so when total === 0 you get division-by-zero NaN% and undefined
timestamps; fix this by checking total === 0 before computing/printing: have
pct(n) return "0.0" or "N/A" when total is zero (avoid dividing), and format
firstTs/lastTs to a safe fallback (e.g., "n/a" or "no entries") when entries[0]
or entries[entries.length - 1] is undefined, then use those safe values in the
console.log calls so the report prints sensible values for empty logs.
In `@tools/lib/classify.mjs`:
- Around line 45-50: The regex in tools/lib/classify.mjs that checks the command
string variable c for recursive deletes currently only matches lowercase r;
update the pattern used in the if that returns { type:
"shell:fs:delete-recursive", riskScore: 85 } so it accepts both lowercase and
uppercase R (e.g., allow -r or -R within option clusters and long form) — modify
the regexp so the r check is case-insensitive (or explicitly includes R) while
preserving existing option-matching logic for -f, other flags, and sudo
handling.
- Around line 27-31: The git clean heuristic currently matches any short flag
containing "f" but doesn't handle long "--force" and incorrectly treats dry-run
cases like "-n" or "--dry-run" as destructive; update the regex used in
tools/lib/classify.mjs (the /^git\s+clean\b(?=.*(?:^|\s)-[a-z]*f[a-z]*)/.test(c)
check) to require an explicit force flag (either -f or --force) and also ensure
it rejects dry-run flags (like -n or --dry-run) — i.e., change the lookahead to
assert presence of (?:\s-(?:[^\s]*\bf\b[^\s]*)| --force\b) and assert absence of
(?:\s-(?:[^\s]*\bn\b[^\s]*)| --dry-run\b) so only truly destructive clean
invocations classify as "shell:git:destructive".
- Around line 51-60: The current check in classify.mjs (the if that tests the
command string c against /^(ls|cat|...)\b/ and redirection) only inspects the
leading token and misses compound commands (e.g., "echo ok && git push"); update
the condition so it only treats truly simple read-only invocations as low-risk
by requiring the whole command string to match a safe-pattern (no shell control
operators like &&, ||, ;, |, `$(...)`, backticks, or backgrounding &) and no
redirections; modify the expression around c (the existing /^(ls|cat|...)\b/
test) to validate the full command form (anchor to end) and add a negative test
for shell metacharacters so compound commands fall through to the
generic/escalation path.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e8663fd3-0881-423e-ae3a-0359f3b9fa30
📒 Files selected for processing (8)
docs/dogfooding.mddocs/samples/README.mddocs/samples/shadow-mode-claude-code.jsonlscripts/analyze-audit-log.mjstools/claude-code-hook.mjstools/lib/classify.mjstools/lib/classify.test.mjsvitest.config.ts
| 4. `SHADOW_MODE=1` — all tool calls were allowed regardless of verdict; | ||
| the log records what Tegata _would_ have blocked in enforce mode. |
There was a problem hiding this comment.
Use the real hook switch here.
tools/claude-code-hook.mjs does not read SHADOW_MODE; shadow mode is simply the default when TEGATA_HOOK_ENFORCE !== "1". As written, this implies a capture-time env var that has no effect.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/samples/README.md` around lines 23 - 24, The README incorrectly
references SHADOW_MODE; update the documentation to describe the actual hook
switch: TEGATA_HOOK_ENFORCE (i.e., shadow mode is the default unless
TEGATA_HOOK_ENFORCE === "1"). Replace the example and explanation that mention
SHADOW_MODE with the real variable name and behavior, and optionally note that
tools/claude-code-hook.mjs uses TEGATA_HOOK_ENFORCE to toggle enforcement so
readers won’t expect a capture-time SHADOW_MODE env var to have effect.
- classify: `git clean` no longer classifies as destructive without explicit `-f`/`--force`; `--dry-run`/`-n` always suppresses. - classify: `rm -R` / `-Rf` / `-fR` (POSIX uppercase) now match the recursive-delete bucket. - classify: read-query prefix no longer wins over shell composition — `echo ok && git push`, `cat x; rm y`, `ls | grep`, `$(...)`, backticks, and embedded newlines all fall through to the generic bucket so the tail of the command is not silently trusted. - analyze-audit-log: print an empty-log summary instead of dividing by zero when the JSONL file has no entries. - docs/samples/README: replace the stray `SHADOW_MODE=1` reference with the actual env var (`TEGATA_HOOK_ENFORCE`), and qualify the "git push alone is escalated" claim by acknowledging the pre-PR-15 rows preserved in the sample log. Tests: +24 cases covering git-clean dry-run / force, rm uppercase permutations, and shell-composition bail-outs. 245 total, all passing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
docs/samples/README.md (1)
38-38: Optional wording trim for readability.“every single one” is a bit heavy; “all” is cleaner.
Suggested tweak
-Escalations (every single one is a genuinely dangerous action): +Escalations (all are genuinely dangerous actions):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/samples/README.md` at line 38, Replace the heavy phrasing in the header line "Escalations (every single one is a genuinely dangerous action):" with a trimmed, clearer variant such as "Escalations (all are genuinely dangerous actions):" to improve readability; locate that exact string in the README and update it accordingly while preserving intent and punctuation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/samples/README.md`:
- Around line 31-36: The two fenced code blocks in README.md are missing
language identifiers (triggering markdownlint MD040); update the first block
that shows the stats ("Total: 121 ... Denied: 0") to use a language tag such as
text (e.g., change the opening ``` to ```text) and update the second block
containing the sed command (the block with "sed
's|/Users/ren/|/Users/<user>/|g'") to use a shell language tag such as bash
(change ``` to ```bash) so both fenced blocks include appropriate language
identifiers.
- Around line 63-66: Replace the absolute claim "**Zero false positives /
negatives (by inspection).**" with a softened phrasing such as "No obvious false
positives/negatives in this sample by manual inspection." Update the surrounding
sentence in the README.md section where this heading appears so it reads as a
qualitative observation rather than a definitive metric, keeping the rest of the
paragraph intact.
---
Nitpick comments:
In `@docs/samples/README.md`:
- Line 38: Replace the heavy phrasing in the header line "Escalations (every
single one is a genuinely dangerous action):" with a trimmed, clearer variant
such as "Escalations (all are genuinely dangerous actions):" to improve
readability; locate that exact string in the README and update it accordingly
while preserving intent and punctuation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 58fc90f7-ccc0-4bbc-8c63-82c0e3c45241
📒 Files selected for processing (4)
docs/samples/README.mdscripts/analyze-audit-log.mjstools/lib/classify.mjstools/lib/classify.test.mjs
✅ Files skipped from review due to trivial changes (1)
- scripts/analyze-audit-log.mjs
| ``` | ||
| Total: 121 | ||
| Approved: 115 (95.0%) [auto-pass] | ||
| Escalated: 6 (5.0%) [human/senior review required] | ||
| Denied: 0 | ||
| ``` |
There was a problem hiding this comment.
Add language identifiers to fenced code blocks (markdownlint MD040).
Both fenced blocks are missing language tags.
Suggested fix
-```
+```text
Total: 121
Approved: 115 (95.0%) [auto-pass]
Escalated: 6 (5.0%) [human/senior review required]
Denied: 0- +bash
sed 's|/Users/ren/|/Users//|g'
Also applies to: 104-106
🧰 Tools
🪛 markdownlint-cli2 (0.22.0)
[warning] 31-31: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/samples/README.md` around lines 31 - 36, The two fenced code blocks in
README.md are missing language identifiers (triggering markdownlint MD040);
update the first block that shows the stats ("Total: 121 ... Denied: 0") to use
a language tag such as text (e.g., change the opening ``` to ```text) and update
the second block containing the sed command (the block with "sed
's|/Users/ren/|/Users/<user>/|g'") to use a shell language tag such as bash
(change ``` to ```bash) so both fenced blocks include appropriate language
identifiers.
| - **Zero false positives / negatives (by inspection).** Every | ||
| escalation was a real write to a remote or a recursive delete; no | ||
| benign reads were caught. No genuinely destructive action slipped | ||
| through. |
There was a problem hiding this comment.
Soften the absolute “zero false positives / negatives” claim.
This reads as a definitive metric, but the document later frames the dataset as qualitative (Lines 130-134). Reword to “no obvious false positives/negatives in this sample by manual inspection” to avoid overclaiming.
Suggested wording
-- **Zero false positives / negatives (by inspection).** Every
+- **No obvious false positives / negatives in this sample (manual inspection).** Every
escalation was a real write to a remote or a recursive delete; no
benign reads were caught. No genuinely destructive action slipped
through.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - **Zero false positives / negatives (by inspection).** Every | |
| escalation was a real write to a remote or a recursive delete; no | |
| benign reads were caught. No genuinely destructive action slipped | |
| through. | |
| - **No obvious false positives / negatives in this sample (manual inspection).** Every | |
| escalation was a real write to a remote or a recursive delete; no | |
| benign reads were caught. No genuinely destructive action slipped | |
| through. |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/samples/README.md` around lines 63 - 66, Replace the absolute claim
"**Zero false positives / negatives (by inspection).**" with a softened phrasing
such as "No obvious false positives/negatives in this sample by manual
inspection." Update the surrounding sentence in the README.md section where this
heading appears so it reads as a qualitative observation rather than a
definitive metric, keeping the rest of the paragraph intact.
Summary
PreToolUsehook observations from the author's own Claude Code sessions (2026-04-19 → 2026-04-20, username masked) asdocs/samples/shadow-mode-claude-code.jsonl.scripts/analyze-audit-log.mjs— a small analyzer that prints totals, escalation breakdown, and top action types by volume. Runs on the committed sample by default.docs/samples/README.mdexplaining capture method, numbers, observations, known data warts, and privacy handling.docs/dogfooding.mdwith pointers to both the sample and the analyzer.Why
Closes Kanbi
oTphpxViAqvBobM3WMUz. Material for ADR-003 (execution modes), the English blog post "The Missing Layer", Dev Summit talk, and IRBANK PoC proposal — all of which benefit from being able to point at real data instead of a hypothetical.Key numbers (from
node scripts/analyze-audit-log.mjs)shell:git:push-force× 2 (riskScore 95)shell:fs:delete-recursive× 2 (riskScore 85)shell:git:push× 2 (riskScore 71)git pushalone crossesescalateAbove: 70thanks to the strict->fix from PR feat(dogfooding): Claude Code PreToolUse hook that runs Tegata on self #15.unknown:ToolSearch:execappears 5× — Claude Code's deferred-tool lookup isn't in the classifier yet. Exactly the kind of data the dogfooding loop was designed to surface.Known warts preserved in the data
Left in deliberately — they're evidence of real problems:
session_idvariance (test/unknown/post-format/ UUID /smoke-test) → motivates ADR-004.proposal_id/decision_reason/decision_ts→ schema drift before PR feat(dogfooding): Claude Code PreToolUse hook that runs Tegata on self #15.smoke-testentry withcwd: /tmp→ cleaning it would bias the sample.See
docs/samples/README.mdfor the full writeup.Test plan
node scripts/analyze-audit-log.mjsruns and matches the numbers in the READMEpnpm run lintpasses/Users/ren/leakage in the sample (grepclean)🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Documentation
Reliability
Tests