Skip to content

feat(dogfooding): Claude Code PreToolUse hook that runs Tegata on self#15

Merged
renasami merged 3 commits intomainfrom
feat/claude-code-dogfooding
Apr 20, 2026
Merged

feat(dogfooding): Claude Code PreToolUse hook that runs Tegata on self#15
renasami merged 3 commits intomainfrom
feat/claude-code-dogfooding

Conversation

@renasami
Copy link
Copy Markdown
Owner

@renasami renasami commented Apr 19, 2026

Summary

  • Adds tools/claude-code-hook.mjs — a zero-config PreToolUse hook that classifies every Claude Code tool call into an ActionType + riskScore and routes it through tegata.propose(). Decisions stream into ~/.claude/tegata-audit.jsonl.
  • Shadow mode by default: the hook exits 0 regardless of verdict, only records what Tegata would have done. Flip TEGATA_HOOK_ENFORCE=1 to have denied/escalated decisions block tool calls with exit 2.
  • Imports Tegata from the repo's built dist/ — exercises the same public API shipped to npm as tegata@preview. All error paths fall through to exit 0; a broken local build can never derail the host agent.
  • docs/dogfooding.md documents the setup, the classification table (Read=5, Edit=40, git push=70, git push --force=95, rm -rf=85, MCP read=10, MCP write=40, ...), and how to query the audit log.
  • Small README pointer to the new doc.

Motivation

An authorization SDK whose author doesn't use it on their own agent carries no weight. Running Tegata on every local Claude Code session also seeds real usage data for the v0.2 TegataReporter design, catches API pain before v0.1.0 GA freezes it, and provides concrete numbers for the first English blog post ("The Missing Layer in Multi-Agent Systems").

Verification

  • pnpm run build produces a working dist/
  • pnpm run typecheck green
  • pnpm run lint green (prettier + eslint)
  • Manually invoked hook with 8 canonical tool inputs (Read, Edit, Write, Bash git-status / rm-rf / push-force / pnpm-test, Kanbi MCP read/write) — all exit 0, audit log contains expected classifications
  • Hook is already registered in the author's local .claude/settings.local.json (gitignored) and firing on every tool call of this very PR's authoring session — confirmed by fresh entries in ~/.claude/tegata-audit.jsonl with the current session_id

Next steps (out of scope for this PR)

  • Collect ~1 week of shadow-mode data, write a follow-up analysis
  • Surface any API pain points as pre-GA issues
  • Swap in TEGATA_HOOK_ENFORCE=1 once classification is stable

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Dogfooding mode to intercept and classify Claude Code tool invocations into actions with risk scores.
    • Shadow (audit-only) and enforce modes; enforce can block denied/escalated calls and emit diagnostics.
  • Documentation

    • New dogfooding guide covering setup, registration, operational modes, default classifications/thresholds, and audit-log analysis.

Adds tools/claude-code-hook.mjs, a zero-config PreToolUse hook that
classifies every Claude Code tool call (Bash, Edit, Write, Read, MCP
servers, subagents, ...) into an ActionType + riskScore and routes it
through tegata.propose(). Decisions are appended to
~/.claude/tegata-audit.jsonl.

Default shadow mode never blocks — it only records what Tegata *would*
have done, so the hook is safe to turn on right away. Flip
TEGATA_HOOK_ENFORCE=1 to have denied/escalated decisions block tool
calls via exit code 2.

The hook imports Tegata from the repo's built dist/ so it exercises the
same public API shipped to npm as tegata@preview. All error paths fall
through to exit 0 — a broken local build can never derail the host
agent.

Motivation: an authorization SDK whose author doesn't use it on their
own agent carries no weight. Running it on every Claude Code session
also seeds real usage data for the v0.2 TegataReporter design and for
the first English blog post.

docs/dogfooding.md explains the setup, the classification table, and
how to query the audit log.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 19, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 060327d9-ed1d-403d-98c4-9bc2424130d5

📥 Commits

Reviewing files that changed from the base of the PR and between 44fe8ab and 274d926.

📒 Files selected for processing (2)
  • docs/dogfooding.md
  • tools/claude-code-hook.mjs

📝 Walkthrough

Walkthrough

Adds a dogfooding workflow: a Claude Code PreToolUse Node.js hook classifies tool invocations into Tegata actions with risk scores, calls tegata.propose(), appends decisions to an audit JSONL, and supports shadow (default) and enforce modes that can block tool calls.

Changes

Cohort / File(s) Summary
Documentation
README.md, docs/dogfooding.md
New dogfooding documentation: setup/build steps, hook registration (project & machine), classification defaults and heuristics, audit log JSONL schema and jq examples, and runtime behavior/robustness notes for shadow vs enforce modes.
Hook Implementation
tools/claude-code-hook.mjs
New executable Node.js hook: reads stdin JSON, classifies tool calls (Bash heuristics, filesystem/web/agent/mcp mappings) into Tegata Action with riskScore, imports dist/index.js, calls tegata.propose({ proposer: "claude-code", action, params }), appends a JSONL audit record to ~/.claude/tegata-audit.jsonl, uses shadow (default) or enforce mode (TEGATA_HOOK_ENFORCE=1) to decide exit behavior (exit 2 + stderr on denied/escalated), and fails open on internal errors.

Sequence Diagram

sequenceDiagram
    actor CT as Claude Code Tool
    participant H as claude-code-hook
    participant T as Tegata
    participant L as Audit Log
    participant E as stderr/exit

    CT->>H: Invoke hook (stdin JSON: tool_name, tool_input, session_id, cwd)
    H->>H: Parse JSON & classify into Action (type, riskScore)
    H->>T: tegata.propose({ proposer: "claude-code", action, params })
    T->>T: Evaluate proposal (allowed / denied / escalated, tier, reason)
    T-->>H: Return decision object
    H->>L: Append JSONL audit record (~/.claude/tegata-audit.jsonl)
    alt TEGATA_HOOK_ENFORCE != "1" (SHADOW)
        H->>E: exit 0
    else TEGATA_HOOK_ENFORCE = "1" (ENFORCE)
        alt decision.status == "denied" or "escalated"
            H->>E: write decision.reason to stderr
            H->>E: exit 2
        else
            H->>E: exit 0
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 I peek at stdin, classify each tread,
Tegata whispers verdicts in my thread.
I stamp the log in JSON, neat and true,
Shadow lets it pass — enforce stops what’s due.
A rabbit's hop, auditing what you do.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding a Claude Code PreToolUse hook that runs Tegata for self-monitoring.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/claude-code-dogfooding

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0b9f6b894b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tools/claude-code-hook.mjs Outdated
const c = (cmd ?? "").trim();
if (/^git\s+push\b.*(--force|-f\b|--force-with-lease)/.test(c))
return { type: "shell:git:push-force", riskScore: 95 };
if (/^git\s+push\b/.test(c)) return { type: "shell:git:push", riskScore: 70 };
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Raise git push risk above enforcement threshold

In enforce mode, this hook creates Tegata with escalateAbove: 70, but plain git push is classified with riskScore: 70. Tegata escalates only when riskScore > escalateAbove, so a normal push is approved and not blocked, which contradicts the stated intent that pushes should cross the default threshold and weakens enforcement for a high-impact action.

Useful? React with 👍 / 👎.

Comment thread tools/claude-code-hook.mjs Outdated
/^git\s+(commit|add|checkout|merge|rebase|stash|tag|fetch|pull)\b/.test(c)
)
return { type: "shell:git:write", riskScore: 40 };
if (/\brm\s+-rf?\b/.test(c))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Match destructive rm flag permutations

The recursive-delete detector only matches rm -r/rm -rf (/\brm\s+-rf?\b/), so equivalent invocations like rm -fr or rm -r -f are misclassified as generic shell commands. In enforce mode those variants stay below escalation risk and can execute without the intended high-risk gate.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces dogfooding support for Tegata by integrating it with Claude Code via a PreToolUse hook. It includes a new documentation guide, a hook script (tools/claude-code-hook.mjs) that classifies and evaluates tool calls, and updates to the README. Feedback was provided to improve the robustness of the regex used for identifying destructive shell commands to avoid false positives and better handle various flag combinations.

Comment thread tools/claude-code-hook.mjs Outdated
/^git\s+(commit|add|checkout|merge|rebase|stash|tag|fetch|pull)\b/.test(c)
)
return { type: "shell:git:write", riskScore: 40 };
if (/\brm\s+-rf?\b/.test(c))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The regex for rm -rf is not anchored to the start of the command, which can lead to false positives if the string appears in arguments (e.g., git commit -m "rm -rf"). Additionally, it is quite restrictive regarding flag order and combinations (e.g., it won't match rm -fr or rm -rfv).

Suggested change
if (/\brm\s+-rf?\b/.test(c))
if (/^rm\s+-[a-z]*r[a-z]*f?\b/.test(c))

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tools/claude-code-hook.mjs`:
- Around line 178-190: The audit JSON currently omits decision.reason and other
decision metadata; update the appendFileSync payload to include decision.reason
(and any other relevant fields from the decision object) alongside
decision.status and decision.tier so shadow-mode logs explain why calls were
denied/escalated; locate the appendFileSync call that writes to AUDIT_PATH (uses
sessionId, cwd, toolName, type, riskScore, decision.status, decision.tier,
SHADOW_MODE) and add decision.reason (and any decision.* fields needed) into the
object before JSON.stringify.
- Around line 149-155: The import of the local bundle uses the POSIX path string
in distEntry which fails on Windows; change the import to use a file:// URL by
converting distEntry to a file URL (e.g., via pathToFileURL or new URL) before
calling await import(...). Update the code around the variables here and
distEntry where ({ Tegata } = await import(distEntry)) is attempted so it
imports the file URL instead of the raw absolute path and preserve the existing
try/catch.
- Around line 203-210: Replace the async stderr write + immediate exit with a
synchronous write that includes the decision.reason so the block message is not
lost: instead of calling process.stderr.write(...) then process.exit(2) in the
branch that checks decision.status === "denied" || decision.status ===
"escalated", build a single string containing toolName, type, riskScore,
decision.status and decision.reason and write it synchronously with
fs.writeSync(process.stderr.fd, message) before calling process.exit(2); update
the code around the decision/status handling to use fs.writeSync and include the
decision.reason field.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7ae3abd0-ecf4-4129-a8ee-0a9c87677669

📥 Commits

Reviewing files that changed from the base of the PR and between d664bd3 and 0b9f6b8.

📒 Files selected for processing (3)
  • README.md
  • docs/dogfooding.md
  • tools/claude-code-hook.mjs

Comment thread tools/claude-code-hook.mjs
Comment thread tools/claude-code-hook.mjs
Comment thread tools/claude-code-hook.mjs
renasami and others added 2 commits April 21, 2026 00:55
- Raise `git push` riskScore 70→71 so it actually crosses the default
  `escalateAbove: 70` (which uses strict `>`) in enforce mode.
- Harden `rm -rf` detector: anchor to start of command, match flag
  permutations (`-fr`, `-rfv`, `-r -f`, `--recursive`), and handle
  `sudo`. Avoids false positives like `git commit -m "rm -rf ..."`.
- Windows: convert `distEntry` to a `file://` URL via `pathToFileURL()`
  before `await import()`. Absolute filesystem paths like `C:\...` are
  rejected by Node's ESM loader.
- Audit log now includes `decision.reason`, `proposal_id`, and
  `decision_ts` — shadow-mode data can explain *why* a call was denied
  or escalated.
- Enforce-mode stderr write: swap `process.stderr.write()` for
  `fs.writeSync(process.stderr.fd, ...)` so the block message is fully
  flushed before `process.exit(2)`. Also includes `decision.reason`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Clarify safeExit() fail-open semantics: internal errors (bad stdin,
  missing dist) silently allow the tool call even in enforce mode.
  Tegata's own denied/escalated verdicts still block, via a separate
  path. Drops the unused `code` parameter.
- docs/dogfooding.md: note ~100–300 ms per-call hook overhead and the
  unbounded growth of ~/.claude/tegata-audit.jsonl with rotation hints.

Addresses review comments M2 / L1 / L4 on PR #15.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@renasami renasami merged commit cf5e90a into main Apr 20, 2026
1 check passed
@renasami renasami deleted the feat/claude-code-dogfooding branch April 20, 2026 16:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant