feat: failproofai audit — count past stupid agent behavior#377
Conversation
…oss sessions Scans transcripts from all 7 supported CLIs (Claude/Codex/Copilot/Cursor/ OpenCode/Pi/Gemini) and reports how often the agent did wasteful or risky things — both via replay through the existing 39 builtin policies and via 8 new audit-only detectors (redundant-cd-cwd, prefer-edit-over-read-cat, prefer-edit-over-sed-awk, prefer-write-over-heredoc, sleep-polling-loop, find-from-root, git-commit-no-verify, reread-after-edit). Output: ANSI table to stdout + sectioned markdown report to ./failproofai-audit.md. Per-transcript results cached at ~/.failproofai/cache/audit/ keyed by (mtime, size, engineVersion, detectorVersion) so the cache invalidates when policy or detector code changes. Skips warn-repeated-tool-calls in replay (its per-session sidecar mutation would pollute real user data). Refactor: extracts per-CLI tool-name + tool-input canonicalization from src/hooks/handler.ts into src/hooks/tool-name-canonicalize.ts so the audit replay and the live handler share one implementation. Adds lib/claude-sessions.ts for Claude transcript discovery (mirroring the existing per-CLI lib/<cli>-sessions.ts helpers). Tests: +36 unit + integration tests under __tests__/audit/ — detector positive/negative cases, replay through real builtins, and an end-to-end fixture-transcript run via runAudit(). All 1659 tests pass; lint + tsc clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds docs/cli/audit.mdx documenting flags, the 8 audit-only detectors, output formats (table / markdown / JSON), and the read-only / no-mutation guarantees. Registers the page in the English CLI nav in docs/docs.json. Translations will follow in the next translation pass (per the team's existing translation workflow — see #371). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis PR implements the beta ChangesAudit Command Implementation
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
✨ Finishing Touches📝 Generate docstrings
|
There was a problem hiding this comment.
Actionable comments posted: 11
🧹 Nitpick comments (1)
__tests__/audit/detectors.test.ts (1)
85-97: ⚡ Quick winAdd a lowercase heredoc delimiter case for coverage.
Please add a case like
cat <<'eof' > out.mdso this detector behavior is locked by tests.As per coding guidelines, "Always add unit tests for new behaviour. Place tests in tests/."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@__tests__/audit/detectors.test.ts` around lines 85 - 97, Add a unit test to cover lowercase heredoc delimiters: inside the "prefer-write-over-heredoc" suite (in __tests__/audit/detectors.test.ts) add an it case that calls preferWriteOverHeredoc.detect with bash("cat <<'eof' > out.md\nhello\neof") and expects a non-null result; use the same test style/helpers (preferWriteOverHeredoc.detect and bash) as the existing tests so the detector behavior for lowercase delimiters is asserted.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@bin/failproofai.mjs`:
- Around line 467-484: The collectMulti function currently allows flags like
--cli/--project/--policy with zero following values; modify collectMulti (which
iterates subArgs and returns { values, consumed }) to validate that at least one
value was collected for the given flag and fail fast if not: after populating
out and consumed, if out.length === 0 throw an Error (or return a clear failure)
with a message including the flag (e.g. `Flag ${flag} requires at least one
value`) so callers cannot proceed with an empty filter.
- Line 533: The current ternary sets limit using parseInt(limitRes.value, 10)
which can produce NaN, 0, or negatives; change this to explicitly parse and
validate a positive integer: take limitRes.value, parse it to an integer, check
Number.isInteger(parsed) and parsed > 0, and only then assign to the limit field
(otherwise print a clear validation error and exit or set limit to undefined
depending on desired behavior). Update the logic around the limit assignment
(referencing limitRes and the limit property) so invalid values are rejected
with a helpful message instead of passing NaN/0/negative through to the audit.
In `@CHANGELOG.md`:
- Around line 5-6: The CHANGELOG entry is too verbose and missing the PR number;
replace the multi-line paragraph with a single concise one-line entry that
includes the PR number and a short description referencing the new command and
key changes (e.g. "Add `failproofai audit` command with 8 audit-only detectors
and shared tool-name canonicalization via tool-name-canonicalize (PR
#<your-PR-number>)"); ensure you mention "failproofai audit", the new audit-only
detectors (e.g. redundant-cd-cwd, prefer-edit-over-read-cat, etc.) only if
needed for clarity but keep the line short and include the actual PR number for
this change.
In `@docs/docs.json`:
- Line 45: The CLI "audit" nav entry ("cli/audit") is only present in the
English section and must be added to every other language's CLI pages array to
keep navigation consistent; edit docs.json and insert "cli/audit" into each
non-English language CLI group (zh, ja, ko, es, pt-BR, de, fr, ru, hi, tr, vi,
it, ar, he) at the same position as in English (after "list-policies" and before
"hook"), using the same path ("cli/audit") if localized files don't yet exist so
the structure mirrors English for future translations.
In `@src/audit/cache.ts`:
- Around line 98-99: The cache file is created with writeFileSync(cachePath,
JSON.stringify(entry), "utf-8") then tightened with chmodSync(cachePath, 0o600),
which leaves a window where the file can have looser permissions; change
creation to atomically set secure mode at creation (e.g., pass a mode option or
use open/write with mode 0o600) so the file is created with 0o600 from the
start, keep the try/catch around chmodSync as a best-effort fallback, and update
references to cachePath, writeFileSync, and chmodSync in src/audit/cache.ts
accordingly.
In `@src/audit/cli-adapters/shared.ts`:
- Around line 52-56: The current truncation uses String.length and slice, which
counts UTF-16 code units and can exceed the intended UTF-8 byte cap; update the
truncation logic where toolResultText is assigned (the block checking
block.result?.content and assigning toolResultText) to measure UTF-8 bytes using
Buffer.byteLength(..., "utf8") (or TextEncoder) and truncate the string so its
UTF-8 byte length does not exceed AUDIT_TOOL_RESULT_MAX_BYTES, e.g.
iterate/backtrack on characters until Buffer.byteLength(candidate, "utf8") <=
AUDIT_TOOL_RESULT_MAX_BYTES and then assign that truncated candidate to
toolResultText.
In `@src/audit/detectors/prefer-write-over-heredoc.ts`:
- Around line 21-24: The heredoc regex in the if-statement currently only
matches uppercase delimiters (/[A-Z_]+/) so lowercase delimiters like <<'eof'
are missed; update the regex used in the detector (the if(...) test) to accept
lowercase letters and digits (e.g. use /<<-?\s*['"]?[A-Za-z0-9_]+['"]?\s*>\s*\S/
or simply add the case-insensitive flag like
/<<-?\s*['"]?[A-Z_]+['"]?\s*>\s*\S/i) so all valid heredoc delimiters are
detected, then keep the existing summary extraction
(cmd.replace(...).slice(...)) unchanged.
In `@src/audit/detectors/sleep-polling-loop.ts`:
- Around line 21-25: The detector currently drops fractional sleep durations
because the regex only captures the integer part and the code uses parseInt;
update the regex in sleep-polling-loop.ts so the numeric capture includes
decimals (e.g., change (\d+)(?:\.\d+)? to capture the full number like
(\d+(?:\.\d+)?)) and replace parseInt(match[1], 10) with parseFloat(match[1]) so
fractional values like 0.5 are preserved, then compute seconds using that float
when applying the unit multipliers (the match variable, numeric capture, and the
seconds calculation should all use the float value).
In `@src/audit/index.ts`:
- Around line 81-85: The catch that calls ADAPTERS[meta.cli].streamEvents(meta)
is swallowing failures (returning empty) so the audit errors counter never
increments; update that catch to increment the shared errors counter (errors++)
and then rethrow the caught error (throw err) instead of silently returning
empty; apply the same change to the other similar catch block later (the one
around the second streamEvents call) so all transcript stream failures are
accounted for and propagated.
In `@src/audit/report.ts`:
- Around line 146-148: The markdown table cells are interpolated raw in the
out.push call, so values like r.name (and r.source/firstSeen/lastSeen) can break
the table if they contain |, backticks, or newlines; create/use a small
sanitizer (e.g., escapeMarkdownCell) and apply it to each cell before
interpolation in the out.push line (escape pipe characters to \|, backticks to
\`, and collapse or replace newlines) so the table row generated by out.push(`|
\`${...}\` | ${...} | ... |`) remains well-formed.
In `@src/hooks/tool-name-canonicalize.ts`:
- Around line 50-57: The function canonicalizeToolInput currently treats any
object-like rawInput (including arrays) as a Record and remaps keys; update the
guard so it returns rawInput unchanged if it's an array (e.g., check
Array.isArray(rawInput)) and only proceeds when rawInput is a plain record
object; keep the existing logic that picks perToolMap from
OPENCODE_TOOL_INPUT_MAP or PI_TOOL_INPUT_MAP using toolName and then maps keys
using perToolMap[k] inside the loop.
---
Nitpick comments:
In `@__tests__/audit/detectors.test.ts`:
- Around line 85-97: Add a unit test to cover lowercase heredoc delimiters:
inside the "prefer-write-over-heredoc" suite (in
__tests__/audit/detectors.test.ts) add an it case that calls
preferWriteOverHeredoc.detect with bash("cat <<'eof' > out.md\nhello\neof") and
expects a non-null result; use the same test style/helpers
(preferWriteOverHeredoc.detect and bash) as the existing tests so the detector
behavior for lowercase delimiters is asserted.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5f691585-8cb1-4a7d-a7d5-58d59a820e24
📒 Files selected for processing (33)
CHANGELOG.md__tests__/audit/detectors.test.ts__tests__/audit/index.test.ts__tests__/audit/replay.test.tsbin/failproofai.mjsdocs/cli/audit.mdxdocs/docs.jsonlib/claude-sessions.tssrc/audit/cache.tssrc/audit/cli-adapters/claude.tssrc/audit/cli-adapters/codex.tssrc/audit/cli-adapters/copilot.tssrc/audit/cli-adapters/cursor.tssrc/audit/cli-adapters/gemini.tssrc/audit/cli-adapters/index.tssrc/audit/cli-adapters/opencode.tssrc/audit/cli-adapters/pi.tssrc/audit/cli-adapters/shared.tssrc/audit/detectors/find-from-root.tssrc/audit/detectors/git-commit-no-verify.tssrc/audit/detectors/index.tssrc/audit/detectors/prefer-edit-over-read-cat.tssrc/audit/detectors/prefer-edit-over-sed-awk.tssrc/audit/detectors/prefer-write-over-heredoc.tssrc/audit/detectors/redundant-cd-cwd.tssrc/audit/detectors/reread-after-edit.tssrc/audit/detectors/sleep-polling-loop.tssrc/audit/index.tssrc/audit/replay.tssrc/audit/report.tssrc/audit/types.tssrc/hooks/handler.tssrc/hooks/tool-name-canonicalize.ts
• bin/failproofai.mjs:
- Reject bare `--cli`/`--project`/`--policy` (was silently widening scope)
- Reject `--limit` values that aren't positive integers (NaN/0/negative)
• CHANGELOG.md: condense the audit Features entry to one line + PR number
(#377) per CLAUDE.md changelog convention.
• docs/docs.json: register `cli/audit` in all 13 non-English language nav
sections (was only in English) so nav structure mirrors English ahead of
the translation pass.
• src/audit/cache.ts: pass `mode: 0o600` to writeFileSync at file-creation
time so there's no permission-leak window between create and chmod.
• src/audit/cli-adapters/shared.ts: cap tool_result_text by UTF-8 BYTE
length (was using String.prototype.length which is UTF-16 code units —
let through up to 4× the intended budget for non-ASCII). Safe UTF-8
boundary walk so multi-byte sequences aren't split.
• src/audit/detectors/prefer-write-over-heredoc.ts: heredoc delimiter
regex now accepts `[A-Za-z_][A-Za-z0-9_]*` instead of only `[A-Z_]+`
so `cat <<'eof' > file` is detected.
• src/audit/detectors/sleep-polling-loop.ts: use parseFloat so
`sleep 0.5m` (= 30s, exactly at the threshold) is no longer dropped.
• src/audit/index.ts: let scanOneTranscript propagate streamEvents errors
so the orchestrator's outer try/catch correctly increments `errors`
(was silently returning an empty result, making audit reliability
stats wrong).
• src/audit/report.ts: escape `|`, `\`, and CR/LF in markdown table cells
so a policy name containing `|` doesn't break the table layout.
• src/hooks/tool-name-canonicalize.ts: guard canonicalizeToolInput against
Array inputs (was flattening them into a plain object with numeric keys
— pre-existing handler.ts behavior, hardened on extraction).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Mintlify's `validate` step (the `docs` CI job) rejects nav references to pages that don't exist on disk. The previous commit pre-registered `<locale>/cli/audit` in all 13 non-English language sections to mirror the English nav ahead of translations landing — but the translated pages don't exist yet, so Mintlify failed with 14 "file does not exist" warnings (treated as build errors). The project's translation workflow lands the translated pages and the matching nav entries together in a dedicated translation-pass commit (see #371), so registering the nav slots upstream of the actual files breaks CI. Reverting the 13 language entries; the English nav entry stays. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/audit/report.ts (1)
152-159:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winEscape backticks in subsection header and consider escaping example metadata.
At line 154,
r.nameis used directly inside backticks without escaping. If a policy name contains backticks, the markdown heading will be malformed. Additionally,e.cwdande.timestampat line 157 could contain characters that interfere with the italic markdown.Suggested fix
if (r.examples.length === 0) continue; - out.push(`### \`${r.name}\` — examples`); + out.push(`### \`${escapeBackticks(r.name)}\` — examples`); out.push(""); for (const e of r.examples) { - out.push(`- \`${escapeBackticks(e.example)}\` _(${e.cwd || "?"}, ${e.timestamp})_`); + const safeCwd = (e.cwd || "?").replace(/_/g, "\\_"); + const safeTs = e.timestamp.replace(/_/g, "\\_"); + out.push(`- \`${escapeBackticks(e.example)}\` _(${safeCwd}, ${safeTs})_`); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/audit/report.ts` around lines 152 - 159, The subsection header uses r.name inside backticks unescaped and example metadata e.cwd/e.timestamp are inserted raw; update the out.push that builds the header and the out.push that builds each example to run r.name, e.cwd, and e.timestamp through the existing escapeBackticks (or a small escape function for markdown/italic safety) so any backticks or characters that would break the ``` `...` ``` or italic markup are escaped; locate the loop over rows (for (const r of rows)), the header construction using r.name, and the example line building using e.cwd/e.timestamp and apply the escaping there (reuse escapeBackticks or add a new escapeMarkdown helper and apply it to r.name, e.cwd, and e.timestamp).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/he/cli/environment-variables.mdx`:
- Around line 2-40: The Hebrew locale file currently contains Arabic content
(e.g., frontmatter title "متغيرات البيئة" and section headings like "## لوحة
المعلومات", "## تسجيل السجلات", "## التحليلات", "## موجه التشغيل الأول", "##
نموذج اللغة (لتقييم السياسة)"); replace those Arabic strings with the correct
Hebrew translations while preserving all environment variable keys,
descriptions, table structures, and default values (e.g., keep `PORT`,
`CLAUDE_PROJECTS_PATH`, `FAILPROOFAI_LOG_LEVEL=info|warn|error`,
`FAILPROOFAI_LLM_BASE_URL`, etc.), and ensure the file remains valid
MDX/frontmatter so the Hebrew copy delivers the same semantic content as the
original English source.
In `@docs/i18n/README.ar.md`:
- Line 96: The Arabic README contains an outdated policy count string ("30 سياسة
مدمجة تصبح نشطة فورًا."); update that text to reflect the new total (change "30"
to "39") so it matches the rest of the PR/docs and the referenced built-in
policies count.
In `@docs/i18n/README.de.md`:
- Line 94: Update the outdated policy count in the German README by replacing
the phrase "30 integrierte Richtlinien" with "39 integrierte Richtlinien" so the
text "30 integrierte Richtlinien werden sofort aktiviert." accurately reads "39
integrierte Richtlinien werden sofort aktiviert."; locate the occurrence by
searching for the exact string "30 integrierte Richtlinien" in README.de.md and
adjust any other nearby references if present.
In `@docs/i18n/README.it.md`:
- Line 102: The table entry for the policy key `block-push-master` mistranslates
the git action; replace the phrase "Pressioni dirette su `main` / `master`" with
a clearer technical term such as "Push diretti su `main` / `master`" (or
equivalent) so the blocked git action is unambiguous.
In `@docs/i18n/README.tr.md`:
- Line 146: Fix the typo in the Turkish README sentence by replacing
“engellendığını” with “engellendiğini” in the docs/i18n/README.tr.md file (the
dashboard visibility sentence on the shown diff) so the phrase reads correctly.
In `@docs/it/introduction.mdx`:
- Line 25: Update the Italian article agreement in the sentence "Scrivi le tue
regole in JavaScript con un semplice API allow / deny / instruct." by replacing
"un semplice API" with "una semplice API" so the line reads "Scrivi le tue
regole in JavaScript con una semplice API allow / deny / instruct."; locate and
edit that exact string in docs/it/introduction.mdx.
In `@docs/ja/introduction.mdx`:
- Around line 53-54: The inline comments after the shell commands are in English
on a Japanese page; replace the English comments in the two lines containing
"failproofai policies --install # enable policies (or skip — `failproofai`
will offer to set them up on first run)" and "failproofai #
launch the dashboard" with concise Japanese translations (e.g.,
「ポリシーを有効にする(スキップ可 — 初回起動時に設定を案内します)」 and 「ダッシュボードを起動」) so the page is fully
localized.
In `@docs/tr/cli/environment-variables.mdx`:
- Around line 28-32: Change the awkward noun form in the first-run prompt
section: rename the header "İlk çalıştırma istemesi" to "İlk çalıştırma istemi"
and update the table description text "İlk `failproofai` çağrısında politikaları
yüklemeyi teklif eden istememi atlayın" to use the matching noun form "istemi"
(e.g., "İlk `failproofai` çağrısında politikaları yüklemeyi teklif eden istemi
atlayın") so the wording is consistent and natural; ensure the
`FAILPROOFAI_NO_FIRST_RUN=1` variable line remains unchanged except for the
wording swap.
In `@src/audit/detectors/sleep-polling-loop.ts`:
- Around line 21-25: The detector's regex and number parsing miss explicit "s"
and leading-dot fractions; update the regex to capture fractional seconds and
the "s" unit (e.g. use /\bsleep\s+(\d*\.?\d+)(s|m|h|d)?\b/i) so match[1] will
include decimals like ".5" or "30.5" and match[2] can be "s"; keep using
parseFloat(match[1]) to get n, keep the existing unit-to-seconds logic (unit ===
"m" ? n*60 : unit === "h" ? n*3600 : unit === "d" ? n*86400 : n) and ensure
variable names match (match, n, unit, seconds) in sleep-polling-loop.ts.
---
Outside diff comments:
In `@src/audit/report.ts`:
- Around line 152-159: The subsection header uses r.name inside backticks
unescaped and example metadata e.cwd/e.timestamp are inserted raw; update the
out.push that builds the header and the out.push that builds each example to run
r.name, e.cwd, and e.timestamp through the existing escapeBackticks (or a small
escape function for markdown/italic safety) so any backticks or characters that
would break the ``` `...` ``` or italic markup are escaped; locate the loop over
rows (for (const r of rows)), the header construction using r.name, and the
example line building using e.cwd/e.timestamp and apply the escaping there
(reuse escapeBackticks or add a new escapeMarkdown helper and apply it to
r.name, e.cwd, and e.timestamp).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a6263605-fb2e-4f2c-9220-1ee326b12f61
📒 Files selected for processing (53)
CHANGELOG.mdbin/failproofai.mjsdocs/ar/cli/environment-variables.mdxdocs/ar/introduction.mdxdocs/cli/environment-variables.mdxdocs/de/cli/environment-variables.mdxdocs/de/introduction.mdxdocs/es/cli/environment-variables.mdxdocs/es/introduction.mdxdocs/fr/cli/environment-variables.mdxdocs/fr/introduction.mdxdocs/he/cli/environment-variables.mdxdocs/he/introduction.mdxdocs/hi/cli/environment-variables.mdxdocs/hi/introduction.mdxdocs/i18n/README.ar.mddocs/i18n/README.de.mddocs/i18n/README.es.mddocs/i18n/README.fr.mddocs/i18n/README.he.mddocs/i18n/README.hi.mddocs/i18n/README.it.mddocs/i18n/README.ja.mddocs/i18n/README.ko.mddocs/i18n/README.pt-br.mddocs/i18n/README.ru.mddocs/i18n/README.tr.mddocs/i18n/README.vi.mddocs/i18n/README.zh.mddocs/introduction.mdxdocs/it/cli/environment-variables.mdxdocs/it/introduction.mdxdocs/ja/cli/environment-variables.mdxdocs/ja/introduction.mdxdocs/ko/cli/environment-variables.mdxdocs/ko/introduction.mdxdocs/pt-br/cli/environment-variables.mdxdocs/pt-br/introduction.mdxdocs/ru/cli/environment-variables.mdxdocs/ru/introduction.mdxdocs/tr/cli/environment-variables.mdxdocs/tr/introduction.mdxdocs/vi/cli/environment-variables.mdxdocs/vi/introduction.mdxdocs/zh/cli/environment-variables.mdxdocs/zh/introduction.mdxsrc/audit/cache.tssrc/audit/cli-adapters/shared.tssrc/audit/detectors/prefer-write-over-heredoc.tssrc/audit/detectors/sleep-polling-loop.tssrc/audit/index.tssrc/audit/report.tssrc/hooks/tool-name-canonicalize.ts
✅ Files skipped from review due to trivial changes (24)
- docs/cli/environment-variables.mdx
- docs/vi/introduction.mdx
- docs/introduction.mdx
- docs/hi/introduction.mdx
- docs/ru/introduction.mdx
- docs/i18n/README.zh.md
- docs/hi/cli/environment-variables.mdx
- docs/zh/introduction.mdx
- docs/es/introduction.mdx
- docs/de/introduction.mdx
- docs/es/cli/environment-variables.mdx
- docs/i18n/README.pt-br.md
- CHANGELOG.md
- docs/it/cli/environment-variables.mdx
- docs/ko/introduction.mdx
- docs/tr/introduction.mdx
- docs/i18n/README.ja.md
- docs/i18n/README.hi.md
- docs/i18n/README.ko.md
- docs/i18n/README.vi.md
- docs/fr/introduction.mdx
- docs/pt-br/introduction.mdx
- docs/i18n/README.ru.md
- docs/ru/cli/environment-variables.mdx
Replaces the developer-facing audit table with a GTM-oriented report:
• Headline call-out box: "Your agent did N wasteful or risky things in
your last X days. M of those would've been caught if more policies
were on."
• Two sections instead of a flat list: "✓ ALREADY PROTECTED" (builtins
the user has enabled that fired) vs "○ SLIPPING THROUGH" (unenabled
builtins + audit-only detectors). Each row is now a human-readable
title ("Tried to read files outside your project") with a one-line
impact ("Stops the agent from peeking at neighboring repos…"), a
relative timestamp ("Last seen 2h ago · 6 projects"), and a per-row
install CTA. The bottom NEXT block lists a single copy-pasteable
`failproofai policies --install ...` command covering every unenabled
builtin, plus the shareable report path and a star link.
• Honors NO_COLOR / FORCE_COLOR=0.
Markdown report rewritten with a TL;DR block aimed at someone who's never
heard of failproofai (so the report is shareable in Slack/PRs), separate
"Already protected" and "Slipping through" tables, an "install everything
in one command" code block, and an Examples appendix grouped by issue.
Authoring:
• New optional `displayTitle` + `impact` fields on `BuiltinPolicyDefinition`
(src/hooks/policy-types.ts) and `Detector` (src/audit/types.ts).
• Copy authored for all 39 builtins + all 8 detectors in-place on each
definition (single source of truth — the dashboard can use the same
copy later).
• AuditCount carries `displayTitle`, `impact`, `enabledInConfig`, and
a pre-built `installHint` so renderers stay pure.
Orchestrator:
• readMergedHooksConfig() called once per audit run to drive the
enabled/unenabled split.
Telemetry (PostHog, src/audit/telemetry.ts):
• Four new events plug into the existing trackHookEvent surface:
audit_started, audit_pattern_detected (per AuditCount),
audit_install_cta_shown (when there are unenabled builtins),
audit_completed (with enabled vs unenabled vs detector hit splits —
the GTM funnel signal).
• Strict privacy: only slugs, counts, booleans, bucketed ages, and CLI
tags ever leave the box. Never transcript paths, cwds, project names,
session IDs, or example commands.
• Honors FAILPROOFAI_TELEMETRY_DISABLED=1 via the inherited contract.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
• docs/cli/audit.mdx — title "Audit past sessions (beta)" + Mintlify
<Note> callout naming beta status and that flags / output / detector
catalog may change before the next stable cut.
• bin/failproofai.mjs — `audit (beta)` in the top-level COMMANDS list
and the `audit --help` page leads with a "NOTE: This command is in
beta. ..." block before USAGE.
• src/audit/report.ts — table header shows the `[beta]` tag next to
the audit command name on every run, and the generated markdown
report's intro has a "_Generated by failproofai audit (beta)_" line.
Also fixes a small wording bug in the table header — was "your the last
1d" (double-article from joining a stray "your" prefix with the "the
last …" sinceLabel), now reads "the last 1d".
No behavior change beyond the cosmetic wording fix.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/audit/report.ts`:
- Around line 217-218: The footer is printing a hardcoded path string via the
lines.push call instead of the configured report path; update the second-to-last
lines.push (the one that currently inserts "./failproofai-audit.md") to use the
audit/report module's configured report path variable (the same variable the CLI
--report <path> populates, e.g., reportPath or reportFile) so the printed
shareable report path matches the actual output location, preserving the
ANSI.cyan and ANSI.reset wrappers and leaving the surrounding text unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9570b90b-7772-4454-951b-c45cbc5f6f75
📒 Files selected for processing (16)
bin/failproofai.mjsdocs/cli/audit.mdxsrc/audit/detectors/find-from-root.tssrc/audit/detectors/git-commit-no-verify.tssrc/audit/detectors/prefer-edit-over-read-cat.tssrc/audit/detectors/prefer-edit-over-sed-awk.tssrc/audit/detectors/prefer-write-over-heredoc.tssrc/audit/detectors/redundant-cd-cwd.tssrc/audit/detectors/reread-after-edit.tssrc/audit/detectors/sleep-polling-loop.tssrc/audit/index.tssrc/audit/report.tssrc/audit/telemetry.tssrc/audit/types.tssrc/hooks/builtin-policies.tssrc/hooks/policy-types.ts
✅ Files skipped from review due to trivial changes (2)
- src/hooks/builtin-policies.ts
- docs/cli/audit.mdx
|
Reopening to re-trigger CI on commits 0c70ff4 + 99a9651 (the previous push hit the old redirected remote and GitHub Actions never registered the new tip). |
…vious push via the old redirected remote URL)
The "📄 Shareable report: …" line at the bottom of `failproofai audit`'s table output was hardcoded to `./failproofai-audit.md`, ignoring the actual `--report <path>` value. Now mirrors `opts.reportPath` (and is suppressed entirely when `--no-report` is passed). CodeRabbit finding on PR #377. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
CodeRabbit posted 10 findings on this push. Triage: In-scope (this PR's code) — fixed:
Out of scope — pre-existing translation issues from the upstream translation-sync commit (
These files weren't touched by this PR — they landed via the auto-translation bot's most recent run on top of |
# Conflicts: # CHANGELOG.md
Summary
Adds a new
failproofai auditCLI command (beta) that scans past agent transcripts on the user's machine (Claude / Codex / Copilot / Cursor / OpenCode / Pi / Gemini — all 7 supported CLIs) and reports how often the agent did things failproofai is built to stop. The output is a quantified retrospective: "in the last 30 days, your agent ranenv428 times, force-pushed 12 times, andcat src/foo.ts'd 207 times instead of using Read."Each tool-use event is run through two parallel detection paths:
evaluatePolicies()— so the audit always reflects what today's policy engine would catch, even on sessions from before failproofai was installed.redundant-cd-cwd,prefer-edit-over-read-cat,prefer-edit-over-sed-awk,prefer-write-over-heredoc,sleep-polling-loop,find-from-root,git-commit-no-verify,reread-after-edit.GTM-oriented report (latest commits)
The report renderer was rewritten so the first run feels like a "moment of truth" instead of a developer table:
failproofai policies --install ...covering every unenabled builtin, plus the shareable report path and a star link.NO_COLOR/FORCE_COLOR=0.To support this,
BuiltinPolicyDefinition(src/hooks/policy-types.ts) andDetector(src/audit/types.ts) gained optionaldisplayTitle("Tried to push to main branch") andimpact("Direct pushes to a protected branch bypass review.") fields. All 39 builtins and all 8 detectors had copy authored in-place — single source of truth, the dashboard can use the same copy later.Beta marker
Marked across all surfaces:
docs/cli/audit.mdxtitle: "Audit past sessions (beta)" + Mintlify<Note>calloutbin/failproofai.mjs:audit (beta)in the top-level help;audit --helppage leads with a NOTE block[beta]tag on every run; markdown report intro line names it as betaPostHog telemetry (per the GTM ask)
src/audit/telemetry.tswires four new events into the existingtrackHookEventsurface:audit_startedrunAudit()audit_pattern_detectedenabled_in_configboolean = the key conversion signalaudit_install_cta_shownaudit_completedrunAudit()Strict privacy contract: never send transcript paths, decoded project folder names, cwds, example commands, file paths, sessionIds, or tool inputs. Only slugs (public), counts, booleans, bucketed ages, and CLI tags. Honors
FAILPROOFAI_TELEMETRY_DISABLED=1via the inherited contract.What's in the diff
src/audit/— types, orchestrator, replay engine, cache, report renderers (text/markdown/json), new telemetry module.src/audit/cli-adapters/— 7 per-CLI adapters + sharedLogEntry[] → NormalizedToolEvent[]converter. Each delegates discovery to the existinglib/<cli>-projects.ts+lib/<cli>-sessions.tshelpers.src/audit/detectors/— 8 detectors, one file each, each with authoreddisplayTitle+impact.src/hooks/tool-name-canonicalize.ts— extracted fromsrc/hooks/handler.tsso the audit replay and the live handler share one canonicalization implementation. Pure refactor.src/hooks/builtin-policies.ts+src/hooks/policy-types.ts—displayTitle+impactadded to all 39 builtin policy definitions.lib/claude-sessions.ts— new transcript-discovery helper for Claude.bin/failproofai.mjs—"audit"registered inSUBCOMMANDS, full dispatch with flag parsing + help + beta marker.docs/cli/audit.mdx+ nav entry indocs/docs.json+ beta callout. Translations follow [auto] update translations #371's workflow.CHANGELOG.md— Features entry under 0.0.11-beta.2 (with(beta)annotation + PR number).__tests__/audit/— 36 tests across detectors, replay, and end-to-end orchestrator.Design notes
~/.failproofai/cache/audit/<sha1>.json, keyed by(mtime, size, engineVersion, detectorVersion).engineVersionis a sha1 of every builtin policy's name + function source, so changing one regex invalidates cache automatically. Created withmode: 0o600at file-creation time.warn-repeated-tool-callsis the one builtin that writes to disk; replay skips it so audits don't pollute user data.require-*-before-stoppolicies match onlyStopevents andexecSyncagainst live git, so they never fire on replayedPreToolUse/PostToolUseevents.PostToolUsesynthesis. When a transcript captured atool_result, the replay emits a second synthetic event withtool_response(truncated to a fixed UTF-8-aware byte budget) sosanitize-*policies register hits.CodeRabbit fixes (commit
218c02d)Addressed 10 of 11 review comments from the initial PR push:
--cli/--project/--policy(was silently widening scope) and validate--limitis a positive integer.mode: 0o600at cache-file creation.tool_responsetext by UTF-8 BYTE length (was UTF-16 code units).sleep-polling-loopusesparseFloatsosleep 0.5m(= 30s) counts.errorsstat.|/\\/CR/LF so a policy name with|doesn't break the table.canonicalizeToolInputno longer flattens array inputs into a numeric-keyed object.The 11th suggestion (pre-register
<locale>/cli/auditin 13 language nav sections) was reverted in commitdb05635— Mintlify'svalidatestep rejects nav refs to non-existent pages, and the project's translation workflow lands translated pages and nav entries together (per #371).Test plan
bun run lintclean (0 errors; 1 pre-existing warning unrelated)bunx tsc --noEmitcleanbun install(which triggers the Next.js prepare build) succeedsbunx vitest run— 1685 tests pass (1623 pre-existing + 36 audit + 26 from rebased upstream)--cliand--limit 0both reject with clear errorsfailproofai audit --since 30don a real install and visually verify the table + report🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
failproofai audit: scans local agent transcripts across supported CLIs, runs builtin policies + audit-only detectors, outputs ANSI table/JSON/markdown report, and uses per‑transcript caching.Tests
Documentation
Release-prep status (commit
e92815e):origin/maininto the branch (resolves aCHANGELOG.mdconflict with PRs Expand PostHog telemetry to close audited coverage gaps #376 and feat: first-run install prompt on barefailproofai#378 that landed onmainwhile this PR was in review).bun installbuild passes,bunx tsc --noEmitclean,bun run lint0 errors, 1685 tests pass.MERGEABLE, awaiting CI on the new merge commit.luv-cut-*branch (block-version-bumpsenforces this on feature branches). The audit Features entry is already correctly placed under## 0.0.11-beta.2 — 2026-05-21matching the currentpackage.json.