Skip to content

fix: merge latest dev updates into main#24

Merged
nadav-node9 merged 57 commits intomainfrom
dev
Mar 21, 2026
Merged

fix: merge latest dev updates into main#24
nadav-node9 merged 57 commits intomainfrom
dev

Conversation

@nadav-node9
Copy link
Contributor

Auto-generated PR

Merge latest dev changes into main to trigger a release.

⚠️ Important: When you click Squash and Merge, ensure the commit message starts with:

  • fix: to publish a Patch release (0.0.X)
  • feat: to publish a Minor release (0.X.0)
    If it starts with chore:, no NPM package will be published!

nadavis and others added 30 commits March 14, 2026 16:41
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- native.ts: add extractContext + formatArgs with matchedField/matchedWord
  tracing for "Context Sniper" popup — shows dangerous word in context
- core.ts: extend evaluatePolicy return with matchedField/matchedWord;
  per-field scan after dangerous word found; pass through authorizeHeadless
- daemon/index.ts: gate SSE broadcast and browser open on browser config flag
- LICENSE/package.json/README.md: MIT → Apache-2.0
- .github/workflows/ai-review.yml: add paths-ignore to prevent self-modification
- scripts/ai-review.mjs: upgrade to claude-sonnet-4-6, max_tokens 2048

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Wrap diff in <diff>...</diff> markers with untrusted-content notice to mitigate prompt injection
- Surface truncation note in posted PR comment when diff exceeds MAX_DIFF_CHARS
- Downgrade API errors to warning comments + exit 0 so Anthropic outages don't block PRs
- Pin @anthropic-ai/sdk@0.78.0 and @octokit/rest@22.0.1 to prevent supply-chain drift
- Add explicit permissions block (contents: read, pull-requests: write)
- Exclude dependabot[bot] from triggering review
- Add fetch-depth: 0 to checkout step

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… main

- ai-review.yml: replace AUTO_PR_TOKEN with GITHUB_TOKEN (permissions block
  already scopes it correctly — no broad PAT needed)
- ai-review.yml: add --ignore-scripts to npm install to block malicious
  postinstall hooks from transitive dependencies
- sync-dev.yml: new workflow — after every push to main, merge main back into
  dev so release-bot version bumps don't cause recurring README conflicts on
  the next dev -> main PR

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Move @anthropic-ai/sdk and @octokit/rest into devDependencies and switch the
ai-review workflow from bare npm install to npm ci --ignore-scripts. This
locks all transitive dependencies to the committed lockfile, eliminating
supply-chain drift on every CI run.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- context-sniper.ts (new): shared RiskMetadata type, smartTruncate,
  extractContext (returns {snippet, lineIndex}), computeRiskMetadata
- native.ts: import from context-sniper, use .snippet on extractContext calls
- core.ts: add tier to evaluatePolicy returns; compute riskMetadata once in
  authorizeHeadless; pass it to initNode9SaaS, askDaemon, notifyDaemonViewer
- daemon/index.ts: store and broadcast riskMetadata in PendingEntry
- daemon/ui.html: renderPayload() uses riskMetadata for intent badge, tier,
  file path, annotated snippet, matched-word highlight; falls back to raw args

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When the Node9 cloud responds with shadowMode:true (org is in shadow
mode), print a yellow warning to stderr instead of blocking the agent.
The developer sees exactly why it was flagged without being interrupted.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…er, and integration tests

- Add Zod v3 schema validation (config-schema.ts) with clear error messages for
  bad config.json — catches literal newlines, invalid regex, unknown keys, bad enums
- Fix silent JSON parse fallback in tryLoadConfig: bad config now warns to stderr
  instead of silently using DEFAULT_CONFIG (which had cloud:true causing unexpected
  browser/cloud popups when config was invalid)
- Fix auditLocalAllow fire-and-forget killed by process.exit: audit mode path now
  awaits the POST so SaaS receives the event before the process exits
- Gate all auditLocalAllow calls on approvers.cloud so cloud:false (privacy mode)
  never sends data to SaaS
- Fix double browser windows when cloud+browser both enabled: RACER 3 (browser)
  now skips when cloudEnforced, preventing duplicate daemon /check entries
- Fix calledFromDaemon guard on terminal status messages to prevent duplicate output
- Add check.integration.test.ts: 20 end-to-end tests spawning real node9 check
  subprocess with isolated HOME dirs and in-process mock SaaS server

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…plied

Previously tryLoadConfig warned about invalid fields (e.g. mode:"bad-mode")
but still returned the raw object, letting them override valid values from
higher-priority config layers. A project-level node9.config.json with
mode:"bad-mode" would override the global mode:"audit", bypassing audit mode
and triggering the full cloud approval race unexpectedly.

sanitizeConfig() now drops top-level keys that fail Zod validation so invalid
project configs cannot corrupt the effective merged config.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…l test, improve comments

- Remove unreliable 200ms sleep from audit+cloud test: auditLocalAllow is
  awaited before process.exit so the POST is done by the time the subprocess
  closes; if it ever races here it would be a production bug too
- Add task* wildcard test: task_drop_all_tables must be fast-pathed to allow
  (documents the intentional security trade-off of user-configured wildcards)
- Expand runCheck docstring explaining why cwd=tmpHome is needed alongside
  HOME=tmpHome (avoids inheriting the repo's own node9.config.json)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…noredTools precedence tests

- Add `let resolved = false` guard in runCheckAsync to prevent double-resolve
  when child.kill() is called on timeout (close event fires after kill)
- Fix mockServer.close() in afterEach to return a Promise (was fire-and-forget)
- Document NODE9_TESTING=1 behavior in file header comment
- Add runCheck/runCheckAsync raw string support for malformed payload testing
- Add section 10: malformed JSON payload tests (non-JSON, empty, partial JSON)
- Add ignoredTools precedence test: task* wildcard + dangerous word in input
  documents that ignoredTools fast-path bypasses dangerousWords (by design)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The CLI intentionally exits 0 on unparseable JSON (fail-open policy):
a transient serialization error must not block the AI session mid-flight.
The test was asserting the opposite. Updated all three malformed-payload
tests to verify graceful failure (no crash/stack trace) rather than an
error exit code.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
core.ts:
- fix review-git-push regex: literal space → \s+ so "git  push" can't bypass
- fix getConfig(): environments block was always hardcoded {} and never merged
  from global/project config files; now applyLayer() accumulates environments
  correctly so strict-mode env overrides actually work

examples/node9.config.json.example:
- remove dangerousWords that caused false positives; keep only mkfs + shred
  (catastrophic, unambiguous — everything else handled by smartRules)
- add enterplanmode/enterworktree/exitworktree to ignoredTools
- add execute_query, query, mcp__postgres__*, mcp__github__* to toolInspection
- fix allow-readonly-bash regex: "npm run(build|test)" → "npm run (build|test)"
  (was matching "runbuild"/"runtest" instead of "run build"/"run test")
- remove smartRules already covered by built-in defaults:
  review-delete-without-where, block-force-push, block-drop-database, review-sudo
- remove "push"/"git" rules entries (match tool *names*, never fire for bash)
- remove non-functional environments block (was silently ignored until above fix)
- add approvalTimeoutMs:30000, version:"1.0", expanded snapshot.tools + ignorePaths

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ule bypass, tests

core.ts:
- move mergedEnvironments declaration before applyLayer closure so the captured
  variable is clearly in scope (was hoisted at runtime but misleading to read)
- replace unsafe Partial<EnvironmentConfig> spread with field-level validation:
  only copies requireApproval when it is a boolean, ignoring any other input
- add version guard in tryLoadConfig: warns to stderr when a config file declares
  a version other than "1.0" so future schema changes can be caught early

examples/node9.config.json.example:
- add notMatches (&&|\|\||;\s*\S) condition to allow-readonly-bash rule so
  chained commands like "cat /etc/passwd && rm -rf /" are NOT fast-allowed
- tighten review-secrets-write file_path regex: add (^|[/\\]) path separator
  anchor so "notmy.env" no longer matches — only actual dotenv files do

tests (advanced_policy.test.ts):
- 6 new tests for allow-readonly-bash: verifies safe commands are allowed,
  &&/||/; chaining is rejected, pipe-only chains remain allowed
- 4 new tests for environments merge: project overrides global, type-unsafe
  values are dropped, multiple envs merge independently

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ets detection

- allow-readonly-bash: add notMatches for $() and backtick to prevent
  command substitution bypass
- allow-install-devtools: add notMatches guard for -g/--global flags
- review-secrets-write: change to conditionMode any and add path/filename
  field checks alongside file_path
- Add review-command-substitution rule (catches $() and backtick)
- Add review-global-install rule (catches npm/yarn/pnpm -g/--global)
- Add tests: $() bypass, backtick bypass, npm install -g, --global,
  review-secrets-write multi-field (path, filename), version mismatch
  warning/rejection paths

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…n guard, secrets rule rename

- Restore rm and drop to dangerousWords (security regression from previous cleanup)
- Fix allow-readonly-bash notMatches: change ;\s*\S to bare ; so a
  trailing semicolon with no following content cannot bypass the guard
- Rename review-secrets-write → flag-secrets-access to reflect that it
  covers reads as well as writes; update reason string to match
- Add .env.bak test: dotfile backup IS flagged, notmy.env.bak is NOT
- Update all test fixtures to use ; instead of ;\s*\S to stay consistent
  with the corrected example config

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix block-rm-rf-home: use two-condition AND (conditionMode:'all') instead
  of a single complex regex that never matched rm -rf ~/foo in practice
- Narrow review-write-etc to write-indicative ops (tee, cp, mv, >, install)
  to avoid approval fatigue on read-only commands like cat/grep /etc/*
- Use Set for shield.dangerousWords in disable (O(1) vs O(n) lookup)
- Add shields.test.ts: 31 tests covering alias resolution, readActiveShields
  validation/corruption, atomic writes, and regex correctness including
  known bypass patterns

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Contributor

🤖 Claude Code Review

Code Review: Node9 Proxy — Shields Feature

Correctness & Edge Cases

disable has a scoping bug: remaining is declared twice — once inside the if (config.policy...) block and again outside it. The inner remaining is used for protectedWords, which is correct, but this shadowing is confusing and error-prone if the block structure changes.

readActiveShields is not used in enable/disable for config writes: The flow reads active shields, mutates config, writes config, then reads active shields again for writeActiveShields. There's a TOCTOU window between these two reads if concurrent CLI invocations race. The code acknowledges this with a comment, but it's worth flagging: the active list read after writeRawConfig could already be stale.

readRawConfig calls process.exit(1) on parse errors — this is a side effect inside what looks like a pure read function. The function signature returning Record<string, unknown> | null doesn't signal the exit behavior, which will surprise callers and makes unit testing impossible without mocking process.exit.

Security Issues

No validation of shield.smartRules or shield.dangerousWords before merging into config. The enable command trusts SHIELDS data directly, which is fine since it's compiled-in — but the existing config's smartRules array entries are cast with as Array<{ name?: string }> without validation. A malformed config could survive the filter and persist corrupt rule objects.

writeRawConfig writes config.json as pretty-printed JSON with no schema validation. An attacker with write access to ~/.node9/config.json could inject arbitrary policy rules — but this is a pre-existing concern, not introduced here.

The readActiveShields whitelist filter is good — unknown names are filtered out. The test correctly verifies this.

Test Coverage Gaps

shield enable and shield disable CLI commands have no tests. The test file only covers shields.ts exports. The config read/write/merge logic in cli.ts is untested, including the dangerousWords deduplication and smartRule prefix-filtering logic — which is where bugs are most likely.

The "disable word-protection" test in shields.test.ts is a tautology — it uses the same shield as both the "words to remove" set and the "protected" set, so the filter always returns everything. It doesn't actually test the cross-shield word protection logic from cli.ts.

No test for enable idempotency — calling enable twice on the same shield should not duplicate rules.

TypeScript Type Safety

as Array<{ name?: string }> and as string[] casts in cli.ts bypass runtime safety. These should at minimum be guarded with Array.isArray() checks before casting, consistent with how readActiveShields handles this in shields.ts.

Performance

No concerns on the critical path — shields are loaded at startup, not per-request.

Minor

The diff is truncated mid-rule (github:block-force-push), so the full SHIELDS definition can't be fully reviewed. The regex git\s+push.*(--force|-f\b|--force-with-lease) should be verified — \b inside a character class or string passed to RegExp constructor needs double-escaping; confirm it's \\b in the source.

⚠️ Note: This diff exceeded 20,000 characters and was truncated. The review above covers only the first portion of the changes.


Automated review by Claude Sonnet

… file permissions, test correctness

- Fix `remaining` declared twice in disable handler (hoisted before the if block)
- Replace unsafe `as T | undefined` casts with Array.isArray() guards in enable/disable
- Add { mode: 0o600 } to writeActiveShields and writeRawConfig (consistent with credentials/config writes)
- Remove existsSync before mkdirSync in writeActiveShields (recursive:true is idempotent, existsSync added a TOCTOU window)
- Fix tautological word-protection test: use concrete sets to exercise both the "survives" and "removed" branches
- Add second word-protection test: words unique to disabled shield are fully removed
- Add undefined guard in regex test helpers (fixes TS2769 typecheck error)
- Add enable idempotency tests for smartRules and dangerousWords deduplication

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Contributor

🤖 Claude Code Review

Code Review: Node9 Proxy — Shield Templates Feature

Correctness & Edge Cases

TOCTOU in enable/disable flow: readActiveShields() and writeActiveShields() are called separately from readRawConfig()/writeRawConfig(). Two files are updated non-atomically. A crash between the two writes leaves them inconsistent — e.g., active shields list says postgres is enabled but config.json has no rules for it. These should be a single atomic write or at minimum the active shields list should be derived from config.json at read time rather than stored separately.

disable skips writeRawConfig when policy is absent: If config.policy doesn't exist, writeRawConfig is never called but writeActiveShields still runs. The active shields file now claims the shield is disabled while config was never confirmed consistent. Low risk today, but fragile.

readRawConfig calls process.exit(1) from a library-style function: This is called from CLI action handlers, but the function is defined as a general utility. Any future non-CLI consumer (e.g., daemon importing shields logic) would get a hard process exit on a bad config file. Throw instead and let the caller decide.

Security Issues

Unvalidated smartRules round-trip: When merging rules, the existing policy.smartRules array is read from the config file and spread directly with only name filtering. If the config was tampered with, malicious rule objects bypass the shield definitions entirely. The existing rules should be re-validated against known-good SmartRule shape before merging.

dangerousWords read from config without validation: existingWords is cast as string[] but individual elements aren't type-checked. A non-string element (e.g., from a corrupted/malicious config) gets spread into the policy. Add element-level filtering (same pattern used correctly in readActiveShields).

File mode 0o600 only on the temp file: renameSync preserves the mode of the temp file, which is correct. But if the destination already exists with broader permissions (e.g., 0o644 from a previous tool), rename replaces it — this is actually fine. No issue here, just confirm this is intentional.

Race Conditions / Async

Last-writer-wins acknowledged but not mitigated: The comment admits there's no advisory lock. For a security tool where two concurrent shield enable calls could silently drop rules from each other, this deserves at minimum a retry-with-backoff or a file lock (e.g., proper-lockfile). The random-suffix tmp file helps with write corruption but not with read-modify-write races.

TypeScript Type Safety

config.policy cast chain is unsafe: config.policy as Record<string, unknown> then individual fields cast as Array<{name?: string}> and string[] — none of these are validated. A Zod schema or manual validation function at the config boundary would be appropriate for a security tool.

The diff truncates shields.ts at line 271 — the actual SmartRule regex patterns can't be reviewed. The regex tests in the test file look reasonable but the source regexes are unverifiable from this diff.

Test Coverage Gaps

  • No test for the concurrent write race (expected given complexity, but worth noting)
  • No test for writeRawConfig when ~/.node9/ doesn't exist yet
  • No test covering the TOCTOU scenario between readRawConfig and writeActiveShields
  • enable and disable CLI actions are untested end-to-end (only unit-testing the underlying utilities)

Minor

process.exit(1); return; pattern after resolveShieldName fails — the return is unreachable but harmless. Remove for clarity.


Bottom line: The utility functions and tests are well-structured. The two critical issues to fix before merge are the TOCTOU between the two file writes and the unvalidated rule/word round-trip from disk — both are meaningful security gaps for a tool whose entire value proposition is being a trusted security layer.

⚠️ Note: This diff exceeded 20,000 characters and was truncated. The review above covers only the first portion of the changes.


Automated review by Claude Sonnet

…n mutation

Shields are now applied as a dedicated layer inside getConfig(): readActiveShields()
reads ~/.node9/shields.json, maps each name to the in-memory SHIELDS catalog, and
appends their smartRules/dangerousWords to the runtime policy.

enable/disable now only write shields.json — config.json is never touched.
This eliminates the two-file TOCTOU, the merge/unmerge complexity, and the
read-modify-write race condition. Shield rules also update automatically when
the catalog changes in a new binary release.

Deleted: SHIELD_CONFIG_PATH, readRawConfig, writeRawConfig, all merge/unmerge logic.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Contributor

🤖 Claude Code Review

Code Review: Node9 Proxy — Shield Templates Feature

Correctness & Edge Cases

core.ts — dangerousWords deduplication missing
The shield layer deduplicates smartRules by name but blindly pushes dangerousWords without any deduplication. If two active shields share a word, or if the user's config already contains that word, it will appear multiple times. The test file even validates dedup logic for words — but core.ts doesn't implement it.

core.tsgetConfig() is called on every tool check
readActiveShields() does a synchronous fs.readFileSync on every single policy evaluation. This is a new disk read on the critical path for every AI tool call. The shields file should be cached (with invalidation on write, or TTL-based).

cli.ts — redundant non-null assertions
After if (!name) guards, name! is used repeatedly. The ! is harmless but noisy — name is already narrowed to string. Same for getShield(name!)! — the ! on getShield suppresses a legitimate null check unnecessarily since the catalog is guaranteed to have it post-resolve, but it's still a code smell.

Security Issues

readActiveShields() — no path validation on shield names
The test confirms unknown names are filtered out, which is good. However, the filtering relies on resolveShieldName or a known-names check in the implementation (not shown fully). Confirm the implementation validates against the catalog and doesn't use the raw string from disk as a lookup key in any filesystem operation.

Shield rules match on field: 'sql' with tool: '*'
Matching any tool on a sql field means these rules fire even for tools that would never have a sql field. This is benign (they'd just not match), but it's imprecise and could cause confusion when debugging why a rule didn't trigger.

block-drop-column regex
ALTER\s+TABLE.*DROP\s+COLUMN with .* is greedy and can span across a multi-statement string. Consider [\s\S]*? with anchoring or multiline mode — though for blocking this is fail-safe.

Race Conditions / Async

writeActiveShields atomic rename + concurrent reads
The atomic write (temp file + rename) is good practice. However, since readActiveShields in getConfig() is on the hot path and unsynchronized, a concurrent enable/disable during an active tool check is a benign TOCTOU — acceptable, but worth a comment.

TypeScript Type Safety

listShields return type
Not visible in the diff, but listShields() in tests is called without type checking the return — ensure it's typed as ShieldDefinition[], not any.

Test: matches() helper
c.value === undefined check is good, but c.flags could also be undefined — new RegExp(c.value, undefined) is valid JS but worth being explicit.

Test Coverage Gaps

  • No test for the core.ts shield layer integration (i.e., that getConfig() actually merges shield rules correctly).
  • No test for dangerousWords deduplication in getConfig() — only the CLI disable logic is tested.
  • shield status and shield enable/disable CLI commands have zero test coverage.
  • No test for concurrent readActiveShields during a write.

Minor

README diff is fine — cosmetic roadmap update only.

Summary: The main actionable issues are: (1) missing dangerousWords dedup in core.ts, (2) synchronous disk read on every tool-call hot path, and (3) no integration test for the getConfig() shield merge.

⚠️ Note: This diff exceeded 20,000 characters and was truncated. The review above covers only the first portion of the changes.


Automated review by Claude Sonnet

- github shield: remove block-force-push (exact duplicate of built-in block-force-push)
  Rename review-delete-branch to review-delete-branch-remote, narrow it to only
  cover `git push --delete` (the part not already caught by built-in review-git-destructive)
- filesystem shield: remove mkfs from dangerousWords (already in built-in DANGEROUS_WORDS)

Shields should only add coverage beyond what built-ins already provide.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Contributor

🤖 Claude Code Review

Code Review: Node9 Proxy — Shields Feature

Correctness & Logic Issues

getConfig() — dangerousWords deduplication missing
Smart rules are deduplicated by name, but dangerousWords are pushed unconditionally. If two active shields share a word, or if the user's config already contains it, duplicates accumulate on every getConfig() call. Since getConfig() is on the critical path for every tool call, this is a correctness and performance issue. Apply the same Set-based dedup used for sandboxPaths just below.

getConfig() is called on every tool call — readActiveShields() does a synchronous disk read each time
Shields are re-read from disk on every invocation. There's no caching. For a high-frequency agent this adds latency and unnecessary I/O on the hot path. Even a process-lifetime cache with a file-watcher invalidation would be better.

Non-null assertions after null guard in cli.ts

const name = resolveShieldName(service);
if (!name) { ... process.exit(1); }
const shield = getShield(name!)!; // redundant ! — already narrowed

The ! operators are noise after the guard, but more importantly getShield(name!)! double-asserts. If getShield somehow returns null (catalog bug), this silently crashes. Assert explicitly or handle the case.

Security Issues

Shield rules applied after user config "cannot be overridden" — but dangerousWords can be
The comment says shields override user config, but a user config that clears dangerousWords: [] would be merged first, then shield words appended. However, a user config smartRules entry with the same name as a shield rule would be kept (shield rule skipped due to dedup). That means a compromised project config can neutralize shield smart rules by pre-seeding their names. This is a meaningful security gap for a tool claiming to be an "execution security layer."

readActiveShields() trusts file content after filtering — but only validates against known shield names
This is good. No issue here.

Atomic write in writeActiveShields uses crypto (imported in shields.ts) for the temp filename?
crypto is imported but its usage isn't shown in the diff. If it's used for the temp file name, that's fine. If it's unused, remove it.

Race Conditions / Async

writeActiveShields is synchronous and atomic (rename), which is correct. No issue.

readActiveShields in getConfig() on concurrent tool calls — multiple simultaneous calls could read a partially-written file between writeFileSync and renameSync. The atomic rename mitigates this for writes, but readers between the two calls see the old state. This is acceptable given OS-level rename atomicity on POSIX.

TypeScript Type Safety

getShield(name!)! double non-null assertion — noted above. Minor but indicative.

condition.value === undefined check in tests throws an error — this is a test-only concern, but if value is typed as optional in SmartRule, the production evaluation path should also handle undefined gracefully rather than relying on tests to catch it.

Test Coverage Gaps

  • No test for getConfig() shield layer integration — the most critical path.
  • No test for the dangerousWords deduplication (because it doesn't exist yet).
  • No test for shield name collision: user rule with same name as a shield rule suppressing the shield rule.
  • writeActiveShields test verifies atomic write mechanics but not that the file is read back correctly end-to-end.

Minor

  • resolveShieldName alias 'git''github' is ambiguous; git is a local tool, github is a hosted service. This could confuse users enabling the wrong shield.
  • README diff looks fine.

Summary: The core logic is sound and the test suite is well-structured, but there are three actionable issues before merge: dangerousWords dedup in getConfig(), the per-call disk read on the hot path, and the shield-name-collision security bypass.

⚠️ Note: This diff exceeded 20,000 characters and was truncated. The review above covers only the first portion of the changes.


Automated review by Claude Sonnet

Replace the flat list of 6 config concepts with a clear two-layer model:
- Layer 1 (always on): core smart rules protect git, SQL, shell for everyone
- Layer 2 (shields): opt-in per-service protection via node9 shield enable

Changes:
- Add "How Protection Works" section with Layer 1/Layer 2 tables
- Quick Start now shows shield enable as step 2 (primary onboarding action)
- Rename "Configuration" to "Custom Rules (Advanced)" — signals power-user territory
- Remove rules array from config examples (legacy)
- Remove "built-in default smart rule" section (internal detail)
- Mark Shield Templates as done [x] in roadmap
- Simplify node9 explain output example

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Contributor

🤖 Claude Code Review

Code Review: Node9 Proxy — Shields Feature

README Changes

These are documentation-only cleanup changes. The reorganization is clear and the new Layer 1/Layer 2 framing is a genuine usability improvement. No issues.

One minor concern: removing the "Resolution Waterfall" section loses documentation about the 5-tier precedence model, which is security-relevant behavior. Users need to understand that cloud locks cannot be bypassed locally. Consider preserving this elsewhere rather than deleting it entirely.


src/__tests__/shields.test.ts

The test file is cut off mid-diff, so this review is necessarily incomplete — the actual shields.ts implementation is not included. That's a significant gap; the implementation is what matters for security correctness.

Positive observations:

  • Atomic write test (writeActiveShields) correctly verifies the rename pattern
  • Input validation tests for corrupted/malformed shield state files are thorough and security-relevant
  • Case-insensitive alias resolution is tested
  • Idempotency tests for rule merging are good

Issues:

  1. Incomplete diff reviewed. The shields.ts source isn't included. The idempotency test for smart rule deduplication (lines ~170–185) tests logic that's inline in the test itself, not the actual implementation. This test cannot catch bugs in the real enable function — it's testing a local simulation. This needs to test the actual enableShield() function.

  2. readActiveShields trust boundary. The test correctly validates that unknown names are filtered — good. But verify the implementation uses strict equality against the known shield list, not something like includes on a user-controlled array.

  3. writeActiveShields test doesn't verify mkdirSync is called. If the config directory doesn't exist and mkdirSync is never asserted, a real-world failure path (first-time setup) goes untested.

  4. Missing: concurrent write test. This is a security tool. If two processes call writeActiveShields simultaneously, the atomic rename mitigates corruption, but there's no test verifying the temp file path is unique (e.g., uses PID or random suffix) rather than a fixed name.

  5. Missing: shield rule content tests. The AWS and GitHub shield rules (blocking S3 deletion, IAM changes, etc.) have no regex correctness tests in this diff. Given the security stakes, each shield's rules should have positive and negative match tests — similar to the filesystem block at the bottom.

  6. readFileSyncSpy mock leakage risk. vi.clearAllMocks() is called in beforeEach, but the spy on os.homedir is set at module scope before the import. If test order changes or modules are cached differently, the mock may not apply. Prefer vi.mock('os', ...) at the top level.

  7. No test for readActiveShields throwing a non-ENOENT error (e.g., permission denied). The implementation should likely re-throw or handle EACCES differently from ENOENT — a permissions error on the shields file is a security signal, not a "no shields active" state.


Summary

The README changes are fine. The test file has structural gaps (no implementation diff, simulated deduplication logic, missing edge cases for non-ENOENT errors and concurrent writes) that need addressing before merge. The missing shields.ts implementation is the biggest blocker — cannot approve security-critical logic without reviewing it.

⚠️ Note: This diff exceeded 20,000 characters and was truncated. The review above covers only the first portion of the changes.


Automated review by Claude Sonnet

…esGlob ops

- Remove legacy `rules` (action/allowPaths/blockPaths) from Config, schema, and both evaluators
- Add `matchesGlob` / `notMatchesGlob` condition operators using picomatch
- Move block-rm-rf-home into built-in DEFAULT_CONFIG smartRules (evaluated first)
- Add ADVISORY_SMART_RULES (allow-rm-safe-paths, review-rm) appended last in getConfig()
  so user-defined rules can override default rm behaviour
- Pattern `(^|&&|\|\||;)\s*rm\b` covers chained commands without false-positives on `docker rm`
- Update README operators table with matchesGlob / notMatchesGlob
- Update all affected tests

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Contributor

🤖 Claude Code Review

Code Review

README.md Changes

Purely documentation — looks good. The restructuring is cleaner and more user-friendly. The new matchesGlob/notMatchesGlob operators added to the conditions table are a notable addition worth flagging below.


src/__tests__/advanced_policy.test.ts

Issue — Silent test assumptions: Several tests now rely entirely on built-in advisory rules firing without any config mock, but beforeEach sets up fs.existsSync as returning false by default. If evaluatePolicy short-circuits config loading when the file doesn't exist, these tests only pass because the built-in rules happen to be correct — there's no assertion that the reason or rule name is the expected built-in one. A future refactor that changes built-in rule priority could silently break these tests without them failing on the right property.

Recommendation: Assert result.rule or result.reason in addition to result.decision so tests are pinned to the specific built-in rule, not just the outcome.

Issue — Overlapping assertion in the last test block:

expect((await evaluatePolicy('Bash', { command: 'rm -rf dist/' })).decision).toBe('allow');

This runs with the rm .env block config still active. The config mocks .env specifically, so dist/ should fall through to advisory rules — but this isn't documented and the mock scope isn't reset between assertions in the same test. Fragile.


src/__tests__/gemini_integration.test.ts

Issue — rules: [] removal: Removing rules: [] from the mock config base is fine if the rules field is no longer part of the schema, but if evaluatePolicy in core.ts still reads a rules key, this silently changes test behavior rather than surfacing a type error. There should be a TypeScript compile-time check that rules no longer exists on the config type.


src/__tests__/shields.test.ts (truncated)

The diff cuts off at line 239. Cannot review this file. If shields interact with the race engine or daemon endpoints, missing coverage there is a security gap — shield bypass during race conditions would be a critical flaw.


Cross-cutting Concerns

matchesGlob/notMatchesGlob operators (documented but no code diff shown): These are advertised in the README but no implementation is visible in this diff. If the glob matching uses a user-supplied pattern fed into something like minimatch or micromatch, there's a potential ReDoS risk on the critical path. Needs review when the implementation lands.

No test for shield-vs-Layer1 interaction ordering: The README states Layer 1 is "always on" and shields are additive, but there's no test asserting that a shield rule cannot downgrade a Layer 1 block to review or allow. This is a security correctness gap.

Missing negative test: There's no test verifying that a project smartRule with verdict: "allow" cannot override a built-in block rule. That would be a policy bypass vulnerability.


Summary

The test migration from rules-based to smartRules-based policy is directionally correct but introduces fragile implicit dependencies on built-in rule behavior. The truncated shields test file and missing operator implementation are the biggest unknowns. The security-critical ordering invariant (Layer 1 cannot be downgraded) needs explicit test coverage before merge.

⚠️ Note: This diff exceeded 20,000 characters and was truncated. The review above covers only the first portion of the changes.


Automated review by Claude Sonnet

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Contributor

🤖 Claude Code Review

Code Review

README.md

Changes look good — cleaner structure, better onboarding flow. The new matchesGlob/notMatchesGlob operators are documented but I don't see them implemented in this diff. If they're not yet in core.ts, documenting them now is misleading for a security tool where users will write rules expecting these to work.

The removal of the Resolution Waterfall section is a regression in documentation. Users need to understand the 5-tier precedence model to reason about whether their local rules can be overridden — this is security-critical information, not cosmetic.


src/__tests__/advanced_policy.test.ts

Critical gap: The test 'a project smartRule can block rm on a sensitive path before advisory rules fire' asserts both block and allow in a single it block with a single mocked config. The config only includes the rm.*\.env block rule, but the second assertion relies on allow-rm-safe-paths being a built-in rule. If the built-in rule is ever removed or renamed, this test still passes the first assertion and silently fails the contract. These should be separate test cases.

Regex escaping concern: The condition 'rm.*\\.env' in the test config is passed as a JSON string in code. Verify that core.ts compiles this with new RegExp(value, flags) and not eval — the diff doesn't show this, but it's the right place to check.

Missing negative test: There's no test verifying that a built-in advisory rule (verdict: review) cannot be escalated to block by additive merging. This is the core security invariant of the waterfall model.


src/__tests__/gemini_integration.test.ts

The removal of rules: [] from the mock base config is fine if the rules field is fully deprecated. But if core.ts still reads policy.rules, removing it from the mock means tests no longer verify that an empty rules array doesn't accidentally trigger a match. Low risk, but worth confirming rules is dead code.


src/__tests__/shields.test.ts

The file is truncated — it ends at impo (mid-import). This is the most significant issue in the diff. A partial test file will cause the test suite to fail to compile entirely, blocking CI. This should not be merged.


Summary of Blocking Issues

  1. shields.test.ts is truncated — will break the build.
  2. matchesGlob/notMatchesGlob operators are documented but not shown implemented — do not ship docs for unimplemented security features.
  3. Resolution Waterfall documentation removed — restore it; precedence rules are load-bearing security knowledge for users writing configs.

⚠️ Note: This diff exceeded 20,000 characters and was truncated. The review above covers only the first portion of the changes.


Automated review by Claude Sonnet

- Add tests proving built-in block rules (block-rm-rf-home, block-force-push)
  cannot be bypassed by a user-defined allow rule
- Restore Configuration Precedence section to README with 5-tier waterfall
  and note that built-in blocks always fire before user rules

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Contributor

🤖 Claude Code Review

Code Review

README.md

Changes are documentation-only reorganization. Content is clearer and better structured. One minor concern: the new "Configuration Precedence" section says "first-match-wins: built-in defaults → global → project → shields → advisory defaults" — this order implies built-in defaults fire first, which should mean they can't be overridden by project config. But earlier you say "Higher tiers win" with env vars at tier 1. These two statements appear contradictory and could confuse users trying to understand whether project rules can override built-ins. Clarify which direction the precedence actually flows in the source.

The removal of the undo example output and the Resolution Waterfall section loses genuinely useful documentation without explanation. At minimum the waterfall concept (now the "Configuration Precedence" table) survives, but the undo diff example was concrete and helpful.

src/__tests__/advanced_policy.test.ts

Correctness concern: The test 'a project smartRule can block rm on a sensitive path before advisory rules fire' sets up a mock config with a block rule for .env, then in the same test asserts rm -rf dist/ returns allow. The dist/ assertion is testing advisory built-in behavior but shares the mocked config from the .env block rule setup — this is fine only if smartRules don't interfere with the allow-rm-safe-paths advisory rule. This assumption should be explicit in a comment, and ideally tested in isolation.

Coverage gap: There's no test asserting that a project-level allow rule cannot override a Layer 1 built-in block. The README claims this invariant ("a user config cannot bypass Layer 1 protection") but the new core.test.ts section is cut off in the diff at the comment // ── Layer 1 security invariant ─────────────────── — we can't verify those tests actually exist. This is the most critical behavioral guarantee of the system and needs visible test coverage.

Missing negative case: The old test 'blocks "rm -rf .env" using explicit blockPaths' verified a block verdict. The replacement downgrades the expectation to review. If .env deletion is now only review by default (not block), that's a security regression that should be explicitly called out and justified, not quietly changed via a test rename.

src/__tests__/core.test.ts

The diff is truncated — the new Layer 1 invariant tests are not visible. Cannot confirm they exist or are correct. This is a red flag for a security-critical PR; the full diff should be required before merging.

Summary

The main concerns in priority order:

  1. Truncated test diff — Layer 1 invariant tests must be visible and reviewed.
  2. .env block → review regression — needs explicit justification.
  3. README precedence contradiction — "higher tiers win" vs "first-match-wins from defaults up" need reconciliation.
  4. Mixed-concern test assertions in the dist/ case.

Do not merge until the core.test.ts diff is complete.

⚠️ Note: This diff exceeded 20,000 characters and was truncated. The review above covers only the first portion of the changes.


Automated review by Claude Sonnet

DLP Engine (src/dlp.ts):
- 7 built-in patterns: AWS key, GitHub token, Slack, OpenAI, Stripe, PEM, Bearer
- Recursive scanner with depth limit (5) and string length cap (100 KB)
- JSON-in-string detection for agents that stringify nested objects
- maskSecret() — only redacted sample stored, full secret never leaves dlp.ts
- severity: 'block' for known high-confidence patterns, 'review' for Bearer

Core integration (src/core.ts):
- DLP check runs before ignoredTools fast-path and audit mode
- Hard block for 'block' patterns; 'review' falls through to race engine
- DLP step 0 in explainPolicy waterfall
- dlp config merged per-layer in getConfig() with enabled/scanIgnoredTools

CLI (src/cli.ts):
- DLP-specific negotiation message (rotate the key, use env vars, don't retry)
- chalk.bgRed.white.bold alarm banner when blockedByLabel includes 'DLP'

Cursor fix (src/setup.ts):
- Remove hooks.json writing — Cursor does not support this format
- Print clear warning that native hook mode is pending Cursor support
- Only MCP proxy wrapping is configured

Shields fix (src/shields.ts):
- Treat empty shields.json as missing (suppress spurious parse warnings in tests)

Tests: 353 passing (22 new DLP tests, fake secrets split via concatenation)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Contributor

🤖 Claude Code Review

Code Review

README Changes (Documentation Only)

Clean restructuring — clearer onboarding flow, better separation of Layer 1 vs Layer 2 concerns. Minor note: the new "Configuration Precedence" table says "first-match-wins: built-in defaults → global → project → shields" but elsewhere implies higher tiers win. That ordering is counterintuitive and worth a clarifying sentence to avoid user confusion.

The removal of the node9 init step from the quickstart could leave users without a ~/.node9/config.json, causing silent failures. Verify the binary handles the missing file gracefully before removing this from docs.


advanced_policy.test.ts

Correctness concern — incomplete test migration:
The new test 'allows "rm -rf node_modules" via built-in allow-rm-safe-paths rule' relies entirely on a built-in rule named allow-rm-safe-paths existing and firing correctly. There's no assertion on the reason field or rule name, so if the built-in rule is renamed or removed, this test still passes vacuously (it only checks decision === 'allow'). Add expect(result.rule).toContain('allow-rm-safe-paths') or similar.

Compounding assertion in one test:

expect((await evaluatePolicy('Bash', { command: 'rm -rf dist/' })).decision).toBe('allow');

This is a second scenario inside 'a project smartRule can block rm...'. Split it — mixing a "project rule blocks" scenario with an "advisory allows" scenario makes failure diagnosis harder.

Missing mock teardown between assertions:
The mock config is set up once but two evaluatePolicy calls are made. If evaluatePolicy has internal caching, the second call may or may not re-read the config. Verify the config cache is reset between calls, or add explicit vi.clearAllMocks() between them.


core.test.ts (partial diff)

The diff is cut off at the // ── Layer 1 security invariant ─────────────────── comment. This is the most security-critical addition and it's entirely missing from the review. Cannot assess the invariant tests without seeing them. If these tests are meant to assert that Layer 1 rules cannot be bypassed by user config, their absence from the diff is a significant gap — reviewer cannot confirm correctness.


Security-Specific Concerns

  1. matchesGlob / notMatchesGlob operators are documented but not visible in the test diff. Glob matching in security rules is a known footgun — **/node_modules/** won't match node_modules/ (no trailing **). Confirm the glob library used handles edge cases and add tests for boundary patterns.

  2. New notMatches regex description (flags optional) — ensure the engine doesn't throw on a notMatches condition with no flags key vs. an explicit flags: undefined. A thrown exception here would likely default to allow, which is dangerous.

  3. The evaluation order comment ("built-in defaults → global → project → shields") means a project allow rule fires after built-in block rules — good. But it also means a project block rule fires before shield block rules. Confirm this is intentional and document it explicitly.


Summary

The README restructuring is solid. The test migration has three actionable issues: missing rule-name assertions, compound assertions in single tests, and the truncated Layer 1 invariant tests are the biggest gap. No source code logic changes are visible in this diff, so the core security engine itself cannot be reviewed here.

⚠️ Note: This diff exceeded 20,000 characters and was truncated. The review above covers only the first portion of the changes.


Automated review by Claude Sonnet

…rity

advanced_policy.test.ts:
- Add ruleName assertion to allow-rm-safe-paths test (reviewer: vacuous test)
- Split compound test into two focused its (block vs allow scenarios)
- Add name field to project smartRule fixture so ruleName is assertable

core.ts:
- Return ruleName on allow verdict (was only returned for block/review)

core.test.ts:
- Add matchesGlob tests: path matching, boundary patterns, notMatchesGlob
- Add notMatches-no-flags tests: no throw + correct evaluation
- Fix notMatchesGlob test: add explicit block rule so fallthrough is observable

README.md:
- Add clarifying note: settings override order (Tier 1 wins) and smart rules
  evaluation order (defaults first) run in opposite directions — plus note that
  project block fires before shield block by design

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Contributor

🤖 Claude Code Review

Code Review: Node9 Proxy PR

README Changes (Low Risk)

The documentation restructuring is clean and generally improves clarity. A few observations:

Correctness concern — contradictory evaluation order explanation:
The new "Configuration Precedence" note describes two opposing orderings in the same section and acknowledges they "run in opposite directions." This is confusing enough that it could cause users to misconfigure security rules believing their project-level allow rule will fire before built-in blocks. Consider a clearer diagram or concrete example rather than a prose caveat.

Missing documentation:
The node9 init command remains in the CLI reference table but the quick-start section no longer mentions it. Minor inconsistency.

matchesGlob / notMatchesGlob operators are added to the docs but there's no corresponding code change visible. If these operators aren't implemented yet, documenting them in a security tool is a risk — users may rely on them and get a silent non-match instead of an error.


Test Changes (advanced_policy.test.ts)

Correctness — implicit reliance on built-in rules:
The new tests remove mock configs and assert against built-in advisory rule names like 'allow-rm-safe-paths' and 'review-rm'. This tightly couples tests to internal rule names. If a built-in rule is renamed or reordered, tests fail without any policy regression — they become maintenance noise rather than security signal.

Missing negative assertion:
The test 'advisory allow-rm-safe-paths still fires after a project block rule (safe path)' is truncated in the diff — it's cut off mid-test. Cannot review it. This is a gap; if this test is incomplete or missing, the evaluation-order guarantee (project block → advisory allow) is untested.

Security gap — no test for evaluation order inversion:
There's no test verifying that a project-level allow rule cannot override a built-in block rule. The README explicitly promises this guarantee ("built-in block rules always fire before any user-defined allow rules"), but nothing in this diff tests the adversarial case.

Missing test cases for new matchesGlob operator:
If matchesGlob is being shipped, it needs tests covering: glob matching, non-matching, and invalid patterns (does it throw or return false?).

vi.spyOn(fs, 'readFileSync') without restore:
The beforeEach clears mocks, but spying on readFileSync mid-test without explicit cleanup can leak across tests if a test throws before completion. Prefer vi.mocked or ensure afterEach restores.


Summary

The main risks are:

  1. Undocumented/unimplemented operators (matchesGlob) — don't document what isn't shipped and tested in a security tool.
  2. No adversarial test for the core security guarantee (user config cannot bypass Layer 1 blocks).
  3. Truncated test needs to be completed before merge.

⚠️ Note: This diff exceeded 20,000 characters and was truncated. The review above covers only the first portion of the changes.


Automated review by Claude Sonnet

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Contributor

🤖 Claude Code Review

Review

README Changes

These are documentation-only changes — reorganization, simplification, and new shield/roadmap content. The changes are clear and accurate. One minor concern: the new configuration precedence note states "built-in block rules always fire before any user-defined allow rules," but this contradicts the evaluation order description immediately following it ("project-level block rule fires before shield block rules"). The two orderings are explained, but the prose is dense enough that a user could easily misread the precedence. Worth tightening.

Test Changes

Critical: Tests assert implementation details that aren't tested

The new tests assert result.ruleName (e.g., 'allow-rm-safe-paths', 'block-rm-env'), but there's no corresponding test that verifies evaluatePolicy actually returns a ruleName field. If EvaluateResult doesn't include ruleName in its type, these assertions silently pass (they'd check undefined === 'allow-rm-safe-paths' → false and fail, actually — but the point is this field should have a dedicated type-safety test).

Missing: Truncated diff

The diff is cut off mid-test (it('advisory allow-rm-safe-paths still fires after a project block rule...'). Can't review completeness of the new test suite.

Test isolation concern

Tests 1–3 ("No config needed — built-in advisory rule covers...") rely on built-in rules being active with no mocked config. But beforeEach resets mocks — it's not shown what fs.existsSync returns by default after reset. If it returns true unexpectedly and readFileSync throws, these tests could produce misleading failures. Each config-free test should explicitly mock fs.existsSync to return false (no project config).

Coverage gap

No test for the rule evaluation ordering boundary: a project smartRule with verdict: 'allow' attempting to bypass a built-in block rule. This is the most security-critical invariant in the system ("a user config cannot bypass Layer 1 protection") and has zero coverage here.

Security gap in test semantics

The regex in the new test — 'rm.*\\.env' — would match rm-env-backup.sh. For a security tool, test inputs should also verify that overly-broad patterns are detectable. Not a bug in the test itself, but a gap worth noting.

Summary

  • Fix test isolation: explicitly mock fs.existsSync to false for config-free tests
  • Add a test asserting a user allow rule cannot override a built-in block
  • Verify ruleName is in the EvaluateResult type definition
  • Clarify the README configuration precedence prose — the two-direction ordering explanation will confuse users
  • Get the full diff; the truncated test is unreviewed

⚠️ Note: This diff exceeded 20,000 characters and was truncated. The review above covers only the first portion of the changes.


Automated review by Claude Sonnet

… 1 invariant

README.md:
- Rewrite configuration precedence note as two distinct scannable paragraphs:
  settings (table/tier order) vs smart rules (concatenated list, first-match-wins)
- Add code block showing the exact evaluation order for smart rules
- Remove the dense single-paragraph explanation that reviewers found confusing

advanced_policy.test.ts:
- Add explicit comment that config-free tests rely on beforeEach existsSpy=false
- Add ruleName assertions to "reviews rm" tests (not just decision)
- Add Layer 1 bypass invariant test: user allow-all rule must not override
  built-in block-force-push — the most security-critical coverage gap identified

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Contributor

🤖 Claude Code Review

Code Review

README Changes

Documentation-only changes. The restructuring is clearer and the new "Shields" concept is well-explained. The configuration precedence table accurately reflects the rule concatenation model described. No issues.


src/__tests__/advanced_policy.test.ts

Correctness / Security

  1. Layer 1 invariant test is the most important test here — verify it actually fails without the invariant. The test asserts block-force-push wins over a user allow rule. This is only meaningful if you've confirmed the test fails when Layer 1 ordering is removed. As written, if evaluatePolicy has a bug where user rules run first, this test would still pass if block-force-push happens to match before '.*' by coincidence (e.g., alphabetical rule ordering). Consider adding a complementary negative test or a comment confirming the failure mode was verified.

  2. ruleName field contract is untested elsewhere. These tests now rely on result.ruleName being populated correctly, but there's no test that ruleName is undefined/absent on a default allow (no rule matched). If evaluatePolicy returns ruleName inconsistently, silent test passes could mask bugs.

  3. Diff is truncated. The last test ('a project smartRule can block rm on a sensitive path before advisory rules fire') is cut off mid-body. Cannot review it — this is a gap.

  4. beforeEach default assumption is fragile. The comment says "existsSpy returns false" but the actual beforeEach setup is not shown in the diff. If a future contributor changes beforeEach, the built-in-only tests silently break. These tests should explicitly call vi.mocked(fs.existsSync).mockReturnValue(false) themselves rather than relying on inherited state.

  5. Case sensitivity inconsistency. The Layer 1 invariant test uses 'bash' (lowercase) while others use 'Bash' (capitalized). If evaluatePolicy is case-sensitive on tool names, the invariant test may not match the same rule set. This should be consistent, and the tool name normalization behavior should be explicitly tested.

Missing coverage:

  • No test for matchesGlob / notMatchesGlob operators (newly documented in README).
  • No test for the notMatches operator with the optional flags field (documented as optional, but is absence handled gracefully?).
  • No test verifying shield rules slot correctly between project config and advisory defaults in the concatenation order.

Performance: No concerns — these are unit tests.


Summary

The documentation is a net improvement. The test changes move in the right direction (testing invariants over implementation details), but the truncated test, case inconsistency on tool names, and reliance on implicit beforeEach state are concrete issues to fix before merging. The Layer 1 invariant test needs a comment or companion negative test to prove it's actually load-bearing.

⚠️ Note: This diff exceeded 20,000 characters and was truncated. The review above covers only the first portion of the changes.


Automated review by Claude Sonnet

@github-actions
Copy link
Contributor

🤖 Claude Code Review

Review

README Changes

Documentation-only changes. The new structure (Layer 1 / Layer 2 / Shields model, config precedence table, matchesGlob/notMatchesGlob operators) is clearer and better organized than what it replaces. No issues.


src/__tests__/advanced_policy.test.ts

Security-critical test gap — the most important finding:

The Layer 1 invariant test is the right instinct but is incomplete. It only verifies one built-in block rule (block-force-push). There's no test verifying that a user-supplied allow rule also cannot bypass other Layer 1 blocks (e.g., curl | bash, rm -rf /home). A single example doesn't prove the invariant holds structurally — consider a parameterized test across all built-in block rules, or at least 2–3 more cases.

Missing negative case for matchesGlob / notMatchesGlob:
The README documents two new operators (matchesGlob, notMatchesGlob), but there are zero tests for them. For a security tool, untested condition operators are a correctness and security risk — a broken glob match could silently fail open.

ruleName assertion dependency:
Several tests assert on result.ruleName (e.g., 'allow-rm-safe-paths', 'review-rm'). This is good for precision, but it also means any internal rename of a built-in rule will silently break user-facing behavior before breaking tests. Consider a comment noting these names are part of the stable API, or export them as constants.

Diff is truncated:
The diff cuts off mid-test (a project smartRule can block rm on a sensitive path before advisory rules fire). Can't fully review this case — the body is missing. If it's verifying that project rules can tighten but not loosen Layer 1, that's essential and needs to be confirmed complete.

beforeEach isolation assumption:
The comment says existsSpy returns false by default. Two tests within the same describe block then call vi.mocked(fs.existsSync).mockReturnValue(true) without a corresponding afterEach reset. If test ordering changes, this could leak mock state. Prefer mockReturnValueOnce or explicit afterEach resets.

No timeout test:
approvalTimeoutMs: 0 (wait forever) vs a nonzero value is a critical path behaviour. There's no test verifying that a timeout actually resolves to deny rather than hanging. This is both a correctness and DoS concern.


Summary

The README restructure is a clear improvement. The test changes move in the right direction — asserting ruleName and adding the Layer 1 bypass invariant test are both good. The critical gaps are: (1) incomplete Layer 1 invariant coverage, (2) zero tests for the new glob operators, (3) potential mock state leak between tests, and (4) the truncated test body that can't be reviewed. No source code changes are present in this diff, so no daemon/race engine/type safety issues to flag.

⚠️ Note: This diff exceeded 20,000 characters and was truncated. The review above covers only the first portion of the changes.


Automated review by Claude Sonnet

@nadav-node9 nadav-node9 merged commit 8ad4bfa into main Mar 21, 2026
8 checks passed
nadav-node9 pushed a commit that referenced this pull request Mar 21, 2026
## [1.0.14](v1.0.13...v1.0.14) (2026-03-21)

### Bug Fixes

* merge latest dev updates into main ([#24](#24)) ([8ad4bfa](8ad4bfa)), closes [hi#confidence](https://github.com/hi/issues/confidence)
@nadav-node9
Copy link
Contributor Author

🎉 This PR is included in version 1.0.14 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants