bon-338: pre-exec security hooks (W4.2 default hook set)#15
Conversation
…ties locked, 52 canonical rule IDs)
…ine = 1651 pass, 20 xfail blind-spots, 1 xpass)
The reconciling Sage's C1-C7 pattern test files import CANONICAL_DENY_RULE_IDS from tests.unit.test_security_patterns_module. Without tests/__init__.py, main-session pytest fails collection with ModuleNotFoundError: No module named 'tests'. Empty __init__.py turns tests/ into a package and restores the cross-file import pattern the reconciling Sage authored. Confirmed: 1651 passed, 20 xfailed (intentional blind-spot corpus), 1 xpassed (fullwidth NFKC) — matches Warrior's internal report byte-for-byte.
Antawari
left a comment
There was a problem hiding this comment.
Superpowered code-reviewer — SECURITY-FIRST lens
Verdict: APPROVE (posted as comment — GH rejects self-approval). Implementation is security-grade across the ten dimensions I traced. No fail-open vector surfaced; all high-priority bypass classes are closed or honestly tombstoned as xfails. Three non-blocking findings below.
Security dimensions — fixed (what I verified)
| # | Dimension | Status | Evidence |
|---|---|---|---|
| 1 | normalize() IFS + NFKC + backslash-NL |
OK | _IFS_RE covers $IFS, ${IFS}, $IFS$[0-9]; _BACKSLASH_NEWLINE_RE collapses \\\n; NFKC proven to fold fullwidth rm -rf / into ASCII and hit C1.1 (live re-run, security_hooks.py:107-124). |
| 2 | Unwrap completeness (sudo/bash-c/sh-c/timeout/nohup/env/xargs/watch/find-exec/chain/substitution) | OK | 9 regex peelers + chain-split + _extract_substitutions nested-paren walker; handles interleaved flags (bash -x -c '...') per unwrap tests. |
| 3 | Depth exhaustion | OK | _unwrap() loop's else: clause emits _UNWRAP_EXHAUSTED_SENTINEL; hook body at security_hooks.py:561 turns it into _infra.unwrap-exhausted DENY + event. Control flow is single-path fail-closed. |
| 4 | Keyword prefilter completeness | OK | All 52 canonical rules have a trigger substring in _PREFILTER_KEYWORDS; prefilter is SKIPPED entirely when user passes extra_deny_patterns (security_hooks.py:584). Critical: user patterns never silently bypass. |
| 5 | Broken user regex → DENY | OK | _compile_user_patterns called inside the try: (security_hooks.py:554); invalid regex raises → outer except → _infra.error DENY with event. Verified via test_invalid_regex_denies, test_unbalanced_paren_regex_denies, test_invalid_backref_regex_denies. |
| 6 | except Exception discipline |
OK | Grep of security_hooks.py + security_patterns.py shows zero except BaseException and zero bare except:. CancelledError re-raised explicitly (line 632). |
| 7 | Pattern TP/FP spot-check (12 rules, live re-run) | OK | C1.1, C1.5, C1.8, C2.3, C2.6, C4.1, C4.7, C5.1, C6.2, C7.2, C7.5, C7.7 all match canonical TPs and reject rm -rf node_modules, git push --force-with-lease, > /dev/null, git restore --staged, cat ~/.ssh/id_rsa.pub, chown -R foo /home/x. |
| 8 | Bus exception during emit | OK | _safe_emit catches Exception, logs, decision still returned (security_hooks.py:500-505). Test test_bus_emit_raises_still_denies + ValueError variant both pass. |
| 9 | Hook return contract | OK | hookEventName, permissionDecision, permissionDecisionReason keys — all exact camelCase (security_hooks.py:469-486); value is literal "deny"/"allow", not "DENY" or "block". |
| 10 | Matcher scope vs body narrow | OK | matcher="Bash|Write|Edit" unanchored AND body re-checks tool_name in ("Bash","Write","Edit") (security_hooks.py:542). BashTool, myBash, bash lower — all pass through. |
| 11 | Circular import | OK | protocols.py → dispatch.security_hooks → dispatch/__init__.py → sdk_backend.py (runtime) → no reverse edge to protocols (all protocol imports in dispatch are TYPE_CHECKING-guarded). Safe. |
| 12 | test_no_dispatch_imports carve-out |
SURGICAL | Whitelist is exact string "from bonfire.dispatch.security_hooks import SecurityHooksConfig" — any other from bonfire.dispatch still rejected. |
| 13 | DenyRule immutability |
OK | @dataclass(frozen=True, slots=True) at security_patterns.py:24. |
Findings
Non-security S1 (suggestion): _compile_user_patterns runs on EVERY hook call (security_hooks.py:554). For a dispatch with non-empty extra_deny_patterns fired N times per session, that's N × len(patterns) re.compile calls. Recommend caching at factory time — compile once in build_preexec_hook, then reuse. Defer broken-regex detection to factory-build: if compile fails, the factory still returns a hook that always emits _infra.error DENY (preserves fail-closed). v0.2 material.
Non-security S2 (suggestion): C5.1 regex ^\s*sudo\s+(?!(?:-n\s+)?(?:-l\b|--list\b)) whitelists sudo -l, sudo -n -l, sudo --list — but MISSES the combined short-flag form sudo -nl (equivalent to sudo -n -l). Verified via live regex run. Minor: produces a spurious WARN, not a false DENY. v0.2 fix.
Security S3 (observation, NOT a bug): rm -rf .git hits C1.1 DENY (.git not in whitelist). This is the intended "fail-closed" posture — accidental repo annihilation is exactly what the hook should block. Calling out so Anta knows the behavior when an agent tries to nuke a sub-checkout.
Trade-offs accepted
- 20 xfails — each cites a Scout-2/338 §5 blind spot with v0.2 disposition (tree-sitter-bash, confusables lib, entropy detection near
| sh). Discipline is honest; no bug hiding behind a marker. - 1 xpass on fullwidth NFKC — documented as "DOES catch" in blindspot #6.
- Category 6 (Obfuscation) is WARN-only per Sage D4 — base64-pipe-to-shell (C6.2) raises a WARN, not DENY. Encoded payload escape is a known blind spot; v0.2 AST work closes it.
- Write/Edit scans
file_pathonly, notcontentper Scout-2/338 §5.12 — write-then-execute is an accepted tombstone. - Cyrillic unicode lookalike is an explicit xfail (C6.6 covers only the NFKC-foldable range) per ambiguity #4.
- Event count 28 → 29 is asserted both mechanically (len check) and semantically (per-category count +
"security"in distinct-categories set). Locked.
Great work. The 1398-line Sage doc paid off — this is the cleanest security keystone I've reviewed.
— superpowered code-reviewer, BON-338 dual Wizard pass
Antawari
left a comment
There was a problem hiding this comment.
Verdict
APPROVE — MERGE
Wave 4 keystone PR closes the W4.2 default security hook set trust-triangle leg. 1651/1651 tests pass locally (570 canonical + 1081 baseline), 20 xfails match the Scout-2/338 §5 adversarial blind-spot corpus, 1 xpass is the heredoc dd of=/dev/sda catch (incidental-but-safe — the command IS denied via C1.2 matching the body). Sage D1..D12 all satisfied. All six reconciled ambiguities locked correctly.
Sage decision coverage
| ID | Requirement | Status | Evidence |
|---|---|---|---|
| D1 | dispatch/security_hooks.py + __all__ |
OK | security_hooks.py:54-57 |
| D2 | dispatch/security_patterns.py 3 exports |
OK | security_patterns.py:21 |
| D3 | DenyRule frozen slotted 4 fields |
OK | security_patterns.py:24-40 |
| D4 | 37 DENY + 15 WARN = 52; CANONICAL_RULE_IDS | OK | runtime verified DENY=37 WARN=15 total=52 |
| D5 | build_preexec_hook factory signature |
OK | security_hooks.py:513-519 |
| D6 | 5-stage pipeline | OK | security_hooks.py:530-630 |
| D7 | Fail-CLOSED on exception, no fail-open knob | OK | outer except Exception → deny (:636) |
| D8 | SecurityHooksConfig frozen, 3 fields |
OK | security_hooks.py:87-98 (enabled, extra_deny_patterns, emit_denial_events) |
| D9 | DispatchOptions.security_hooks w/ factory |
OK | protocols.py:74 |
| D10 | SecurityDenied (no Event suffix); 29 total |
OK | events.py:302-313; registry 29 entries |
| D11 | hooks={"PreToolUse": [HookMatcher(matcher="Bash|Write|Edit", ...)]} |
OK | security_hooks.py:690; sdk_backend.py:116-118 |
| D12 | No BON-337 leakage (role/tool_policy/disallowed_tools) | OK | grep clean across dispatch/security_* |
| A#1 | CANONICAL_RULE_IDS slug scheme literal, 52 slugs | OK | test_security_patterns_module.py:59-126 |
| A#2 | depth-6 → _infra.unwrap-exhausted DENY |
OK | security_hooks.py:560-578 |
| A#3 | any exception → _infra.error DENY |
OK | security_hooks.py:636-656 |
| A#4 | C6.6 NFKC range narrow; Cyrillic xfail | OK | security_patterns.py:475-477; test_security_hooks_blindspots.py:178-186 |
| A#5 | matcher literal Bash|Write|Edit unanchored |
OK | security_hooks.py:690 |
| A#6 | WARN wire format prefix + allow | OK | security_hooks.py:613-628 |
Security-correctness review: except Exception (not BaseException) is used consistently with an explicit except asyncio.CancelledError: raise on every critical path — _safe_emit, inner hook body, outer error-emit. CancelledError propagation correct for 3.12+. Circular-import carve-out verified: security_hooks.py imports security_patterns + events + optional SDK; no back-reference to bonfire.protocols anywhere. No bypass vectors found in unwrap/normalize/match.
Findings
W-01 (Non-blocking, cosmetic) — PR description mis-identifies the XPASS
The PR body claims 1 xpass — fullwidth NFKC catches rm -rf /. Actual XPASS per pytest output is test_security_hooks_blindspots.py::TestBlindSpotHeredoc::test_heredoc_sh — the sh <<'EOT'\ndd if=/dev/zero of=/dev/sda\nEOT heredoc. The fullwidth test at test_fullwidth_rm_caught (line 173) is NOT marked xfail, it's a normal pass. Safe behavior: the XPASSing heredoc still denies (C1.2 dd regex fires on the substring inside the heredoc body even without heredoc-specific unwrap). Recommend the author either unmark test_heredoc_sh as xfail or update the PR description; not a merge blocker.
W-02 (Non-blocking, defense-in-depth) — _extract_command silent "" on malformed payload
File: src/bonfire/dispatch/security_hooks.py:393-414
For non-dict tool_input, non-str/non-bytes command, or unknown tools, _extract_command returns "", which the hook body then treats as "nothing to scan → allow." Given the matcher is scoped to Bash|Write|Edit and the SDK provides well-formed payloads, this is low risk. Per the fail-closed doctrine, a Pydantic-level shape assertion (or a DENY when the payload is an unexpected shape but the tool IS Bash/Write/Edit) would tighten the invariant. Follow-up nit, not a blocker.
W-03 (Non-blocking, scope documented) — Write/Edit content not scanned
File: src/bonfire/dispatch/security_hooks.py:401-404
Only file_path is scanned for Write/Edit; content is deliberately out of scope per Scout-2/338 §5.12. An agent could Write(file_path="/etc/sudoers.d/allow", content=...) and the C5.3 regex won't fire (no > redirect prefix). This is documented and deferred to v0.2; recording here for future W-series review.
Non-blocking notes
tests/__init__.pywas added (empty) to fix cross-file canonical-fixture imports — Sage-blessed canonical pattern. Good.test_no_dispatch_importscarve-out for the singlefrom bonfire.dispatch.security_hooks import SecurityHooksConfigimport atprotocols.py:27is clearly documented intest_protocols.py:802-820. Pydantic-runtime-type requirement justifies it. Circular-import risk verified absent.- 1265 lines of implementation + 570 canonical tests + a 1398-line Sage decision doc is the largest single-ticket audit trail in Bonfire history. Quality matches.
- Runtime smoke:
rm -rf $HOME,sudo rm -rf /,git reset --hard,curl ...| bash, fork bomb all DENY; benignecho okpasses. Fail-closed discipline verified.
Trust-triangle W4.2 leg closed.
…entory Wave 4 keystone PRs #14 (bon-337 role) and #15 (bon-338 security_hooks) both add a field to DispatchOptions. Single conflict in test_has_exactly_eight_fields resolved by locking the set to both fields (10 total) and updating the length assertion. protocols.py and sdk_backend.py auto-merged cleanly. Full suite: 1868 passed + 20 xfailed (Scout-2/338 §5 blind-spots) + 1 xpassed.
`bonfire init . && bonfire persona set falcor && bonfire scan .`
exits 1 on step 3 because `persona set` mutates the init stub
(adds `persona = "<name>"` under `[bonfire]`), and the
`_is_init_stub` predicate in
`src/bonfire/onboard/config_generator.py:51` only recognizes
the exact bare-stub shape `[bonfire]\n`. After persona-set,
the file has a persona key — `_is_init_stub` returns False —
`bonfire scan` refuses to overwrite (preserving the W7.M
overwrite-defense for fully-configured files), and the
documented quickstart flow exits 1 on step 3.
New file `tests/unit/test_init_persona_scan_composability.py`
adds 8 tests (4 RED + 4 GREEN canaries).
RED (pin the bug):
- test_recognizes_persona_falcor_after_init
- test_recognizes_persona_minimal_after_init
- test_recognizes_persona_with_trailing_whitespace
- test_init_then_persona_set_then_scan_all_succeed
GREEN canaries (pin the W7.M overwrite-defense + narrow-widening
boundary so the predicate cannot drift to "any single key"):
- test_rejects_fully_configured_config
- test_rejects_hand_added_name_key
- test_rejects_stub_plus_added_section
- test_rejects_persona_plus_extra_key
Design choice: narrow widening (recognize the persona-stub
shape `[bonfire]\npersona = "<basic-string>"` SPECIFICALLY)
rather than broad widening ("any single key under [bonfire]").
The narrow form preserves the existing
`test_init_scan_composability.py` test #15 canary that pins
`[bonfire]\nname = "hand-tuned"\n` (a single-key user edit)
must still trigger the overwrite refusal — the broad form
would regress it. If future waves need other programmatic
stub mutations to be recognized, the right shape is a
stub-variants registry, not further widening of this
predicate.
Cross-lane (OL-3): the parallel W8.G work touches the
`[bonfire.tools]` section emitter in the same file
(`config_generator.py`); these tests deliberately construct
no `[bonfire.tools]` section and assert nothing about
tools-section emission, so the two contracts compose.
Neighbour regression sweep: pre-existing
`test_init_scan_composability.py` (15 tests),
`test_scan_overwrite_guard.py`, and `test_persona_cli.py`
all 37 still pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
src/bonfire/dispatch/security_patterns.py— 37 DENY + 15 WARN regex rules across 7 categories (C1 destructive-fs, C2 destructive-git, C3 pipe-to-shell, C4 exfiltration, C5 priv-esc WARN, C6 obfuscation WARN, C7 system-integrity) with canonical slug IDs locked viaCANONICAL_RULE_IDSfixture.src/bonfire/dispatch/security_hooks.py— 5-stage PreToolUse hook pipeline (normalize → structural unwrap → keyword prefilter → category match → decide). Fail-CLOSED on any exception with_infra.erroremission; fail-CLOSED past unwrap depth 5 with_infra.unwrap-exhausted. Structural unwrap covers sudo/bash-c/sh-c/timeout/nohup/env/xargs/watch/find-exec + pipes/substitution.SecurityDeniedevent (house-style, noEventsuffix) — registered inBonfireEventUnion+EVENT_REGISTRY(28→29). AddSecurityHooksConfig(frozen Pydantic, 3 fields) +DispatchOptions.security_hooksfield. Wirehooks={"PreToolUse": [HookMatcher(matcher="Bash|Write|Edit", ...)]}onClaudeAgentOptions.Test plan
except Exception(notBaseException) soasyncio.CancelledErrorpropagates correctly_infra.erroremission on hook exception path;_infra.unwrap-exhaustedon depth-5 exhaustionreason=\"WARN: \"+rule.message,permissionDecision=\"allow\"Sage decision
`docs/audit/sage-decisions/bon-338-unified-sage-2026-04-18.md` — 1398 lines, 12 canonical decisions, 6 reconciled ambiguities locked, full CANONICAL_RULE_IDS slug table.
Architectural note for Wizard
One pre-existing guard relaxed per Sage D9:
tests/unit/test_protocols.py::test_no_dispatch_importsnow permits `protocols.py` to import `SecurityHooksConfig` from `dispatch.security_hooks`. Circular-import risk noted — if `dispatch.security_hooks` ever imports from `protocols` we loop. Current code avoids this; Wizard should verify.Trust-triangle gate
Closes the W4.2 default security hook set trust-triangle leg. Pairs with #bon-337 for the v0.1.0 release gate.
Linear
Closes BON-338.
🤖 Generated with Claude Code