bon-337: per-role tool allow-lists (W1.5.3 floor + W4.1) by Antawari · Pull Request #14 · BonfireAI/bonfire

Antawari · 2026-04-19T00:50:30Z

Summary

Add ToolPolicy Protocol + DefaultToolPolicy 8-role floor matrix in src/bonfire/dispatch/tool_policy.py. Roles: scout/knight/warrior/prover/sage/bard/wizard/herald with tool lists lifted from private V1 axioms.
Thread tool_policy: ToolPolicy | None = None kwarg through StageExecutor and PipelineEngine; three-tier ratchet (empty role → permissive, unmapped role → strict empty, mapped → floor).
Set both tools= (presence layer) AND allowed_tools= (approval layer) on ClaudeAgentOptions for belt-and-suspenders enforcement (Scout-1/337 §7.3). Add DispatchOptions.role: str with Field(strict=True) to block Pydantic int→str coercion.

Test plan

217 canonical Sage-reconciled tests pass across 5 files (tool_policy, dispatch_options_role, engine_executor_tool_policy, engine_pipeline_tool_policy, sdk_backend_tool_presence)
1298/1298 full unit suite — no regressions against v0.1 baseline of 1081 tests
test_rejects_compiler_kwarg remains green (Sage D3 lock from BON-334 preserved)
test_has_exactly_eight_fields updated 8→9 for the new role field

Sage decision

`docs/audit/sage-decisions/bon-337-unified-sage-2026-04-18.md` — 724 lines, 8 canonical decisions, 5 reconciled ambiguities locked by the reconciling Sage.

Trust-triangle gate

Closes the W1.5.3 default allow-list floor + W4.1 user-configurable allow-list trust-triangle legs from `docs/release-policy.md:41-43`. Pairs with #bon-338 (default security hook set, W4.2) to complete the v0.1.0 release gate.

Linear

Closes BON-337.

🤖 Generated with Claude Code

…es locked)

…1 baseline = 1298 pass)

Antawari

code-reviewer (superpowered) review

Verdict: APPROVE (posted as comment because GitHub blocks author self-approval; structured table per feedback_wizard_findings_on_pr.md)

This PR is a textbook-clean execution of the unified Sage decision. The diff is surgical (five source files, one-line SDK seam, one-line protocols field, additive kwarg on both engine constructors, a 56-line new tool_policy.py), every canonical Sage decision is independently verifiable, and the 1298/1298 suite is green with test_rejects_compiler_kwarg preserved even under the new tool_policy= kwarg.

Plan adherence (Sage D1..D8 + ambiguity locks)

Ref	Requirement	Evidence	Status
D1	Module at `src/bonfire/dispatch/tool_policy.py`; no `bonfire/security/`	`src/bonfire/dispatch/tool_policy.py:1-57`; `git diff --name-only` shows no `security/` path	Met
D2	`@runtime_checkable` Protocol, `tools_for(role: str) -> list[str]`, `list[str]` return	`tool_policy.py:20-33`; `get_type_hints` confirms `{'role': str, 'return': list[str]}` and `_is_runtime_protocol=True`	Met
D3	`DefaultToolPolicy` + 8-role `_FLOOR` byte-for-byte; `list(self._FLOOR.get(role, []))` wrap	`tool_policy.py:36-56`; `test_tool_policy.py:225-240` parametrized byte-check; fresh-copy verified	Met
D4	`DispatchOptions.role: str = ""`, strict typing, below `permission_mode`	`protocols.py:68` (`role: str = Field(default="", strict=True)`); `DispatchOptions(role=42)` raises `ValidationError`	Met + strict lock
D5	`tool_policy: ToolPolicy \| None = None` kwarg on both; `TYPE_CHECKING` import; `__slots__` alphabetical	`executor.py:33, 77, 86`; `__slots__` slot `"_tool_policy"` between `"_project_root"` and `"_vault_advisor"` at 62-64; `pipeline.py:53, 102, 111`	Met
D6	Ratchet `if self._tool_policy is None or not stage.role: role_tools: list[str] = []`	`executor.py:261-264`; `pipeline.py:493-496`	Met
D7	`tools=list(options.tools)` directly before `allowed_tools=options.tools`	`sdk_backend.py:105-106` order exact; `disallowed_tools` absent	Met
D8	`ToolPolicy` NOT re-exported; `bonfire.protocols.__all__` unchanged	`protocols.py:29-36`; `TestProtocolExportDiscipline` enforces	Met
AMBIG #2	`role=None` raises	`test_dispatch_options_role.py:182-183` green	Met
AMBIG #3	`test_has_exactly_eight_fields` updated 8->9	`test_protocols.py:620-633`	Met (name stale, see F1)
AMBIG #4	`role=42` rejected via `strict=True`	`protocols.py:68` + `test_dispatch_options_role.py:187-188`	Met
AMBIG #5	`tools_for` defensive for hashable non-str, raises for unhashable	`test_tool_policy.py:463-485`	Met
Cross-lane	No `_compiler`, no `get_role_tools`, no `/Projects/bonfire/` imports	grep-verified clean	Met
Scope	Only 5 source files touched	`git diff --stat v0.1...HEAD`	Met

Findings

F1 - Minor (nit): tests/unit/test_protocols.py:620 - method name test_has_exactly_eight_fields is stale; docstring says "nine" and assert is == 9. Sage AMBIG #3 only required the value update (done, green). Recommend: rename to test_has_exactly_nine_fields in a trivial follow-up. Non-blocking. Note tests/unit/test_dispatch_options_role.py:17 and docs/audit/sage-decisions/bon-332-sage-20260418T004817Z.md reference the old name - a rename should grep those.
F2 - Minor (nit): src/bonfire/dispatch/sdk_backend.py:62 still carries def __init__(self, *, compiler: Any | None = None) as forward-compat dead-param. This is the correct behavior for BON-337 (Sage section 5 NOT-touched list). Flagging so a future reviewer does not mistake it for an oversight: the compiler kwarg on ClaudeSDKBackend.__init__ is distinct from the blocked compiler kwarg on StageExecutor.__init__ (BON-334 D3).
F3 - Minor (nit): src/bonfire/engine/pipeline.py:493-496 - the ratchet is inline-duplicated between executor.py:261-264 and here, as Sage D6 explicitly forbids helper extraction. Both mirrors are identical line-for-line (verified by the parametrized TestThreeTierRatchetBackendObserved in both test files). No action - duplication is intentional per Sage lockdown.
F4 - Minor (nit): tests/unit/test_tool_policy.py:463-485 TestRoleArgumentTypeBoundary relies on CPython dict.get semantics for unhashable inputs. Since Pydantic strict=True upstream catches every non-str before it reaches tools_for, these cases are defense-in-depth only. Future: guard with if not isinstance(role, str): return [] for belt-and-suspenders parity. Non-blocking.
F5 - Observation: options.role is carried on DispatchOptions but not consumed by sdk_backend.py in this ticket. TestRoleNotConsumedBySdkBackend asserts role does NOT land on ClaudeAgentOptions - exactly as D7-footer prescribes. Clean seam for BON-338 to consume via hooks=.

Accepted trade-offs

Inline ratchet duplication between executor and pipeline (Sage D6 locks against extraction). Accepted - pre-existing duplication, refactor is not BON-337 scope.
disallowed_tools deferred (Sage section 6 open #1) - TestDisallowedToolsNotSet enforces the absence, matches release-policy scope.
Bard Bash omission (Sage D3 intent) - TestBardBashOmission locks it; future Bard handler owns its own shell.
217-test canonical RED instead of the original 198: Sage merged both Knight lanes' adversarial tests, so 217 is the correct reconciled number.

Verified live: full unit suite 1298/1298 green after pip install -e . against the worktree; test_rejects_compiler_kwarg still rejects compiler= and compiler+tool_policy together; role=42/role=True/role=None all raise ValidationError; DefaultToolPolicy().tools_for("scout").append(...) does not contaminate subsequent calls.

LGTM. Ship it and gate BON-338 onto the same constructor as planned.

Generated with Claude Code

Antawari

Wizard review (regular lens)

Verdict: APPROVE

Ran the full unit suite locally on bon-337-warrior: 1298/1298 passing (~4.3s). The 217 canonical BON-337 tests across 5 sibling files pass in 1.12s. Every Sage decision D1–D8 and every reconciled ambiguity #1–#5 is honored byte-for-byte. Ultrathought through correctness, security, backward compat, typing, maintainability, and coverage — no blockers surfaced.

Sage decision coverage

Decision	Status	Notes
D1 — new module `src/bonfire/dispatch/tool_policy.py`, `__all__` = `["DefaultToolPolicy", "ToolPolicy"]`	PASS	File at `src/bonfire/dispatch/tool_policy.py:17` with exact `__all__`. No `src/bonfire/security/` created.
D2 — `@runtime_checkable Protocol` with `tools_for(role: str) -> list[str]`	PASS	`tool_policy.py:20-33`. Runtime-checked live: `isinstance(DefaultToolPolicy(), ToolPolicy)` is `True`.
D3 — `DefaultToolPolicy._FLOOR` 8-role dict, exact tool lists	PASS	`tool_policy.py:44-53`. Byte-for-byte match with Sage §1 D3. Bard's Bash omission preserved (scout/337 §4).
D4 — `DispatchOptions.role: str` with `strict=True`, default `""`	PASS	`protocols.py:68` — `role: str = Field(default="", strict=True)`. `None`/`int`/`bool`/`list`/`dict` all raise `ValidationError` (verified).
D5 — `tool_policy` kwarg on both constructors; `_tool_policy` in `StageExecutor.__slots__`; `PipelineEngine` has no `__slots__`	PASS	`executor.py:56-65` slot tuple includes `"_tool_policy"` in alphabetical order; `pipeline.py` has no `__slots__` (verified).
D6 — three-tier ratchet, exact variable name `role_tools`, both dispatch sites	PASS	`executor.py:261-264` and `pipeline.py:493-496`. Identical guard `if self._tool_policy is None or not stage.role`. `role_tools` is the exact variable name at both sites.
D7 — `tools=list(options.tools)` before existing `allowed_tools=options.tools`; no `disallowed_tools`; no `role=` forwarding	PASS	`sdk_backend.py:105-106` two adjacent lines. `role` correctly NOT forwarded (asserted in `test_sdk_backend_tool_presence.py:426`).
D8 — `ToolPolicy` NOT in `bonfire.protocols.__all__`	PASS	`protocols.py:29-36` unchanged at v0.1 set. Verified live.
Ambig #1 — sibling test files, not appended	PASS	Five new canonical files: `test_tool_policy.py`, `test_dispatch_options_role.py`, `test_engine_executor_tool_policy.py`, `test_engine_pipeline_tool_policy.py`, `test_sdk_backend_tool_presence.py`.
Ambig #2 — `DispatchOptions(role=None)` raises `ValidationError`	PASS	Verified live.
Ambig #3 — `test_has_exactly_eight_fields` updated 8→9	PASS	`test_protocols.py:620-633` — set comparison includes `"role"`, assertion on `len == 9`. Test name kept; docstring updated. Purely cosmetic naming — see non-blocking note.
Ambig #4 — `DispatchOptions(role=42)` raises `ValidationError`	PASS	`strict=True` on the field blocks Pydantic int→str coercion. Verified live.
Ambig #5 — `DefaultToolPolicy.tools_for` graceful on non-str (returns `[]`)	PASS	`dict.get(role, [])` short-circuits on any hashable non-str. Covered by `test_tool_policy.py:463-472` (None, int, tuple). Unhashable (list/dict) raises `TypeError` — documented acceptable per Sage §AMBIG #5.

Findings

None blocking. All D1–D8 and Ambig #1–#5 verified both by code inspection and live runtime checks on the checked-out bon-337-warrior branch.

Notable strengths worth calling out:

Ratchet discipline — the exact role_tools name + identical guard clause lets both dispatch sites be diffed line-for-line; future refactors will stay aligned. executor.py:261 ↔ pipeline.py:493.
Strict coercion guard — Field(default="", strict=True) at protocols.py:68 is the minimum-blast-radius knob; no validator hand-roll required.
SDK adjacency — tools=list(...) at sdk_backend.py:105 correctly uses list(...) to hand the SDK a fresh list even though the Pydantic model is frozen; belt-and-suspenders honored.
No scope creep — diff stays inside the eight locations promised in Sage §2 File Manifest. No touch of bonfire/security/, no new Security* events, no HookMatcher import, no bonfire.pydantic_ai_backend.py edits. Clean decoupling from BON-338.

Non-blocking notes (tech debt / follow-ups)

Test name drift — test_has_exactly_eight_fields (test_protocols.py:620) now asserts nine. Docstring was updated but the name reads as a lie. Consider renaming to test_field_inventory or test_has_exactly_nine_fields in a trailing-edge housekeeping ticket. Non-blocking for v0.1.0.
Sage §6 deferred — disallowed_tools field, per-arg tool scoping (e.g. Write(path)), and AgentRole ↔ workflow-string reconciliation remain deferred per Sage §6. File as tech-debt tickets.
Bard handler blocker for Wave 5+ — Bard's tool row intentionally omits Bash per Scout-3/337 §4. Once the Bard handler lands (out of scope here), it will need to invoke gh in-process via a handler path that bypasses the backend dispatch ratchet. This PR correctly surfaces that as a future concern, not a gap to plug now.
PipelineEngine slot discipline — No __slots__ is consistent with Sage D5 and the current file, but worth noting: StageExecutor gains memory discipline while PipelineEngine stays dict-shaped. Not wrong, just asymmetric; a follow-up could harmonize.
Pydantic-AI backend untouched — pydantic_ai_backend.py neither reads options.role nor options.tools. Structural protocol means the kwargs pass silently. If a user swaps backends, the new allow-list surface becomes a no-op — possibly surprising. A doc note in the backend docstring that "this backend does not honor tools/role" would help when BON-338 lands.

Full local unit suite: 1298 passed in 4.30s. Canonical BON-337 subset: 217 passed in 1.12s. Green to merge.

🤖 Generated with Claude Code

…entory Wave 4 keystone PRs #14 (bon-337 role) and #15 (bon-338 security_hooks) both add a field to DispatchOptions. Single conflict in test_has_exactly_eight_fields resolved by locking the set to both fields (10 total) and updating the length assertion. protocols.py and sdk_backend.py auto-merged cleanly. Full suite: 1868 passed + 20 xfailed (Scout-2/338 §5 blind-spots) + 1 xpassed.

Antawari added 3 commits April 18, 2026 17:44

bon-337 sage: canonical RED contract (198 tests, 5 files, 5 ambiguiti…

e4057f7

…es locked)

bon-337 warrior GREEN: per-role tool allow-lists (217 canonical + 108…

5bedf3c

…1 baseline = 1298 pass)

bon-337 docs: scout reports (3) + unified sage decision (724 lines)

9700f95

Antawari commented Apr 19, 2026

View reviewed changes

Antawari merged commit 646df88 into v0.1 Apr 19, 2026

Antawari deleted the bon-337-warrior branch April 19, 2026 02:15

Antawari mentioned this pull request May 15, 2026

Harden dispatch layer: Write/Edit sensitive-path deny + tools_for() accepts either vocabulary #90

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bon-337: per-role tool allow-lists (W1.5.3 floor + W4.1)#14

bon-337: per-role tool allow-lists (W1.5.3 floor + W4.1)#14
Antawari merged 3 commits into
v0.1from
bon-337-warrior

Antawari commented Apr 19, 2026

Uh oh!

Antawari left a comment

Uh oh!

Antawari left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Antawari commented Apr 19, 2026

Summary

Test plan

Sage decision

Trust-triangle gate

Linear

Uh oh!

Antawari left a comment

Choose a reason for hiding this comment

code-reviewer (superpowered) review

Plan adherence (Sage D1..D8 + ambiguity locks)

Findings

Accepted trade-offs

Uh oh!

Antawari left a comment

Choose a reason for hiding this comment

Wizard review (regular lens)

Sage decision coverage

Findings

Non-blocking notes (tech debt / follow-ups)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant