bon-337: per-role tool allow-lists (W1.5.3 floor + W4.1)#14
Conversation
Antawari
left a comment
There was a problem hiding this comment.
code-reviewer (superpowered) review
Verdict: APPROVE (posted as comment because GitHub blocks author self-approval; structured table per feedback_wizard_findings_on_pr.md)
This PR is a textbook-clean execution of the unified Sage decision. The diff is surgical (five source files, one-line SDK seam, one-line protocols field, additive kwarg on both engine constructors, a 56-line new tool_policy.py), every canonical Sage decision is independently verifiable, and the 1298/1298 suite is green with test_rejects_compiler_kwarg preserved even under the new tool_policy= kwarg.
Plan adherence (Sage D1..D8 + ambiguity locks)
| Ref | Requirement | Evidence | Status |
|---|---|---|---|
| D1 | Module at src/bonfire/dispatch/tool_policy.py; no bonfire/security/ |
src/bonfire/dispatch/tool_policy.py:1-57; git diff --name-only shows no security/ path |
Met |
| D2 | @runtime_checkable Protocol, tools_for(role: str) -> list[str], list[str] return |
tool_policy.py:20-33; get_type_hints confirms {'role': str, 'return': list[str]} and _is_runtime_protocol=True |
Met |
| D3 | DefaultToolPolicy + 8-role _FLOOR byte-for-byte; list(self._FLOOR.get(role, [])) wrap |
tool_policy.py:36-56; test_tool_policy.py:225-240 parametrized byte-check; fresh-copy verified |
Met |
| D4 | DispatchOptions.role: str = "", strict typing, below permission_mode |
protocols.py:68 (role: str = Field(default="", strict=True)); DispatchOptions(role=42) raises ValidationError |
Met + strict lock |
| D5 | tool_policy: ToolPolicy | None = None kwarg on both; TYPE_CHECKING import; __slots__ alphabetical |
executor.py:33, 77, 86; __slots__ slot "_tool_policy" between "_project_root" and "_vault_advisor" at 62-64; pipeline.py:53, 102, 111 |
Met |
| D6 | Ratchet if self._tool_policy is None or not stage.role: role_tools: list[str] = [] |
executor.py:261-264; pipeline.py:493-496 |
Met |
| D7 | tools=list(options.tools) directly before allowed_tools=options.tools |
sdk_backend.py:105-106 order exact; disallowed_tools absent |
Met |
| D8 | ToolPolicy NOT re-exported; bonfire.protocols.__all__ unchanged |
protocols.py:29-36; TestProtocolExportDiscipline enforces |
Met |
| AMBIG #2 | role=None raises |
test_dispatch_options_role.py:182-183 green |
Met |
| AMBIG #3 | test_has_exactly_eight_fields updated 8->9 |
test_protocols.py:620-633 |
Met (name stale, see F1) |
| AMBIG #4 | role=42 rejected via strict=True |
protocols.py:68 + test_dispatch_options_role.py:187-188 |
Met |
| AMBIG #5 | tools_for defensive for hashable non-str, raises for unhashable |
test_tool_policy.py:463-485 |
Met |
| Cross-lane | No _compiler, no get_role_tools, no /Projects/bonfire/ imports |
grep-verified clean | Met |
| Scope | Only 5 source files touched | git diff --stat v0.1...HEAD |
Met |
Findings
-
F1 - Minor (nit):
tests/unit/test_protocols.py:620- method nametest_has_exactly_eight_fieldsis stale; docstring says "nine" and assert is== 9. Sage AMBIG #3 only required the value update (done, green). Recommend: rename totest_has_exactly_nine_fieldsin a trivial follow-up. Non-blocking. Notetests/unit/test_dispatch_options_role.py:17anddocs/audit/sage-decisions/bon-332-sage-20260418T004817Z.mdreference the old name - a rename should grep those. -
F2 - Minor (nit):
src/bonfire/dispatch/sdk_backend.py:62still carriesdef __init__(self, *, compiler: Any | None = None)as forward-compat dead-param. This is the correct behavior for BON-337 (Sage section 5 NOT-touched list). Flagging so a future reviewer does not mistake it for an oversight: thecompilerkwarg onClaudeSDKBackend.__init__is distinct from the blockedcompilerkwarg onStageExecutor.__init__(BON-334 D3). -
F3 - Minor (nit):
src/bonfire/engine/pipeline.py:493-496- the ratchet is inline-duplicated betweenexecutor.py:261-264and here, as Sage D6 explicitly forbids helper extraction. Both mirrors are identical line-for-line (verified by the parametrizedTestThreeTierRatchetBackendObservedin both test files). No action - duplication is intentional per Sage lockdown. -
F4 - Minor (nit):
tests/unit/test_tool_policy.py:463-485TestRoleArgumentTypeBoundaryrelies on CPythondict.getsemantics for unhashable inputs. Since Pydanticstrict=Trueupstream catches every non-str before it reachestools_for, these cases are defense-in-depth only. Future: guard withif not isinstance(role, str): return []for belt-and-suspenders parity. Non-blocking. -
F5 - Observation:
options.roleis carried onDispatchOptionsbut not consumed bysdk_backend.pyin this ticket.TestRoleNotConsumedBySdkBackendassertsroledoes NOT land onClaudeAgentOptions- exactly as D7-footer prescribes. Clean seam for BON-338 to consume viahooks=.
Accepted trade-offs
- Inline ratchet duplication between executor and pipeline (Sage D6 locks against extraction). Accepted - pre-existing duplication, refactor is not BON-337 scope.
disallowed_toolsdeferred (Sage section 6 open #1) -TestDisallowedToolsNotSetenforces the absence, matches release-policy scope.- Bard Bash omission (Sage D3 intent) -
TestBardBashOmissionlocks it; future Bard handler owns its own shell. - 217-test canonical RED instead of the original 198: Sage merged both Knight lanes' adversarial tests, so 217 is the correct reconciled number.
Verified live: full unit suite 1298/1298 green after pip install -e . against the worktree; test_rejects_compiler_kwarg still rejects compiler= and compiler+tool_policy together; role=42/role=True/role=None all raise ValidationError; DefaultToolPolicy().tools_for("scout").append(...) does not contaminate subsequent calls.
LGTM. Ship it and gate BON-338 onto the same constructor as planned.
Generated with Claude Code
Antawari
left a comment
There was a problem hiding this comment.
Wizard review (regular lens)
Verdict: APPROVE
Ran the full unit suite locally on bon-337-warrior: 1298/1298 passing (~4.3s). The 217 canonical BON-337 tests across 5 sibling files pass in 1.12s. Every Sage decision D1–D8 and every reconciled ambiguity #1–#5 is honored byte-for-byte. Ultrathought through correctness, security, backward compat, typing, maintainability, and coverage — no blockers surfaced.
Sage decision coverage
| Decision | Status | Notes |
|---|---|---|
D1 — new module src/bonfire/dispatch/tool_policy.py, __all__ = ["DefaultToolPolicy", "ToolPolicy"] |
PASS | File at src/bonfire/dispatch/tool_policy.py:17 with exact __all__. No src/bonfire/security/ created. |
D2 — @runtime_checkable Protocol with tools_for(role: str) -> list[str] |
PASS | tool_policy.py:20-33. Runtime-checked live: isinstance(DefaultToolPolicy(), ToolPolicy) is True. |
D3 — DefaultToolPolicy._FLOOR 8-role dict, exact tool lists |
PASS | tool_policy.py:44-53. Byte-for-byte match with Sage §1 D3. Bard's Bash omission preserved (scout/337 §4). |
D4 — DispatchOptions.role: str with strict=True, default "" |
PASS | protocols.py:68 — role: str = Field(default="", strict=True). None/int/bool/list/dict all raise ValidationError (verified). |
D5 — tool_policy kwarg on both constructors; _tool_policy in StageExecutor.__slots__; PipelineEngine has no __slots__ |
PASS | executor.py:56-65 slot tuple includes "_tool_policy" in alphabetical order; pipeline.py has no __slots__ (verified). |
D6 — three-tier ratchet, exact variable name role_tools, both dispatch sites |
PASS | executor.py:261-264 and pipeline.py:493-496. Identical guard if self._tool_policy is None or not stage.role. role_tools is the exact variable name at both sites. |
D7 — tools=list(options.tools) before existing allowed_tools=options.tools; no disallowed_tools; no role= forwarding |
PASS | sdk_backend.py:105-106 two adjacent lines. role correctly NOT forwarded (asserted in test_sdk_backend_tool_presence.py:426). |
D8 — ToolPolicy NOT in bonfire.protocols.__all__ |
PASS | protocols.py:29-36 unchanged at v0.1 set. Verified live. |
| Ambig #1 — sibling test files, not appended | PASS | Five new canonical files: test_tool_policy.py, test_dispatch_options_role.py, test_engine_executor_tool_policy.py, test_engine_pipeline_tool_policy.py, test_sdk_backend_tool_presence.py. |
Ambig #2 — DispatchOptions(role=None) raises ValidationError |
PASS | Verified live. |
Ambig #3 — test_has_exactly_eight_fields updated 8→9 |
PASS | test_protocols.py:620-633 — set comparison includes "role", assertion on len == 9. Test name kept; docstring updated. Purely cosmetic naming — see non-blocking note. |
Ambig #4 — DispatchOptions(role=42) raises ValidationError |
PASS | strict=True on the field blocks Pydantic int→str coercion. Verified live. |
Ambig #5 — DefaultToolPolicy.tools_for graceful on non-str (returns []) |
PASS | dict.get(role, []) short-circuits on any hashable non-str. Covered by test_tool_policy.py:463-472 (None, int, tuple). Unhashable (list/dict) raises TypeError — documented acceptable per Sage §AMBIG #5. |
Findings
None blocking. All D1–D8 and Ambig #1–#5 verified both by code inspection and live runtime checks on the checked-out bon-337-warrior branch.
Notable strengths worth calling out:
- Ratchet discipline — the exact
role_toolsname + identical guard clause lets both dispatch sites be diffed line-for-line; future refactors will stay aligned.executor.py:261↔pipeline.py:493. - Strict coercion guard —
Field(default="", strict=True)atprotocols.py:68is the minimum-blast-radius knob; no validator hand-roll required. - SDK adjacency —
tools=list(...)atsdk_backend.py:105correctly useslist(...)to hand the SDK a fresh list even though the Pydantic model is frozen; belt-and-suspenders honored. - No scope creep — diff stays inside the eight locations promised in Sage §2 File Manifest. No touch of
bonfire/security/, no newSecurity*events, noHookMatcherimport, nobonfire.pydantic_ai_backend.pyedits. Clean decoupling from BON-338.
Non-blocking notes (tech debt / follow-ups)
- Test name drift —
test_has_exactly_eight_fields(test_protocols.py:620) now asserts nine. Docstring was updated but the name reads as a lie. Consider renaming totest_field_inventoryortest_has_exactly_nine_fieldsin a trailing-edge housekeeping ticket. Non-blocking for v0.1.0. - Sage §6 deferred —
disallowed_toolsfield, per-arg tool scoping (e.g.Write(path)), andAgentRole↔ workflow-string reconciliation remain deferred per Sage §6. File as tech-debt tickets. - Bard handler blocker for Wave 5+ — Bard's tool row intentionally omits
Bashper Scout-3/337 §4. Once the Bard handler lands (out of scope here), it will need to invokeghin-process via a handler path that bypasses the backend dispatch ratchet. This PR correctly surfaces that as a future concern, not a gap to plug now. PipelineEngineslot discipline — No__slots__is consistent with Sage D5 and the current file, but worth noting:StageExecutorgains memory discipline whilePipelineEnginestays dict-shaped. Not wrong, just asymmetric; a follow-up could harmonize.- Pydantic-AI backend untouched —
pydantic_ai_backend.pyneither readsoptions.rolenoroptions.tools. Structural protocol means the kwargs pass silently. If a user swaps backends, the new allow-list surface becomes a no-op — possibly surprising. A doc note in the backend docstring that "this backend does not honortools/role" would help when BON-338 lands.
Full local unit suite: 1298 passed in 4.30s. Canonical BON-337 subset: 217 passed in 1.12s. Green to merge.
🤖 Generated with Claude Code
…entory Wave 4 keystone PRs #14 (bon-337 role) and #15 (bon-338 security_hooks) both add a field to DispatchOptions. Single conflict in test_has_exactly_eight_fields resolved by locking the set to both fields (10 total) and updating the length assertion. protocols.py and sdk_backend.py auto-merged cleanly. Full suite: 1868 passed + 20 xfailed (Scout-2/338 §5 blind-spots) + 1 xpassed.
Summary
ToolPolicyProtocol +DefaultToolPolicy8-role floor matrix insrc/bonfire/dispatch/tool_policy.py. Roles: scout/knight/warrior/prover/sage/bard/wizard/herald with tool lists lifted from private V1 axioms.tool_policy: ToolPolicy | None = Nonekwarg throughStageExecutorandPipelineEngine; three-tier ratchet (empty role → permissive, unmapped role → strict empty, mapped → floor).tools=(presence layer) ANDallowed_tools=(approval layer) onClaudeAgentOptionsfor belt-and-suspenders enforcement (Scout-1/337 §7.3). AddDispatchOptions.role: strwithField(strict=True)to block Pydantic int→str coercion.Test plan
test_rejects_compiler_kwargremains green (Sage D3 lock from BON-334 preserved)test_has_exactly_eight_fieldsupdated 8→9 for the newrolefieldSage decision
`docs/audit/sage-decisions/bon-337-unified-sage-2026-04-18.md` — 724 lines, 8 canonical decisions, 5 reconciled ambiguities locked by the reconciling Sage.
Trust-triangle gate
Closes the W1.5.3 default allow-list floor + W4.1 user-configurable allow-list trust-triangle legs from `docs/release-policy.md:41-43`. Pairs with #bon-338 (default security hook set, W4.2) to complete the v0.1.0 release gate.
Linear
Closes BON-337.
🤖 Generated with Claude Code