Skip to content

bon-337: per-role tool allow-lists (W1.5.3 floor + W4.1)#14

Merged
Antawari merged 3 commits into
v0.1from
bon-337-warrior
Apr 19, 2026
Merged

bon-337: per-role tool allow-lists (W1.5.3 floor + W4.1)#14
Antawari merged 3 commits into
v0.1from
bon-337-warrior

Conversation

@Antawari
Copy link
Copy Markdown
Contributor

Summary

  • Add ToolPolicy Protocol + DefaultToolPolicy 8-role floor matrix in src/bonfire/dispatch/tool_policy.py. Roles: scout/knight/warrior/prover/sage/bard/wizard/herald with tool lists lifted from private V1 axioms.
  • Thread tool_policy: ToolPolicy | None = None kwarg through StageExecutor and PipelineEngine; three-tier ratchet (empty role → permissive, unmapped role → strict empty, mapped → floor).
  • Set both tools= (presence layer) AND allowed_tools= (approval layer) on ClaudeAgentOptions for belt-and-suspenders enforcement (Scout-1/337 §7.3). Add DispatchOptions.role: str with Field(strict=True) to block Pydantic int→str coercion.

Test plan

  • 217 canonical Sage-reconciled tests pass across 5 files (tool_policy, dispatch_options_role, engine_executor_tool_policy, engine_pipeline_tool_policy, sdk_backend_tool_presence)
  • 1298/1298 full unit suite — no regressions against v0.1 baseline of 1081 tests
  • test_rejects_compiler_kwarg remains green (Sage D3 lock from BON-334 preserved)
  • test_has_exactly_eight_fields updated 8→9 for the new role field

Sage decision

`docs/audit/sage-decisions/bon-337-unified-sage-2026-04-18.md` — 724 lines, 8 canonical decisions, 5 reconciled ambiguities locked by the reconciling Sage.

Trust-triangle gate

Closes the W1.5.3 default allow-list floor + W4.1 user-configurable allow-list trust-triangle legs from `docs/release-policy.md:41-43`. Pairs with #bon-338 (default security hook set, W4.2) to complete the v0.1.0 release gate.

Linear

Closes BON-337.

🤖 Generated with Claude Code

Copy link
Copy Markdown
Contributor Author

@Antawari Antawari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code-reviewer (superpowered) review

Verdict: APPROVE (posted as comment because GitHub blocks author self-approval; structured table per feedback_wizard_findings_on_pr.md)

This PR is a textbook-clean execution of the unified Sage decision. The diff is surgical (five source files, one-line SDK seam, one-line protocols field, additive kwarg on both engine constructors, a 56-line new tool_policy.py), every canonical Sage decision is independently verifiable, and the 1298/1298 suite is green with test_rejects_compiler_kwarg preserved even under the new tool_policy= kwarg.

Plan adherence (Sage D1..D8 + ambiguity locks)

Ref Requirement Evidence Status
D1 Module at src/bonfire/dispatch/tool_policy.py; no bonfire/security/ src/bonfire/dispatch/tool_policy.py:1-57; git diff --name-only shows no security/ path Met
D2 @runtime_checkable Protocol, tools_for(role: str) -> list[str], list[str] return tool_policy.py:20-33; get_type_hints confirms {'role': str, 'return': list[str]} and _is_runtime_protocol=True Met
D3 DefaultToolPolicy + 8-role _FLOOR byte-for-byte; list(self._FLOOR.get(role, [])) wrap tool_policy.py:36-56; test_tool_policy.py:225-240 parametrized byte-check; fresh-copy verified Met
D4 DispatchOptions.role: str = "", strict typing, below permission_mode protocols.py:68 (role: str = Field(default="", strict=True)); DispatchOptions(role=42) raises ValidationError Met + strict lock
D5 tool_policy: ToolPolicy | None = None kwarg on both; TYPE_CHECKING import; __slots__ alphabetical executor.py:33, 77, 86; __slots__ slot "_tool_policy" between "_project_root" and "_vault_advisor" at 62-64; pipeline.py:53, 102, 111 Met
D6 Ratchet if self._tool_policy is None or not stage.role: role_tools: list[str] = [] executor.py:261-264; pipeline.py:493-496 Met
D7 tools=list(options.tools) directly before allowed_tools=options.tools sdk_backend.py:105-106 order exact; disallowed_tools absent Met
D8 ToolPolicy NOT re-exported; bonfire.protocols.__all__ unchanged protocols.py:29-36; TestProtocolExportDiscipline enforces Met
AMBIG #2 role=None raises test_dispatch_options_role.py:182-183 green Met
AMBIG #3 test_has_exactly_eight_fields updated 8->9 test_protocols.py:620-633 Met (name stale, see F1)
AMBIG #4 role=42 rejected via strict=True protocols.py:68 + test_dispatch_options_role.py:187-188 Met
AMBIG #5 tools_for defensive for hashable non-str, raises for unhashable test_tool_policy.py:463-485 Met
Cross-lane No _compiler, no get_role_tools, no /Projects/bonfire/ imports grep-verified clean Met
Scope Only 5 source files touched git diff --stat v0.1...HEAD Met

Findings

  1. F1 - Minor (nit): tests/unit/test_protocols.py:620 - method name test_has_exactly_eight_fields is stale; docstring says "nine" and assert is == 9. Sage AMBIG #3 only required the value update (done, green). Recommend: rename to test_has_exactly_nine_fields in a trivial follow-up. Non-blocking. Note tests/unit/test_dispatch_options_role.py:17 and docs/audit/sage-decisions/bon-332-sage-20260418T004817Z.md reference the old name - a rename should grep those.

  2. F2 - Minor (nit): src/bonfire/dispatch/sdk_backend.py:62 still carries def __init__(self, *, compiler: Any | None = None) as forward-compat dead-param. This is the correct behavior for BON-337 (Sage section 5 NOT-touched list). Flagging so a future reviewer does not mistake it for an oversight: the compiler kwarg on ClaudeSDKBackend.__init__ is distinct from the blocked compiler kwarg on StageExecutor.__init__ (BON-334 D3).

  3. F3 - Minor (nit): src/bonfire/engine/pipeline.py:493-496 - the ratchet is inline-duplicated between executor.py:261-264 and here, as Sage D6 explicitly forbids helper extraction. Both mirrors are identical line-for-line (verified by the parametrized TestThreeTierRatchetBackendObserved in both test files). No action - duplication is intentional per Sage lockdown.

  4. F4 - Minor (nit): tests/unit/test_tool_policy.py:463-485 TestRoleArgumentTypeBoundary relies on CPython dict.get semantics for unhashable inputs. Since Pydantic strict=True upstream catches every non-str before it reaches tools_for, these cases are defense-in-depth only. Future: guard with if not isinstance(role, str): return [] for belt-and-suspenders parity. Non-blocking.

  5. F5 - Observation: options.role is carried on DispatchOptions but not consumed by sdk_backend.py in this ticket. TestRoleNotConsumedBySdkBackend asserts role does NOT land on ClaudeAgentOptions - exactly as D7-footer prescribes. Clean seam for BON-338 to consume via hooks=.

Accepted trade-offs

  • Inline ratchet duplication between executor and pipeline (Sage D6 locks against extraction). Accepted - pre-existing duplication, refactor is not BON-337 scope.
  • disallowed_tools deferred (Sage section 6 open #1) - TestDisallowedToolsNotSet enforces the absence, matches release-policy scope.
  • Bard Bash omission (Sage D3 intent) - TestBardBashOmission locks it; future Bard handler owns its own shell.
  • 217-test canonical RED instead of the original 198: Sage merged both Knight lanes' adversarial tests, so 217 is the correct reconciled number.

Verified live: full unit suite 1298/1298 green after pip install -e . against the worktree; test_rejects_compiler_kwarg still rejects compiler= and compiler+tool_policy together; role=42/role=True/role=None all raise ValidationError; DefaultToolPolicy().tools_for("scout").append(...) does not contaminate subsequent calls.

LGTM. Ship it and gate BON-338 onto the same constructor as planned.

Generated with Claude Code

Copy link
Copy Markdown
Contributor Author

@Antawari Antawari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wizard review (regular lens)

Verdict: APPROVE

Ran the full unit suite locally on bon-337-warrior: 1298/1298 passing (~4.3s). The 217 canonical BON-337 tests across 5 sibling files pass in 1.12s. Every Sage decision D1–D8 and every reconciled ambiguity #1#5 is honored byte-for-byte. Ultrathought through correctness, security, backward compat, typing, maintainability, and coverage — no blockers surfaced.

Sage decision coverage

Decision Status Notes
D1 — new module src/bonfire/dispatch/tool_policy.py, __all__ = ["DefaultToolPolicy", "ToolPolicy"] PASS File at src/bonfire/dispatch/tool_policy.py:17 with exact __all__. No src/bonfire/security/ created.
D2 — @runtime_checkable Protocol with tools_for(role: str) -> list[str] PASS tool_policy.py:20-33. Runtime-checked live: isinstance(DefaultToolPolicy(), ToolPolicy) is True.
D3 — DefaultToolPolicy._FLOOR 8-role dict, exact tool lists PASS tool_policy.py:44-53. Byte-for-byte match with Sage §1 D3. Bard's Bash omission preserved (scout/337 §4).
D4 — DispatchOptions.role: str with strict=True, default "" PASS protocols.py:68role: str = Field(default="", strict=True). None/int/bool/list/dict all raise ValidationError (verified).
D5 — tool_policy kwarg on both constructors; _tool_policy in StageExecutor.__slots__; PipelineEngine has no __slots__ PASS executor.py:56-65 slot tuple includes "_tool_policy" in alphabetical order; pipeline.py has no __slots__ (verified).
D6 — three-tier ratchet, exact variable name role_tools, both dispatch sites PASS executor.py:261-264 and pipeline.py:493-496. Identical guard if self._tool_policy is None or not stage.role. role_tools is the exact variable name at both sites.
D7 — tools=list(options.tools) before existing allowed_tools=options.tools; no disallowed_tools; no role= forwarding PASS sdk_backend.py:105-106 two adjacent lines. role correctly NOT forwarded (asserted in test_sdk_backend_tool_presence.py:426).
D8 — ToolPolicy NOT in bonfire.protocols.__all__ PASS protocols.py:29-36 unchanged at v0.1 set. Verified live.
Ambig #1 — sibling test files, not appended PASS Five new canonical files: test_tool_policy.py, test_dispatch_options_role.py, test_engine_executor_tool_policy.py, test_engine_pipeline_tool_policy.py, test_sdk_backend_tool_presence.py.
Ambig #2DispatchOptions(role=None) raises ValidationError PASS Verified live.
Ambig #3test_has_exactly_eight_fields updated 8→9 PASS test_protocols.py:620-633 — set comparison includes "role", assertion on len == 9. Test name kept; docstring updated. Purely cosmetic naming — see non-blocking note.
Ambig #4DispatchOptions(role=42) raises ValidationError PASS strict=True on the field blocks Pydantic int→str coercion. Verified live.
Ambig #5DefaultToolPolicy.tools_for graceful on non-str (returns []) PASS dict.get(role, []) short-circuits on any hashable non-str. Covered by test_tool_policy.py:463-472 (None, int, tuple). Unhashable (list/dict) raises TypeError — documented acceptable per Sage §AMBIG #5.

Findings

None blocking. All D1–D8 and Ambig #1#5 verified both by code inspection and live runtime checks on the checked-out bon-337-warrior branch.

Notable strengths worth calling out:

  1. Ratchet discipline — the exact role_tools name + identical guard clause lets both dispatch sites be diffed line-for-line; future refactors will stay aligned. executor.py:261pipeline.py:493.
  2. Strict coercion guardField(default="", strict=True) at protocols.py:68 is the minimum-blast-radius knob; no validator hand-roll required.
  3. SDK adjacencytools=list(...) at sdk_backend.py:105 correctly uses list(...) to hand the SDK a fresh list even though the Pydantic model is frozen; belt-and-suspenders honored.
  4. No scope creep — diff stays inside the eight locations promised in Sage §2 File Manifest. No touch of bonfire/security/, no new Security* events, no HookMatcher import, no bonfire.pydantic_ai_backend.py edits. Clean decoupling from BON-338.

Non-blocking notes (tech debt / follow-ups)

  1. Test name drifttest_has_exactly_eight_fields (test_protocols.py:620) now asserts nine. Docstring was updated but the name reads as a lie. Consider renaming to test_field_inventory or test_has_exactly_nine_fields in a trailing-edge housekeeping ticket. Non-blocking for v0.1.0.
  2. Sage §6 deferreddisallowed_tools field, per-arg tool scoping (e.g. Write(path)), and AgentRole ↔ workflow-string reconciliation remain deferred per Sage §6. File as tech-debt tickets.
  3. Bard handler blocker for Wave 5+ — Bard's tool row intentionally omits Bash per Scout-3/337 §4. Once the Bard handler lands (out of scope here), it will need to invoke gh in-process via a handler path that bypasses the backend dispatch ratchet. This PR correctly surfaces that as a future concern, not a gap to plug now.
  4. PipelineEngine slot discipline — No __slots__ is consistent with Sage D5 and the current file, but worth noting: StageExecutor gains memory discipline while PipelineEngine stays dict-shaped. Not wrong, just asymmetric; a follow-up could harmonize.
  5. Pydantic-AI backend untouchedpydantic_ai_backend.py neither reads options.role nor options.tools. Structural protocol means the kwargs pass silently. If a user swaps backends, the new allow-list surface becomes a no-op — possibly surprising. A doc note in the backend docstring that "this backend does not honor tools/role" would help when BON-338 lands.

Full local unit suite: 1298 passed in 4.30s. Canonical BON-337 subset: 217 passed in 1.12s. Green to merge.

🤖 Generated with Claude Code

@Antawari Antawari merged commit 646df88 into v0.1 Apr 19, 2026
@Antawari Antawari deleted the bon-337-warrior branch April 19, 2026 02:15
Antawari added a commit that referenced this pull request Apr 19, 2026
…entory

Wave 4 keystone PRs #14 (bon-337 role) and #15 (bon-338 security_hooks) both
add a field to DispatchOptions. Single conflict in test_has_exactly_eight_fields
resolved by locking the set to both fields (10 total) and updating the length
assertion. protocols.py and sdk_backend.py auto-merged cleanly.

Full suite: 1868 passed + 20 xfailed (Scout-2/338 §5 blind-spots) + 1 xpassed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant