refactor(py): split 4,129-line lib.rs into 15 focused modules + thin Python layer by bkrabach · Pull Request #48 · microsoft/amplifier-core

bkrabach · 2026-03-15T13:45:22Z

Summary

Three-phase refactor of the Python bindings in bindings/python/src/, followed by three rounds of expert review (architecture, security, code quality) with all findings resolved.

Phase 1: Mechanical File Split

Split monolithic lib.rs (4,129 lines) into 11 focused module files
lib.rs reduced to ~234-line router with mod declarations + #[pymodule] registration
Zero behavior change — purely mechanical file reorganization

Phase 2: Coordinator Decomposition

Decomposed the ~900-line PyCoordinator into coordinator/ subdirectory with 4 sub-modules:
- mod.rs (struct definition + lifecycle)
- mount_points.rs (module storage and retrieval)
- capabilities.rs (extensibility surface)
- hook_dispatch.rs (hook result routing)

Phase 3: Thin the Python Layer

Added PyContextManagerBridge to bridges.rs (completing the bridge trio)
Implemented process_hook_result() in Rust (coordinator/hook_dispatch.rs)
Patched PySession::new() to construct RustCoordinator directly
Deleted _rust_wrappers.py (247 lines of Python wrapper code now in Rust)
Updated __init__.py and coordinator.py re-exports
Consolidated duplicated mount-point validation logic

Review Fixes (3 rounds × 3 reviewers)

Security: Unified approval decision parsing via shared is_approval_granted() (fail-closed allowlist). Content sanitization via _sanitize_for_llm with fail-closed fallback. Size limits and token budgets validated against post-sanitization content. Timeout defaults routed through same fail-closed logic.
Architecture: UTC timezone for audit timestamps. Diagnostic logging in try_model_dump fallback. Documented approval vocabulary constraints.
Quality: Removed DRY violations (_raw helpers, redundant is_none() guards). Fixed version string to env!("CARGO_PKG_VERSION"). Removed dead "agents" match arms. Reconciled unsafe impl Send/Sync across all 4 bridges. Phase-based section labels. Documented loader param, infallible unwrap, and cleanup divergence.

File Structure (After)

bindings/python/src/
├── lib.rs (234 lines — pure router)
├── helpers.rs (43), bridges.rs (421), session.rs (676)
├── hooks.rs (284), cancellation.rs (185), errors.rs (256)
├── retry.rs (205), module_resolver.rs (90), wasm.rs (898)
├── coordinator/
│   ├── mod.rs (~470), mount_points.rs (~430)
│   ├── capabilities.rs (155), hook_dispatch.rs (~550)
└── tests/mod.rs

Python-side Changes

_rust_wrappers.py deleted (247 lines)
ModuleCoordinator now aliases directly to RustCoordinator
process_hook_result() routing logic moved to Rust
Duplicated mount-point checks removed from _session_exec.py

Motivation

Maintainability: 4,129-line monolith was unmaintainable. Follows the Node binding split pattern (PR refactor: split Node bindings lib.rs into focused modules #46).
DRY across bindings: Moving process_hook_result to Rust means the routing logic is available to any future binding without reimplementation.
Thinner Python layer: Eliminates _rust_wrappers.py entirely — one less layer between consumers and the Rust kernel.
Security hardening: All approval paths now fail-closed via shared allowlist. Content sanitization enforced. Audit timestamps UTC.

Backward Compatibility

Verified by exploration of amplifier-foundation and amplifier-app-cli:

ModuleCoordinator importable from amplifier_core and amplifier_core.coordinator ✅
HookResult importable from all 3 paths (., .models, .hooks) ✅
session.config dict mutation works ✅
coordinator.session_state dict mutation works ✅
No external code ever referenced _rust_wrappers or process_hook_result directly ✅

Test Results

Suite	Result
`cargo check -p amplifier-core-py`	✅ Clean
`cargo clippy -p amplifier-core-py`	✅ Zero warnings
`maturin develop`	✅ Built + installed
`pytest tests/ -m "not slow"`	✅ 549 passed (517 original + 32 new)

Expert Review Summary

Reviewer	Rounds	Final Verdict
Zen-Architect	3	All concerns resolved
Security-Guardian	3	✅ APPROVED — Risk: 🟢 LOW
Code-Quality-Reviewer	3	All concerns resolved

Design Document

See docs/plans/2026-03-14-python-bindings-split-design.md

Covers the plan to split bindings/python/src/lib.rs (4,129 lines) into ~14 focused modules, decompose PyCoordinator into a subdirectory with 4 sub-modules, and move process_hook_result() from Python into the Rust binding layer — eliminating _rust_wrappers.py entirely. Key decisions: - Approach A: smarter binding layer, kernel stays pure mechanism - New PyContextManagerBridge completes the bridge trio - Coordinator decomposed into mod.rs + mount_points/capabilities/hook_dispatch - Duplicated config/mount-point validation canonicalized in Rust - Full backward compatibility preserved (external contract confirmed)

…_dump)

…iderBridge, PyDisplayServiceBridge)

…cellation.rs

…, compute_delay)

…rom_path)

… load_and_mount_wasm)

…b.rs

- Replace wildcard re-exports (pub(crate) use module::*) with explicit re-exports for each module's public items - Group all mod declarations together (alphabetically ordered) - Group all pub(crate) use re-exports together after mod declarations - Fix unused import 'use crate::PyCoordinator' in session.rs cargo check, cargo check --tests, and cargo clippy all pass clean.

Split the monolithic 4,129-line lib.rs into: - helpers.rs (~30 lines) — wrap_future_as_coroutine, try_model_dump - bridges.rs (~330 lines) — PyHookHandlerBridge, PyApprovalProviderBridge, PyDisplayServiceBridge - cancellation.rs (~180 lines) — PyCancellationToken - errors.rs (~255 lines) — PyProviderError - retry.rs (~200 lines) — PyRetryConfig, classify_error_message, compute_delay - hooks.rs (~270 lines) — PyUnregisterFn, PyHookRegistry - session.rs (~660 lines) — PySession - module_resolver.rs (~90 lines) — resolve_module, load_wasm_from_path - coordinator.rs (~900 lines) — PyCoordinator (single file; Phase 2 decomposes to subdir) - wasm.rs (~880 lines) — 6 PyWasm* classes, NullContextManager, load_and_mount_wasm - lib.rs (~230 lines) — router: mod decls, re-exports, #[pymodule], tests Zero behavior change. All 517 Python tests pass. All 14 Rust tests pass.

…hase-1 cleanup) - errors.rs: removed duplicate section banner (lines 7-9) - retry.rs: removed duplicate section banner (lines 8-10) - hooks.rs: removed spurious #[allow(clippy::too_many_arguments)] from PyHookRegistry::new() All tests pass: cargo test (14/14), clippy (0 warnings), pytest (336/336).

…module

…ts.rs

…s.rs

…tion in to_dict()

…esult

- Add hook_dispatch.rs with #[pymethods] impl PyCoordinator containing process_hook_result() method that routes HookResult actions to subsystems - Implement call_approval_system() async helper for approval flow - Add injection_size_limit_raw() and injection_budget_raw() helpers in mod.rs - Preserve exact behavioral semantics from _rust_wrappers.py: - ask_user returns early (short-circuits steps 3 and 4) - user_message fires on field being truthy, not on action field - size limit violation is hard error (ValueError) - budget overage is soft warning (logs but continues) - ephemeral skips context injection but still counts tokens - _handle_user_message is synchronous - token estimate: len(content) // 4 Co-authored-by: Amplifier <amplifier@sourcegraph.com>

…ok_dispatch - Removed unused _timeout_err_cls parameter from call_approval_system async helper - Extracted private make_hook_result helper to replace 5 near-identical hook result construction blocks - All early-return sites in approval flow now use the centralized helper 🤖 Generated with Amplifier Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

- Extract shared FakeApproval(response) class at module level, replacing two identical local classes in test_ask_user_approved and test_ask_user_denied - Rename local FakeApproval in timeout test to TimeoutApproval to eliminate shadowing of the module-level class - Add process_hook_result signature to RustCoordinator in _engine.pyi, resolving 9 Pyright reportAttributeAccessIssue false positives All 9 tests still pass. Ruff and Pyright clean.

…_rust_wrappers import

…iminate shadowing

- Replace 'from ._rust_wrappers import ModuleCoordinator' with 'from ._engine import RustCoordinator as ModuleCoordinator' - Also fix coordinator.py which had a matching _rust_wrappers import (required for the smoke test to pass, since _rust_wrappers is deleted) - Add TDD test: tests/test_module_coordinator_import.py Co-authored-by: Amplifier <amplifier@myco.io>

Add TDD tests verifying coordinator.py imports ModuleCoordinator from amplifier_core._engine (RustCoordinator), not the deleted _rust_wrappers module. - coordinator.py already contains the correct _engine import (from prior task) - Added tests/test_coordinator_reexport_engine.py with 5 targeted tests: * ModuleCoordinator importable from amplifier_core.coordinator * __all__ exposes ModuleCoordinator * ModuleCoordinator is RustCoordinator from _engine (identity check) * Source does not reference deleted _rust_wrappers module * Source contains the canonical _engine import line All 5 new tests pass. Full suite: 534 passed, 4 pre-existing failures (test_session.py / test_cancellation_resilience.py mock-attribute incompatibility with Rust extension type — not introduced by this task, confirmed pre-existing before this commit via git stash verification).

- Removed redundant mount-point presence checks (ValueError/RuntimeError) from run_orchestrator() function - These validations are duplicated in Rust's PySession::execute() which already validates before calling run_orchestrator() - Updated implementation to use 'or {}' fallbacks for orchestrator, context, and providers - Removed ~15 lines of redundant error handling code - Added comprehensive test file (test_session_exec.py) verifying run_orchestrator does not raise for missing mount points - All 537 tests pass with these changes 🤖 Generated with Amplifier Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>

…estore spec test

…_python_stubs - cargo fmt: reformat bridges.rs and hook_dispatch.rs per rustfmt conventions - test_python_stubs: update test_coordinator_reexport to import from _engine instead of deleted _rust_wrappers (Phase 3 deleted that module) Co-authored-by: Amplifier <amplifier@amplified.dev>

…red parser Add is_approval_granted() to helpers.rs — only 'allow', 'allow once', and 'allow always' (case-insensitive) return true. Unknown or unexpected strings are treated as denial. Replace the old case-sensitive 'decision == "Deny"' check in hook_dispatch.rs (fail-OPEN: anything other than exact 'Deny' was silently allowed) and the '.to_lowercase().contains("approve")' check in bridges.rs (fail-CLOSED but inconsistent) with the single shared function. Both paths now agree on semantics, and hook_dispatch.rs emits log::warn! when an unexpected decision string is received so operators can diagnose misbehaving approval systems.

…k warning capabilities.rs line 82: replace .unwrap() after get_item() with .ok_or_else() that returns a descriptive PyRuntimeError. This was the only naked unwrap() across the 15-file split. Also add log::warn! in the else branch of the _collect_helper import so that a missing helper module produces a visible diagnostic instead of silently degrading to sync-only callback collection.

Replaces the hardcoded '1.0.0' string in lib.rs with env!("CARGO_PKG_VERSION") so the version exposed to Python (amplifier_core._engine.__version__) always matches the version in Cargo.toml (currently 1.2.2) without manual updates.

'agents' is never initialized as a mount point in PyCoordinator::new() — the mount_points dict contains only 'orchestrator', 'providers', 'tools', 'context', 'hooks', and 'module-source-resolver'. Remove 'agents' from the three match arms in mount(), get(), and unmount() so that any code attempting to mount to 'agents' correctly hits the 'Unknown mount point' error rather than silently succeeding with a key that will never be populated.

… and session Add an explanatory block comment in PySession::cleanup()'s async block clarifying WHY it intentionally swallows all exceptions (including BaseException subclasses like KeyboardInterrupt and SystemExit), in contrast to PyCoordinator::cleanup() which re-raises fatal non-Exception BaseExceptions after all cleanup functions have run. This addresses the DRY observation from code review: the two loops look similar but have opposing semantics for fatal errors. The comment makes the divergence explicit so future readers don't 'fix' session.rs to match coordinator.rs and accidentally break the session lifecycle guarantee.

…are-dict input, ask_user message suppression Four new test cases addressing gaps identified in code review: 1. test_inject_context_ephemeral_skips_add_message_but_counts_tokens - ephemeral=True: add_message NOT called, but token counter IS incremented 2. test_ask_user_timeout_allow_default - Approval timeout with approval_default='allow' returns action='continue' (companion to the existing timeout+deny test) 3. test_bare_dict_hook_result_raises_attribute_error - Plain dict passed as result raises AttributeError promptly (not silent bug) 4. test_ask_user_does_not_process_user_message - When action='ask_user', the user_message field is NOT routed to the display system (ask_user returns early before user_message processing) Also update test_inject_context_size_limit_exceeded to match the new error message ('characters' instead of 'bytes' after the Fix 3 char-count change).

… vocabulary PyApprovalProviderBridge was presenting ["approve", "deny"] to the Python approval system, but is_approval_granted() only matches "allow", "allow once", "allow always". This made the bridge permanently unable to approve — a conforming approval system returning "approve" would always be rejected. Change options to ["Allow", "Deny"] and default to "Deny", matching the vocabulary already used in hook_dispatch.rs:338-339 and accepted by is_approval_granted().

Previously, if _sanitize_for_llm failed (import error, runtime error), the hook injection would proceed with raw unsanitized content and emit a warning. This is the definition of a fail-open security fallback. Now: sanitization failure emits log::error! at SECURITY level and returns PyValueError, rejecting the injection entirely. If sanitization is not available, the content cannot be trusted and must not be injected.

…val_granted hook_dispatch.rs:428 used `if approval_default == "deny"` (exact, case-sensitive). If a HookResult model permitted "Deny" or "DENY" as the default, they fell through to the else branch and continued (allowed). This is the same class of bug as the original S1 finding. Replace the case-sensitive string comparison with is_approval_granted() so the timeout path uses identical fail-closed semantics: only "allow", "allow once", "allow always" (case-insensitive) grant access. Everything else — including "deny", "Deny", empty string, or unexpected values — denies.

…nst sanitized content - Add truncate_for_log() helper — caps prompt/message text at 200 chars in all log messages and HookResult.reason fields (S6, N5). Hook-controlled strings can no longer flood audit logs or embed sensitive data in reason payloads. - Downgrade user_message fallback log from info! to debug! (no display_system configured — not an operator-level event, and now truncated as well). - Restructure inject_context Phase A to sanitize FIRST, then validate: A.1 _sanitize_for_llm (fail closed on error — unchanged) A.2 size limit check against *sanitized* char_count (was pre-sanitization) A.3 budget check against *sanitized* token count (was pre-sanitization) A.4 budget counter increment — only after sanitization succeeds (N4, NEW-1) A.5 build message dict A.6 audit log Previously the size limit and budget counter were evaluated against raw hook-provided content before any sanitization ran. - Add NOTE comment on approval_options: custom option strings must contain an allow-family value for is_approval_granted() to match; unknown strings are denied (fail-closed by design). (NC-NEW-1) - Replace numbered section comments (Section 1 / 3 / 4 / 1d / 2) with Phase A–E labels and add phase-ordering rationale in the method doc-comment, matching the actual sync-before-async execution model. (Q8)

session.rs: - Add inline comment explaining why loader param is accepted but not used (API compat with Python AmplifierSession signature; _session_init.py retrieves the loader from the coordinator config). - Add runtime log::warn! when a non-None loader is passed so callers get immediate feedback instead of silent ignoring. (Q9) session.rs + coordinator/mod.rs: - Add // Safety comment on the .unwrap() call in parent_id getter: IntoPyObject<&String> has Error = Infallible — unwrap cannot panic. Consistent with the codebase's no-unwrap standard; intent is now explicit rather than requiring the reader to know the Infallible rule. (NEW-3)

The loader, approval_system, and display_system property getters had a pattern of: let obj = self.xxx_obj.bind(py); if obj.is_none() { py.None() } else { self.xxx_obj.clone_ref(py) } clone_ref() on a Py<PyAny> that holds Python None already returns a clone of the None reference — the branch is dead code and every reader must verify that clone_ref(None) == None to convince themselves it's correct. Replace all three with the direct form: self.xxx_obj.clone_ref(py) This is shorter, less surprising, and identical in behavior. (Q10)

The test was already correctly skipped with a brief reason string. This commit replaces that one-liner with a full explanation of WHY the Rust engine handles the scenario differently: - Python CancellationToken.trigger_callbacks() can propagate KeyboardInterrupt/SystemExit out to the caller via pytest.raises(). - The Rust version wraps trigger_callbacks inside a pyo3_async_runtimes::tokio::future_into_py Tokio task; BaseExceptions raised inside that task are caught by the async bridge before they can surface as a Python exception — preventing them from tearing down the Tokio runtime (intentional design). - The coordinator-level BaseException re-raise (first_fatal in mod.rs) IS still in place, but that's a different code path from the CancellationToken itself. - Adds a TODO for adding dedicated coordinator.cleanup() BaseException test coverage.

bkrabach and others added 30 commits March 14, 2026 15:41

chore: clean baseline before python bindings split

cde65c4

refactor(py): extract helpers.rs (wrap_future_as_coroutine, try_model…

27a0fa0

…_dump)

refactor(py): extract bridges.rs (PyHookHandlerBridge, PyApprovalProv…

45e9b60

…iderBridge, PyDisplayServiceBridge)

refactor(py): extract cancellation.rs (PyCancellationToken)

8aa7e63

fix(py): remove orphaned HashSet import and stale clippy allow in can…

508470a

…cellation.rs

refactor(py): extract errors.rs (PyProviderError)

08a93f7

refactor(py): extract retry.rs (PyRetryConfig, classify_error_message…

a54c0fb

…, compute_delay)

style(py): remove scaffolding placeholder comment from retry.rs

1a87961

refactor(py): extract hooks.rs (PyUnregisterFn, PyHookRegistry)

1bad339

refactor(py): extract session.rs (PySession)

6b21cf1

refactor(py): extract module_resolver.rs (resolve_module, load_wasm_f…

e64655d

…rom_path)

refactor(py): extract coordinator.rs (PyCoordinator, ~900 lines)

42c858d

refactor(py): extract wasm.rs (6 PyWasm* classes, NullContextManager,…

e009ac7

… load_and_mount_wasm)

fix(py): fix test compile error and remove stale unused imports in li…

8cc4361

…b.rs

refactor(py): convert coordinator.rs to coordinator/mod.rs directory …

468069c

…module

refactor(py): extract mount-point methods into coordinator/mount_poin…

4ef34f2

…ts.rs

refactor(py): extract capability methods into coordinator/capabilitie…

9b7a3ef

…s.rs

refactor(py): add placeholder coordinator/hook_dispatch.rs for Phase 3

03b85f0

refactor(py): clean up coordinator/mod.rs after method extraction

fd60bbe

style(py): clean up coordinator/mod.rs comments after code review

3216252

style(py): add clarifying comments for unwrap_or_default() key extrac…

d410a2a

…tion in to_dict()

feat: add PyContextManagerBridge to bridges.rs

54ebd9d

test: add failing tests for PyContextManagerBridge via process_hook_r…

8bb2a57

…esult

test: add comprehensive tests for process_hook_result action branches

e04402c

bkrabach and others added 29 commits March 14, 2026 21:54

style: replace per-test import guards with module-level importorskip

b0588bd

feat: add fatal-exception re-raise to cleanup() in coordinator/mod.rs

e55efb2

feat: construct RustCoordinator directly in PySession::new(), remove …

dad4553

…_rust_wrappers import

style: rename inner kwargs to coord_kwargs in coordinator block to el…

91e457e

…iminate shadowing

chore: delete _rust_wrappers.py (247 lines, logic now in Rust)

e50d4ef

fix: use patch.object at class level for RustCoordinator.cleanup mocks

206471b

test: add backward compatibility tests for Phase 3

bcf21c1

fix: add session_state backward-compat alias to RustCoordinator and r…

e50a4c6

…estore spec test

fix: add diagnostic logging to try_model_dump fallback

e76e8ed

fix: use UTC timezone for audit timestamps in hook context injection

851a566

refactor: remove duplicate _raw helpers, call public getters directly

ed09eac

bkrabach merged commit 9593bbe into main Mar 16, 2026
2 of 6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(py): split 4,129-line lib.rs into 15 focused modules + thin Python layer#48

refactor(py): split 4,129-line lib.rs into 15 focused modules + thin Python layer#48
bkrabach merged 59 commits intomainfrom
refactor/python-bindings-split

bkrabach commented Mar 15, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bkrabach commented Mar 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Phase 1: Mechanical File Split

Phase 2: Coordinator Decomposition

Phase 3: Thin the Python Layer

Review Fixes (3 rounds × 3 reviewers)

File Structure (After)

Python-side Changes

Motivation

Backward Compatibility

Test Results

Expert Review Summary

Design Document

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

bkrabach commented Mar 15, 2026 •

edited

Loading