refactor(ocap-kernel): Extract VatManager and SubclusterManager from Kernel class#651
refactor(ocap-kernel): Extract VatManager and SubclusterManager from Kernel class#651
Conversation
af6dcc8 to
9bc842d
Compare
rekmarks
left a comment
There was a problem hiding this comment.
Looking at this PR and its sibling #653, I'm left with some questions that I don't have easy answers to. The Kernel.ts we're left with after #653 (ref) is much shorter than on current main (ref). It maintains its existing interface, and the primary difference is that the body of function calls into a method that lives on one of the new manager classes. In other words, the refactor amounts to turning the Kernel into a facade.
Normally, I understand a facade to be something you erect on top of a messy agglomeration of stuff to present a single interface to some consumer. Here, we already have a single interface for our consumers in the form of the Kernel class, and are instead rearranging dividing its internals into separate components.
The goal of this refactor is to improve maintainability. It seems that the refactor is a net gain in tests, and I imagine it may be difficult to unit test the Kernel as thoroughly as we have unit tested its constituent components in this PR. However, I wonder if
coordinating the new sub-components of the kernel will be more work than keeping them in a single place? I have seen this kind of "fat"
orchestrator class grow beyond all reason, but I don't believe we are there yet with the kernel.
I'm curious as to how you view the tradeoffs here @sirtimid.
@rekmarks I would argue this isn't a facade pattern, but rather a separation of concerns refactor following the Single Responsibility Principle. We're not hiding complexity, we're organizing it better. We didn't think "Kernel is hard to use, let's add a simple interface." We thought "Kernel is doing too many things in one place, let's organize it better." Here's why it's worth it: Clearer ResponsibilitiesInstead of one 600+ line file doing everything, we now have smaller, focused files:
Concrete Benefits
Why Refactor Now?While 600 lines isn't "unreasonably large" yet, waiting until it gets worse makes refactoring harder. The code was already naturally grouped (all remote methods together, all service methods together), we just made those boundaries official. Early refactoring prevents the technical debt that comes from waiting until it's "bad enough." Addressing Coordination OverheadYou raised a valid concern about coordination complexity. In practice, it's minimal:
The coordination is minimal, managers share Changes stay localized:
The coordination "cost" is essentially zero, just one function call, while the benefits (focused testing, safer changes, clearer boundaries) are immediate. The Public API BenefitMost importantly, the Kernel now clearly defines our public API. Before, it mixed public methods with internal implementation details. Now, anyone reading The refactor makes the codebase more intentional: public API in Kernel, implementation in managers. What do you think works better for the team? |
rekmarks
left a comment
There was a problem hiding this comment.
Alright, let's see how this plays out!
2fc63d8 to
3b3846a
Compare
| throw new SubclusterNotFoundError(subclusterId); | ||
| } | ||
| const vatIdsToTerminate = this.#kernelStore.getSubclusterVats(subclusterId); | ||
| for (const vatId of vatIdsToTerminate.reverse()) { |
There was a problem hiding this comment.
Bug: Array Mutation in Subcluster Management
The terminateSubcluster and reloadSubcluster methods directly mutate arrays (like vatIdsToTerminate and subcluster.vats) by calling .reverse(). This mutates what might be internal kernel store data, potentially corrupting state and causing unexpected behavior for other parts of the system.
Ref: #632
Split the monolithic Kernel class into focused manager classes to improve separation of concerns and maintainability. Created
VatManagerto handle all vat lifecycle operations (launch, terminate, restart, ping) andSubclusterManagerfor subcluster operations (launch, reload, terminate). This refactoring reduces the Kernel class from handling 15+ responsibilities to focusing on core coordination, while delegating vat and subcluster management to specialized classes. Added comprehensive test coverage for both new manager classes.Note
Split vat and subcluster lifecycle logic into new managers, refactoring Kernel to delegate to them and adding comprehensive tests with updated coverage thresholds.
VatManagerandSubclusterManager(e.g.,initializeAllVats,launchSubcluster,terminateSubcluster,reloadSubcluster,restartVat,terminateVat,getVats/Ids,pin/unpin,pingVat,collectGarbage,reload). Removes direct vat map management and related helpers.vats/VatManager.ts: Handles vat lifecycle (run/launch/restart/terminate/pin/unpin/ping, reap, collect garbage, initialize from store).vats/SubclusterManager.ts: Handles subcluster lifecycle (launch/reload/terminate, membership queries, reload all subclusters).vats/VatManager.test.tsandvats/SubclusterManager.test.tswith extensive coverage of manager behaviors.Kernel.test.tsto align with delegation and behavior changes (e.g.,restartVatnow returns the new handle and preservesvatId).vitest.config.tsforpackages/ocap-kernel/**.Written by Cursor Bugbot for commit e12bf61. This will update automatically on new commits. Configure here.