Today cluster tests in tests/unit/cluster/* run as single-process with InMemoryTransport. Real multi-node tests are missing — multiple ActorSystem instances with real transport, in one process or worker-threads, deterministically coordinated. Akka's MultiNodeSpec is the prior art.
Strategy: worker-thread-based. We already have MessageChannelTransport (in src/cluster/transports/) for worker-thread mesh. The harness builds on that:
- Spawn N worker threads, each with its own ActorSystem.
- Wire them up via MessageChannel for cluster transport.
- Provide barrier synchronisation (
enterBarrier(name) Akka-style) for deterministic 'all N nodes reached step X' coordination.
- Test-side API:
runMultiNode(['node-a', 'node-b', 'node-c'], async (role) => { ... }) — body runs once per role, in its own worker.
Components:
| File |
Task |
| src/testkit/MultiNodeSpec.ts (new) |
harness API: runMultiNode, MultiNodeProbe, barriers |
| src/testkit/internal/MultiNodeRunner.ts (new) |
spawn workers, route messages, collect results |
| tests/unit/testkit/MultiNodeSpec.test.ts (new) |
self-tests of the harness |
| tests/multi-node/*.test.ts (new) |
three real multi-node tests (sharding-rebalance, pubsub-cross-node, singleton-failover) |
Estimate: 4-6 days. No prerequisites. Foundation for #3 (sharding hardening) and #5 (CRDTs / replicated ES) — those depend on this landing first.
Verification — three concrete multi-node tests:
- Sharding rebalance: 3 nodes, one fails, shards rebalance, all entities reachable.
- PubSub cross-node: 3 nodes; subscribe on A, publish on B, receipt on C.
- Singleton failover: 3 nodes; singleton runs on oldest, oldest dies, failover to next-oldest.
See the roadmap plan for full context (item 2 of 5).
Today cluster tests in tests/unit/cluster/* run as single-process with InMemoryTransport. Real multi-node tests are missing — multiple ActorSystem instances with real transport, in one process or worker-threads, deterministically coordinated. Akka's MultiNodeSpec is the prior art.
Strategy: worker-thread-based. We already have MessageChannelTransport (in src/cluster/transports/) for worker-thread mesh. The harness builds on that:
enterBarrier(name)Akka-style) for deterministic 'all N nodes reached step X' coordination.runMultiNode(['node-a', 'node-b', 'node-c'], async (role) => { ... })— body runs once per role, in its own worker.Components:
Estimate: 4-6 days. No prerequisites. Foundation for #3 (sharding hardening) and #5 (CRDTs / replicated ES) — those depend on this landing first.
Verification — three concrete multi-node tests:
See the roadmap plan for full context (item 2 of 5).