Skip to content

Use multiple jvms to exec tests#5

Merged
sake92 merged 2 commits intomainfrom
multiple-test-forks
Apr 17, 2026
Merged

Use multiple jvms to exec tests#5
sake92 merged 2 commits intomainfrom
multiple-test-forks

Conversation

@sake92
Copy link
Copy Markdown
Owner

@sake92 sake92 commented Apr 17, 2026

Multi-fork test execution

Adds an optional second dimension of parallelism to deder test — run a module's test
suite across several forked JVMs, in addition to the existing in-JVM thread pool.

Motivation

Large test suites bottleneck on single-JVM contention (library-global locks, Netty
worker groups, JIT warmup on cold classes, I/O serialization). We already fork one JVM
per test task and thread-parallelize inside it; adding a second dimension — multiple
forks per module — scales throughput on machines with spare cores/memory.

Defaults are unchanged: maxTestForks = 1 per module is equivalent to today.
Opt-in is a single Pkl field.

Design in brief

  • Topology. The orchestrator becomes a supervisor that spawns up to maxTestForks
    forked JVMs. Each fork continues to run its internal thread pool of size
    testParallelism. Effective concurrent classes per module ≈ M × testParallelism.
  • Effective M = min(maxTestForks, totalTestClasses) — no empty forks spawned.
  • Distribution. Longest-Processing-Time-first bin-packing weighted by persisted
    per-class duration history. Missing durations fall back to the median; first run with
    no history falls through to round-robin.
  • Global throttle. A new server-wide Semaphore caps concurrent forked JVMs across
    the whole server (sized by maxConcurrentTestForks, default =
    Runtime.availableProcessors()). One permit per spawned JVM, released on exit.
    Natural staggering when many modules ask for many forks at once; no deadlock because
    permits are per-JVM, not per-task.
  • JDWP guard. If jvmOptions contains a fixed -agentlib:jdwp=... port and
    maxTestForks > 1, the orchestrator refuses with a clear error (multiple forks
    can't all bind the same debug port). Users set maxTestForks = 1 when debugging.

Output & capture

Forks talk back to the orchestrator over a JSON-lines envelope protocol on stdout
(ForkStarted, SuiteStarted, SuiteCompleted, UnattributedOutput, ForkCompleted).
Envelope lines are prefixed with @@DEDER-FORK@@ so stray pre-capture bytes are still
tolerated.

Inside each fork, a capturing PrintStream is installed on System.out before any
logger captures it. A ThreadLocal buffer accumulates whatever the suite writes — both
framework reporter output and user printlns — and is emitted as a single
SuiteCompleted envelope on suite completion. Writes from threads with no active
suite become UnattributedOutput envelopes, newline-flushed to keep them ordered.

The orchestrator renders each SuiteCompleted as a single block in the terminal:

=== SuiteName ===
<captured output including ScalaTest's reporter lines and any prints>

When effectiveForks > 1, every header/line is tagged [fork-N]. With a single fork
there is no tag — the output is indistinguishable from the classic single-JVM view
except for the per-suite grouping.

Per-suite batching trade-off

A suite's stdout is held until the suite completes. A long-running suite won't stream
live output mid-run. This was an explicit choice (discussed in the design doc). A
size/time-based partial flush can be layered on the same envelope stream later without
changing the protocol.

PASS lines dropped

DederTestEventHandler no longer emits a PASS ✅ line per test; frameworks (ScalaTest,
munit, utest, JUnit, etc.) already print passes through their own reporters, so we were
double-logging. FAIL 🔴 / SKIP 🚫 lines and failure stack traces are kept — deder's
value-add is a uniform failure surface across frameworks.

Run & history persistence

Each invocation creates .deder/out/<moduleId>/test/run-<YYYYMMDD-HHmmss-SSS>-<uuid4>/
containing one subdirectory per fork (fork-0/, fork-1/, …). Each fork directory
holds its fork-args.json, per-fork fork-results-*.json payload, and
stdout.log / stderr.log for offline inspection. Previous runs are preserved.

At the module level, .deder/out/<moduleId>/test/test-history.json carries cumulative
Map[className, TestClassStats] (duration + last status + last-run epoch). It's loaded
at orchestrator start, merged with the current run's stats at the end, and atomically
rewritten via tmp-file + rename. Corrupt or missing file → empty map; never fatal.

No database, no schema migrations — a single JSON file per module. If dynamic
work-stealing or cross-module analytics become needs later, the on-disk format can be
migrated without touching call sites.

Cancellation

The orchestrator checks DederGlobals.cancellationTokens before acquiring each permit
(queued forks abort without launching) and calls proc.wrapped.destroyForcibly() on
running forks when cancelled. No cooperative drain — forks die. Results from suites
already flushed are preserved; partial suites are lost.

Configuration

Per module (config/DederProject.pkl)

Added to JavaTestModule and ScalaTestModule:

/// Maximum number of forked JVMs to spawn for running this module's tests.
/// Effective fork count is min(maxTestForks, number of discovered test classes).
/// Default 1 (single fork). Increase for throughput on large test suites; each fork
/// pays JVM startup cost and holds its own heap, so scale with available memory.
/// Capped server-wide by the maxConcurrentTestForks server property.
maxTestForks: Int(isBetween(1, 64)) = 1

Java bindings must be regenerated via ./scripts/gen-config-bindings.sh (config/src is
gitignored). ScalaJS / Scala Native test modules do not have this field — their runners
live on separate paths and remain single-process for now.

Server-wide (.deder/server.properties)

maxConcurrentTestForks=   # optional; default: Runtime.availableProcessors()

Documented in docs/content/reference/server-properties.md.

Defaults summary

┌────────────────────────┬───────────────────────────────┬────────────────────────────────┐
│        Setting         │            Default            │             Effect             │
├────────────────────────┼───────────────────────────────┼────────────────────────────────┤
│ maxTestForks           │ 1                             │ Single fork — pre-PR behavior. │
├────────────────────────┼───────────────────────────────┼────────────────────────────────┤
│ testParallelism        │ 1 (existing)                  │ Unchanged.                     │
├────────────────────────┼───────────────────────────────┼────────────────────────────────┤
│ maxConcurrentTestForks │ Runtime.availableProcessors() │ Caps total forked JVMs.        │
└────────────────────────┴───────────────────────────────┴────────────────────────────────┘

Files

┌───────────────────────────────────────────────┬─────────────────────────────────────────────────────────────────────────────┐
│                     File                      │                                   Change                                    │
├───────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤
│ config/DederProject.pkl                       │ Add maxTestForks to JavaTestModule and ScalaTestModule.                     │
├───────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤
│ docs/content/reference/server-properties.md   │ Document maxConcurrentTestForks.                                            │
├───────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤
│ server/.../DederGlobals.scala                 │ Add testForkSemaphore with runtime-sized default + setter.                  │
├───────────────────────────────────────────────┼─────────────────────────────────────────────────────────────────────────────┤
│ server/.../ServerMain.scala                     │ Read optional maxConcurrentTestForks; size the global semaphore.          │
├─────────────────────────────────────────────────┼───────────────────────────────────────────────────────────────────────────┤
│ server/.../CoreTasks.scala                      │ Destructure maxTestForks per module; pass to orchestrator.                │
├─────────────────────────────────────────────────┼───────────────────────────────────────────────────────────────────────────┤
│ server/.../importing/sbt/SbtImporter.scala      │ Fix constructor call for new Pkl field.                                   │
├─────────────────────────────────────────────────┼───────────────────────────────────────────────────────────────────────────┤
│ server/.../testing/ForkedTestProtocol.scala     │ forkId on args; new ForkedTestEnvelope ADT; ForkedTestResultsPayload.     │
├─────────────────────────────────────────────────┼───────────────────────────────────────────────────────────────────────────┤
│ server/.../testing/ForkedTestReporter.scala     │ Envelope emitter + capturing PrintStream with per-suite ThreadLocal.      │
│ new                                             │                                                                           │
├─────────────────────────────────────────────────┼───────────────────────────────────────────────────────────────────────────┤
│ server/.../testing/TestHistory.scala  new       │ Load / merge / atomic-save test-history.json.                             │
├─────────────────────────────────────────────────┼───────────────────────────────────────────────────────────────────────────┤
│ server/.../testing/TestDistribution.scala  new  │ LPT bin-packing over discovered tests, history-weighted.                  │
├─────────────────────────────────────────────────┼───────────────────────────────────────────────────────────────────────────┤
│ server/.../testing/ForkedTestMain.scala         │ Install capturing stream before logger init; emit ForkStarted /           │
│                                                 │ ForkCompleted; write ForkedTestResultsPayload.                            │
├─────────────────────────────────────────────────┼───────────────────────────────────────────────────────────────────────────┤
│                                                 │ Optional ForkRunnerHooks; wrap task.execute with SuiteStarted /           │
│ server/.../testing/DederTestRunner.scala        │ SuiteCompleted; event handler aggregates per-class stats; skip PASS ✅    │
│                                                 │ for successes.                                                            │
├─────────────────────────────────────────────────┼───────────────────────────────────────────────────────────────────────────┤
│                                                 │ Full rewrite as multi-fork supervisor: JDWP guard, distribution, per-fork │
│ server/.../testing/ForkedTestOrchestrator.scala │  thread pool + semaphore permit per spawn, envelope-driven rendering,     │
│                                                 │ run-<id>/fork-<N>/ layout, aggregation, atomic history persistence,       │
│                                                 │ tag-free output when a single fork.                                       │
└─────────────────────────────────────────────────┴───────────────────────────────────────────────────────────────────────────┘

Verification

- ./mill server.compile — clean
- ./mill server.test — 179 unit tests pass
- Smoke-tested end-to-end against scala-algorithms:
  - maxTestForks = 1 → no [fork-N] tags, clean per-suite === SuiteName === headers.
  - maxTestForks = 222 test classes split 11/11 across forks; [fork-0] / [fork-1]
tags, no output cross-contamination between forks, test-history.json populated.
- Integration tests (./scripts/run-it-tests.sh) not run by the branch author — please
run in CI before merging.

Out of scope / follow-ups

- ScalaJS & Scala Native test forking (separate runners).
- Dynamic work-stealing (current distribution is static LPT).
- H2 or DB-backed coordination — not needed for v1; JSON history suffices.
- Retention / cleanup of old run-* directories — they accumulate indefinitely today.
- Per-fork port allocation / temp-dir sandboxing for tests that bind fixed ports.
- Per-fork JDWP-port rewriting (currently refused outright when maxTestForks > 1).
- Size/time-based mid-suite flush for very long suites.

@sake92 sake92 merged commit 4f30fb3 into main Apr 17, 2026
2 checks passed
@sake92 sake92 deleted the multiple-test-forks branch April 17, 2026 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant