v0.18.0
Higher concurrency, lower latency
Default overcommit ratios retuned from a 500-VM sweep on production hardware. At the optimal config (CPU=2×, MEM=8×), exec latency is 57ms p50 / 71ms p95 with 39 VMs/s throughput at just 7% host RAM. Beyond 2× CPU, latency degrades sharply with no throughput gain.
New make bench-optimizer lets you find the optimal ratios for your own hardware.
Reliable at high VM counts
- VMs no longer crash with
SIGABRT("Resource temporarily unavailable") under burst startup — root causes were a per-UID process limit applied per-VM and low default fd/thread limits - Concurrent asset downloads no longer corrupt cached files — a cross-process file lock serializes the full download+decompress pipeline
- QMP connection race during snapshot restore eliminated via socket activation (fd inheritance instead of filesystem polling)
- Admission queue wakeups reduced from O(N²) to O(N) under burst load, preventing unnecessary contention
Startup health checks
The scheduler now probes system limits at startup and auto-raises safe ones (fd limit, nproc) or warns about unsafe ones (overcommit mode, max_map_count, threads-max, cgroup pids.max). No more silent failures under load — operators get actionable guidance before the first VM boots.
Observability
- New
admission_mstiming field separates queue wait from infra setup — previously lumped together insetup_ms, making it impossible to distinguish capacity contention from actual overhead - README now includes a throughput/latency benchmark table with 5 configs for capacity planning
CI & testing
- 4 package-install tests skipped under TCG emulation (prevented 940s timeout flakes)
- CI runners hardened with matching system limit tuning
ci-diagnoserewritten as failure-centric Python tool (replaces bash script)- Flaky
test_peak_ram_per_vmstabilized on free-threaded Python 3.14t
Internal
- Subprocess spawn boilerplate consolidated into
start_managed_process()(6 call sites across QEMU, gvproxy, QSD, qemu-img) - Two-level async+flock locking primitive (
lock_utils.py) prevents thread-pool starvation deadlock - Boot path parallelized (overlay + cgroup run concurrently)
- Overlay pool pre-creates 100% of max slots (was 50%)