Skip to content

v0.18.0

Choose a tag to compare

@clemlesne clemlesne released this 03 Mar 16:28
· 44 commits to main since this release

Higher concurrency, lower latency

Default overcommit ratios retuned from a 500-VM sweep on production hardware. At the optimal config (CPU=2×, MEM=8×), exec latency is 57ms p50 / 71ms p95 with 39 VMs/s throughput at just 7% host RAM. Beyond 2× CPU, latency degrades sharply with no throughput gain.

New make bench-optimizer lets you find the optimal ratios for your own hardware.

Reliable at high VM counts

  • VMs no longer crash with SIGABRT ("Resource temporarily unavailable") under burst startup — root causes were a per-UID process limit applied per-VM and low default fd/thread limits
  • Concurrent asset downloads no longer corrupt cached files — a cross-process file lock serializes the full download+decompress pipeline
  • QMP connection race during snapshot restore eliminated via socket activation (fd inheritance instead of filesystem polling)
  • Admission queue wakeups reduced from O(N²) to O(N) under burst load, preventing unnecessary contention

Startup health checks

The scheduler now probes system limits at startup and auto-raises safe ones (fd limit, nproc) or warns about unsafe ones (overcommit mode, max_map_count, threads-max, cgroup pids.max). No more silent failures under load — operators get actionable guidance before the first VM boots.

Observability

  • New admission_ms timing field separates queue wait from infra setup — previously lumped together in setup_ms, making it impossible to distinguish capacity contention from actual overhead
  • README now includes a throughput/latency benchmark table with 5 configs for capacity planning

CI & testing

  • 4 package-install tests skipped under TCG emulation (prevented 940s timeout flakes)
  • CI runners hardened with matching system limit tuning
  • ci-diagnose rewritten as failure-centric Python tool (replaces bash script)
  • Flaky test_peak_ram_per_vm stabilized on free-threaded Python 3.14t

Internal

  • Subprocess spawn boilerplate consolidated into start_managed_process() (6 call sites across QEMU, gvproxy, QSD, qemu-img)
  • Two-level async+flock locking primitive (lock_utils.py) prevents thread-pool starvation deadlock
  • Boot path parallelized (overlay + cgroup run concurrently)
  • Overlay pool pre-creates 100% of max slots (was 50%)