bench(v0.4): Phase 7.5 — live BRANCH pause-window data on a clean source by WaylandYang · Pull Request #210 · deeplethe/forkd

WaylandYang · 2026-05-31T10:37:08Z

Summary

Replaces the "bench in progress" disclaimer in v0.4 docs with measured numbers from a clean Hub-pulled python-numpy source. The previous attempt used coding-agent-fork-prewarm-v1, which had 17 baked-in guest Oopses contaminating the timing — fixed by switching source.

Results (Intel i7-12700, 30 GiB RAM, Linux 6.14, ext4 on HDD, source = python-numpy 1.5 GiB)

mode	pause p50	pause p90	RT p50
live-sync	56 ms	64 ms	13 730 ms
live-async	54 ms	241 ms	69 ms
diff	202 ms	418 ms	13 461 ms
full	13 550 ms	14 268 ms	13 559 ms

Key ratios at p50:

live vs diff: 3.6× faster pause (202 / 56)
live vs full: 242× faster pause (13 550 / 56)
async RT vs sync RT: 198× faster return (13 730 / 69)

Why HDD makes the numbers more impressive, not less

Live's pause is disk-independent — the memory copy runs after resume, not during. So:

On NVMe, Diff would speed up (~50-100 ms p50), Live wouldn't change (it's CPU-bound on vmstate + WP arming). Ratio narrows but never inverts.
On slow storage, Live's gap widens.

This is the structural advantage of moving the memory copy out of the critical section.

Files

bench/live-fork-pause-window/bench-live-fork.py — runnable harness, parameterized
bench/live-fork-pause-window/bench-live-fork.csv — 40-row raw data (one per BRANCH)
bench/live-fork-pause-window/RESULTS-v0.4.md — writeup with methodology + honest caveats

Docs

README.md headline: "BRANCH a live VM in 56 ms (v0.4 live mode)"
README-zh.md headline: same in Chinese
CHANGELOG.md: Unreleased v0.4 section's "bench in progress" placeholder replaced with the table

Honest caveats called out in the writeup

Single host, one source size — numbers will move with RAM size and disk medium
One live-async p90 outlier (iter Per-child network namespace + macvtap setup #8 saw 258 ms pause) — root cause not investigated; suspects = ext4 writeback pressure or FC vmstate hiccup
Requires unprivileged_userfaultfd=1 or running as root (forkd doctor probes both)

Phase 7 complete with this PR

REST (#204) · CLI (#205) · SDKs (#206) · doctor (#207) · docs (#208) · bench (this).

Test plan

Bench ran on dev box: 40 iterations clean, no FC OOM, no guest panic
CSV + summary table both regenerable from bench-live-fork.py
Replicate on a different host (SSD or NVMe) — left as a follow-up
Replicate with a different source size (e.g. 512 MiB or 4 GiB) — follow-up

🤖 Generated with Claude Code

Replaces the "pause_ms TBD" disclaimer in v0.4 docs with measured numbers from a clean Hub-pulled `python-numpy` source (1.5 GiB, sha256-verified). The previous attempt at this measurement used `coding-agent-fork-prewarm-v1`, which had 17 baked-in guest Oopses contaminating the timing — fixed by switching source. Methodology (`bench/live-fork-pause-window/bench-live-fork.py`, based on `scripts/dev/e2e-live-branch.py` Phase 6 E2E harness): - One memfd-backed source sandbox spawned with `live_fork: true` - 10 iterations × 4 modes ({live-sync, live-async, diff, full}), interleaved so cold-cache effects average across modes - Each iteration: POST .../branch, record `pause_ms` and HTTP RT, DELETE the result snapshot to bound disk usage - Async iterations also record `poll_until_ready_ms` Results (Intel i7-12700, 30 GiB RAM, Linux 6.14, ext4 on **HDD**): | mode | pause p50 | pause p90 | RT p50 | |--------------|----------:|----------:|----------:| | live-sync | **56 ms**| 64 ms | 13 730 ms | | live-async | 54 ms | 241 ms | **69 ms** | | diff | 202 ms | 418 ms | 13 461 ms | | full | 13 550 ms | 14 268 ms | 13 559 ms | Key ratios at p50: - live vs diff: **3.6× faster pause** (202 / 56) - live vs full: **242× faster pause** (13550 / 56) - async RT vs sync RT: **198× faster return** (13730 / 69) The "on HDD" point is a feature, not a bug for the writeup: Live's pause is disk-independent (memory copy runs after resume, not during), so the Live / Diff gap *widens* on slow storage rather than shrinking. NVMe would speed up Diff but not Live, making the ratio narrower — but Live is always bounded by CPU work (vmstate dump + UFFD_WP arming), never by disk throughput. Files: - `bench/live-fork-pause-window/bench-live-fork.py` — runnable harness, parameterized on source-tag and iterations - `bench/live-fork-pause-window/bench-live-fork.csv` — 40-row raw data (one per BRANCH iteration) - `bench/live-fork-pause-window/RESULTS-v0.4.md` — writeup with methodology, host config, per-mode interpretation of what pause_ms / RT measure, and honest caveats (single host, one source size, p90 outlier on async iter #8) Docs updated: - `README.md` headline: "BRANCH a live VM in 150 ms" → "in 56 ms (v0.4 live mode)". v0.4 preview block now leads with the measured 3.6× / 200× ratios and links to RESULTS-v0.4.md. - `README-zh.md`: same headline + intro update. - `CHANGELOG.md`: Unreleased's v0.4 section's "Bench in progress" disclaimer replaced with the actual numbers table. Phase 7 (user surface for v0.4 live BRANCH) is complete with this PR: REST (#204), CLI (#205), SDKs (#206), doctor (#207), docs (#208), bench (this). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

WaylandYang merged commit 15b76ba into main May 31, 2026
2 checks passed

WaylandYang deleted the feat/v0.4-phase7.5-bench branch May 31, 2026 10:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bench(v0.4): Phase 7.5 — live BRANCH pause-window data on a clean source#210

bench(v0.4): Phase 7.5 — live BRANCH pause-window data on a clean source#210
WaylandYang merged 1 commit into
mainfrom
feat/v0.4-phase7.5-bench

WaylandYang commented May 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

WaylandYang commented May 31, 2026

Summary

Results (Intel i7-12700, 30 GiB RAM, Linux 6.14, ext4 on HDD, source = python-numpy 1.5 GiB)

Why HDD makes the numbers more impressive, not less

Files

Docs

Honest caveats called out in the writeup

Phase 7 complete with this PR

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant