docs(bench): recontextualise CubeSandbox row + small-N replay numbers by WaylandYang · Pull Request #42 · deeplethe/forkd

WaylandYang · 2026-05-14T02:59:10Z

Follow-up to TencentCloud/CubeSandbox#235. The Cube maintainer responded with two clarifications that change how the 20.3 s / 77-of-100 N=100 number should be interpreted (without changing the measurement itself).

What the Cube team said

The reflink-copy race is on a slow code path the original template inadvertently selected. CubeSandbox pre-formats a pool of writable-layer ext4 images at sizes listed in `pool_default_format_size_list` (default `["1Gi"]`). A sandbox whose `writable_layer_size` matches one of those sizes reuses a pool entry — fast path, no `mkfs.ext4` or reflink-copy per sandbox. We passed `--writable-layer-size 2Gi`, which doesn't match, so every sandbox went through the live `mkfs.ext4 + reflink-copy` path. That's where the bad-magic race lives.
Cube's published <60 ms single-instance / P95 90 ms @ N=50 / <200 ms @ N=100 numbers are measured on a 96 vCPU server. Our 20 vCPU host (the dev box) is outside their tested matrix.

Cube also accepted the first two improvements from our issue (configurable `cmdTimeout`, richer diagnostic info on `newExt4RawByReflinkCopy` failures) and is reviewing the third (drop per-clone `e2fsck`).

Doc changes

README.md — both footnotes ¹ rewritten to lead with "slow-path measurement on a host outside CubeSandbox's documented testing matrix". The 20.3 s figure itself stays in the table. The footnote now explicitly says we did not re-test the fast-path configuration.
bench/CUBESANDBOX.md — two new sections:
- Upstream response (2026-05-14) with the clarifications from Cube and the status of our improvement proposals.
- Small-N replay on the same (slow-path) configuration with N=1/5/10 numbers — done so the row's narrative isn't just "we hit a race once."
bench/cube-replay.sh — the script that produced the small-N numbers.

Small-N replay results

Same 2 GiB template (slow path), this dev box (20 vCPU / 30 GiB):

N	Succeeded	Wall-clock	Per-sandbox
1	1/1	924 ms	924 ms
5	5/5	2,207 ms	441 ms
10	10/10	2,567 ms	257 ms

100 % success at every size measured — the race is specifically a slow-path-at-high-N phenomenon. The 20.3 s / 100 = 203 ms-per-sandbox figure at N=100 is consistent with the per-sandbox cost shrinking with concurrency.

Test plan

README footnotes render OK on GitHub
bench/CUBESANDBOX.md anchor link from README footnote works
bench/cube-replay.sh is executable + matches the numbers reported

🤖 Generated with Claude Code

… small-N replay After filing TencentCloud/CubeSandbox#235, the Cube maintainer confirmed two things that recontextualise the 20.3 s / 77-of-100 N=100 number we'd published: 1. The reflink-copy race lives on a slow code path the original template inadvertently selected — `--writable-layer-size 2Gi` doesn't match the default `pool_default_format_size_list = ["1Gi"]`, so every sandbox went through live `mkfs.ext4 + reflink-copy` instead of the pool fast path. The fast path (writable_layer_size matches pool) doesn't run the racy code. 2. Cube's published "<60 ms single-instance / P95 90 ms @ N=50 / <200 ms @ N=100" numbers are measured on a 96 vCPU server. Our 20 vCPU host is outside their tested matrix. Neither of those changes the fact that the 20.3 s figure is what we measured. They do change how the row should be interpreted, so: - README.md: both footnotes ¹ rewritten to lead with "slow-path measurement on a host outside CubeSandbox's documented testing matrix". The 20.3 s number stays in the table. The footnote now explicitly says we did not re-test the fast-path configuration. - bench/CUBESANDBOX.md: two new sections. "Upstream response (2026-05-14)" — the two clarifications from Tencent verbatim, plus a note that they accepted the first two fixes from our issue (configurable cmdTimeout, richer error diagnostics) and are reviewing the third. "Small-N replay on the same (slow-path) configuration" — we re-ran with the same 2 GiB template at N=1, N=5, N=10 to fit the 30 GiB host RAM budget. 100% success at every size; cold start ~924 ms, per-sandbox cost shrinks to 257 ms at N=10 (consistent with the 20.3 s / 100 = 203 ms-per-sandbox at N=100). Confirms the race is a slow-path-at-high-N phenomenon, not a general spawn-time issue. - bench/cube-replay.sh: the script that produced the small-N numbers, parking it next to the rest of bench/. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

WaylandYang merged commit 66ff403 into main May 14, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(bench): recontextualise CubeSandbox row + small-N replay numbers#42

docs(bench): recontextualise CubeSandbox row + small-N replay numbers#42
WaylandYang merged 1 commit into
mainfrom
dev

WaylandYang commented May 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

WaylandYang commented May 14, 2026

What the Cube team said

Doc changes

Small-N replay results

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant