feat(sdk): Phase 7.3 — mode/wait/live_fork across Python, TS, MCP#206
Merged
Conversation
Wires the Phase 7.1 canonical `mode` selector plus the Phase 6.4
`wait=false` path and the Phase 6.5 `live_fork` spawn flag through all
three published SDKs, so users can opt into v0.4 live BRANCH without
falling back to the raw HTTP surface.
Surface added (consistent across the three SDKs):
- `branch_sandbox(...)` / `branchSandbox(...)`:
* `mode: "full" | "diff" | "live"` (None default) — canonical
selector, takes precedence over legacy `diff` bool. Both set →
daemon returns 400, so SDK serializes only `mode` to keep
callers mid-migration safe.
* `wait: bool = True` — only meaningful with `mode="live"`. False
omits the body field when True (daemon default) so calls still
work against pre-v0.4 daemons that don't know the field.
- `spawn_sandboxes(...)` / `spawnSandboxes(...)`:
* `live_fork: bool = False` (`liveFork` camelCase in TS) — boots
the sandbox with memfd-backed RAM so a later live BRANCH from
it can register UFFD_WP. Body field only added when true.
TypeScript:
- New `BranchMode` type alias re-exported from package root.
- `BranchOptions.mode` / `BranchOptions.wait` added.
- `SpawnOptions.live_fork` added.
- `SnapshotInfo.status` ("writing" | "ready" | "failed") for the
`wait=false` lifecycle marker.
- 5 new vitest cases: mode=live serialization, wait=false serialized,
wait=true omitted, mode-wins-over-legacy-diff, liveFork→live_fork.
Python:
- New `BranchMode` Literal re-exported from `forkd` package root.
- `Controller.branch_sandbox` gains `mode`, `wait` kwargs.
- `Controller.spawn_sandboxes` gains `live_fork` kwarg.
MCP server:
- `branch_sandbox` MCP tool gains `mode`, `wait` args.
- `spawn_sandboxes` MCP tool gains `live_fork` arg.
- Docstrings explain when each kwarg takes effect (so the model picks
the right one without re-reading the API doc).
Backwards compat verified:
- All existing kwargs unchanged (positional order, defaults).
- Existing callers passing `diff=True` keep working — body still
serializes `diff: true` so this SDK drives v0.3.x daemons too.
- `wait` and `live_fork` only appear in the JSON body when the caller
opts in, so the wire is unchanged against pre-v0.4 daemons.
Gates:
- TypeScript: `npm run build` clean; `npm run test` 14/14 pass
(was 8 — +5 new mode/wait/liveFork tests, existing 9 unchanged).
- Python: smoke import + signature inspection — `BranchMode` exports,
`branch_sandbox` defaults unchanged for diff/measure_diff/tag.
- MCP: `server.py` parses clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
WaylandYang
added a commit
that referenced
this pull request
May 31, 2026
…rce (#210) Replaces the "pause_ms TBD" disclaimer in v0.4 docs with measured numbers from a clean Hub-pulled `python-numpy` source (1.5 GiB, sha256-verified). The previous attempt at this measurement used `coding-agent-fork-prewarm-v1`, which had 17 baked-in guest Oopses contaminating the timing — fixed by switching source. Methodology (`bench/live-fork-pause-window/bench-live-fork.py`, based on `scripts/dev/e2e-live-branch.py` Phase 6 E2E harness): - One memfd-backed source sandbox spawned with `live_fork: true` - 10 iterations × 4 modes ({live-sync, live-async, diff, full}), interleaved so cold-cache effects average across modes - Each iteration: POST .../branch, record `pause_ms` and HTTP RT, DELETE the result snapshot to bound disk usage - Async iterations also record `poll_until_ready_ms` Results (Intel i7-12700, 30 GiB RAM, Linux 6.14, ext4 on **HDD**): | mode | pause p50 | pause p90 | RT p50 | |--------------|----------:|----------:|----------:| | live-sync | **56 ms**| 64 ms | 13 730 ms | | live-async | 54 ms | 241 ms | **69 ms** | | diff | 202 ms | 418 ms | 13 461 ms | | full | 13 550 ms | 14 268 ms | 13 559 ms | Key ratios at p50: - live vs diff: **3.6× faster pause** (202 / 56) - live vs full: **242× faster pause** (13550 / 56) - async RT vs sync RT: **198× faster return** (13730 / 69) The "on HDD" point is a feature, not a bug for the writeup: Live's pause is disk-independent (memory copy runs after resume, not during), so the Live / Diff gap *widens* on slow storage rather than shrinking. NVMe would speed up Diff but not Live, making the ratio narrower — but Live is always bounded by CPU work (vmstate dump + UFFD_WP arming), never by disk throughput. Files: - `bench/live-fork-pause-window/bench-live-fork.py` — runnable harness, parameterized on source-tag and iterations - `bench/live-fork-pause-window/bench-live-fork.csv` — 40-row raw data (one per BRANCH iteration) - `bench/live-fork-pause-window/RESULTS-v0.4.md` — writeup with methodology, host config, per-mode interpretation of what pause_ms / RT measure, and honest caveats (single host, one source size, p90 outlier on async iter #8) Docs updated: - `README.md` headline: "BRANCH a live VM in 150 ms" → "in 56 ms (v0.4 live mode)". v0.4 preview block now leads with the measured 3.6× / 200× ratios and links to RESULTS-v0.4.md. - `README-zh.md`: same headline + intro update. - `CHANGELOG.md`: Unreleased's v0.4 section's "Bench in progress" disclaimer replaced with the actual numbers table. Phase 7 (user surface for v0.4 live BRANCH) is complete with this PR: REST (#204), CLI (#205), SDKs (#206), doctor (#207), docs (#208), bench (this). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Wires the Phase 7.1 canonical
modeselector plus the Phase 6.4wait=falsepath and the Phase 6.5live_forkspawn flag through all three published SDKs.branch_sandbox/branchSandboxgains:mode: \"full\" | \"diff\" | \"live\"— canonical selector. Wins over legacydiffbool when both set (daemon would 400 — SDK serializes onlymode).wait: bool = True— only meaningful withmode=\"live\". Omitted from body when True so calls work against pre-v0.4 daemons.spawn_sandboxes/spawnSandboxesgains:live_fork: bool = False(liveForkcamelCase in TS) — boots sandbox with memfd-backed RAM. Body field only added when true.TypeScript:
BranchModetype re-exported from package root.SnapshotInfo.status(\"writing\" | \"ready\" | \"failed\") for thewait=falselifecycle marker.Python:
BranchModeLiteralre-exported fromforkdpackage root.MCP:
branch_sandboxandspawn_sandboxesMCP tools gain the new args with docstrings explaining when each applies — so the model picks the right one without re-reading docs/API.md.Compat
diff=Truecallers keep working — SDK still serializesdiff: trueso this build drives v0.3.x daemons.waitandlive_forkonly appear in the body when caller opts in — wire is unchanged against pre-v0.4 daemons.Test plan
npm run build(TS) — cleannpm run test(TS) — 14 / 14 pass (was 9; +5 new)server.pyparses clean--live-forksource — deferred to Phase 7.5 bench harness🤖 Generated with Claude Code