Skip to content

feat(vmm): v0.4 Phase 5b — MemoryBackend::MemfdShared via restore_many_with#187

Merged
WaylandYang merged 3 commits into
mainfrom
feat/v0.4-phase5b-memfd-restore
May 29, 2026
Merged

feat(vmm): v0.4 Phase 5b — MemoryBackend::MemfdShared via restore_many_with#187
WaylandYang merged 3 commits into
mainfrom
feat/v0.4-phase5b-memfd-restore

Conversation

@WaylandYang
Copy link
Copy Markdown
Contributor

Builds on the Phase 5a helper (#186) by wiring memfd-backed restore into the existing parallel-restore path. When `ForkOpts::memory_backend == MemoryBackend::MemfdShared`, each restored child gets its own memfd populated from the snapshot's `memory.bin`, and the JSON PUT to `/snapshot/load` carries `mem_backend.shared: true` against `/proc/self/fd/`.

The patched Firecracker at `deeplethe/firecracker:forkd-v0.4-mem-backend-shared` honours the new `shared` flag and mmaps with `MAP_SHARED`; forkd-controller can then arm `UFFDIO_WRITEPROTECT` on the same backing in Phase 6 to capture dirty pages asynchronously.

What changed in `crates/forkd-vmm/src/lib.rs`

Surface Before After
`MemoryBackend` enum `File` (default), `Userfault` (todo) + `MemfdShared`
`Vm` struct proc/pid/sock/console/netns/cgroup + public `memfd: Option`
`restore_many_with` pre-flight rejects everything except `File` allows `File` and `MemfdShared`; still bails on `Userfault`
Restore-path JSON body single shared `String` cloned to every thread per-child `Vec`; each MemfdShared child stamps its own `/proc/self/fd/` + `shared: true`

A new "Phase 1.5" loop sits between spawn and restore: when `MemfdShared`, it calls `memfd::create_and_populate(self.memory, "forkd-source-mem-")` per child and attaches the region to `child.memfd`. Error chain includes the child index and source path.

Local verification on dev box

```
$ cargo fmt --all -- --check # clean
$ cargo check --all-targets # ok (26.9 s)
$ cargo test -p forkd-vmm --lib
test result: ok. 21 passed; 0 failed
```

End-to-end against the patched FC binary is deferred to Phase 5c (real-VM smoke test on the dev box, using the just-built FC at `~/firecracker-fork/build/cargo_target/release/firecracker`). This PR is the code-side wiring; the next step is a single "happy path" smoke that boots one MemfdShared child and asserts `/proc/<fc_pid>/maps` shows the memory.bin region as `rw-s`.

Side fix in the upstream FC fork

While wiring this up I discovered the original `mem_backend.shared` patch in `deeplethe/firecracker` only added the field to `MemBackendConfig` but left six `MemBackendConfig { ... }` struct literals (1 production site + 5 tests in `api_server/request/snapshot.rs`) unupdated — the compiler rightly rejected the partial change with E0063. Fixed in `deeplethe/firecracker@ca090c9` by adding `shared: false` to each literal. Behavior unchanged for those sites — `false` matches the existing MAP_PRIVATE default.

The corresponding update to `0001-feat-mem-backend-shared-option-for-MAP-SHARED.patch` in this repo will land in a follow-up housekeeping PR so the patch file accurately reflects the now-buildable fork. Filed mentally; not blocking 5b review.

Refs #101.

…y_with

Wires the memfd helper from #186 into the existing parallel-restore
path. When ForkOpts::memory_backend == MemoryBackend::MemfdShared,
each restored child gets its own memfd (created from the snapshot's
memory.bin via forkd_vmm::memfd::create_and_populate) and the JSON
PUT to /snapshot/load uses /proc/self/fd/<N> with mem_backend.shared:
true. The patched Firecracker at deeplethe/firecracker:forkd-v0.4-mem-
backend-shared honours the new shared flag and mmaps with MAP_SHARED;
forkd-controller can then arm UFFDIO_WRITEPROTECT on the same backing
in Phase 6 to capture dirty pages asynchronously.

Changes:
- MemoryBackend gains a MemfdShared variant. Docstring spells out
  the patched-FC dependency loudly enough that an operator running
  vanilla FC + setting MemfdShared knows they're on the wrong path.
- Vm gains a public memfd: Option<MemfdRegion> field. Held for the
  VM's lifetime; Drop closes the fd and the kernel reclaims pages.
- restore_many_with's pre-flight check now permits both File and
  MemfdShared; Userfault still bails (still v0.3 scaffolding).
- After spawn, a Phase 1.5 loop creates per-child memfds when
  MemfdShared. Failure surfaces with the child index + the source
  path in the error chain — same Drop-on-error guarantee as the
  helper module.
- The single shared JSON body is replaced with a per-child Vec<String>
  so each child's PUT can reference its own /proc/self/fd/<N> path
  without sharing state.

Not in this PR:
- Fresh-boot (non-snapshot) memfd backing in Vm::boot. v0.4's live-
  fork only needs the restore path; that's the only one wired.
- `forkd doctor` check that the runtime FC binary supports the
  shared flag (Phase 8).

Refs #101, deeplethe/firecracker#1.
@WaylandYang WaylandYang merged commit 1473c68 into main May 29, 2026
2 checks passed
@WaylandYang WaylandYang deleted the feat/v0.4-phase5b-memfd-restore branch May 29, 2026 03:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant