feat(vmm): v0.4 Phase 5b — MemoryBackend::MemfdShared via restore_many_with#187
Merged
Conversation
…y_with Wires the memfd helper from #186 into the existing parallel-restore path. When ForkOpts::memory_backend == MemoryBackend::MemfdShared, each restored child gets its own memfd (created from the snapshot's memory.bin via forkd_vmm::memfd::create_and_populate) and the JSON PUT to /snapshot/load uses /proc/self/fd/<N> with mem_backend.shared: true. The patched Firecracker at deeplethe/firecracker:forkd-v0.4-mem- backend-shared honours the new shared flag and mmaps with MAP_SHARED; forkd-controller can then arm UFFDIO_WRITEPROTECT on the same backing in Phase 6 to capture dirty pages asynchronously. Changes: - MemoryBackend gains a MemfdShared variant. Docstring spells out the patched-FC dependency loudly enough that an operator running vanilla FC + setting MemfdShared knows they're on the wrong path. - Vm gains a public memfd: Option<MemfdRegion> field. Held for the VM's lifetime; Drop closes the fd and the kernel reclaims pages. - restore_many_with's pre-flight check now permits both File and MemfdShared; Userfault still bails (still v0.3 scaffolding). - After spawn, a Phase 1.5 loop creates per-child memfds when MemfdShared. Failure surfaces with the child index + the source path in the error chain — same Drop-on-error guarantee as the helper module. - The single shared JSON body is replaced with a per-child Vec<String> so each child's PUT can reference its own /proc/self/fd/<N> path without sharing state. Not in this PR: - Fresh-boot (non-snapshot) memfd backing in Vm::boot. v0.4's live- fork only needs the restore path; that's the only one wired. - `forkd doctor` check that the runtime FC binary supports the shared flag (Phase 8). Refs #101, deeplethe/firecracker#1.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Builds on the Phase 5a helper (#186) by wiring memfd-backed restore into the existing parallel-restore path. When `ForkOpts::memory_backend == MemoryBackend::MemfdShared`, each restored child gets its own memfd populated from the snapshot's `memory.bin`, and the JSON PUT to `/snapshot/load` carries `mem_backend.shared: true` against `/proc/self/fd/`.
The patched Firecracker at `deeplethe/firecracker:forkd-v0.4-mem-backend-shared` honours the new `shared` flag and mmaps with `MAP_SHARED`; forkd-controller can then arm `UFFDIO_WRITEPROTECT` on the same backing in Phase 6 to capture dirty pages asynchronously.
What changed in `crates/forkd-vmm/src/lib.rs`
A new "Phase 1.5" loop sits between spawn and restore: when `MemfdShared`, it calls `memfd::create_and_populate(self.memory, "forkd-source-mem-")` per child and attaches the region to `child.memfd`. Error chain includes the child index and source path.
Local verification on dev box
```
$ cargo fmt --all -- --check # clean
$ cargo check --all-targets # ok (26.9 s)
$ cargo test -p forkd-vmm --lib
test result: ok. 21 passed; 0 failed
```
End-to-end against the patched FC binary is deferred to Phase 5c (real-VM smoke test on the dev box, using the just-built FC at `~/firecracker-fork/build/cargo_target/release/firecracker`). This PR is the code-side wiring; the next step is a single "happy path" smoke that boots one MemfdShared child and asserts `/proc/<fc_pid>/maps` shows the memory.bin region as `rw-s`.
Side fix in the upstream FC fork
While wiring this up I discovered the original `mem_backend.shared` patch in `deeplethe/firecracker` only added the field to `MemBackendConfig` but left six `MemBackendConfig { ... }` struct literals (1 production site + 5 tests in `api_server/request/snapshot.rs`) unupdated — the compiler rightly rejected the partial change with E0063. Fixed in `deeplethe/firecracker@ca090c9` by adding `shared: false` to each literal. Behavior unchanged for those sites — `false` matches the existing MAP_PRIVATE default.
The corresponding update to `0001-feat-mem-backend-shared-option-for-MAP-SHARED.patch` in this repo will land in a follow-up housekeeping PR so the patch file accurately reflects the now-buildable fork. Filed mentally; not blocking 5b review.
Refs #101.