Tracked in ROADMAP.md M2.1. Filing this as the trackable issue so the work is visible alongside the rest of M2.
Problem
Today, modifying a parent (e.g. pip install pandas into the parent rootfs) requires re-running the entire snapshot pipeline: rebuild rootfs, boot parent, warm up, pause, dump memory.bin + vmstate.json. That's ~10 s of wall-clock plus several GB of disk write, repeated for every tweak. Dev-loop friction is the biggest obstacle to recipe-author productivity.
Proposal
Surface Firecracker's existing snapshot_type: "Diff" API as a snapshot chain (base → base+pandas → base+pandas+sklearn) at the controller / CLI / registry layer.
Surface area
forkd snapshot diff --from <tag> --tag <new-tag>: pause the running parent, write only dirty pages relative to <tag>'s memory.bin. Persist chain link metadata.
forkd fork --tag <new-tag>: when restoring, walk back to the base and replay diffs in order (or use mmap'd overlay if cheaper).
- Snapshot Hub (#TBD M1.2):
forkd pull <new-tag> recursively pulls parents that aren't already local.
Done criteria
- Diff snapshot of a typical
pip install pandas delta is < 100 MB on disk.
- Restore time on a 3-element chain is within 10% of restoring the base.
- Round-trip
forkd snapshot --tag base → modify in child → forkd snapshot diff --from base --tag base+pkg → forkd fork --tag base+pkg works and the forked child sees the modification.
- Documentation page added under
docs/ explaining the chain semantics.
Risk
Firecracker diff-snapshot has some vmstate fields that don't chain cleanly (TSC offset, MMIO regions). Need to enumerate which fields are safe to diff vs which need full replay. Highest-risk item on the M2 roadmap — budget +1 week (4 weeks total) if the first 2 weeks hit edge cases.
Out of scope for this issue
- GC / cleanup of orphan diff chains (separate follow-up).
- UI for visualising chain trees.
- Branch-merge semantics (multiple diffs sharing a parent — easy; one diff with multiple parents — out of scope, not Firecracker-supported anyway).
Tracked in ROADMAP.md M2.1. Filing this as the trackable issue so the work is visible alongside the rest of M2.
Problem
Today, modifying a parent (e.g.
pip install pandasinto the parent rootfs) requires re-running the entire snapshot pipeline: rebuild rootfs, boot parent, warm up, pause, dumpmemory.bin+vmstate.json. That's ~10 s of wall-clock plus several GB of disk write, repeated for every tweak. Dev-loop friction is the biggest obstacle to recipe-author productivity.Proposal
Surface Firecracker's existing
snapshot_type: "Diff"API as a snapshot chain (base → base+pandas → base+pandas+sklearn) at the controller / CLI / registry layer.Surface area
forkd snapshot diff --from <tag> --tag <new-tag>: pause the running parent, write only dirty pages relative to<tag>'smemory.bin. Persist chain link metadata.forkd fork --tag <new-tag>: when restoring, walk back to the base and replay diffs in order (or use mmap'd overlay if cheaper).forkd pull <new-tag>recursively pulls parents that aren't already local.Done criteria
pip install pandasdelta is < 100 MB on disk.forkd snapshot --tag base→ modify in child →forkd snapshot diff --from base --tag base+pkg→forkd fork --tag base+pkgworks and the forked child sees the modification.docs/explaining the chain semantics.Risk
Firecracker diff-snapshot has some vmstate fields that don't chain cleanly (TSC offset, MMIO regions). Need to enumerate which fields are safe to diff vs which need full replay. Highest-risk item on the M2 roadmap — budget +1 week (4 weeks total) if the first 2 weeks hit edge cases.
Out of scope for this issue