feat(jailer): chroot + cgroup v2 + seccomp + UID/GID drop via official jailer#10
Merged
Conversation
…l jailer Wrap Firecracker spawn in the official `jailer` binary so the VMM no longer runs as the parent's UID with zero isolation. New `src/jailer.rs` (gated on the `firecracker` feature) provides: * `JailerConfig` with env-driven defaults (uid=123, gid=100, chroot_base=/srv/jailer, parent=microvm.slice, cgroup v2). * `Jailer::prepare(vm_id, kernel, rootfs, extra_drives)` — idempotent chroot setup: hardlink-or-copy kernel/rootfs/drives into `<base>/firecracker/<vm_id>/root/`, mknod `dev/kvm` (c 10 232) and `dev/net/tun` (c 10 200), chown the tree to uid:gid. Surfaces missing CAP_MKNOD with a clear error. * `Jailer::build_command(...)` — returns a ready-to-spawn `Command` with `--id`, `--exec-file`, `--uid`, `--gid`, `--chroot-base-dir`, `--cgroup-version`, `--parent-cgroup`, optional `--numa-node`, `--new-pid-ns`, the `--` separator, and FC's `--api-sock /api.sock`. Default seccomp filter applies (no opt-out flag passed). * `Jailer::teardown(vm_id)` — recursive chroot removal plus best-effort cgroup rmdir. Adapter wiring is deliberately out of scope (tech lead owns that PR). Tests: * arg-construction tests assert exact arg order for cgroup v1/v2, NUMA, sanitised vm_id, relative-binary rejection, chroot mismatch. * chroot-path computation, safe_vm_id sanitisation, makedev encoding. * OS-touching prepare/teardown tests gated `#[ignore]` with run-as-root instructions in the comment.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Wrap Firecracker spawn in the official
jailerbinary. Today FC runs as the parent process's UID with no chroot, no cgroup, no seccomp — a guest escape compromises the host. This PR addssrc/jailer.rs(behind thefirecrackerfeature) with the four primitives the adapter needs:JailerConfig— uid/gid drop (defaults 123:100), chroot base (/srv/jailer), cgroup v2 parent slice (microvm.slice), optional NUMA pinning.from_env()driven byMICROVM_JAILER_*vars.Jailer::prepare(vm_id, kernel, rootfs, extra_drives)— idempotent chroot setup. Hardlink-or-copy of kernel/rootfs/drives into<base>/firecracker/<vm_id>/root/,mknodfor/dev/kvm(c 10 232) and/dev/net/tun(c 10 200), recursive chown to uid:gid. MissingCAP_MKNODsurfaces asVmRuntimeError::Jailerwith a clear message.Jailer::build_command(vm_id, jail, firecracker_bin)— returns aCommandwith--id,--exec-file,--uid,--gid,--chroot-base-dir,--cgroup-version <1|2>,--parent-cgroup, optional--numa-node,--new-pid-ns,--, then--api-sock /api.sock. Caller owns.spawn()so stderr can be piped into console capture.Jailer::teardown(vm_id)—rm -rfthe chroot tree; best-effort cgrouprmdir.Default seccomp filter is in force — we never pass
--no-seccomp.Scope
src/jailer.rs(module-level doc comment covers chroot layout + required host capabilities).src/error.rs: addJailer(String)variant.src/lib.rs: pub-use the new types behind#[cfg(feature = "firecracker")].Cargo.toml: addnix(optional, gated onfirecracker) +tempfile(dev).src/adapters/firecracker.rsuntouched).Test plan
cargo fmt --all -- --checkcleancargo clippy --all-features --all-targets -- -D warningscleancargo test --features firecracker --lib— 12 passing (1 ignored, root-only)cargo test --lib(default features) — 1 passingcargo check --all-featurescleanprepare/mknod/chownexercised on a root host:sudo -E env "PATH=$PATH" cargo test --features firecracker -- --ignoredUnit coverage
Command::get_args()sequence for cgroup v1, cgroup v2, NUMA-on, sanitised vm_id, relative-binary rejection, chroot/vm_id mismatch.safe_vm_idsanitisation pinned to the adapter convention.makedevencoding cross-checked against glibc for KVM (10:232) and TUN (10:200).teardownexercised both with and without a pre-existing chroot.