Skip to content

feat(jailer): chroot + cgroup v2 + seccomp + UID/GID drop via official jailer#10

Merged
tangletools merged 1 commit into
mainfrom
feat/jailer
May 23, 2026
Merged

feat(jailer): chroot + cgroup v2 + seccomp + UID/GID drop via official jailer#10
tangletools merged 1 commit into
mainfrom
feat/jailer

Conversation

@tangletools
Copy link
Copy Markdown
Contributor

Summary

Wrap Firecracker spawn in the official jailer binary. Today FC runs as the parent process's UID with no chroot, no cgroup, no seccomp — a guest escape compromises the host. This PR adds src/jailer.rs (behind the firecracker feature) with the four primitives the adapter needs:

  • JailerConfig — uid/gid drop (defaults 123:100), chroot base (/srv/jailer), cgroup v2 parent slice (microvm.slice), optional NUMA pinning. from_env() driven by MICROVM_JAILER_* vars.
  • Jailer::prepare(vm_id, kernel, rootfs, extra_drives) — idempotent chroot setup. Hardlink-or-copy of kernel/rootfs/drives into <base>/firecracker/<vm_id>/root/, mknod for /dev/kvm (c 10 232) and /dev/net/tun (c 10 200), recursive chown to uid:gid. Missing CAP_MKNOD surfaces as VmRuntimeError::Jailer with a clear message.
  • Jailer::build_command(vm_id, jail, firecracker_bin) — returns a Command with --id, --exec-file, --uid, --gid, --chroot-base-dir, --cgroup-version <1|2>, --parent-cgroup, optional --numa-node, --new-pid-ns, --, then --api-sock /api.sock. Caller owns .spawn() so stderr can be piped into console capture.
  • Jailer::teardown(vm_id)rm -rf the chroot tree; best-effort cgroup rmdir.

Default seccomp filter is in force — we never pass --no-seccomp.

Scope

  • New file: src/jailer.rs (module-level doc comment covers chroot layout + required host capabilities).
  • src/error.rs: add Jailer(String) variant.
  • src/lib.rs: pub-use the new types behind #[cfg(feature = "firecracker")].
  • Cargo.toml: add nix (optional, gated on firecracker) + tempfile (dev).
  • Adapter wiring is deliberately out of scope (src/adapters/firecracker.rs untouched).

Test plan

  • cargo fmt --all -- --check clean
  • cargo clippy --all-features --all-targets -- -D warnings clean
  • cargo test --features firecracker --lib — 12 passing (1 ignored, root-only)
  • cargo test --lib (default features) — 1 passing
  • cargo check --all-features clean
  • OS-touching prepare/mknod/chown exercised on a root host:
    sudo -E env "PATH=$PATH" cargo test --features firecracker -- --ignored

Unit coverage

  • Arg-construction asserts the full ordered Command::get_args() sequence for cgroup v1, cgroup v2, NUMA-on, sanitised vm_id, relative-binary rejection, chroot/vm_id mismatch.
  • Chroot path computation and safe_vm_id sanitisation pinned to the adapter convention.
  • makedev encoding cross-checked against glibc for KVM (10:232) and TUN (10:200).
  • teardown exercised both with and without a pre-existing chroot.

…l jailer

Wrap Firecracker spawn in the official `jailer` binary so the VMM no longer
runs as the parent's UID with zero isolation. New `src/jailer.rs` (gated on
the `firecracker` feature) provides:

* `JailerConfig` with env-driven defaults (uid=123, gid=100,
  chroot_base=/srv/jailer, parent=microvm.slice, cgroup v2).
* `Jailer::prepare(vm_id, kernel, rootfs, extra_drives)` — idempotent chroot
  setup: hardlink-or-copy kernel/rootfs/drives into
  `<base>/firecracker/<vm_id>/root/`, mknod `dev/kvm` (c 10 232) and
  `dev/net/tun` (c 10 200), chown the tree to uid:gid. Surfaces missing
  CAP_MKNOD with a clear error.
* `Jailer::build_command(...)` — returns a ready-to-spawn `Command` with
  `--id`, `--exec-file`, `--uid`, `--gid`, `--chroot-base-dir`,
  `--cgroup-version`, `--parent-cgroup`, optional `--numa-node`,
  `--new-pid-ns`, the `--` separator, and FC's `--api-sock /api.sock`.
  Default seccomp filter applies (no opt-out flag passed).
* `Jailer::teardown(vm_id)` — recursive chroot removal plus best-effort
  cgroup rmdir.

Adapter wiring is deliberately out of scope (tech lead owns that PR).

Tests:
* arg-construction tests assert exact arg order for cgroup v1/v2, NUMA,
  sanitised vm_id, relative-binary rejection, chroot mismatch.
* chroot-path computation, safe_vm_id sanitisation, makedev encoding.
* OS-touching prepare/teardown tests gated `#[ignore]` with run-as-root
  instructions in the comment.
@tangletools tangletools merged commit e03bdba into main May 23, 2026
4 checks passed
@tangletools tangletools deleted the feat/jailer branch May 23, 2026 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants