-
Notifications
You must be signed in to change notification settings - Fork 163
Description
Summary
MultiUseSandbox::map_file_cow on Linux calls libc::mmap(MAP_PRIVATE) but never calls munmap on the success path. Each call leaks one kernel VMA (vm_area_struct), which counts against the per-process vm.max_map_count limit (default: 65530).
Physical memory is NOT leaked — all mmaps of the same file share the kernel page cache. The leak is purely a VMA slot leak (~200 bytes of kernel metadata per entry). However, a long-running service that churns sandboxes with map_file_cow will eventually hit vm.max_map_count and subsequent mmap calls will fail with ENOMEM.
The vm.max_map_count limit is a kernel-wide sysctl, not a per-process resource limit — it cannot be raised with ulimit, only via sudo sysctl -w vm.max_map_count=<value>.
What's leaked per call
| Resource | Leaked? | Impact |
|---|---|---|
Kernel VMA (vm_area_struct) |
Yes — ~200 bytes | Counts against vm.max_map_count (default 65530) |
| Virtual address space | Yes — one page-aligned region | Negligible on 64-bit (128TB user VA space) |
| Physical RAM (page cache) | No — shared across all mmaps of the same file | Kernel reclaims under memory pressure |
| File descriptor | No — File is dropped (fd closed) after mmap; kernel holds inode ref internally |
Released on munmap |
Root Cause
The raw *mut c_void pointer returned by libc::mmap is cast to usize and stored in MemoryRegion.host_region — a plain integer with no ownership or Drop semantics. Nothing in the cleanup chain calls munmap:
HyperlightVm::unmap_region— removes the KVM/MSHV slot only, does notmunmapHyperlightVm::Drop— only callsinterrupt_handle.set_dropped()MultiUseSandbox— has noDropimpl- Snapshot
restore()— callsunmap_regionfor stale regions (same problem)
The error path in map_file_cow correctly calls munmap, confirming the success path omission is an oversight.
Proposed Fix
Introduce a Linux OwnedFileMapping struct that stores the mmap pointer and size with a Drop impl calling libc::munmap. Track these in a Vec on MultiUseSandbox and clean up in restore() (when regions are unmapped during snapshot restore) and on MultiUseSandbox::Drop (catch-all for remaining mappings).