policy_fn: extend execve argv freeze to peer processes (#27)#33
Merged
congwang-mk merged 1 commit intomainfrom May 1, 2026
Merged
policy_fn: extend execve argv freeze to peer processes (#27)#33congwang-mk merged 1 commit intomainfrom
congwang-mk merged 1 commit intomainfrom
Conversation
46bd3bf to
1c45cda
Compare
Signed-off-by: Cong Wang <cwang@multikernel.io>
1c45cda to
32c9a76
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
MAP_SHAREDmappings (memfd, SysV shm, shared file mmap) or sharemm_structviaclone(CLONE_VM), and mutate argv between the supervisor's read and the kernel's post-Continue re-read. Sibling-thread freeze (PR policy_fn: drop path strings, keep argv via sibling-thread freeze (#27) #29) closed only the same-TGID case.freeze_siblings_for_execvewithfreeze_sandbox_for_execve, which enumerates every TGID inProcessIndex(the canonical sandbox-membership set, populated byregister_child_if_newinresource.rs) andPTRACE_SEIZE+PTRACE_INTERRUPTs every TID via/proc/<tgid>/task. The supervisor's sequential notification dispatch (notif.rs:999-1001) prevents new clone/fork notifications from completing during the freeze, so the snapshot is stable without any new locking.de_threadand the kernel reaps their ptrace state automatically (unchanged). Peer threads survive execve and arePTRACE_DETACHed afterNOTIF_SENDso they resume normally.Why enumerate-and-freeze rather than a static
CLONE_VMblockConsidered three approaches:
CLONE_VM & ~CLONE_THREAD+ per-execve/proc/<pid>/mapsprivacy check + sibling freeze. Provable but bans a legal Linux clone variant globally for one syscall's TOCTOU; the BPF mask is a magic bit-pattern that obscures the filter.ProcessIndex, sibling-freeze pattern) already exist; ~80 lines on top.(2) was the right tradeoff: existing primitives compose, no per-execve
/proc/<pid>/mapswalk, no static restriction on a legal clone variant. The samefreeze_sandbox_for_<syscall>shape generalizes if future syscalls need TOCTOU protection on re-read user memory.Test plan
cargo test -p sandlock-core --lib— 223 passed (includes newfreeze_sandbox_includes_peer_processregression test)cargo test -p sandlock-core --test integration test_policy_fn— 13 passed (coversdeny_by_argvwhich exercises the live freeze path through the supervisor)Notes for reviewers
notif.rs:932-967is the new dispatch site. Note that detach happens aftersend_response, not before — peer threads must remain frozen until the kernel completes its argv re-read.freeze_sandbox_for_execvefailure mode is unchanged from the old function: any partial-freeze error rolls back all already-frozen tasks and propagates the error, which the dispatcher converts toEPERMto keep the argv-safety invariant fail-closed.policy_fn.rs:65-71and theSyscallEvent.argvdoc comment are updated to reflect the actual guarantee (sandbox-wide pause, not just sibling threads).🤖 Generated with Claude Code