Skip to content

Block ARM64 exec with live CLONE_VM siblings#407

Merged
ryanbreen merged 2 commits into
mainfrom
fix/arm64-clonevm-exec-cr3-uaf
Jun 1, 2026
Merged

Block ARM64 exec with live CLONE_VM siblings#407
ryanbreen merged 2 commits into
mainfrom
fix/arm64-clonevm-exec-cr3-uaf

Conversation

@ryanbreen
Copy link
Copy Markdown
Owner

Summary

  • Add an ARM64 exec guard that detects live CLONE_VM siblings still holding the old inherited CR3 before draining/freeing the parent's old page table.
  • Return -EAGAIN for that specific stopgap rejection from the ARM64 exec syscall path.
  • Close breenix-45i with a conservative stopgap until proper POSIX thread-group exec teardown exists.

Root cause

sys_clone creates CLONE_VM siblings as separate Process objects and stores the parent's page-table root in inherited_cr3. ARM64 exec later replaces and drains the parent's old page table, but previously did not clear or reject siblings still holding that old CR3. If such a sibling survived parent exec, context switching it could reinstall a freed address-space root.

Reproduction note

A fresh pre-fix Parallels boot reached the direct CLONE_VM path through /bin/hello_world and completed the thread test, but the current hello_world program waits for the child and does not parent-exec afterward. I did not observe a hard UAF crash from the existing binary alone; the crash condition is code-proven by the stale inherited_cr3 lifetime and would require parent exec with a live CLONE_VM sibling.

Verification

  • cargo build --release --target aarch64-breenix.json -Z build-std=core,alloc -Z build-std-features=compiler-builtins-mem -p kernel --bin kernel-aarch64
  • warning/error grep over that build log: 0 lines
  • Fresh post-fix Parallels boot: bsshd listening, boot script completed, Bounce started and sustained FPS/heartbeat output
  • Post-fix /bin/hello_world rerun over SSH: CLONE_VM thread test completed and All std tests passed!
  • Post-fix serial marker counts: DATA_ABORT=0, UNHANDLED_EC=0, FATAL=0, PANIC=0, SCHED_RESCUE=0, SOFT_LOCKUP=0, DEFER_SNAP/TRACE BUFFER DUMP=0

ryanbreen and others added 2 commits June 1, 2026 08:41
Reject ARM64 exec while a live CLONE_VM sibling still holds the old inherited CR3. This prevents parent exec from draining/freeing an address space still reachable through a sibling Process until proper thread-group exec teardown exists.

Co-authored-by: Ryan Breen <ryan@ryanbreen.com>

Co-authored-by: Claude Code <noreply@anthropic.com>
Add a userspace test that keeps a CLONE_VM child alive while the parent attempts exec, so the ARM64 inherited-CR3 UAF path is exercised at runtime.

Co-authored-by: Ryan Breen <ryan@ryanbreen.com>

Co-authored-by: Claude Code <noreply@anthropic.com>
@ryanbreen
Copy link
Copy Markdown
Owner Author

TURN 342 runtime proof added in 86950dcc.

Unfixed comparison (origin/main + test only, throwaway 606a380b):

  • Fresh Parallels boot bug-unfixed-2 ran /usr/local/test/bin/clonevm_exec_test.
  • SSH output reached live-child parent exec and second-stage exec, then /bin/simple_exit exited 42.
  • Serial then faulted: DATA_ABORT at turn342-artifacts/bug-unfixed-2/serial-final.log:963, PT_WALK ... L0=invalid at L965, FATAL_POSTMORTEM at L968, and TRACE BUFFER DUMP at L981.

Fixed branch (86950dcc):

  • Same test on fresh Parallels boot returned errno=11 from parent exec, observed fixed path observed EAGAIN, released the child, and exited PASS over SSH.
  • Serial showed normal boot milestones (bsshd, boot script, Bounce) and clean test process exits.
  • Marker counts after the test: DATA_ABORT=0, UNHANDLED_EC=0, FATAL=0, PANIC=0, SCHED_RESCUE=0, SOFT_LOCKUP=0, DEFER_SNAP/TRACE BUFFER DUMP=0.
  • ARM64 userspace and kernel warning/error grep artifacts were 0 lines.

@ryanbreen ryanbreen merged commit 6584788 into main Jun 1, 2026
@ryanbreen ryanbreen deleted the fix/arm64-clonevm-exec-cr3-uaf branch June 1, 2026 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant