Problem
vproc's inline hooks install 16-byte trampolines (LDR X17, #8; BR X17; .quad <addr>) to hook libc functions. On aarch64, this 16-byte write is not atomic on hardware before ARMv8.4-a (which adds FEAT_LSE2 for atomic 128-bit STP).
If another thread executes the target function while the hook is being written, it may see a partially-written trampoline — e.g., LDR X17, #8; BR X17 with only half of the address loaded — leading to a jump to an invalid address and a crash.
Single-instruction hooks (B <offset>, 4 bytes) are safe — aligned 32-bit stores are architecturally guaranteed atomic on aarch64 (ARM ARM B2.2.1).
Proposed Fix: Reverse Write
Write the trampoline in reverse order — high addresses first, low addresses last:
// Write bytes 4-15 first (address payload + part of LDR/BR)
std::ptr::write_unaligned(target.add(4), /* ... */);
// ...
// Write bytes 0-3 last (the LDR X17, #8 instruction that triggers the jump)
std::ptr::write_unaligned(target, ldr_insn);
This minimizes the inconsistency window: no thread can enter the trampoline until the final 4-byte write lands. If a thread reads the target between the partial writes, it still sees the original instructions at bytes 0-3 and continues executing normally.
Alternative Approaches
- Trap-first: Write a
BRK instruction first, all threads trap and pause, then write the full trampoline and resume
- Stop-the-world: Suspend all other threads during hook installation (used by ShadowHook)
- ARMv8.4-a check: Detect
FEAT_LSE2 via ID_AA64MMFR1_EL1 and use atomic STP when available
Affected Code
src/vexec.rs — hook_libc_fork(), hook_libc_pipe(), hook_libc_wait() — all write 16-byte trampolines non-atomically
References
- ARM Architecture Reference Manual B2.2.1 (single-copy atomicity guarantees)
- ARMv8.4-a
FEAT_LSE2 — atomic 128-bit aligned stores
- ShadowHook — stop-the-world approach for multi-thread safety
Problem
vproc's inline hooks install 16-byte trampolines (
LDR X17, #8; BR X17; .quad <addr>) to hook libc functions. On aarch64, this 16-byte write is not atomic on hardware before ARMv8.4-a (which addsFEAT_LSE2for atomic 128-bitSTP).If another thread executes the target function while the hook is being written, it may see a partially-written trampoline — e.g.,
LDR X17, #8; BR X17with only half of the address loaded — leading to a jump to an invalid address and a crash.Single-instruction hooks (
B <offset>, 4 bytes) are safe — aligned 32-bit stores are architecturally guaranteed atomic on aarch64 (ARM ARM B2.2.1).Proposed Fix: Reverse Write
Write the trampoline in reverse order — high addresses first, low addresses last:
This minimizes the inconsistency window: no thread can enter the trampoline until the final 4-byte write lands. If a thread reads the target between the partial writes, it still sees the original instructions at bytes 0-3 and continues executing normally.
Alternative Approaches
BRKinstruction first, all threads trap and pause, then write the full trampoline and resumeFEAT_LSE2viaID_AA64MMFR1_EL1and use atomicSTPwhen availableAffected Code
src/vexec.rs—hook_libc_fork(),hook_libc_pipe(),hook_libc_wait()— all write 16-byte trampolines non-atomicallyReferences
FEAT_LSE2— atomic 128-bit aligned stores