-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Use After Free when accessing current frame inside suspend block #10484
Description
Zig Version
0.9.0
Steps to Reproduce
A: Real-life usecase
Write a program that uses a multi-threaded evented IO inside a function launched with std.event.Loop.runDetached for every incoming connection, such as this one: this program. If you open and close enough connections in quick succession (to lose a race condition), it will segfault in std.event.loop.Loop.linuxWaitFd.
zig build-exe httpd.zig
./httpd.zig
# in another terminal:
hey -n 1000 -c 8 http://127.0.0.1:8001B: Attempt at isolating the bug
I wrote a program that performs analogous operations to those std.event.loop does, but reproduces the crash every time.
Build this program in debug mode, and run it. It will segfault every time.
zig build-exe frame-uaf-bug.zig
./frame-uaf-bugExpected Behavior
A: Real-life usecase
The program doesn't crash.
Either runDetached, linuxWaitFd, or the async function implementation, ensures that there is no use-after-free in that scenario, or it is documented that using them in this combination is incorrect (though I think this is a pretty natural way to use them).
B: Attempt at isolating the bug
There is a documented way to figure out when it's ok to free foo()'s frame, so that it doesn't get freed while the suspend block is accessing it.
I can see a few possible ways this could happen:
- the compiler doesn't generate implicit accesses to the frame in the suspend block, possibly by making it a compile error to do things in suspend block that require access to the frame
- the await in
bar()doesn't return until the suspend block infoo()has finished executing - another primitive is provided to wait for all suspend blocks to finish
but there's probably more.
Actual Behavior
A: Real-life usecase
Sometimes (usually within first few hundred connections handled), while inside the suspend block in linuxWaitFd() called by some IO function (read, in my case), after linuxAddFd() has already added the @frame to the event loop, the thread (let's call it Thread 1) gets preempted / otherwise doesn't get CPU for a long time. At the same time, the FD becomess ready again, and another thread (let's say Thread 2) running the event loop's workerRun() resumes that frame. linuxWaitFd() returns in that thread, the function handling the connection finishes, returns to runDetached(), which frees the frame. Then Thread 1 finally gets to run, and continues executing linuxWaitFd()'s suspend block, which tries to access its frame (eg. to save error from linuxAddFd() or to update the need_to_delete local variable) which has been already freed, leading to use-after-free, and possibly a segfault.
This doesn't happen if the event loop is running single-threaded (uncommenting line 74 in the example httpd.zig), or when the frames are never freed (flipping the boolean in line 45 in the example), which is consistent with it being a race-condition use-after-free.
B: Attempt at isolating the bug
When thread "T2" (running thread()) gets to foo()'s second suspend block, it makes the @frame available to thread "T1", and then busy-waits inside the suspend block. Thread "T1" (running main()) resumes foo(), which returns to bar() (at await in line 33), which then frees foo()'s frame, and returns from the resume in main(), which then notifies the busy-wait in T2 that it can proceed. T2 proceeds executing the suspend block in foo(), accessing local variables, which live in the frame, which has been freed, leading to use-after-free and a segfault.
Note: bar() alocates foo()'s frame directly from page_allocator to ensure that the freed memory is unmapped, and UAF always leads to a segfault.