Skip to content

Use After Free when accessing current frame inside suspend block #10484

@Wolf480pl

Description

@Wolf480pl

Zig Version

0.9.0

Steps to Reproduce

A: Real-life usecase

Write a program that uses a multi-threaded evented IO inside a function launched with std.event.Loop.runDetached for every incoming connection, such as this one: this program. If you open and close enough connections in quick succession (to lose a race condition), it will segfault in std.event.loop.Loop.linuxWaitFd.

zig build-exe httpd.zig
./httpd.zig
# in another terminal:
hey -n 1000 -c 8 http://127.0.0.1:8001

B: Attempt at isolating the bug

I wrote a program that performs analogous operations to those std.event.loop does, but reproduces the crash every time.
Build this program in debug mode, and run it. It will segfault every time.

zig build-exe frame-uaf-bug.zig 
./frame-uaf-bug

Expected Behavior

A: Real-life usecase

The program doesn't crash.
Either runDetached, linuxWaitFd, or the async function implementation, ensures that there is no use-after-free in that scenario, or it is documented that using them in this combination is incorrect (though I think this is a pretty natural way to use them).

B: Attempt at isolating the bug

There is a documented way to figure out when it's ok to free foo()'s frame, so that it doesn't get freed while the suspend block is accessing it.

I can see a few possible ways this could happen:

  • the compiler doesn't generate implicit accesses to the frame in the suspend block, possibly by making it a compile error to do things in suspend block that require access to the frame
  • the await in bar() doesn't return until the suspend block in foo() has finished executing
  • another primitive is provided to wait for all suspend blocks to finish

but there's probably more.

Actual Behavior

A: Real-life usecase

Sometimes (usually within first few hundred connections handled), while inside the suspend block in linuxWaitFd() called by some IO function (read, in my case), after linuxAddFd() has already added the @frame to the event loop, the thread (let's call it Thread 1) gets preempted / otherwise doesn't get CPU for a long time. At the same time, the FD becomess ready again, and another thread (let's say Thread 2) running the event loop's workerRun() resumes that frame. linuxWaitFd() returns in that thread, the function handling the connection finishes, returns to runDetached(), which frees the frame. Then Thread 1 finally gets to run, and continues executing linuxWaitFd()'s suspend block, which tries to access its frame (eg. to save error from linuxAddFd() or to update the need_to_delete local variable) which has been already freed, leading to use-after-free, and possibly a segfault.

This doesn't happen if the event loop is running single-threaded (uncommenting line 74 in the example httpd.zig), or when the frames are never freed (flipping the boolean in line 45 in the example), which is consistent with it being a race-condition use-after-free.

B: Attempt at isolating the bug

When thread "T2" (running thread()) gets to foo()'s second suspend block, it makes the @frame available to thread "T1", and then busy-waits inside the suspend block. Thread "T1" (running main()) resumes foo(), which returns to bar() (at await in line 33), which then frees foo()'s frame, and returns from the resume in main(), which then notifies the busy-wait in T2 that it can proceed. T2 proceeds executing the suspend block in foo(), accessing local variables, which live in the frame, which has been freed, leading to use-after-free and a segfault.

Note: bar() alocates foo()'s frame directly from page_allocator to ensure that the freed memory is unmapped, and UAF always leads to a segfault.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugObserved behavior contradicts documented or intended behavior

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions