Skip to content

Disabling allocations in pre_exec (and signal handlers) #701

@yuki0iq

Description

@yuki0iq

Proposal

Problem statement

Allocating inside CommandExt::pre_exec1 is a footgun.

According to POSIX, malloc is not async-signal-safe, meaning that it's not guaranteed to work after fork in a multi-thread process. In practice, this sometimes leads to deadlocks, although some libcs take liberty in making it fine to call malloc/free after fork.

2024 revision of POSIX says that "the application shall ensure that the child process only executes async-signal-safe operations until such time as one of the exec functions is successful". This does not clearly indicate that executing them is UB. However, the previous revision says that, under the same conditions, it is undefined behavior to invoke fork if async-signal-unsafe pthread_atfork handlers are registered. Since this is just invoking async-signal-unsafe functions with a layer of indirection, I believe this wording implies executing async-signal-unsafe operations after fork is UB in general. Musl authors seemingly agree with this interpretation.

Yet it can be tricky to know if the function is allocating, or at least not immediately obvious, as allocations are hidden and every Rust function can allocate. This problem is exacerbated by lack of tools to detect allocations inside pre_exec: Miri can't analyze pre_exec hooks, clippy does not lint against using these functions and Valgrind remains silent. We can't make pre_exec safe, but we can at least help people recognise bugs in their code.

Motivating examples or use cases

During efforts to make panicking work normally after fork it was discovered that a simple catch_unwind solution does not reliably work across systems because of allocations and it was suggested to make global allocator abort after fork. Though std moved forward with the other solution to "don't unwind past fork() in child", it still could be helpful to know where these allocations were.

The pre_exec allocations footgun is mostly hit by external code, though. @purplesyringa recently found many projects that unknowingly use allocating Error::other or Error::new in pre_exec hooks. It is not clear how developers could find this bug.

  1. Even though chdir(2) is async-signal-safe, Rust's std::env::set_current_dir is not, because it has to allocate to append NUL-byte to path. This applies to all filesystem-related functions. (Note that Command::current_dir could not be used in Non async-signal-safe closure in pre_exec alacritty/alacritty#8751 because the intention was to swallow errors.)
  2. Allocating functions can be deeply nested in pre_exec, like filesystem accesses inside Cgroup::add_task in Allocations in pre_exec openanolis/cryptpilot#54, or could be abstracted away, like in logger in Fix non-async-signal-safety in pre_exec alacritty/alacritty#8756, so simple clippy lints could not help detect allocations.
  3. Project maintainers showed interest in such functionality: Avoid allocating in pre_exec closure reubeno/brush#777 (comment).

Solution sketch

The idea is to make allocations inside spawn variant pre_exec a library UB and integrate checks into ub-checks efforts. Although POSIX mentions only multi-threaded processes, std could possibly have created helper threads, and lack of helper threads should probably be considered an implementation detail by treating all programs as multi-threaded in the pre_exec case.

While malloc(3) is async-signal-unsafe, certain custom allocators may be guaranteed to work correctly after fork (e.g. bump allocators). Therefore the check must be opt-in for the allocator.

Since async-signal-safe contexts also appear in signal handlers, it would be prudent to make the API useful for that purpose as well. We don't have to expose it right away, but the developed solution should be forward-compatible.

All in all, the plan is to expose a function to query whether we are in an "async signal" context (like std::thread::panicking) and a "guard" (like DropGuard).

// Names are up to bikeshed
mod thread {
    #[thread_local]
    static ASYNC_SIGNAL_NESTING: Cell<usize> = const { Cell::new(0) };

    pub fn is_async_signal_context() -> bool {
        ASYNC_SIGNAL_NESTING.get() != 0
    }

    pub struct AsyncSignalGuard;

    impl AsyncSignalGuard {
        fn new() -> Self {
            ASYNC_SIGNAL_NESTING.set(ASYNC_SIGNAL_NESTING.get() + 1);
            Self
        }
    }

    impl Drop for AsyncSignalGuard {
        fn drop(&mut self) {
            ASYNC_SIGNAL_NESTING.set(ASYNC_SIGNAL_NESTING.get() - 1);
        }
    }
}

Allocators would use this API to make sure they were not called inside an async signal context:

debug_assert!(
    !std::thread::is_async_signal_context(),
    "This allocator does not support allocating in async-signal-safe context."
);

The allocator, or a custom wrapper, could alternatively choose to use assert! to run such checks even in release mode. The System allocator would have to use a different mechanism than debug_assert! if std is pre-built.

Users of raw fork, like Command::spawn, and signal handlers may enter the async signal context with AsyncSignalGuard:

let pid = unsafe { cvt(libc::fork())? };
if pid == 0 {
    std::panic::always_abort();
    let _guard = std::thread::AsyncSignalGuard::new();
    // remaining code here
}

These functions should be stubbed on #[cfg(not(unix))] so that allocators don't have to always use #[cfg(unix)] (e.g. because we may want to introduce such checks on other platforms).

Note that panic! inside pre_exec will not show traceback now (playground), and using this approach with allocator will produce a near-useless message "you allocate in async-signal-unsafe context" without telling where exactly that allocation happened. Though I believe one can add a breakpoint to panic and view traceback in a debugger, it has a less nicer experience than desired.

To sum up, this proposal consists of two parts:

  • A new public API, and
  • Updating spawn to enter an async signal context and the System allocator to panic when invoked in an async signal context.

Alternatives

Custom allocator published on crates.io

It is possible to write a custom allocator that would wrap System and register pthread_atfork handler to set a flag to disable allocations (this only detects fork and not _Fork; though another fork detection mechanism may not have this issue). However:

  • Users would need to actively opt in this allocator, and third-party tools lack good discoverability compared to first-party tools like Miri. To use this debug mechanism, you'd have to be aware of the possibility of allocations, but the main reason this problem is wide-spread is lack of this knowledge.
  • It does not open opportunity to mark other, non-allocating std functions as async-signal-unsafe, limiting the applicability.
  • There is a many-to-many correspondence between async signal contexts (fork, manual syscalls, manually set up signal handlers) and async-signal-unsafe functions (allocators, libc wrappers, unwinders). For the checks to work, both have to use the same registry to enter async signal contexts and assert safety, so there would have to be exactly one such crate in the ecosystem, with no possibility of switching without losing functionality. That crate might as well be std.

Replace "async signal context" with "no-alloc context"

Focusing on "no-alloc context" will make it impossible to track non-allocating async-signal-unsafe functions, e.g. functions using global mutexes or non-reentrant functions. However, "no-alloc context" is a platform-independent idea and might be useful outside of cfg(unix). It is not clear which approach is better.

Just wait for effects

Suggested in bootc-dev/containers-image-proxy-rs#109 (comment)

Effects in Rust are not coming soon, whereas the proposed solution is simple enough for an incremental improvement.

Clippy lint against certain functions

Suggested in jamesmcm/vopono#340 (comment)

Clippy can't see through functions, and a lot of APIs unknowingly allocate. The lint would have a lot of false negatives and therefore be near-useless. A more complicated analysis would just be an ad-hoc implementation of effects.

Links and related work

What happens now?

This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.

Possible responses

The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):

  • We think this problem seems worth solving, and the standard library might be the right place to solve it.
  • We think that this probably doesn't belong in the standard library.

Second, if there's a concrete solution:

  • We think this specific solution looks roughly right, approved, you or someone else should implement this. (Further review will still happen on the subsequent implementation PR.)
  • We're not sure this is the right solution, and the alternatives or other materials don't give us enough information to be sure about that. Here are some questions we have that aren't answered, or rough ideas about alternatives we'd want to see discussed.

Footnotes

  1. Command can be materialized on unix in two ways -- Command::spawn and CommandExt::exec. All pre_exec madness about async-signal-safety applies only to spawn variant.

Metadata

Metadata

Assignees

No one assigned

    Labels

    T-libs-apiapi-change-proposalA proposal to add or alter unstable APIs in the standard libraries

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions