Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move fork from origin to c_scape #84

Merged
merged 13 commits into from
Dec 21, 2021
Merged

Move fork from origin to c_scape #84

merged 13 commits into from
Dec 21, 2021

Conversation

sunfishcode
Copy link
Owner

After a fork, the child only has one thread, and it looks like all the
other threads are suspended. This may leave behind locks in locked
states and cause deadlocks, however deadlocks are not considered unsafe.
Consequently, fork doesn't need to be unsafe.

This also changes the on_fork callback functions to be
Box<Fn() + Send> so that it's safe to call them, and adds code to
c-scape and non-mustang targets to wrap them as needed.

@bjorn3
Copy link
Contributor

bjorn3 commented Dec 4, 2021

Making fork safe would make calling memmap2::MmapMut::map_mut() on a file created using create_memfd and closed after mapping (to avoid the usual UB of mmap) unsound. map_mut() uses MAP_SHARED, which would be shared between the original and forked process, yet it allows you to get a mutable reference on both sides of the fork. In addition using something like sqlite on both sides of the fork will likely lead to data corruption, potentially causing sqlite to invoke UB, thus making any sqlite wrapper unsound. I am pretty sure there are many other cases of code that can only run on one side of the fork without causing UB.

@sunfishcode
Copy link
Owner Author

Making fork safe would make calling memmap2::MmapMut::map_mut() on a file created using create_memfd and closed after mapping (to avoid the usual UB of mmap) unsound. map_mut() uses MAP_SHARED, which would be shared between the original and forked process, yet it allows you to get a mutable reference on both sides of the fork.

Should it be the responsibility of fork callers to avoid this hazard, or map_mut callers? One could plausibly make arguments either way. My habit is to think of mmap as so frought with unsafety that I default to putting responsibility on map_mut. But I can see how it could be harder for map_mut callers to prevent this hazard than for fork callers. I'll need to think about this more.

In addition using something like sqlite on both sides of the fork will likely lead to data corruption, potentially causing sqlite to invoke UB, thus making any sqlite wrapper unsound. I am pretty sure there are many other cases of code that can only run on one side of the fork without causing UB.

Can you say more about what the hazard is here? If the sqlite bindings expose a safe API, we should be able to make the same assumptions we do about Rust code. Or does it use MAP_SHARED or something else which is implicitly shared by a fork?

@bjorn3
Copy link
Contributor

bjorn3 commented Dec 4, 2021

If the sqlite bindings expose a safe API, we should be able to make the same assumptions we do about Rust code. Or does it use MAP_SHARED or something else which is implicitly shared by a fork?

Normally sqlite locks the database file. A fork can happen after the database was locked, in which case we now got two competing sqlite instances fighting for the same database. The resulting corruption itself could lead to UB inside the sqlite code, or a TOCTOU could happen between checking that the file contents are sane and actually using it. (for example because the compiler assumes that the memory doesn't change behind the back of the current thread due to data races being UB) Sqlite supports using mmap, though as an optional feature. The same argument could be made for any database that uses mmap as they have to use MAP_SHARED to make their changes visible in the database file in the first place.

sunfishcode added a commit that referenced this pull request Dec 4, 2021
Mark `fork` unsafe for now, per discussion in [#84], and update the
comments to describe the hazards.

[#84]: #84
@sunfishcode sunfishcode changed the title Mark origin::program::fork as safe. Make origin::program::fork less unsafe Dec 4, 2021
Copy link
Contributor

@bjorn3 bjorn3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new doc comment LGTM. Didn't closely review the rest of the PR.

@sunfishcode
Copy link
Owner Author

I think in theory there are ways these safety concerns could be pushed to other places. But ultimately, fork is a really low-level and special API, so it's not worth pushing this too much. I've now marked it as unsafe and described the conditions. Thanks for bringing these up!

@sunfishcode
Copy link
Owner Author

Thinking about this more, I've now written bytecodealliance/rustix#137 to organize my thoughts, and point to a new overall direction here.

@sunfishcode sunfishcode changed the title Make origin::program::fork less unsafe Move fork from origin to c_scape Dec 7, 2021
@sunfishcode
Copy link
Owner Author

32-bit arm is failing because parking_lot's Mutex uses std::time::Instant::now to implement fairness, which takes a lock on 32-bit arm to ensure monotonicity because it doesn't have 64-bit atomics and can't guarantee that CLOCK_MONOTONIC is actually monotonic, so it then deadlocks on the bucket lock.

@sunfishcode sunfishcode force-pushed the sunfishcode/safe-fork branch 2 times, most recently from 0c4235c to 0a85137 Compare December 21, 2021 18:44
@sunfishcode
Copy link
Owner Author

I think the thing to do for now is to temporarily disable 32-bit arm support. I expect the available options with parking_lot will evolve over time, and we can re-evaluate 32-bit arm support when we have other options.

Move `fork` from origin to c-scape, and convert it back to using
`unsafe extern "C" fn()` hooks instead of safe boxed closures. See
bytecodealliance/rustix#137 for discussion.

Add an at_thread_exit_raw to origin, which allows thread destructors to
be registered without allocating, which is important since Rust will
register TLS destructors while threads are being constructed, before
parking_lot is prepared for the global allocator to take locks.

Update to the latest rustix, and the change from a safe execve to an
unsafe raw execveat. This similarly allows it to avoid allocating, which
is important as it's primarily called in the child of `fork`.

And switch to a branch of parking_lot with changes to allow parking_lot locks
to be used from within global allocator implementations.

These changes allow us to move the `#[global_allocator]` back to the mustang
crate, and restore the wee_alloc allocator option. More generally,
origin and c-scape should in theory now work with any global allocator
implementation.
This avoids the problem that `execveat` isn't always supported.
For better global-alloc compatibility, allocate main-thread data with
mmap_anonymous instead of with alloc.
32-bit arm is failing because parking_lot's `Mutex` uses
`std::time::Instant::now` to implement fairness, which takes a lock on 32-bit
arm to ensure monotonicity because it doesn't have 64-bit atomics and can't
guarantee that `CLOCK_MONOTONIC` is actually monotonic, so it then deadlocks
on the bucket lock.
Use `#[cfg_attr(..., path = "...")]` to conditionally define modules,
which simplifies the lib.rs file.

And clean up several comments.
has-elf-tls was renamed to has-thread-local. Define both for now, so
that we work with versions of Rust before and after the rename.
To support both old and new Rust compilers, we need both has-elf-tls and
has-thread-local in the target json files for now. Disable the stderr
comparison checks which get caught on the warning that one of them is
unsupported.
@sunfishcode sunfishcode merged commit ff88ee7 into main Dec 21, 2021
@sunfishcode sunfishcode deleted the sunfishcode/safe-fork branch December 21, 2021 21:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants