Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consult dlerror() only if a dl*() call fails #74469

Closed
wants to merge 1 commit into from

Conversation

jclulow
Copy link
Contributor

@jclulow jclulow commented Jul 18, 2020

The string returned from dlerror() is purely diagnostic and should not
itself be used to determine whether a previous call to dlopen() or
dlsym() has failed. Those functions are documented with specific return
values that signal failure; i.e., returning NULL.

If we assume a non-NULL return from dlerror() means the prior dlsym()
call failed, we are vulnerable to a race with another thread outside of
Rust control concurrently inducing dynamic linking operations. This
manifests on illumos systems with an intermittent spurious failure from
rustc:

error: ld.so.1: rustc: fatal: _ex_unwind: can't find symbol

The illumos libc checks for the existence of an "_ex_unwind" symbol via
dlsym() under some conditions when a thread exits, as part of an old
contract with a particular C++ standard library. If another thread
exits at the same time that rustc is attempting to load a plugin, we can
hit this race and report an error that does not belong to us.

The string returned from dlerror() is purely diagnostic and should not
itself be used to determine whether a previous call to dlopen() or
dlsym() has failed.  Those functions are documented with specific return
values that signal failure; i.e., returning NULL.

If we assume a non-NULL return from dlerror() means the prior dlsym()
call failed, we are vulnerable to a race with another thread outside of
Rust control concurrently inducing dynamic linking operations.  This
manifests on illumos systems with an intermittent spurious failure from
rustc:

    error: ld.so.1: rustc: fatal: _ex_unwind: can't find symbol

The illumos libc checks for the existence of an "_ex_unwind" symbol via
dlsym() under some conditions when a thread exits, as part of an old
contract with a particular C++ standard library.  If another thread
exits at the same time that rustc is attempting to load a plugin, we can
hit this race and report an error that does not belong to us.
@rust-highfive
Copy link
Collaborator

r? @ecstatic-morse

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 18, 2020
Comment on lines +106 to +107
let s = CStr::from_ptr(last_error).to_bytes();
Err(str::from_utf8(s).unwrap().to_owned())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let s = CStr::from_ptr(last_error).to_bytes();
Err(str::from_utf8(s).unwrap().to_owned())
let s = CStr::from_ptr(last_error).to_str().unwrap();
Err(s.to_owned())

// dlerror reports the most recent failure that occured during a
// dynamic linking operation and then clears that error; we call
// once in advance of our operation in an attempt to discard any
// stale prior error report that may exist:
let _old_error = libc::dlerror();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this call still needed? Surely any prior error will be replaced if there's a new error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @ollie27 said, there's no need to do this anymore if we don't use the return value of dlerror to determine whether an error occurred.

Comment on lines +99 to 101
if ptr::null() != result {
Ok(result)
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the else block now has a condition inside, could you switch to an early return for the happy path?

Comment on lines +89 to +91
// We should only check dlerror() in the event that the operation
// fails, which we determine by checking for a NULL return. This
// covers at least dlopen() and dlsym().
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you document these semantics at the function level? Specifically, if f returns a null pointer, this function returns Err with the string in dlerror.

Also, just to be sure, do all the functions we pass to this helper return NULL and only NULL to indicate an error? There's no (void *) 1 weirdness or something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For dlsym at least, the current approach is explicitly recommended on linux and seems to be necessary on illumos as well, since NULL can indicate either a "symbol not found" error or a found symbol with the value NULL. We should be checking the return value of dlopen, but we will need to find a different workaround here.

@ecstatic-morse
Copy link
Contributor

ecstatic-morse commented Jul 19, 2020

r=me with nits addressed. I don't think you need to change the existing CStr conversion, although to_string_lossy would be more appropriate here.

The current approach is specifically mandated for dlsym, so we need to keep using it. See above.

@Muirrum Muirrum added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 6, 2020
@Dylan-DPC-zz
Copy link

@jclulow closing this due to inactivity. When you have the time, you can submit a new pr that works in a way that addresses the above concerns. Thanks for taking the time to contribute

@Dylan-DPC-zz Dylan-DPC-zz added S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Aug 7, 2020
ecstatic-morse added a commit to ecstatic-morse/rust that referenced this pull request Aug 22, 2020
This works around behavior observed on illumos in rust-lang#74469, in which
foreign code (libc according to the OP) was racing with rustc to check
`dlerror`.
bors added a commit to rust-lang-ci/rust that referenced this pull request Aug 26, 2020
Refactor dynamic library error checking on *nix

The old code was checking `dlerror` more often than necessary, since (unlike `dlsym`) checking the return value of [`dlopen`](https://www.man7.org/linux/man-pages/man3/dlopen.3.html) is enough to indicate whether an error occurred. In the first commit, I've refactored the code to minimize the number of system calls needed. It should be strictly better than the old version.

The second commit is an optional addendum which fixes the issue observed on illumos in rust-lang#74469, a PR I reviewed that was ultimately closed due to inactivity. I'm not sure how hard we try to work around platform-specific bugs like this, and I believe that, due to the way that `dlerror` is specified in the POSIX standard, libc implementations that want to run on conforming systems cannot call `dlsym` in multi-threaded programs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants