New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Require static native libraries when linking static executables #55566

Open
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
10 participants
@smaeul
Contributor

smaeul commented Nov 1, 2018

gcc/ld will create a dynamically-linked executable without warning, even when passed -static, when asked to link to a .so. Avoid this confusing and unintended behavior by always using the static version of libraries when trying to link static executables.

This is a patch originally wrote as part of #40113. I was asked to hold off on it then. Now that I see others running into the same issue (#54243), and things have settled down a bit, I'm submitting it again.

This patch only affects musl and msvc (targets that set crt_static_respected). I don't know enough about msvc to know whether "static" binaries are any different than non-"static" ones in whether or not they can load DLLs. If they can load DLLs, then I'll need to add a !is_like_msvc check.

Note that this patch corresponds to one Alpine has been shipping since 1.16.0.

@rust-highfive

This comment has been minimized.

Collaborator

rust-highfive commented Nov 1, 2018

r? @petrochenkov

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive

This comment has been minimized.

Collaborator

rust-highfive commented Nov 1, 2018

The job x86_64-gnu-llvm-5.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
travis_time:end:2f35e68a:start=1541037063465106335,finish=1541037065498413215,duration=2033306880
$ git checkout -qf FETCH_HEAD
travis_fold:end:git.checkout

Encrypted environment variables have been removed for security reasons.
See https://docs.travis-ci.com/user/pull-requests/#Pull-Requests-and-Security-Restrictions
$ export SCCACHE_BUCKET=rust-lang-ci-sccache2
$ export SCCACHE_REGION=us-west-1
Setting environment variables from .travis.yml
$ export IMAGE=x86_64-gnu-llvm-5.0
---
travis_time:end:0053d52c:start=1541037075356955537,finish=1541037075370511011,duration=13555474
travis_fold:end:before_script.2
travis_time:start:153857fc
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)
Thu Nov  1 01:51:15 UTC 2018
Thu, 01 Nov 2018 01:51:15 GMT

The command "date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)
" exited with 0.
travis_time:start:05189ae4

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@smaeul smaeul force-pushed the smaeul:static-linkage branch from 04ad185 to 2c9bd43 Nov 1, 2018

Require static native libraries when linking static executables
gcc/ld will create a dynamically-linked executable without warning, even
when passed `-static`, when asked to link to a `.so`. Avoid this
confusing and unintended behavior by always using the static version of
libraries when trying to link static executables.

Fixes #54243
@petrochenkov

This comment has been minimized.

Contributor

petrochenkov commented Nov 1, 2018

@alexcrichton

This comment has been minimized.

Member

alexcrichton commented Nov 5, 2018

Thanks for the PR! I think though that the naming here is a bit of a red herring, "NativeUnknown" is better to be named as "NativeDylib", as it comes up for kind = "dylib" as well.

With that in mind I'm not sure that this is quite the right patch, but rather it seems like we should emit a compiler error if a dynamically linked library is requested for a statically linked executable. That would itself be a breaking change I think, though, because the linker typically sorts things out where "ask for a dynamic library" is actually "just pass the right -l flag and most of the time it's static anyway".

Can you describe a bit more what's being fixed here? Is this largely just fixing a papercut? Or fixing a fundamental issue?

@smaeul

This comment has been minimized.

Contributor

smaeul commented Nov 6, 2018

Thanks for the PR! I think though that the naming here is a bit of a red herring, "NativeUnknown" is better to be named as "NativeDylib", as it comes up for kind = "dylib" as well.

If I write

#[link(name = "foo")]
extern {}

this will generate a NativeUnknown dependency. It's telling the compiler "link the most appropriate libfoo you can find". Are you telling me that writing

#[link(name = "foo", kind = "dylib")]
extern {}

will also generate a NativeUnknown dependency? If so, this is extremely counterintuitive behavior, because the second snippet is telling the compiler something completely different: "link libfoo, but only if you can find it as a shared library". Maybe I need to wire up a new NativeDylib choice?

With that in mind I'm not sure that this is quite the right patch, but rather it seems like we should emit a compiler error if a dynamically linked library is requested for a statically linked executable.

I wholeheartedly agree. Passing kind = "dylib" along with +crt-static should be an immediate compiler error. (Again, with the caveat that nothing I say applies to MSVC, as I have no idea how that works.)

That would itself be a breaking change I think, though, because the linker typically sorts things out where "ask for a dynamic library" is actually "just pass the right -l flag and most of the time it's static anyway".

Sorry, but this is just false. Everywhere but macOS (where you can't create static binaries anyway), GccLinker explicitly prevents the linker from choosing what kind of library to use, by passing it -Bdynamic or -Bstatic (called from here).

Moreover, the patch as written is not a breaking change (again, ignoring MSVC). As explained below, rustc today produces entirely nonfunctional binaries in the scenario this patch affects.

Can you describe a bit more what's being fixed here? Is this largely just fixing a papercut? Or fixing a fundamental issue?

Today, without this patch, it is impossible to link a native library into a static binary with either #[link(name = "foo")] extern {} or cargo:rustc-link-lib=foo. Instead, a completely broken binary is produced, and absolutely no warning or error is emitted at the time.


A recent example of this being a problem is #54243. This broke the entire testsuite in #55163 until they globally added RUSTFLAGS="-C target-feature=-crt-static" (search for "failed to exec"). There are plenty of other reports around the web. Try here, here, here, here, here, here, here, and also #39998. This was also an issue while developing #40113 (e.g. here). Quoting what I originally said there:

  1. The current problem is that, when base.dynamic_linking = true, base.crt_static_default = true is not enough to enforce that binaries are statically linked by default.

    • It does not enforce that all linked libraries must also be statically linked, so when gcc sees dynamic libraries, it creates a dynamic executable (i.e. an ELF with DT_NEEDED entries).
    • But because you told it you were creating a static executable, gcc not pass -dynamic-linker to ld to add the correct PT_INTERP header to the ELF, and therefore ld uses its hardcoded default for the architecture, e.g. /lib/ld64.so.1 on x86_64.
  2. Therefore, to maintain the status quo from 3 when setting base.dynamic_linking = true, we must first fix +crt-static, which is turned on by base.crt_static_default = true (regardless of any setting in config.toml), to actually do what it says. It should convert -lc -lgcc, etc. from linking dynamically to linking statically.

    • For GccLinker, this means changing them from being preceded by -Bdynamic to being preceded by -Bstatic, which is exactly the change I made to NativeLibraryKind::NativeUnknown.

To put this in to terms more relevant to today, if I compile an executable crate with target-features=+crt-static and cargo:rustc-link-lib=foo, here's what happens:

  1. rustc builds a command line for gcc.
    • Because of target-features=+crt-static, -static is added to gcc's command line.
    • A NativeUnknown dependency is created on foo. That dependency is translated to -Wl,-Bdynamic -lfoo on the gcc command line in GccLinker::link_dylib().
  2. rustc calls gcc, which interprets its arguments and builds a command line for ld.
    • Because -static was passed, gcc thinks a static binary is being generated. Static binaries don't have a dynamic linker, by definition. So gcc does not pass -dynamic-linker to ld. It does forward -static to ld.
    • gcc sees -Wl,-Bdynamic. It's an argument for gcc's linker backend, not the compiler driver, so gcc doesn't interpret it. It only strips off the -Wl and passes the rest to ld.
    • -lfoo is also forwarded to the linker.
  3. gcc runs ld. ld's command line looks something like the following (along with a bunch of irrelevant arguments such as library paths and startup files):
    ld -static /tmp/$EXECUTABLE_CRATE.o -Bdynamic -lfoo -Bstatic $PATH_TO_LIBSTD -lc -lunwind
    
    • ld sees -static. Okay, we're making a static binary.
    • ld sees -Bdynamic. Wait a minute. You can't link dynamic libraries into a static binary. But you told me to link libfoo dynamically. So I guess we're making a dynamic binary now! (In other words ld completely ignores -static at this point).
    • ld sees -lfoo. If libfoo.so exists in a library search path, it gets linked; ld doesn't even look for libfoo.a if it finds a libfoo.so.
    • ld sees -Bstatic and goes back to looking for static libraries. It sees libstd, libc, and libunwind and links them all statically into the binary.
    • ld doesn't see -dynamic-linker, so since dynamic binaries must have one, it uses the hardcoded default, which on x86_64 is /lib/ld64.so.1 (this is leftover from the libc5 days, so it's very wrong even for glibc).
  4. The build succeeds without any errors or diagnostic messages.
  5. You try to run the resulting binary, and it fails to execute with "No such file or directory", because of course /lib/ld64.so.1 doesn't exist on your system. Even if the dynamic linker path was correct, the binary would segfault because it has two copies of libc's global data (locks, symbol and thread lists, etc.): one statically linked in, used by the main program; and one dynamically linked, used by the native shared libraries.

That's probably way more detail than you need, but I hope it's clear now that, yes, this is an issue; and yes, people are bitten by it on a regular basis.

There are three possible workarounds or "solutions" I'm aware of, in order of worst to best:

  1. Stop passing -Bdynamic and -Bstatic hints to the linker. This allows the linker to choose the library type, but it also means kind="dylib" and kind="static-nobundle" (and no kind at all) do the exact same thing, and the developer loses control over which native dependencies are needed at runtime.
    • Note that keeping the hints everywhere except NativeUnknown is not possible. Once you pass -Bstatic (e.g. for NativeStatic), everything is static unless you later pass -Bdynamic. And -Bdynamic is what makes everything explode.
  2. Prohbit NativeUnknown libraries from ever being linked into static executables. This forces every crate that wants to use native libraries to use kind="static" or kind="static-nobundle" (possibly with #cfg[target_feature("crt-static")]), or they will be unusable for static linking.
  3. Interpret NativeUnknown as dynamic if the binary is dynamic, or static if the binary is static, like the patch in this PR. This is akin to how ld interprets a -l options with no preceding -B option. It works, it's intuitive, and it doesn't require patching hundreds of crates. As suggested above, we could add a NativeDynamic (or whatever you want to name it) option for kind="dylib" that always links dynamically, and produces an error on sess.crt_static() == true.
@smaeul

This comment has been minimized.

Contributor

smaeul commented Nov 6, 2018

@alexcrichton I read back through your conversation with @aidanhs in #40113 (which I probably should have done earlier), and I agree that I'll need to add NativeDylib or NativeDefault, however it's named, to distinguish kind="dylib" and no kind, before this would be mergeable.

"No kind always means dylib, even when impossible" is counterintuitive to me, but if that's the documented behavior, then this is a behavior change and not just a bugfix. So I should treat it more carefully as such. However, I'd like to reiterate what @aidanhs said that this is "a strictly non-breaking change" for musl.

@alexcrichton

This comment has been minimized.

Member

alexcrichton commented Nov 9, 2018

Unfortunately I think we're basically in a bind with omitting kind, it's always been documented that not specifying anything implies dylib as a kind. In that sense there's no way to add a NativeDylib type because it's already there, and NativeUnknown is still just a bad name.

I personally feel like this continues to pile on evidence that attempting to have the same target for both dynamically and statically linked musl is just the wrong strategy. There's already a few flags that need to be passed and configuration has to be just right across the entire stack to get things working. Misconfiguration leads to everyone feeling like the have to hack around everything when the fact of the matter is that the original musl target was simply never added with the intention of supporting both static and dynamic linking.

While I think you're right that this technically isn't a breaking change it doesn't really make sense to land in the model that we have today which is that you specify a library as either dynamic/static to link later. An alternative may be to just ignore linker library hints entirely on musl perhaps?

@smaeul

This comment has been minimized.

Contributor

smaeul commented Nov 9, 2018

To be fair, nothing about this issue is specific to musl. In fact, if you look at the patches downstream in various distributions, they remove all of the special flags and hacks that make musl targets complicated. We're just seeing this issue on musl because it's one of the very few libc implementations that fully supports both linkages. If you were able to create fully static glibc-linked binaries, you'd run into the exact same issue. So what I really don't want to do is make the musl targets even more special.

Unfortunately I think we're basically in a bind with omitting kind, it's always been documented that not specifying anything implies dylib as a kind. In that sense there's no way to add a NativeDylib type because it's already there, and NativeUnknown is still just a bad name.

Is it not possible to change the documentation? It seems like this is the only hangup. Or is there another issue with the semantics of "Implies dylib when compiling a dylib or normal binary, or static [or possibly static-nobundle] when compiling a static binary"? Remember that crt-static is a no-op everywhere but musl and MSVC already, so this has no functional impact on other targets. It does not require people to go around adding kind = "dylib" to any existing crates. Once again, this only affects people whose use cases are currently broken. I'd think the community would appreciate the new feature of "the compiler automatically picks the right library to link".

Actually, the documentation recommends to "inspect the resulting binary to ensure that it's linked as you would expect after the compiler succeeds." So yes, the behavior with respect to NativeUnknown is documented, but it's also documented that sometimes the compiler does the wrong thing.

I personally feel like this continues to pile on evidence that attempting to have the same target for both dynamically and statically linked musl is just the wrong strategy. There's already a few flags that need to be passed and configuration has to be just right across the entire stack to get things working. Misconfiguration leads to everyone feeling like the have to hack around everything when the fact of the matter is that the original musl target was simply never added with the intention of supporting both static and dynamic linking.

What would you even name the new target? Would you create a new target for static glibc? Then what is the point of crt-static?

While I think you're right that this technically isn't a breaking change it doesn't really make sense to land in the model that we have today which is that you specify a library as either dynamic/static to link later. An alternative may be to just ignore linker library hints entirely on musl perhaps?

Can we not fix the model?

@alexcrichton

This comment has been minimized.

Member

alexcrichton commented Nov 13, 2018

Also just to clarify, everything here is entirely my own personal opinion. Others who also help out in maintaining the compiler may feel radically different than I. Mine certainly isn't the only opinion that matters here!

These sorts of linkage questions are notoriously hard to change after the fact (which I feel like we've largely learned over time). Changes like this which originate from one platform tend to have far-reaching and unintended consequences on other platforms, often taking shape of situations that are uncovering other bugs or simply weren't thought of originally. I agree that this naively looks like it only would help to fix and/or change already broken code, but I in no way would say that with conviction.

The crt-static flag was primarily added for MSVC. IIRC after-the-fact it was attempted to repurpose the same flag for musl and whether or not libc is dynamically or statically linked. The behavior of statically linking libc on musl is quite different than what it means to have a static CRT on MSVC. That, plus the fact that everything has attempted to get shoehorned into one target, means that things are complicated and not always working today.

This is why I am personally a fan of adding new target, something like x86_64-alpine-linux-musl or something like that. Adding a new target means we can't regress other targets, and it's a chance to clean out old configuration eventually and make sure that code is updated to specifically handle just this one target, not other targets.

This PR, if interpreted literally, is wrong for MSVC. A static CRT doesn't mean you can't link to a dynamic library. This won't regress MSVC, however, because the implementation of link_staticlib and link_dylib looks exactly the same. If that were to change one day, however, it would produce bugs. This is one example of how a one-target-specific-problem which attempts to fix it for all targets can have unintended consequences.

I'd be curious to hear the thoughts of others on this topic, but historically there haven't been too many too interested in this topic. I'm personally trying to keep projects from regressing while ensuring we can still add new features and fix bugs in rustc, and I'm specifically worried about how a patch like this might accidentally affect someone's usage of the existing musl target or MSVC.

@corbinu

This comment has been minimized.

corbinu commented Nov 13, 2018

@alexcrichton @smaeul Super happy to see all the things happening here! Just my two cents but I think having a target like x86_64-alpine-linux-musl or x86_64-linux-staticmusl would be a very cool idea.

I feel like a being able to easily build single static linux binary is a relevant target that people would like being able to just specify and go. I also think it has great implications for Rust's use with Serverless and Docker since that kind of binary can be inserted in a scratch docker container and not even need Alpine.

@awilfox

This comment has been minimized.

awilfox commented Nov 14, 2018

If you are going to add an x86_64-alpine-linux-musl target, please also add the following for Adélie, which is a completely separate distribution from Alpine and has wildly different opinions on how things should be done:

  • aarch64-foxkit-linux-musl
  • armv5-foxkit-linux-musleabi
  • armv6-foxkit-linux-muslgnueabihf
  • mips-foxkit-linux-musl
  • powerpc-foxkit-linux-musl
  • powerpc64-foxkit-linux-musl
  • i486-foxkit-linux-musl
  • i586-foxkit-linux-musl
  • x86_64-foxkit-linux-musl

Please also note that Alpine is only shipping Rust on x86_64, while we are already shipping it on all of our tier1 platforms (ppc32, ppc64, i586, x86_64) and will eventually be bringing it to tier2 platforms as possible (armv5/6, arm64, mips, i486). In fact, bringing Rust to our arm64 port was why I filed #54999.

Unlike Alpine we are focused on desktop usage and we feel Rust is very important to the future of the Linux desktop (Firefox, librsvg, and other Gnome projects being a small sample of what is to come).

I'm not sure what should be done for Void, since they have no vendor triplet that I'm aware of and they ship both glibc and musl versions of their distributions.

@alexcrichton

This comment has been minimized.

Member

alexcrichton commented Nov 14, 2018

@awilfox sorry that was just a strawman of what the target might look like! I didn't mean to single out Alpine. If a new target were added it'd likely more appropriately be named something like x86_64-dynamic-linux-musl or something generic like that. We don't have target per glibc-based distribution (aka ubuntu/debian/fedora/etc), we've just got one x86_64-unknown-linux-gnu that covers everything. Similarly all musl-based distributions should be able to use the same target

@mati865

This comment has been minimized.

Contributor

mati865 commented Nov 20, 2018

It'd be a bit confusing:

Dynamic Static
x86_64-unknown-linux-gnu x86_64-static-linux-gnu (if it's ever made)
x86_64-dynamic-linux-musl x86_64-unknown-linux-musl

It'd more clear if targets were named like $arch-{dynamic,static}-linux-{gnu,musl}. For the backward compatibility unknown target would point to dynamic or static depending on the libc.

@alexcrichton

This comment has been minimized.

Member

alexcrichton commented Nov 20, 2018

@mati865 it'll likely be inconsistent, yes, but that's sort of a fact of life we'll have to live with because it's the way it is right now.

@malbarbo

This comment has been minimized.

Contributor

malbarbo commented Nov 20, 2018

How about naming the static version?

Dynamic Static
x86_64-unknown-linux-gnu x86_64-static-linux-gnu (if it's ever made)
x86_64-unknown-linux-musl x86_64-static-linux-musl
@alexcrichton

This comment has been minimized.

Member

alexcrichton commented Nov 20, 2018

@malbarbo that would unfortunately be a breaking change to the existing x86_64-unknown-linux-musl target, which we're not prepared to do

@TimNN

This comment has been minimized.

Contributor

TimNN commented Nov 27, 2018

Ping from triage @smaeul / @alexcrichton: What is the status of this PR? Is it blocked on something (RFC / FCP)?

@alexcrichton

This comment has been minimized.

Member

alexcrichton commented Nov 27, 2018

I personally still feel that this is too high risk to land, and that the best way forward here is to add a dedicated target for dynamically linking musl, something like x86_64-dynamic-linux-musl or x86_64-unknown-linux-dynmusl (or something like that)

@alexcrichton

This comment has been minimized.

Member

alexcrichton commented Nov 27, 2018

in that it's not blocked on FCP/RFC/etc, just a consensus about the direction to take

@smaeul

This comment has been minimized.

Contributor

smaeul commented Nov 29, 2018

So in that case do you suggest changing the current targets back to dynamic_linking = false? I think that should be sufficient to fix linkage issues. Would crt-static then be forced true or false for the static musl target (or ignored)? I'm thinking of its uses in #[cfg] in the wild moreso than how rustc uses it.

As for the dynamic musl target, would it support crt-static at all? Or would you prefer to make crt-static a MSVC-only thing?

I'm trying to clarify the semantics you're looking for, so maybe I can submit an updated patch. I agree it does reduce the differences between targets, so the only differentiating factor is dynamic_linking.

From a Linux distribution maintainer's perspective, it would be great if the dynamic musl target supported both static and dynamic linkage via crt-static. Otherwise, we'd have to ship two targets, with identical copies of all of the libraries, or do hacky things with symlinks to pretend to have both targets installed.

@alexcrichton

This comment has been minimized.

Member

alexcrichton commented Nov 30, 2018

Hm I'm not certain on those questions. If a new target is added and it supports both then the current one effectively becomes deprecated which would be pretty unfortunate. If a new one is added it doesn't necessarily mean we can tweak the existing one, though, lest we risk breaking changes to current usage. Thinking more about it, it may not be the best way forward?

I'm trying to think of how we can solve the problem here, and one thing I wanted to clarify was what motivated this patch? In the abstract it makes sense that this is problematic (asking for a dynamic library on a static musl target produces a weird binary). Is there like one or two concrete use cases we can fix up though? It seems that all crates.io crates should "already be correct" and libstd may be the culprit here, but I'd want to confirm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment