MUSL target doesn't link on nightly #34978

Closed
alexcrichton opened this Issue Jul 22, 2016 · 28 comments

Projects

None yet

6 participants

@alexcrichton
Member

This was a recent regression:

$ cargo new foo --bin
$ cd foo
$ rustup run nightly-2016-07-19 cargo build --target x86_64-unknown-linux-musl
   Compiling foo v0.1.0 (file:///home/alex/code/foo)
$ rustup run nightly-2016-07-20 cargo build --target x86_64-unknown-linux-musl
   Compiling foo v0.1.0 (file:///home/alex/code/foo)
error: linking with `cc` failed: exit code: 1
  |
  = note: "cc" "-Wl,--as-needed" "-Wl,-z,noexecstack" "-nostdlib" "-static" "-Wl,--eh-frame-hdr" "-Wl,-(" "-m64" "/home/alex/.multirust/toolchains/nightly-2016-07-20-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib/crt1.o" "/home/alex/.multirust/toolchains/nightly-2016-07-20-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib/crti.o" "-L" "/home/alex/.multirust/toolchains/nightly-2016-07-20-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib" "/home/alex/code/foo/target/x86_64-unknown-linux-musl/debug/foo.0.o" "-o" "/home/alex/code/foo/target/x86_64-unknown-linux-musl/debug/foo" "-Wl,--gc-sections" "-nodefaultlibs" "-L" "/home/alex/code/foo/target/x86_64-unknown-linux-musl/debug" "-L" "/home/alex/code/foo/target/x86_64-unknown-linux-musl/debug/deps" "-L" "/home/alex/.multirust/toolchains/nightly-2016-07-20-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib" "-Wl,-Bstatic" "-Wl,-Bdynamic" "/home/alex/.multirust/toolchains/nightly-2016-07-20-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib/libstd-c8005792.rlib" "/home/alex/.multirust/toolchains/nightly-2016-07-20-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib/libpanic_unwind-c8005792.rlib" "/home/alex/.multirust/toolchains/nightly-2016-07-20-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib/libunwind-c8005792.rlib" "/home/alex/.multirust/toolchains/nightly-2016-07-20-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib/librand-c8005792.rlib" "/home/alex/.multirust/toolchains/nightly-2016-07-20-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib/libcollections-c8005792.rlib" "/home/alex/.multirust/toolchains/nightly-2016-07-20-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib/librustc_unicode-c8005792.rlib" "/home/alex/.multirust/toolchains/nightly-2016-07-20-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib/liballoc-c8005792.rlib" "/home/alex/.multirust/toolchains/nightly-2016-07-20-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib/liballoc_jemalloc-c8005792.rlib" "/home/alex/.multirust/toolchains/nightly-2016-07-20-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib/liblibc-c8005792.rlib" "/home/alex/.multirust/toolchains/nightly-2016-07-20-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib/libcore-c8005792.rlib" "-l" "compiler-rt" "/home/alex/.multirust/toolchains/nightly-2016-07-20-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib/crtn.o" "-Wl,-)"
  = note: /usr/bin/ld: /home/alex/.multirust/toolchains/nightly-2016-07-20-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib/liballoc_jemalloc-c8005792.rlib(jemalloc.pic.o): unrecognized relocation (0x2a) in section `.text.malloc_conf_init'
/usr/bin/ld: final link failed: Bad value
collect2: error: ld returned 1 exit status


error: aborting due to previous error

error: Could not compile `foo`.

To learn more, run the command again with --verbose.

The diff between the nightlies looks innocuous, so it looks like it was some change in deployment or images.

@alexcrichton
Member

This is perhaps related to rust-lang/rust-buildbot@417ddbf, specifically the update to the linux-cross build.

@alexcrichton
Member

Er no, the linux container, not the linux-cross container

@alexcrichton
Member

Ok, looks like this is due to the update to Ubuntu 16.04. I've got a suite of docker containers which are built each night which is how I discovered this regression. I'm personally running Ubuntu 14.04 on my computer (in the description), and the relevant Dockerfile is using Ubuntu 15.10 which also regressed. If I change that to Ubuntu 16.04, however, it works.

So... I wonder what changed!

@eefriedman
Contributor

0x2a is R_X86_64_REX_GOTPCRELX; newer versions of binutils generate it, but it isn't supported by older versions of binutils.

I'm guessing the cause is that you upgraded binutils on your nightly builder.

@alexcrichton
Member

Interesting! Could you clarify what you mean by newer versions of binutils generating it? The error here I believe is just using an object generated by what is presumably a newer gcc, not necessarily something linked with a newer linker. In that sense is there a flag we can pass to gcc to disable this behavior?

@eefriedman
Contributor

I mean the assembler ("as") is generating it.

It looks like there's a flag -mrelax-relocations=no to turn it off... you might be able to pass -Xassembler -mrelax-relocations=no or something like that to gcc. Or you could build with clang, where the integrated assembler doesn't do this by default.

@alexcrichton
Member

Hm I unfortunately can't seem to find any documentation pertaining to that flag, do you have any links on hand with the information here? Also, thanks for the investigation!

@eefriedman
Contributor

https://www.sourceware.org/ml/binutils-cvs/2016-02/msg00038.html . Taking another look, I'm not sure it's actually part of any released binutils version. :(

@alexcrichton
Member

Weird, surely I'm not using that old of a distribution, Ubuntu 15.10 is less than a year old? There's got to be some way we can change how we compile these objects, because the standard glibc target works ok, just musl is broken?

@eefriedman
Contributor

Compiling them with clang should work, I think.

It looks like you'll only end up with the relocation in question if code computes the address of a global using a GOT lookup, so it's possible the glibc target is just getting lucky.

@eefriedman
Contributor

Err, actually, hang on; why are we building code targeting musl with -fPIC in the first place?

@alexcrichton
Member

I thought -fPIC was required to build position-independent executables at the end?

@eefriedman
Contributor

You could build with -fPIE instead. But nevermind, it looks like that doesn't actually help here.

@alexcrichton
Member

The tools team got a chance to talk about this during triage yesterday, and our conclusion was that this wasn't a super serious regression but we'd love to fix this! We're not entirely sure how to do so ourselves, so help would greatly be appreciated!

@jwilm
jwilm commented Jul 26, 2016

Is there a workaround for this other than sticking to an older nightly?

@alexcrichton
Member

@jwilm for me at least updating to a Ubuntu 16.04 docker container also fixed it, and I think the underlying issue according to @eefriedman is that a newer binutils is required to link. If either of those can be arranged I think that'll get things working again.

@m4b
Contributor
m4b commented Jul 29, 2016

Chiming in here, nightly broke my build as well, but only on what looks like are ubuntu machines, with the relocation (0x2a) as stated above:

linking: dryad.so.1 with c8005792
ld -pie --gc-sections -Bsymbolic --dynamic-list=etc/dynamic-list.txt -I/tmp/dryad.so.1 -soname dryad.so.1 -L/home/ubuntu/.multirust/toolchains/nightly-x86_64-unknown-linux-gnu/lib  -nostdlib -e _start -o dryad.so.1 target/x86_64-unknown-linux-musl/debug/libdryad.rlib /home/ubuntu/.multirust/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib/libstd-c8005792.rlib /home/ubuntu/.multirust/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib/libcore-c8005792.rlib /home/ubuntu/.multirust/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib/librand-c8005792.rlib /home/ubuntu/.multirust/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib/liballoc-c8005792.rlib /home/ubuntu/.multirust/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib/libcollections-c8005792.rlib /home/ubuntu/.multirust/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib/librustc_unicode-c8005792.rlib /home/ubuntu/.multirust/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib/liballoc_system-c8005792.rlib /home/ubuntu/.multirust/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib/libpanic_unwind-c8005792.rlib /home/ubuntu/.multirust/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib/libunwind-c8005792.rlib /home/ubuntu/.multirust/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib/libcompiler-rt.a /home/ubuntu/.multirust/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib/liblibc-c8005792.rlib target/x86_64-unknown-linux-musl/debug/deps/libcolorify-70dd9c63eb32ff00.rlib target/x86_64-unknown-linux-musl/debug/deps/libcrossbeam-95e5cb210400af41.rlib target/x86_64-unknown-linux-musl/debug/deps/libgoblin-03cf7c16204371f2.rlib target/x86_64-unknown-linux-musl/debug/deps/libsyscall-bc1c5f989107898d.rlib
ld: /home/ubuntu/.multirust/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib/libunwind-c8005792.rlib(UnwindLevel1.c.o): unrecognized relocation (0x2a) in section `.text.assert_rtn'
ld: final link failed: Bad value

So I haven't tested with clang; unfortunately my build will be broken until the travis vm is updated (and hence gets a new linker that understands the reloc), I guess?

@alexcrichton
Member

@m4b the fix I know of currently at least is to update to Ubuntu 16.04 (perhaps in a docker image). Other than that though I know of a fix :(.

If the fix is known though we'd be more than happy to apply it!

@shepmaster
Contributor

I have also been bitten. 😄 There appear to be some similar reports out there from Rust packaged on Debian:

In both cases, it appears that the solution was updating binutils. It would be interesting to know why this affects only musl, as the above reports appeared to be normal Linux Rust versions.

@shepmaster
Contributor

This may also negatively affect people who use Travis CI to test nightly builds or to produce easy-to-run redistributable binaries. Since they currently only support up to 14.04, I believe that you'd have to use their Docker building and abandon the faster container-based infrastructure.

@alexcrichton
Member

@shepmaster those two debian bugs look unrelated as I think they're related to how debian compiles the binaries. Although if we can figure out how to ask something to not emit these kinds of relocations I'd imagine it'd fix them as well!

That is, the bug here I'm worried about is specifically related to the musl target, there should be no regression in the glibc target we're shipping because that container hasn't changed in ages.

@shepmaster
Contributor

two debian bugs look unrelated

They definitely aren't this problem, but similar issues around mismatched binutils. I'd suppose you are right because Debian probably compiles their own version of Rust against their own binutils.

specifically related to the musl target,

Oh, I agree. I was pointing out that I could see a project that wanted to make easily redistributable Linux binaries using Travis CI and building against musl running into this problem.

@m4b
Contributor
m4b commented Aug 1, 2016

I think we can fix this.

So, using binutils version 2.26, I recompiled musl with -fPIC and added the -Wa,-mrelax-relocations=no to musl's config.mak for the CFLAGS, and stderr, which was previously compiled with the problematic relocation, R_X86_64_GOTPCRELX and it generates the backwards compatible (but less optimized) relocation R_X86_64_GOTPCREL (notice no X):

objdump -r lib/libc.a | grep stderr
0000000000000007 R_X86_64_GOTPCREL  stderr-0x0000000000000004
0000000000000003 R_X86_64_GOTPCREL  stderr-0x0000000000000004
0000000000000003 R_X86_64_GOTPCREL  stderr-0x0000000000000004
0000000000000003 R_X86_64_GOTPCREL  stderr-0x0000000000000004
000000000000002c R_X86_64_GOTPCREL  stderr-0x0000000000000004
0000000000000014 R_X86_64_GOTPCREL  stderr-0x0000000000000004
0000000000000012 R_X86_64_GOTPCREL  stderr-0x0000000000000004
0000000000000010 R_X86_64_GOTPCREL  stderr-0x0000000000000004
stderr.lo:     file format elf64-x86-64
RELOCATION RECORDS FOR [.data.rel.local.__stderr_used]:
RELOCATION RECORDS FOR [.data.rel.ro.local.stderr]:
0000000000000586 R_X86_64_64       stderr
000000000000059b R_X86_64_64       __stderr_used
00000000000000a8 R_X86_64_GOTPCREL  __stderr_used-0x0000000000000004

similarly for the rest of the symbols.

So this flag is documented in man as as of binutils2.26.0.20160501, and it's present and working in my assembler (on Archlinux). Therefore, if you're building musl, we'll need an up to date assembler with an as at least:

as -version
GNU assembler (GNU Binutils) 2.26.0.20160501

I assume the containers have a newer version since the assembler is producing the relocation, but it does look like the flag was added after the fact. On my system at least, with the above assembler version, it works as expected.

Consequently, I think we can turn off this optimization for now, until whatever the appropriate time to turn it back on is. ( we will want to turn it back on as it turns GOT lookups into either immediates or RIP relative offsets, which will likely be faster)

@m4b
Contributor
m4b commented Aug 1, 2016 edited

oh yea, you'll need to add this flag to the libunwind compilation step too I think, since i'm seeing similar optimizations (and unknown relocation failures) there :/

EDIT: if cmake respects the CFLAGS env variable, the pull request below, or something similar, should probably work for that stage.

However, I'm also seeing the relocation in libstd, and liballoc_jemalloc.rlib, so similar flags will need to be used there as well when compiling those artifacts?

And just so it's here, I'm seeing the bad relocation in the following:

  • libunwind
  • libstd
  • liblibc
  • liballoc_jemalloc
  • crt1.o

which I checked using something like this in the musl nightly dir:

for i in *; do echo $i; objdump -r $i | grep "RELX"; done
@m4b
Contributor
m4b commented Aug 4, 2016

So if i'm understanding correctly, my rustc is:

rustc 1.12.0-nightly (a005b6785 2016-08-02)

which means my rustc's source had the build changes I added here #35178

However, when I inspect the musl build artifacts I'm still seeing the *RELX relocations in libstd and liballoc_jemalloc (and libunwind.a but latest pr in rust-buildbot should fix that).

I take this to mean the changes in PR 35178 had no affect? Or am I misunderstanding something?

@sanxiyn
Member
sanxiyn commented Aug 4, 2016

@m4b That nightly does not include #35178(9cf1897). Can be confirmed by running git log 9cf1897..a005b6785 and git log a005b6785..9cf1897.

@m4b
Contributor
m4b commented Aug 10, 2016

This seems to be resolved now with last PR. I suggest we close unless others are still experiencing?

@alexcrichton
Member

This indeed works for me as well, thanks @m4b!

@japaric japaric added a commit to japaric/rust that referenced this issue Aug 10, 2016
@japaric japaric add -mrelax-relocations=no to i686-musl and i586-gnu
I've been experiencing #34978 with these two targets. This applies the
hack in #35178 to these targets as well.
ae58a87
@eddyb eddyb added a commit to eddyb/rust that referenced this issue Aug 14, 2016
@eddyb eddyb Rollup merge of #35577 - japaric:relax, r=alexcrichton
add -mrelax-relocations=no to i686-musl and i586-gnu

I've been experiencing #34978 with these two targets. This applies the
hack in #35178 to these targets as well.

r? @alexcrichton
4d6a553
@eddyb eddyb added a commit to eddyb/rust that referenced this issue Aug 14, 2016
@eddyb eddyb Rollup merge of #35577 - japaric:relax, r=alexcrichton
add -mrelax-relocations=no to i686-musl and i586-gnu

I've been experiencing #34978 with these two targets. This applies the
hack in #35178 to these targets as well.

r? @alexcrichton
da2328b
@emk emk added a commit to emk/rust-musl-builder that referenced this issue Sep 12, 2016
@emk emk Switch from Debian to Ubuntu 16.04 to work around Rust musl link bug
See rust-lang/rust#34978

Allowing broken musl cross compilers through to stable really does break
some Rust-based projects.
c111667
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment