Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression on nightly since LLVM 8 upgrade: `thread` sanitizer doesn't compile anymore #53945

Open
PaulGrandperrin opened this Issue Sep 4, 2018 · 15 comments

Comments

Projects
None yet
9 participants
@PaulGrandperrin
Copy link

PaulGrandperrin commented Sep 4, 2018

Hi, the fuzzer I maintain is failing to build on the latest nightlies:

The interesting part of the error log seems to be:

note: /usr/bin/ld: __sancov_guards has both ordered [`__sancov_guards' in /home/travis/build/rust-fuzz/honggfuzz-rs/example/hfuzz_target/x86_64-unknown-linux-gnu/release/deps/example-934b6185f6a63e31.example.ajs5tmgw-cgu.0.rcgu.o] and unordered [`__sancov_guards' in /home/travis/build/rust-fuzz/honggfuzz-rs/example/hfuzz_target/x86_64-unknown-linux-gnu/release/deps/example-934b6185f6a63e31.example.ajs5tmgw-cgu.0.rcgu.o] sections
          /usr/bin/ld: final link failed: Bad value
          collect2: error: ld returned 1 exit status

You can find the full log here:
https://travis-ci.org/rust-fuzz/honggfuzz-rs/jobs/424079778

I bisected on my computer the exact rust version that fails and it seems to be related to the LLVM 8 upgrade.

This version works well:

# rustup default nightly-2018-09-01
# rustc -vV
rustc 1.30.0-nightly (aaa170beb 2018-08-31)
binary: rustc
commit-hash: aaa170bebe31d03e2eea14e8cb06dc2e8891216b
commit-date: 2018-08-31
host: x86_64-unknown-linux-gnu
release: 1.30.0-nightly
LLVM version: 7.0

This version doesn't:

# rustup default nightly-2018-09-02
# rustc -vV                                                                                                                             Tue 04 Sep 2018 02:25:07 PM CEST
rustc 1.30.0-nightly (28bcffead 2018-09-01)
binary: rustc
commit-hash: 28bcffead74d5e17c6cb1f7de432e37f93a6b50c
commit-date: 2018-09-01
host: x86_64-unknown-linux-gnu
release: 1.30.0-nightly
LLVM version: 8.0
@guidovranken

This comment has been minimized.

Copy link

guidovranken commented Sep 4, 2018

I have the same issue.

@PaulGrandperrin

This comment has been minimized.

Copy link
Author

PaulGrandperrin commented Sep 4, 2018

@PaulGrandperrin PaulGrandperrin changed the title Regression on nightly: /usr/bin/ld: __sancov_guards has both ordered [...] and unordered [...] sections Regression on nightly since LLVM 8 upgrade: /usr/bin/ld: __sancov_guards has both ordered [...] and unordered [...] sections Sep 4, 2018

PaulGrandperrin added a commit to rust-fuzz/honggfuzz-rs that referenced this issue Sep 6, 2018

test.sh: temporarily stop testing leak sanitizer on Linux
It's broken since the upgrade to LLVM 8, see: rust-lang/rust#53945

PaulGrandperrin added a commit to rust-fuzz/honggfuzz-rs that referenced this issue Sep 6, 2018

test.sh: temporarily stop testing thread sanitizer on Linux
It's broken since the upgrade to LLVM 8, see: rust-lang/rust#53945
@PaulGrandperrin

This comment has been minimized.

Copy link
Author

PaulGrandperrin commented Sep 6, 2018

I progressed a little bit on narrowing down the root cause of the issue.
It's only triggered when using the thread sanitizer.
How to reproduce:

cd /tmp
git clone https://github.com/rust-fuzz/honggfuzz-rs.git
cd honggfuzz-rs/example/
RUSTFLAGS="-Z sanitizer=thread" ./test.sh

If you use the address or leak sanitizer or no sanitizers, there is no issues.

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Sep 6, 2018

(The regression is nightly-to-nightly and recent, the label must've been an accident)

@PaulGrandperrin PaulGrandperrin changed the title Regression on nightly since LLVM 8 upgrade: /usr/bin/ld: __sancov_guards has both ordered [...] and unordered [...] sections Regression on nightly since LLVM 8 upgrade: `thread` sanitizer doesn't compile anymore Sep 9, 2018

@Shnatsel

This comment has been minimized.

Copy link

Shnatsel commented Sep 13, 2018

I am facing the same issue with address sanitizer on rustc 1.30.0-nightly (f2302daef 2018-09-12) when building with libfuzzer (cargo-fuzz)

@memoryruins memoryruins added the A-LLVM label Sep 15, 2018

@Aaron1011

This comment has been minimized.

Copy link
Contributor

Aaron1011 commented Sep 23, 2018

I'd like to work on this.

@Aaron1011

This comment has been minimized.

Copy link
Contributor

Aaron1011 commented Sep 23, 2018

Some initial findings:

That error is emited by ld when(I think) a section links to both both unordered and order sections. Ordered sections are defined by the presence of the SHF_LINK_ORDER ELF section header flag, which is described here

LLVM emits this flag in TargetLoweringObjectFileImpl.cpp here and here, in response to LLVMContext::MD_associated being set.

From what I can see, LLVMContext::MD_associated is unconditionally set by SanitizerCoverage when writing to the __sancov_guards section.

I'll need to investigate further to determine how this flag is getting left off.

@Aaron1011

This comment has been minimized.

Copy link
Contributor

Aaron1011 commented Sep 29, 2018

I've determine that passing -C opt-level=0 causes the compilation to succeed, while passing -C opt-level=1 causes it to fail.

I suspect that this issue is caused by an interaction between LLVM's Dead Global Elimination pass (which doesn't run with opt-level=0) and the sanitizer. My guess is that LLVM ends up deleting an unused function referenced by MD_ASSOCIATED. This would leave FunctionGuardArray with a dangling reference to its function, causing getAssociatedSymbol to return null.

In this case, LLVM would no longer add the SHF_LINK_ORDER flag to the ELF section, resulting in a linker error due to the missing flag.

However, this is all still somewhat speculative. I'm going to try to come up with a minimal reproduction, which can hopefully be induced to fail/succeed by toggling the Dead Global Elimination pass.

@Aaron1011

This comment has been minimized.

Copy link
Contributor

Aaron1011 commented Sep 30, 2018

TL;DR: As as a temporary workaround, pass -C opt-level=0. This issue is caused by an LLVM bug, so it will need to be fixed upstream.

I've now determined that this is definitely an LLVM bug. I've created a minimal reproduction, which only uses Clang and other LLVM tools, here: https://github.com/Aaron1011/llvm_arg_elim

The issue occurs due to the behavior of LLVM's DeadArgumentEliminationPass (not Dead Global Elimination, as I had previously thought). When DeadArgumentEliminationPass removes arguments/return values from a function, it actually creates an entirely new function, and updates all references to the previous function. However, it fails to update any MD_associated metadata entries
targeting the old function.

As I described in my previous comment, this results in LLVM leaving off the SHF_LINK_ORDER flag when generating the ELF section header. Since there are still other __sancov_guarc sections with the header present (from functions that DeadArgumentEliminationPass didn't modify), ld will error when it sees the mismatched flags.

I'll be filing a bug with LLVM once I'm given an account on their bugtracker. For now, you can work around this issue by passing -C opt-level=0 to rustc. This will disable running LLVM optimizations, including DeadArgumentEliminationPass. Unfortunately, there doesn't seem to be a way to disable that particular pass, other than by disabling all optimizations.

@Aaron1011

This comment has been minimized.

Copy link
Contributor

Aaron1011 commented Oct 3, 2018

Using the gold linker with -Clink-arg=-fuse-ld=gold seems to avoid this problem entirely.

When using the default (BFD) linker, the 'has both ordered and unordered' error appears to be triggered by two separate bugs:

  1. The LLVM DeadArgumentEliminationPass bug, which I'm still planning to upstream a fix for.
  2. The Dead Global Elimiation interaction that I mentioend here. I'm not sure if this is actually an LLVM bug - the existance of an MD_Associated global shouldn't prevent a function from being deleted, but there's no good way to delete the __sancov_gen global entirely. Since golddoesn't complain about SHF_LINK_ORDER being used inconsistently, I'm not sure if this is a real issue or not.
@Aaron1011

This comment has been minimized.

Copy link
Contributor

Aaron1011 commented Oct 11, 2018

I've managed to come up with a full fix locally. I'll be submitting my changes to LLVM tomorrow, and will post the Phabricator link(s) here once I do so.

The cause of the issue:

  1. Several LLVM passes (ArgumentPromotion, DeadArgumentElimination, Inliner, GlobalDCE, GlobalOpt, Internalize, and possibly others) mishandle COMDATs and/or MD_Associated metadata - either through improper deletion, or failure to properly update.
  2. This mishandling can result in two kinds of malformed __sancov_guards sections:
    1. The associated function is stripped from the object, but the __sancov_gen_ symbol associated with it is still emitted in a __sancov_guards section. Since the associated function section does not exist in the object, the __sancov_guards section will have nothing to link to. This is due to LLVM failing to take COMDATs into account in several places when deleting dead code/objects.
    2. The associated function still exists in the object, but the __sancov_guards section is not linked to it. This is due to several LLVM passes accidentally removing the MD_Associated metadata from the __sancov_gen_ global object.

In both of these cases, the BFD linker will see proper, 'ordered' __sancov_guards sections (sh_link is set and the SHF_LINK_ORDER flag is set) in addition to an improper, unordered __sancov_guards section (which LLVM failed to link to its associated function).

@sfackler

This comment has been minimized.

Copy link
Member

sfackler commented Oct 13, 2018

Nice! Might be worth pushing them to our llvm fork so we can pick them up more quickly?

@Aaron1011

This comment has been minimized.

Copy link
Contributor

Aaron1011 commented Oct 13, 2018

I think it might be best to wait until they're (hopefully) all accepted by LLVM. Getting them into the rust LLVM fork is going to require cherry-picking some additional commits, and it's possible that the LLVM team might want some changes before my patches are merged.

SingingTree added a commit to SingingTree/afl.rs that referenced this issue Oct 22, 2018

Work around linking issues from rust-fuzz#141, rust-lang/rust#53945
rust-fuzz#141 + rust-lang/rust#53945 track issues with linkage
which regressed when rust updated to llvm 8. This commit adds a work
around for such issues for cargo-afl. This helps with the ergonomics of
cargo-afl, particularly for those less familiar with the project and the
above issues.

These changes can be safely removed once patches are landed in llvm and
rust updates to use the patched version.

frewsxcv added a commit to rust-fuzz/afl.rs that referenced this issue Oct 23, 2018

Work around linking issues from #141, rust-lang/rust#53945 (#144)
#141 + rust-lang/rust#53945 track issues with linkage
which regressed when rust updated to llvm 8. This commit adds a work
around for such issues for cargo-afl. This helps with the ergonomics of
cargo-afl, particularly for those less familiar with the project and the
above issues.

These changes can be safely removed once patches are landed in llvm and
rust updates to use the patched version.
@brson

This comment has been minimized.

Copy link
Contributor

brson commented Dec 28, 2018

Thank for all your work on this @Aaron1011. Any news?

brson added a commit to brson/tikv that referenced this issue Dec 31, 2018

fuzz: make fuzzers work with nightly
Recent Rust compilers have bugs that appear when fuzzing
optimized binaries:

rust-lang/rust#53945

This patch works around the issue by TODO

Signed-off-by: Brian Anderson <andersrb@gmail.com>

brson added a commit to brson/tikv that referenced this issue Dec 31, 2018

fuzz: make fuzzers work with nightly
Recent Rust compilers have bugs that appear when fuzzing
optimized binaries:

rust-lang/rust#53945

This patch works around the issue by TODO

Signed-off-by: Brian Anderson <andersrb@gmail.com>

brson added a commit to brson/tikv that referenced this issue Dec 31, 2018

fuzz: make fuzzers work with nightly
Recent Rust compilers have bugs that appear when fuzzing
optimized binaries:

rust-lang/rust#53945

This patch works around the issue by TODO

Signed-off-by: Brian Anderson <andersrb@gmail.com>

brson added a commit to brson/tikv that referenced this issue Dec 31, 2018

fuzz: make fuzzers work with nightly
Recent Rust compilers have bugs that appear when fuzzing
optimized binaries:

rust-lang/rust#53945

This patch works around the issue by adding the "-C codegen-units=1 -C
incremental=fuzz-incremental" arguments to `RUSTFLAGS`.

Why this works I don't actually know. This workaround isn't mentioned
in the linked issue, and afaik the "incremental" flag is simply
changing the directory of the incremental cache, not turning it on or
off.

Signed-off-by: Brian Anderson <andersrb@gmail.com>

brson added a commit to brson/tikv that referenced this issue Jan 1, 2019

fuzz: make fuzzers work with nightly
Recent Rust compilers have bugs that appear when fuzzing
optimized binaries:

rust-lang/rust#53945

This patch works around the issue by adding the "-C codegen-units=1 -C
incremental=fuzz-incremental" arguments to `RUSTFLAGS`.

Why this works I don't actually know. This workaround isn't mentioned
in the linked issue, and afaik the "incremental" flag is simply
changing the directory of the incremental cache, not turning it on or
off.

Signed-off-by: Brian Anderson <andersrb@gmail.com>

brson added a commit to brson/tikv that referenced this issue Jan 1, 2019

fuzz: make fuzzers work with nightly
Recent Rust compilers have bugs that appear when fuzzing
optimized binaries:

rust-lang/rust#53945

This patch works around the issue by adding the "-C codegen-units=1 -C
incremental=fuzz-incremental" arguments to `RUSTFLAGS`.

Why this works I don't actually know. This workaround isn't mentioned
in the linked issue, and afaik the "incremental" flag is simply
changing the directory of the incremental cache, not turning it on or
off.

Signed-off-by: Brian Anderson <andersrb@gmail.com>

garyttierney added a commit to garyttierney/secsp that referenced this issue Jan 3, 2019

Force fuzzing with 1 codegen unit
This is a workaround for an LLVM 8 codegen issue. See
rust-lang/rust#53945 for more information.

brson added a commit to brson/tikv that referenced this issue Jan 3, 2019

fuzz: make fuzzers work with nightly
Recent Rust compilers have bugs that appear when fuzzing
optimized binaries:

rust-lang/rust#53945

This patch works around the issue by adding the "-C codegen-units=1 -C
incremental=fuzz-incremental" arguments to `RUSTFLAGS`.

Why this works I don't actually know. This workaround isn't mentioned
in the linked issue, and afaik the "incremental" flag is simply
changing the directory of the incremental cache, not turning it on or
off.

Signed-off-by: Brian Anderson <andersrb@gmail.com>

brson added a commit to brson/tikv that referenced this issue Jan 3, 2019

fuzz: make fuzzers work with nightly
Recent Rust compilers have bugs that appear when fuzzing
optimized binaries:

rust-lang/rust#53945

This patch works around the issue by adding the "-C codegen-units=1 -C
incremental=fuzz-incremental" arguments to `RUSTFLAGS`.

Why this works I don't actually know. This workaround isn't mentioned
in the linked issue, and afaik the "incremental" flag is simply
changing the directory of the incremental cache, not turning it on or
off.

Signed-off-by: Brian Anderson <andersrb@gmail.com>

garyttierney added a commit to garyttierney/secsp that referenced this issue Jan 3, 2019

Force fuzzing with 1 codegen unit
This is a workaround for an LLVM 8 codegen issue. See
rust-lang/rust#53945 for more information.

garyttierney added a commit to garyttierney/secsp that referenced this issue Jan 3, 2019

Force fuzzing with 1 codegen unit
This is a workaround for an LLVM 8 codegen issue. See
rust-lang/rust#53945 for more information.

brson added a commit to brson/tikv that referenced this issue Jan 7, 2019

fuzz: make fuzzers work with nightly
Recent Rust compilers have bugs that appear when fuzzing
optimized binaries:

rust-lang/rust#53945

This patch works around the issue by adding the "-C codegen-units=1 -C
incremental=fuzz-incremental" arguments to `RUSTFLAGS`.

Why this works I don't actually know. This workaround isn't mentioned
in the linked issue, and afaik the "incremental" flag is simply
changing the directory of the incremental cache, not turning it on or
off.

Signed-off-by: Brian Anderson <andersrb@gmail.com>

brson added a commit to brson/tikv that referenced this issue Jan 9, 2019

fuzz: make fuzzers work with nightly
Recent Rust compilers have bugs that appear when fuzzing
optimized binaries:

rust-lang/rust#53945

This patch works around the issue by adding the "-C codegen-units=1 -C
incremental=fuzz-incremental" arguments to `RUSTFLAGS`.

Why this works I don't actually know. This workaround isn't mentioned
in the linked issue, and afaik the "incremental" flag is simply
changing the directory of the incremental cache, not turning it on or
off.

Signed-off-by: Brian Anderson <andersrb@gmail.com>

brson added a commit to brson/tikv that referenced this issue Jan 9, 2019

fuzz: make fuzzers work with nightly
Recent Rust compilers have bugs that appear when fuzzing
optimized binaries:

rust-lang/rust#53945

This patch works around the issue by adding the "-C codegen-units=1 -C
incremental=fuzz-incremental" arguments to `RUSTFLAGS`.

Why this works I don't actually know. This workaround isn't mentioned
in the linked issue, and afaik the "incremental" flag is simply
changing the directory of the incremental cache, not turning it on or
off.

Signed-off-by: Brian Anderson <andersrb@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.