Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

7 GWP-ASan lit test failures in linux self build after commit b22910daab95 #46824

Open
dyung opened this issue Sep 10, 2020 · 11 comments
Open

7 GWP-ASan lit test failures in linux self build after commit b22910daab95 #46824

dyung opened this issue Sep 10, 2020 · 11 comments
Labels
bugzilla

Comments

@dyung
Copy link
Collaborator

dyung commented Sep 10, 2020

Bugzilla Link 47480
Version trunk
OS All
CC @chandlerc,@jdoerfert,@LebedevRI,@RKSimon,@morehouse,@hctim,@nikic,@nunoplopes,@rotateright,@vitalybuka,@vlad902

Extended Description

As part of our testing internally, we do a self-host build of the compiler on linux and run the lit tests to make sure that everything works. Recently this self-host build started failing 7 GWP-ASan tests in the stage 1 build of our self-host build. I bisected the test failure to recent commit b22910d.

The 7 tests that are failing are:

  1. GWP-ASan-x86_64 :: double_delete.cpp
  2. GWP-ASan-x86_64 :: double_deletea.cpp
  3. GWP-ASan-x86_64 :: double_free.cpp
  4. GWP-ASan-x86_64 :: invalid_free_left.cpp
  5. GWP-ASan-x86_64 :: invalid_free_right.cpp
  6. GWP-ASan-x86_64 :: realloc.cpp
  7. GwpAsan-Unittest :: ./GwpAsan-x86_64- Test/BacktraceGuardedPoolAllocator.DoubleFree

I was able to reproduce this locally on my linux machine running Ubuntu 20.04 using both Gcc 9.3.0 and clang 10.0.0 to build the initial compiler.

Steps to reproduce the failure:

  1. Build clang using the system compiler (gcc 9.3 or clang 10 in my case)
  2. Use the clang built in step 1 to rebuild clang
  3. Using the version of clang built in step 2, run the lit tests

The cmake command I used to build was the following:
cmake -G Ninja -DLLVM_INCLUDE_EXAMPLES=OFF -DLLVM_VERSION_SUFFIX= -DLLVM_BUILD_RUNTIME=ON -DLLVM_TOOL_LLD_BUILD=OFF "-DLLVM_LIT_ARGS=--verbose -j150" -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_OPTIMIZED_TABLEGEN=ON -DLLVM_TARGETS_TO_BUILD=all "-DLLVM_ENABLE_PROJECTS=clang;compiler-rt" -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++

The Unittest that failed produced the following output:
FAIL: GwpAsan-Unittest :: ./GwpAsan-x86_64-Test/BacktraceGuardedPoolAllocator.DoubleFree (32710 of 69710)
******************** TEST 'GwpAsan-Unittest :: ./GwpAsan-x86_64-Test/BacktraceGuardedPoolAllocator.DoubleFree' FAILED ********************
Note: Google Test filter = BacktraceGuardedPoolAllocator.DoubleFree
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from BacktraceGuardedPoolAllocator
[ RUN ] BacktraceGuardedPoolAllocator.DoubleFree
/home/dyung/src/upstream/llvm_clean_git/compiler-rt/lib/gwp_asan/tests/backtrace.cpp:45: Failure
Death test: DeallocateMemory2(GPA, Ptr)
Result: failed to die.
Error msg:
[ DEATH ]
[ FAILED ] BacktraceGuardedPoolAllocator.DoubleFree (62 ms)
[----------] 1 test from BacktraceGuardedPoolAllocator (62 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (69 ms total)
[ PASSED ] 0 tests.
[ FAILED ] 1 test, listed below:
[ FAILED ] BacktraceGuardedPoolAllocator.DoubleFree

1 FAILED TEST

While the regular lit tests that failed all seemed to not produce any output:
FileCheck error: '' is empty.
FileCheck command line: FileCheck /home/dyung/src/upstream/llvm_clean_git/compiler-rt/test/gwp_asan/double_delete.cpp

@LebedevRI
Copy link
Member

LebedevRI commented Sep 10, 2020

See ongoing disscussion in https://reviews.llvm.org/D87149
I believe TLDR is that unless these miscompiles are for volatile stores,
it is GWP-ASan that needs to be fixed.

@rotateright
Copy link
Contributor

rotateright commented Sep 10, 2020

Warning: I don't know anything about GWP-ASan, and I'm not experienced in the UB subtleties exposed by my patch...
but a first grep of that dir shows the likely problematic function:

void GuardedPoolAllocator::trapOnAddress(uintptr_t Address, Error E) {
State.FailureType = E;
State.FailureAddress = Address;

// Raise a SEGV by touching first guard page.
volatile char *p = reinterpret_cast<char *>(State.GuardedPagePool);
*p = 0;
__builtin_unreachable();
}

IIUC, we are probably going to change LLVM to allow this usage. But would it be safer/better to use __builtin_trap() or __builtin_debugtrap() in this function?

@rotateright
Copy link
Contributor

rotateright commented Sep 10, 2020

cc'ing people that I hopefully correctly cross-referenced from:
https://reviews.llvm.org/D60593

@nunoplopes
Copy link
Member

nunoplopes commented Sep 10, 2020

Warning: I don't know anything about GWP-ASan, and I'm not experienced in
the UB subtleties exposed by my patch...
but a first grep of that dir shows the likely problematic function:

void GuardedPoolAllocator::trapOnAddress(uintptr_t Address, Error E) {
State.FailureType = E;
State.FailureAddress = Address;

// Raise a SEGV by touching first guard page.
volatile char *p = reinterpret_cast<char *>(State.GuardedPagePool);
*p = 0;
__builtin_unreachable();
}

IIUC, we are probably going to change LLVM to allow this usage. But would it
be safer/better to use __builtin_trap() or __builtin_debugtrap() in this
function?

Nice catch, thanks!
Using __builtin_trap() would be better indeed. unreachable is quite nasty.

@dyung
Copy link
Collaborator Author

dyung commented Sep 10, 2020

Warning: I don't know anything about GWP-ASan, and I'm not experienced in
the UB subtleties exposed by my patch...
but a first grep of that dir shows the likely problematic function:

void GuardedPoolAllocator::trapOnAddress(uintptr_t Address, Error E) {
State.FailureType = E;
State.FailureAddress = Address;

// Raise a SEGV by touching first guard page.
volatile char *p = reinterpret_cast<char *>(State.GuardedPagePool);
*p = 0;
__builtin_unreachable();
}

IIUC, we are probably going to change LLVM to allow this usage. But would it
be safer/better to use __builtin_trap() or __builtin_debugtrap() in this
function?

Nice catch, thanks!
Using __builtin_trap() would be better indeed. unreachable is quite nasty.

I can confirm that changing the __builtin_unreachable() to __builtin_trap() does seem to fix the problem!

@jdoerfert
Copy link
Member

jdoerfert commented Sep 10, 2020

While the "problem" will be address either way, I would generally think __builtin_trap() is the better way to cause a trap.

@rotateright
Copy link
Contributor

rotateright commented Sep 10, 2020

Note that we have at least temporarily restored the failing tests to passing:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/29534

...with this patch:
4e413e1

But it sounds like changing the GWP-ASan source would be safer.

@hctim
Copy link
Collaborator

hctim commented Sep 10, 2020

The volatile store in GWP-ASan is (by designed) supposed to crash the program with a segv at a specific address.

The __builtin_unreachable() is there to guide the compiler that it can tail-call optimise this function.

It seems to me that the root cause is that the __builtin_unreachable() shouldn't be allowed to dead-code eliminate things with side effects (like the volatile store)...

@jdoerfert
Copy link
Member

jdoerfert commented Sep 10, 2020

The volatile store in GWP-ASan is (by designed) supposed to crash the
program with a segv at a specific address.

The __builtin_unreachable() is there to guide the compiler that it can
tail-call optimise this function.

It seems to me that the root cause is that the __builtin_unreachable()
shouldn't be allowed to dead-code eliminate things with side effects (like
the volatile store)...

As we discuss in the thread https://reviews.llvm.org/D87149, the LLVM LangRef needs to be updated to provide the above guarantees. That said, generic "side effects", in the LLVM-IR sense, will not be preserved if they are inevitable followed by an unreachable. That is, a regular store to a global for example is a side effect but it won't be preserved. volatile accesses are different and you will probably get what you expect here, we'll update this once the discussion is settled.

@LebedevRI
Copy link
Member

LebedevRI commented Jul 8, 2021

The volatile store in GWP-ASan is (by designed) supposed to crash the
program with a segv at a specific address.

The __builtin_unreachable() is there to guide the compiler that it can
tail-call optimise this function.

It seems to me that the root cause is that the __builtin_unreachable()
shouldn't be allowed to dead-code eliminate things with side effects (like
the volatile store)...

As we discuss in the thread https://reviews.llvm.org/D87149, the LLVM
LangRef needs to be updated to provide the above guarantees. That said,
generic "side effects", in the LLVM-IR sense, will not be preserved if
they are inevitable followed by an unreachable. That is, a regular store to
a global for example is a side effect but it won't be preserved. volatile
accesses are different and you will probably get what you expect here, we'll
update this once the discussion is settled.

FYI looks like GWP-ASan still hasn't been fixed,
nor did anybody submit an RFC to change LangRef.
Chances are, these tests will break again tomorrow,
once the temporary patch expires as part of https://reviews.llvm.org/D105338

@hctim
Copy link
Collaborator

hctim commented Jul 8, 2021

Sent https://reviews.llvm.org/D105654 to add a workaround in GWP-ASan, but __builtin_unreachable() optimising away code with side effects still needs to be fixed.

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla
Projects
None yet
Development

No branches or pull requests

6 participants