Make finding the corresponding BasicBlock more robust. #3

vext01 · 2021-06-14T15:30:32Z

In a previous PR we modified the .llvm_bb_addr_map section, adding the
orignial basic block index. This commit improves that code in two ways:

Catch the case where MachineBasicBlock::getBasicBlock() returns a
null pointer. To quote the documentation:

Note that this may be NULL if this instance does not correspond
directly to an LLVM basic block.

We use the block index u64::MAX to encode this condition (note that
although the field in the section is uleb128, LLVM only allows block
indices that fit in a u64).

An upcoming change to yk will check for this condition at runtime.

Currently if a block isn't found, then the resulting block index is
incorrect: The last index used in the loop is erroneously used.This
commit causes an error at compile time instead.

In a previous PR we modified the .llvm_bb_addr_map section, adding the original basic block index. This commit improves that code in two ways: 1) Catch the case where MachineBasicBlock::getBasicBlock() returns a null pointer. To quote the documentation: > Note that this may be NULL if this instance does not correspond > directly to an LLVM basic block. We use the block index u64::MAX to encode this condition (note that although the field in the section is uleb128, LLVM only allows block indices that fit in a u64). An upcoming change to yk will check for this condition at runtime. 2) Currently if a block isn't found, then the resulting block index is incorrect: The last index used in the loop is erroneously used.This commit causes an error at compile time instead.

ltratt · 2021-06-14T16:23:49Z

bors r+

bors · 2021-06-14T16:42:50Z

Build succeeded:

buildbot/yk-buildbot-build-script-hw

ptersilie · 2021-06-14T22:21:33Z

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp

+      for (auto It = F.begin(); It != F.end(); It++) {
+        const BasicBlock *BB = &*It;
+        if (BB == FindBB) {
+            found = true;


Surprised this wasn't caught by clang-format. Will need to look into that as it should flag the indentation.

I think it's sometimes "best effort", although I agree, this one's a bit surprising!

Are we running clang-format on the compiler itself? I didn't think so.

Ah yes, you're right. Though I'm surprised they don't have a formatting test in one of their many checks.

We experienced some deadlocks when we used multiple threads for logging using `scan-builds` intercept-build tool when we used multiple threads by e.g. logging `make -j16` ``` (gdb) bt #0 0x00007f2bb3aff110 in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 #1 0x00007f2bb3af70a3 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0 #2 0x00007f2bb3d152e4 in ?? () #3 0x00007ffcc5f0cc80 in ?? () #4 0x00007f2bb3d2bf5b in ?? () from /lib64/ld-linux-x86-64.so.2 #5 0x00007f2bb3b5da27 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #6 0x00007f2bb3b5dbe0 in exit () from /lib/x86_64-linux-gnu/libc.so.6 #7 0x00007f2bb3d144ee in ?? () #8 0x746e692f706d742f in ?? () #9 0x692d747065637265 in ?? () #10 0x2f653631326b3034 in ?? () #11 0x646d632e35353532 in ?? () #12 0x0000000000000000 in ?? () ``` I think the gcc's exit call caused the injected `libear.so` to be unloaded by the `ld`, which in turn called the `void on_unload() __attribute__((destructor))`. That tried to acquire an already locked mutex which was left locked in the `bear_report_call()` call, that probably encountered some error and returned early when it forgot to unlock the mutex. All of these are speculation since from the backtrace I could not verify if frames 2 and 3 are in fact corresponding to the `libear.so` module. But I think it's a fairly safe bet. So, hereby I'm releasing the held mutex on *all paths*, even if some failure happens. PS: I would use lock_guards, but it's C. Reviewed-by: NoQ Differential Revision: https://reviews.llvm.org/D118439 (cherry picked from commit d919d02)

When building/testing ASan inside the GCC tree on Solaris while using GNU `ld` instead of Solaris `ld`, a large number of tests SEGVs on both sparc and x86 like this: Thread 2 received signal SIGSEGV, Segmentation fault. [Switching to Thread 1 (LWP 1)] 0xfe014cfc in __sanitizer::atomic_load<__sanitizer::atomic_uintptr_t> (a=0xfc602a58, mo=__sanitizer::memory_order_acquire) at sanitizer_common/sanitizer_atomic_clang_x86.h:46 46 v = a->val_dont_use; 1: x/i $pc => 0xfe014cfc <_ZN11__sanitizer11atomic_loadINS_16atomic_uintptr_tEEENT_4TypeEPVKS2_NS_12memory_orderE+62>: mov (%eax),%eax (gdb) bt #0 0xfe014cfc in __sanitizer::atomic_load<__sanitizer::atomic_uintptr_t> (a=0xfc602a58, mo=__sanitizer::memory_order_acquire) at sanitizer_common/sanitizer_atomic_clang_x86.h:46 #1 0xfe0bd1d7 in __sanitizer::DTLS_NextBlock (cur=0xfc602a58) at sanitizer_common/sanitizer_tls_get_addr.cpp:53 #2 0xfe0bd319 in __sanitizer::DTLS_Find (id=1) at sanitizer_common/sanitizer_tls_get_addr.cpp:77 #3 0xfe0bd466 in __sanitizer::DTLS_on_tls_get_addr (arg_void=0xfeffd068, res=0xfe602a18, static_tls_begin=0, static_tls_end=0) at sanitizer_common/sanitizer_tls_get_addr.cpp:116 #4 0xfe063f81 in __interceptor___tls_get_addr (arg=0xfeffd068) at sanitizer_common/sanitizer_common_interceptors.inc:5501 #5 0xfe0a3054 in __sanitizer::CollectStaticTlsBlocks (info=0xfeffd108, size=40, data=0xfeffd16c) at sanitizer_common/sanitizer_linux_libcdep.cpp:366 #6 0xfe6ba9fa in dl_iterate_phdr () from /usr/lib/ld.so.1 #7 0xfe0a3132 in __sanitizer::GetStaticTlsBoundary (addr=0xfe608020, size=0xfeffd244, align=0xfeffd1b0) at sanitizer_common/sanitizer_linux_libcdep.cpp:382 #8 0xfe0a33f7 in __sanitizer::GetTls (addr=0xfe608020, size=0xfeffd244) at sanitizer_common/sanitizer_linux_libcdep.cpp:482 #9 0xfe0a34b1 in __sanitizer::GetThreadStackAndTls (main=true, stk_addr=0xfe608010, stk_size=0xfeffd240, tls_addr=0xfe608020, tls_size=0xfeffd244) at sanitizer_common/sanitizer_linux_libcdep.cpp:565 The address being accessed is unmapped. However, even when the tests `PASS` with Solaris `ld`, `ASAN_OPTIONS=verbosity=2` shows ==6582==__tls_get_addr: Can't guess glibc version Given that that the code is stricly `glibc`-specific according to `sanitizer_tls_get_addr.h`, there seems little point in using the interceptor on non-`glibc` targets. That's what this patch does. Tested on `i386-pc-solaris2.11` and `sparc-sun-solaris2.11` inside the GCC tree. Differential Revision: https://reviews.llvm.org/D141385

vext01 assigned ltratt and ptersilie Jun 14, 2021

vext01 mentioned this pull request Jun 14, 2021

Catch the case where a traced MachineBasicblock has no corresponding BasicBlock. ykjit/yk#356

Merged

bors bot merged commit dc88419 into ykjit:main Jun 14, 2021

ptersilie reviewed Jun 14, 2021

View reviewed changes

vext01 deleted the more-robust-bb-idxs branch June 15, 2021 07:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make finding the corresponding BasicBlock more robust. #3

Make finding the corresponding BasicBlock more robust. #3

vext01 commented Jun 14, 2021

ltratt commented Jun 14, 2021

bors bot commented Jun 14, 2021

ptersilie Jun 14, 2021

ltratt Jun 14, 2021

vext01 Jun 15, 2021

ptersilie Jun 15, 2021

Make finding the corresponding BasicBlock more robust. #3

Make finding the corresponding BasicBlock more robust. #3

Conversation

vext01 commented Jun 14, 2021

ltratt commented Jun 14, 2021

bors bot commented Jun 14, 2021

ptersilie Jun 14, 2021

Choose a reason for hiding this comment

ltratt Jun 14, 2021

Choose a reason for hiding this comment

vext01 Jun 15, 2021

Choose a reason for hiding this comment

ptersilie Jun 15, 2021

Choose a reason for hiding this comment