Use Component in OpenBSD::getCompilerRT to find libraries #1

blackgnezdo · 2021-09-02T13:55:17Z

OpenBSD with this change would use this name for ASan:
/usr/lib/clang/11.1.0/lib/libclang_rt.asan.a

Already submitted to OpenBSD repository.

OpenBSD with this change would use this name for ASan: /usr/lib/clang/11.1.0/lib/libclang_rt.asan.a Already submitted to OpenBSD repository.

blackgnezdo · 2021-09-02T13:55:40Z

It's in OpenBSD repo, but just in case you also want it here...

mordak · 2021-09-02T13:59:48Z

Thanks! I got this earlier today via my usual sync where I pull in all our llvm patches and apply them to all of the updated release branches and main. :-)

476fd2b

@yrnkrn

This patch re-introduces the fix in the commit llvm@66b0cebf7f736 by @yrnkrn > In DwarfEHPrepare, after all passes are run, RewindFunction may be a dangling > > pointer to a dead function. To make sure it's valid, doFinalization nullptrs > RewindFunction just like the constructor and so it will be found on next run. > > llvm-svn: 217737 It seems that the fix was not migrated to `DwarfEHPrepareLegacyPass`. This patch also updates `llvm/test/CodeGen/X86/dwarf-eh-prepare.ll` to include `-run-twice` to exercise the cleanup. Without this patch `llvm-lit -v llvm/test/CodeGen/X86/dwarf-eh-prepare.ll` fails with ``` -- Testing: 1 tests, 1 workers -- FAIL: LLVM :: CodeGen/X86/dwarf-eh-prepare.ll (1 of 1) ******************** TEST 'LLVM :: CodeGen/X86/dwarf-eh-prepare.ll' FAILED ******************** Script: -- : 'RUN: at line 1'; /home/arakaki/build/llvm-project/main/bin/opt -mtriple=x86_64-linux-gnu -dwarfehprepare -simplifycfg-require-and-preserve-domtree=1 -run-twice < /home/arakaki/repos/watch/llvm-project/llvm/test/CodeGen/X86/dwarf-eh-prepare.ll -S | /home/arakaki/build/llvm-project/main/bin/FileCheck /home/arakaki/repos/watch/llvm-project/llvm/test/CodeGen/X86/dwarf-eh-prepare.ll -- Exit Code: 2 Command Output (stderr): -- Referencing function in another module! call void @_Unwind_Resume(i8* %ehptr) #1 ; ModuleID = '<stdin>' void (i8*)* @_Unwind_Resume ; ModuleID = '<stdin>' in function simple_cleanup_catch LLVM ERROR: Broken function found, compilation aborted! PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace. Stack dump: 0. Program arguments: /home/arakaki/build/llvm-project/main/bin/opt -mtriple=x86_64-linux-gnu -dwarfehprepare -simplifycfg-require-and-preserve-domtree=1 -run-twice -S 1. Running pass 'Function Pass Manager' on module '<stdin>'. 2. Running pass 'Module Verifier' on function '@simple_cleanup_catch' #0 0x000056121b570a2c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/arakaki/repos/watch/llvm-project/llvm/lib/Support/Unix/Signals.inc:569:0 #1 0x000056121b56eb64 llvm::sys::RunSignalHandlers() /home/arakaki/repos/watch/llvm-project/llvm/lib/Support/Signals.cpp:97:0 #2 0x000056121b56f28e SignalHandler(int) /home/arakaki/repos/watch/llvm-project/llvm/lib/Support/Unix/Signals.inc:397:0 #3 0x00007fc7e9b22980 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12980) llvm#4 0x00007fc7e87d3fb7 raise /build/glibc-S7xCS9/glibc-2.27/signal/../sysdeps/unix/sysv/linux/raise.c:51:0 llvm#5 0x00007fc7e87d5921 abort /build/glibc-S7xCS9/glibc-2.27/stdlib/abort.c:81:0 llvm#6 0x000056121b4e1386 llvm::raw_svector_ostream::raw_svector_ostream(llvm::SmallVectorImpl<char>&) /home/arakaki/repos/watch/llvm-project/llvm/include/llvm/Support/raw_ostream.h:674:0 llvm#7 0x000056121b4e1386 llvm::report_fatal_error(llvm::Twine const&, bool) /home/arakaki/repos/watch/llvm-project/llvm/lib/Support/ErrorHandling.cpp:114:0 llvm#8 0x000056121b4e1528 (/home/arakaki/build/llvm-project/main/bin/opt+0x29e3528) llvm#9 0x000056121adfd03f llvm::raw_ostream::operator<<(llvm::StringRef) /home/arakaki/repos/watch/llvm-project/llvm/include/llvm/Support/raw_ostream.h:218:0 FileCheck error: '<stdin>' is empty. FileCheck command line: /home/arakaki/build/llvm-project/main/bin/FileCheck /home/arakaki/repos/watch/llvm-project/llvm/test/CodeGen/X86/dwarf-eh-prepare.ll -- ******************** ******************** Failed Tests (1): LLVM :: CodeGen/X86/dwarf-eh-prepare.ll Testing Time: 0.22s Failed: 1 ``` Reviewed By: loladiro Differential Revision: https://reviews.llvm.org/D110979

Script for automatic 'opt' pipeline reduction for when using the new pass-manager (NPM). Based around the '-print-pipeline-passes' option. The reduction algorithm consists of several phases (steps). Step #0: Verify that input fails with the given pipeline and make note of the error code. Step #1: Split pipeline in two starting from front and move forward as long as first pipeline exits normally and the second pipeline fails with the expected error code. Move on to step #2 with the IR from the split point and the pipeline from the second invocation. Step #2: Remove passes from end of the pipeline as long as the pipeline fails with the expected error code. Step #3: Make several sweeps over the remaining pipeline trying to remove one pass at a time. Repeat sweeps until unable to remove any more passes. Usage example: ./utils/reduce_pipeline.py --opt-binary=./build-all-Debug/bin/opt --input=input.ll --output=output.ll --passes=PIPELINE [EXTRA-OPT-ARGS ...] Differential Revision: https://reviews.llvm.org/D110908

Although THREADLOCAL variables are supported on Darwin they cannot be used very early on during process init (before dyld has set it up). Unfortunately the checked lock is used before dyld has setup TLS leading to an abort call (`_tlv_boostrap()` is never supposed to be called at runtime). To avoid this problem `SANITIZER_CHECK_DEADLOCKS` is now disabled on Darwin platforms. This fixes running TSan tests (an possibly other Sanitizers) when `COMPILER_RT_DEBUG=ON`. For reference the crashing backtrace looks like this: ``` * thread #1, stop reason = signal SIGABRT * frame #0: 0x00000002044da0ae dyld`__abort_with_payload + 10 frame #1: 0x00000002044f01af dyld`abort_with_payload_wrapper_internal + 80 frame #2: 0x00000002044f01e1 dyld`abort_with_payload + 9 frame #3: 0x000000010c989060 dyld_sim`abort_with_payload + 26 frame llvm#4: 0x000000010c94908b dyld_sim`dyld4::halt(char const*) + 375 frame llvm#5: 0x000000010c988f5c dyld_sim`abort + 16 frame llvm#6: 0x000000010c96104f dyld_sim`dyld4::APIs::_tlv_bootstrap() + 9 frame llvm#7: 0x000000010cd8d6d2 libclang_rt.tsan_iossim_dynamic.dylib`__sanitizer::CheckedMutex::LockImpl(this=<unavailable>, pc=<unavailable>) at sanitizer_mutex.cpp:218:58 [opt] frame llvm#8: 0x000000010cd8a0f7 libclang_rt.tsan_iossim_dynamic.dylib`__sanitizer::Mutex::Lock() [inlined] __sanitizer::CheckedMutex::Lock(this=0x000000010d733c90) at sanitizer_mutex.h:124:5 [opt] frame llvm#9: 0x000000010cd8a0ee libclang_rt.tsan_iossim_dynamic.dylib`__sanitizer::Mutex::Lock(this=0x000000010d733c90) at sanitizer_mutex.h:162:19 [opt] frame llvm#10: 0x000000010cd8a0bf libclang_rt.tsan_iossim_dynamic.dylib`__sanitizer::GenericScopedLock<__sanitizer::Mutex>::GenericScopedLock(this=0x000000030c7479a8, mu=<unavailable>) at sanitizer_mutex.h:364:10 [opt] frame llvm#11: 0x000000010cd89819 libclang_rt.tsan_iossim_dynamic.dylib`__sanitizer::GenericScopedLock<__sanitizer::Mutex>::GenericScopedLock(this=0x000000030c7479a8, mu=<unavailable>) at sanitizer_mutex.h:363:67 [opt] frame llvm#12: 0x000000010cd8985b libclang_rt.tsan_iossim_dynamic.dylib`__sanitizer::LibIgnore::OnLibraryLoaded(this=0x000000010d72f480, name=0x0000000000000000) at sanitizer_libignore.cpp:39:8 [opt] frame llvm#13: 0x000000010cda7aaa libclang_rt.tsan_iossim_dynamic.dylib`__tsan::InitializeLibIgnore() at tsan_interceptors_posix.cpp:219:16 [opt] frame llvm#14: 0x000000010cdce0bb libclang_rt.tsan_iossim_dynamic.dylib`__tsan::Initialize(thr=0x0000000110141400) at tsan_rtl.cpp:403:3 [opt] frame llvm#15: 0x000000010cda7b8e libclang_rt.tsan_iossim_dynamic.dylib`__tsan::ScopedInterceptor::ScopedInterceptor(__tsan::ThreadState*, char const*, unsigned long) [inlined] __tsan::LazyInitialize(thr=0x0000000110141400) at tsan_rtl.h:665:5 [opt] frame llvm#16: 0x000000010cda7b86 libclang_rt.tsan_iossim_dynamic.dylib`__tsan::ScopedInterceptor::ScopedInterceptor(this=0x000000030c747af8, thr=0x0000000110141400, fname=<unavailable>, pc=4568918787) at tsan_interceptors_posix.cpp:247:3 [opt] frame llvm#17: 0x000000010cda7bb9 libclang_rt.tsan_iossim_dynamic.dylib`__tsan::ScopedInterceptor::ScopedInterceptor(this=0x000000030c747af8, thr=<unavailable>, fname=<unavailable>, pc=<unavailable>) at tsan_interceptors_posix.cpp:246:59 [opt] frame llvm#18: 0x000000010cdb72b7 libclang_rt.tsan_iossim_dynamic.dylib`::wrap_strlcpy(dst="\xd2", src="0xd1d398d1bb0a007b", size=20) at sanitizer_common_interceptors.inc:7386:3 [opt] frame llvm#19: 0x0000000110542b03 libsystem_c.dylib`__guard_setup + 140 frame llvm#20: 0x00000001104f8ab4 libsystem_c.dylib`_libc_initializer + 65 ... ``` rdar://83723445 Differential Revision: https://reviews.llvm.org/D111243

When inserting a scalable subvector into a scalable vector through the stack, the index to store to needs to be scaled by vscale. Before this patch, that didn't yet happen, so it would generate the wrong offset, thus storing a subvector to the incorrect address and overwriting the wrong lanes. For some insert: nxv8f16 insert_subvector(nxv8f16 %vec, nxv2f16 %subvec, i64 2) The offset was not scaled by vscale: orr x8, x8, #0x4 st1h { z0.h }, p0, [sp] st1h { z1.d }, p1, [x8] ld1h { z0.h }, p0/z, [sp] And is changed to: mov x8, sp st1h { z0.h }, p0, [sp] st1h { z1.d }, p1, [x8, #1, mul vl] ld1h { z0.h }, p0/z, [sp] Differential Revision: https://reviews.llvm.org/D111633

PPC64 bot failed with the following error. The buildbot output is not particularly useful, but looking at other similar tests, it seems that there is something broken in free stacks on PPC64. Use the same hack as other tests use to expect an additional stray frame. /home/buildbots/ppc64le-clang-lnt-test/clang-ppc64le-lnt/llvm/compiler-rt/test/tsan/free_race3.c:28:11: error: CHECK: expected string not found in input // CHECK: Previous write of size 4 at {{.*}} by thread T1{{.*}}: ^ <stdin>:13:9: note: scanning from here #1 main /home/buildbots/ppc64le-clang-lnt-test/clang-ppc64le-lnt/llvm/compiler-rt/test/tsan/free_race3.c:17:3 (free_race3.c.tmp+0x1012fab8) ^ <stdin>:17:2: note: possible intended match here ThreadSanitizer: reported 1 warnings ^ Input file: <stdin> Check file: /home/buildbots/ppc64le-clang-lnt-test/clang-ppc64le-lnt/llvm/compiler-rt/test/tsan/free_race3.c -dump-input=help explains the following input dump. Input was: <<<<<< . . . 8: Previous write of size 4 at 0x7ffff4d01ab0 by thread T1: 9: #0 Thread /home/buildbots/ppc64le-clang-lnt-test/clang-ppc64le-lnt/llvm/compiler-rt/test/tsan/free_race3.c:8:10 (free_race3.c.tmp+0x1012f9dc) 10: 11: Thread T1 (tid=3222898, finished) created by main thread at: 12: #0 pthread_create /home/buildbots/ppc64le-clang-lnt-test/clang-ppc64le-lnt/llvm/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:1001:3 (free_race3.c.tmp+0x100b9040) 13: #1 main /home/buildbots/ppc64le-clang-lnt-test/clang-ppc64le-lnt/llvm/compiler-rt/test/tsan/free_race3.c:17:3 (free_race3.c.tmp+0x1012fab8) check:28'0 X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found 14: check:28'0 ~ 15: SUMMARY: ThreadSanitizer: data race /home/buildbots/ppc64le-clang-lnt-test/clang-ppc64le-lnt/llvm/compiler-rt/test/tsan/free_race3.c:19:3 in main check:28'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 16: ================== check:28'0 ~~~~~~~~~~~~~~~~~~~ 17: ThreadSanitizer: reported 1 warnings check:28'0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ check:28'1 ? possible intended match >>>>>> Reviewed By: melver Differential Revision: https://reviews.llvm.org/D112444

@yrnkrn

This patch re-introduces the fix in the commit llvm@66b0cebf7f736 by @yrnkrn > In DwarfEHPrepare, after all passes are run, RewindFunction may be a dangling > > pointer to a dead function. To make sure it's valid, doFinalization nullptrs > RewindFunction just like the constructor and so it will be found on next run. > > llvm-svn: 217737 It seems that the fix was not migrated to `DwarfEHPrepareLegacyPass`. This patch also updates `llvm/test/CodeGen/X86/dwarf-eh-prepare.ll` to include `-run-twice` to exercise the cleanup. Without this patch `llvm-lit -v llvm/test/CodeGen/X86/dwarf-eh-prepare.ll` fails with ``` -- Testing: 1 tests, 1 workers -- FAIL: LLVM :: CodeGen/X86/dwarf-eh-prepare.ll (1 of 1) ******************** TEST 'LLVM :: CodeGen/X86/dwarf-eh-prepare.ll' FAILED ******************** Script: -- : 'RUN: at line 1'; /home/arakaki/build/llvm-project/main/bin/opt -mtriple=x86_64-linux-gnu -dwarfehprepare -simplifycfg-require-and-preserve-domtree=1 -run-twice < /home/arakaki/repos/watch/llvm-project/llvm/test/CodeGen/X86/dwarf-eh-prepare.ll -S | /home/arakaki/build/llvm-project/main/bin/FileCheck /home/arakaki/repos/watch/llvm-project/llvm/test/CodeGen/X86/dwarf-eh-prepare.ll -- Exit Code: 2 Command Output (stderr): -- Referencing function in another module! call void @_Unwind_Resume(i8* %ehptr) #1 ; ModuleID = '<stdin>' void (i8*)* @_Unwind_Resume ; ModuleID = '<stdin>' in function simple_cleanup_catch LLVM ERROR: Broken function found, compilation aborted! PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace. Stack dump: 0. Program arguments: /home/arakaki/build/llvm-project/main/bin/opt -mtriple=x86_64-linux-gnu -dwarfehprepare -simplifycfg-require-and-preserve-domtree=1 -run-twice -S 1. Running pass 'Function Pass Manager' on module '<stdin>'. 2. Running pass 'Module Verifier' on function '@simple_cleanup_catch' #0 0x000056121b570a2c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/arakaki/repos/watch/llvm-project/llvm/lib/Support/Unix/Signals.inc:569:0 #1 0x000056121b56eb64 llvm::sys::RunSignalHandlers() /home/arakaki/repos/watch/llvm-project/llvm/lib/Support/Signals.cpp:97:0 #2 0x000056121b56f28e SignalHandler(int) /home/arakaki/repos/watch/llvm-project/llvm/lib/Support/Unix/Signals.inc:397:0 #3 0x00007fc7e9b22980 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12980) llvm#4 0x00007fc7e87d3fb7 raise /build/glibc-S7xCS9/glibc-2.27/signal/../sysdeps/unix/sysv/linux/raise.c:51:0 llvm#5 0x00007fc7e87d5921 abort /build/glibc-S7xCS9/glibc-2.27/stdlib/abort.c:81:0 llvm#6 0x000056121b4e1386 llvm::raw_svector_ostream::raw_svector_ostream(llvm::SmallVectorImpl<char>&) /home/arakaki/repos/watch/llvm-project/llvm/include/llvm/Support/raw_ostream.h:674:0 llvm#7 0x000056121b4e1386 llvm::report_fatal_error(llvm::Twine const&, bool) /home/arakaki/repos/watch/llvm-project/llvm/lib/Support/ErrorHandling.cpp:114:0 llvm#8 0x000056121b4e1528 (/home/arakaki/build/llvm-project/main/bin/opt+0x29e3528) llvm#9 0x000056121adfd03f llvm::raw_ostream::operator<<(llvm::StringRef) /home/arakaki/repos/watch/llvm-project/llvm/include/llvm/Support/raw_ostream.h:218:0 FileCheck error: '<stdin>' is empty. FileCheck command line: /home/arakaki/build/llvm-project/main/bin/FileCheck /home/arakaki/repos/watch/llvm-project/llvm/test/CodeGen/X86/dwarf-eh-prepare.ll -- ******************** ******************** Failed Tests (1): LLVM :: CodeGen/X86/dwarf-eh-prepare.ll Testing Time: 0.22s Failed: 1 ``` Reviewed By: loladiro Differential Revision: https://reviews.llvm.org/D110979 (cherry picked from commit e8806d7)

Fixes a CHECK-failure caused by glibc's pthread_getattr_np implementation calling realloc. Essentially, Thread::GenerateRandomTag gets called during Thread::Init and before Thread::InitRandomState: HWAddressSanitizer: CHECK failed: hwasan_thread.cpp:134 "((random_buffer_)) != (0)" (0x0, 0x0) (tid=314) #0 0x55845475a662 in __hwasan::CheckUnwind() #1 0x558454778797 in __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) #2 0x558454766461 in __hwasan::Thread::GenerateRandomTag(unsigned long) #3 0x55845475c58b in __hwasan::HwasanAllocate(__sanitizer::StackTrace*, unsigned long, unsigned long, bool) llvm#4 0x55845475c80a in __hwasan::hwasan_realloc(void*, unsigned long, __sanitizer::StackTrace*) llvm#5 0x5584547608aa in realloc llvm#6 0x7f6f3a3d8c2c in pthread_getattr_np llvm#7 0x5584547790dc in __sanitizer::GetThreadStackTopAndBottom(bool, unsigned long*, unsigned long*) llvm#8 0x558454779651 in __sanitizer::GetThreadStackAndTls(bool, unsigned long*, unsigned long*, unsigned long*, unsigned long*) llvm#9 0x558454761bca in __hwasan::Thread::InitStackAndTls(__hwasan::Thread::InitState const*) llvm#10 0x558454761e5c in __hwasan::HwasanThreadList::CreateCurrentThread(__hwasan::Thread::InitState const*) llvm#11 0x55845476184f in __hwasan_thread_enter llvm#12 0x558454760def in HwasanThreadStartFunc(void*) llvm#13 0x7f6f3a3d6fa2 in start_thread llvm#14 0x7f6f3a15b4ce in __clone Also reverts 7a3fb71, as it's now unneeded. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D113045

…turn to external addr part) Before we have an issue with artificial LBR whose source is a return, recalling that "an internal code(A) can return to external address, then from the external address call a new internal code(B), making an artificial branch that looks like a return from A to B can confuse the unwinder". We just ignore the LBRs after this artificial LBR which can miss some samples. This change aims at fixing this by correctly unwinding them instead of ignoring them. List some typical scenarios covered by this change. 1) multiple sequential call back happen in external address, e.g. ``` [ext, call, foo] [foo, return, ext] [ext, call, bar] ``` Unwinder should avoid having foo return from bar. Wrong call stack is like [foo, bar] 2) the call stack before and after external call should be correctly unwinded. ``` {call stack1} {call stack2} [foo, call, ext] [ext, call, bar] [bar, return, ext] [ext, return, foo ] ``` call stack 1 should be the same to call stack2. Both shouldn't be truncated 3) call stack should be truncated after call into external code since we can't do inlining with external code. ``` [foo, call, ext] [ext, call, bar] [bar, call, baz] [baz, return, bar ] [bar, return, ext] ``` the call stack of code in baz should not include foo. ### Implementation: We leverage artificial frame to fix #2 and #3: when we got a return artificial LBR, push an extra artificial frame to the stack. when we pop frame, check if the parent is an artificial frame to pop(fix #2). Therefore, call/ return artificial LBR is just the same as regular LBR which can keep the call stack. While recording context on the trie, artificial frame is used as a tag indicating that we should truncate the call stack(fix #3). To differentiate #1 and #2, we leverage `getCallAddrFromFrameAddr`. Normally the target of the return should be the next inst of a call inst and `getCallAddrFromFrameAddr` will return the address of call inst. Otherwise, getCallAddrFromFrameAddr will return to 0 which is the case of #1. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D115550

@s

…ce characters in lookup names when parsing the ctu index file This error was found when analyzing MySQL with CTU enabled. When there are space characters in the lookup name, the current delimiter searching strategy will make the file path wrongly parsed. And when two lookup names have the same prefix before their first space characters, a 'multiple definitions' error will be wrongly reported. e.g. The lookup names for the two lambda exprs in the test case are `c:@s@G@F@G#@sa@F@operator int (*)(char)#1` and `c:@s@G@F@G#@sa@F@operator bool (*)(char)#1` respectively. And their prefixes are both `c:@s@G@F@G#@sa@F@operator` when using the first space character as the delimiter. Solving the problem by adding a length for the lookup name, making the index items in the format of `USR-Length:USR File-Path`. Reviewed By: steakhal Differential Revision: https://reviews.llvm.org/D102669

…he parser" This reverts commit b0e8667. ASAN/UBSAN bot is broken with this trace: [ RUN ] FlatAffineConstraintsTest.FindSampleTest llvm-project/mlir/include/mlir/Support/MathExtras.h:27:15: runtime error: signed integer overflow: 1229996100002 * 809999700000 cannot be represented in type 'long' #0 0x7f63ace960e4 in mlir::ceilDiv(long, long) llvm-project/mlir/include/mlir/Support/MathExtras.h:27:15 #1 0x7f63ace8587e in ceil llvm-project/mlir/include/mlir/Analysis/Presburger/Fraction.h:57:42 #2 0x7f63ace8587e in operator* llvm-project/llvm/include/llvm/ADT/STLExtras.h:347:42 #3 0x7f63ace8587e in uninitialized_copy<llvm::mapped_iterator<mlir::Fraction *, long (*)(mlir::Fraction), long>, long *> include/c++/v1/__memory/uninitialized_algorithms.h:36:62 llvm#4 0x7f63ace8587e in uninitialized_copy<llvm::mapped_iterator<mlir::Fraction *, long (*)(mlir::Fraction), long>, long *> llvm-project/llvm/include/llvm/ADT/SmallVector.h:490:5 llvm#5 0x7f63ace8587e in append<llvm::mapped_iterator<mlir::Fraction *, long (*)(mlir::Fraction), long>, void> llvm-project/llvm/include/llvm/ADT/SmallVector.h:662:5 llvm#6 0x7f63ace8587e in SmallVector<llvm::mapped_iterator<mlir::Fraction *, long (*)(mlir::Fraction), long> > llvm-project/llvm/include/llvm/ADT/SmallVector.h:1204:11 llvm#7 0x7f63ace8587e in mlir::FlatAffineConstraints::findIntegerSample() const llvm-project/mlir/lib/Analysis/AffineStructures.cpp:1171:27 llvm#8 0x7f63ae95a84d in mlir::checkSample(bool, mlir::FlatAffineConstraints const&, mlir::TestFunction) llvm-project/mlir/unittests/Analysis/AffineStructuresTest.cpp:37:23 llvm#9 0x7f63ae957545 in mlir::FlatAffineConstraintsTest_FindSampleTest_Test::TestBody() llvm-project/mlir/unittests/Analysis/AffineStructuresTest.cpp:222:3

…se of OpenMP task construct Currently variables appearing inside shared clause of OpenMP task construct are not visible inside lldb debugger. After the current patch, lldb is able to show the variable ``` * thread #1, name = 'a.out', stop reason = breakpoint 1.1 frame #0: 0x0000000000400934 a.out`.omp_task_entry. [inlined] .omp_outlined.(.global_tid.=0, .part_id.=0x000000000071f0d0, .privates.=0x000000000071f0e8, .copy_fn.=(a.out`.omp_task_privates_map. at testshared.cxx:8), .task_t.=0x000000000071f0c0, __context=0x000000000071f0f0) at testshared.cxx:10:34 7 else { 8 #pragma omp task shared(svar) firstprivate(n) 9 { -> 10 printf("Task svar = %d\n", svar); 11 printf("Task n = %d\n", n); 12 svar = fib(n - 1); 13 } (lldb) p svar (int) $0 = 9 ``` Reviewed By: djtodoro Differential Revision: https://reviews.llvm.org/D115510

The Support directory was removed from the unittests cmake when the directory was removed in 204c3b5. Subsequent commits added the directory back but seem to have missed adding it back to the cmake. This patch also removes MLIRSupportIndentedStream from the list of linked libraries to avoid an ODR violation (it's already part of MLIRSupport which is also being linked here). Otherwise ASAN complains: ``` ================================================================= ==102592==ERROR: AddressSanitizer: odr-violation (0x7fbdf214eee0): [1] size=120 'vtable for mlir::raw_indented_ostream' /home/arjun/llvm-project/mlir/lib/Support/IndentedOstream.cpp [2] size=120 'vtable for mlir::raw_indented_ostream' /home/arjun/llvm-project/mlir/lib/Support/IndentedOstream.cpp These globals were registered at these points: [1]: #0 0x28a71d in __asan_register_globals (/home/arjun/llvm-project/build/tools/mlir/unittests/Support/MLIRSupportTests+0x28a71d) #1 0x7fbdf214a61b in asan.module_ctor (/home/arjun/llvm-project/build/lib/libMLIRSupportIndentedOstream.so.14git+0x661b) [2]: #0 0x28a71d in __asan_register_globals (/home/arjun/llvm-project/build/tools/mlir/unittests/Support/MLIRSupportTests+0x28a71d) #1 0x7fbdf2061c4b in asan.module_ctor (/home/arjun/llvm-project/build/lib/libMLIRSupport.so.14git+0x11bc4b) ==102592==HINT: if you don't care about these errors you may set ASAN_OPTIONS=detect_odr_violation=0 SUMMARY AddressSanitizer: odr-violation: global 'vtable for mlir::raw_indented_ostream' at /home/arjun/llvm-project/mlir/lib/Support/IndentedOstream.cpp ==102592==ABORTING ``` Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D116027

The Support directory was removed from the unittests cmake when the directory was removed in 204c3b5. Subsequent commits added the directory back but seem to have missed adding it back to the cmake. This patch also removes MLIRSupportIndentedStream from the list of linked libraries to avoid an ODR violation (it's already part of MLIRSupport which is also being linked here). Otherwise ASAN complains: ``` ================================================================= ==102592==ERROR: AddressSanitizer: odr-violation (0x7fbdf214eee0): [1] size=120 'vtable for mlir::raw_indented_ostream' /home/arjun/llvm-project/mlir/lib/Support/IndentedOstream.cpp [2] size=120 'vtable for mlir::raw_indented_ostream' /home/arjun/llvm-project/mlir/lib/Support/IndentedOstream.cpp These globals were registered at these points: [1]: #0 0x28a71d in __asan_register_globals (/home/arjun/llvm-project/build/tools/mlir/unittests/Support/MLIRSupportTests+0x28a71d) #1 0x7fbdf214a61b in asan.module_ctor (/home/arjun/llvm-project/build/lib/libMLIRSupportIndentedOstream.so.14git+0x661b) [2]: #0 0x28a71d in __asan_register_globals (/home/arjun/llvm-project/build/tools/mlir/unittests/Support/MLIRSupportTests+0x28a71d) #1 0x7fbdf2061c4b in asan.module_ctor (/home/arjun/llvm-project/build/lib/libMLIRSupport.so.14git+0x11bc4b) ==102592==HINT: if you don't care about these errors you may set ASAN_OPTIONS=detect_odr_violation=0 SUMMARY AddressSanitizer: odr-violation: global 'vtable for mlir::raw_indented_ostream' at /home/arjun/llvm-project/mlir/lib/Support/IndentedOstream.cpp ==102592==ABORTING ``` This patch also fixes a build issue with `DebugAction::classof` under Windows. This commit re-lands this patch, which was previously reverted in 2132906 due to a buildbot failure that turned out to be because of a flaky test. Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D116027

Segmentation fault in ompt_tsan_dependences function due to an unchecked NULL pointer dereference is as follows: ``` ThreadSanitizer:DEADLYSIGNAL ==140865==ERROR: ThreadSanitizer: SEGV on unknown address 0x000000000050 (pc 0x7f217c2d3652 bp 0x7ffe8cfc7e00 sp 0x7ffe8cfc7d90 T140865) ==140865==The signal is caused by a READ memory access. ==140865==Hint: address points to the zero page. /usr/bin/addr2line: DWARF error: could not find variable specification at offset 1012a /usr/bin/addr2line: DWARF error: could not find variable specification at offset 133b5 /usr/bin/addr2line: DWARF error: could not find variable specification at offset 1371a /usr/bin/addr2line: DWARF error: could not find variable specification at offset 13a58 #0 ompt_tsan_dependences(ompt_data_t*, ompt_dependence_t const*, int) /ptmp/bhararit/llvm-project/openmp/tools/archer/ompt-tsan.cpp:1004 (libarcher.so+0x15652) #1 __kmpc_doacross_post /ptmp/bhararit/llvm-project/openmp/runtime/src/kmp_csupport.cpp:4280 (libomp.so+0x74d98) #2 .omp_outlined. for_ordered_01.c:? (for_ordered_01.exe+0x5186cb) #3 __kmp_invoke_microtask /ptmp/bhararit/llvm-project/openmp/runtime/src/z_Linux_asm.S:1166 (libomp.so+0x14e592) llvm#4 __kmp_invoke_task_func /ptmp/bhararit/llvm-project/openmp/runtime/src/kmp_runtime.cpp:7556 (libomp.so+0x909ad) llvm#5 __kmp_fork_call /ptmp/bhararit/llvm-project/openmp/runtime/src/kmp_runtime.cpp:2284 (libomp.so+0x8461a) llvm#6 __kmpc_fork_call /ptmp/bhararit/llvm-project/openmp/runtime/src/kmp_csupport.cpp:308 (libomp.so+0x6db55) llvm#7 main ??:? (for_ordered_01.exe+0x51828f) llvm#8 __libc_start_main ??:? (libc.so.6+0x24349) llvm#9 _start /home/abuild/rpmbuild/BUILD/glibc-2.26/csu/../sysdeps/x86_64/start.S:120 (for_ordered_01.exe+0x4214e9) ThreadSanitizer can not provide additional info. SUMMARY: ThreadSanitizer: SEGV /ptmp/bhararit/llvm-project/openmp/tools/archer/ompt-tsan.cpp:1004 in ompt_tsan_dependences(ompt_data_t*, ompt_dependence_t const*, int) ==140865==ABORTING ``` To reproduce the error, use the following openmp code snippet: ``` /* initialise testMatrixInt Matrix, cols, r and c */ #pragma omp parallel private(r,c) shared(testMatrixInt) { #pragma omp for ordered(2) for (r=1; r < rows; r++) { for (c=1; c < cols; c++) { #pragma omp ordered depend(sink:r-1, c+1) depend(sink:r-1,c-1) testMatrixInt[r][c] = (testMatrixInt[r-1][c] + testMatrixInt[r-1][c-1]) % cols ; #pragma omp ordered depend (source) } } } ``` Compilation: ``` clang -g -stdlib=libc++ -fsanitize=thread -fopenmp -larcher test_case.c ``` It seems like the changes introduced by the commit https://reviews.llvm.org/D114005 causes this particular SEGV while using Archer. Reviewed By: protze.joachim Differential Revision: https://reviews.llvm.org/D115328

This reverts commit ea75be3 and 1eb5b6e. That commit caused crashes with compilation e.g. like this (not fixed by the follow-up commit): $ cat sqrt.c float a; b() { sqrt(a); } $ clang -target x86_64-linux-gnu -c -O2 sqrt.c Attributes 'readnone and writeonly' are incompatible! %sqrtf = tail call float @sqrtf(float %0) #1 in function b fatal error: error in backend: Broken function found, compilation aborted!

We experienced some deadlocks when we used multiple threads for logging using `scan-builds` intercept-build tool when we used multiple threads by e.g. logging `make -j16` ``` (gdb) bt #0 0x00007f2bb3aff110 in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 #1 0x00007f2bb3af70a3 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0 #2 0x00007f2bb3d152e4 in ?? () #3 0x00007ffcc5f0cc80 in ?? () llvm#4 0x00007f2bb3d2bf5b in ?? () from /lib64/ld-linux-x86-64.so.2 llvm#5 0x00007f2bb3b5da27 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 llvm#6 0x00007f2bb3b5dbe0 in exit () from /lib/x86_64-linux-gnu/libc.so.6 llvm#7 0x00007f2bb3d144ee in ?? () llvm#8 0x746e692f706d742f in ?? () llvm#9 0x692d747065637265 in ?? () llvm#10 0x2f653631326b3034 in ?? () llvm#11 0x646d632e35353532 in ?? () llvm#12 0x0000000000000000 in ?? () ``` I think the gcc's exit call caused the injected `libear.so` to be unloaded by the `ld`, which in turn called the `void on_unload() __attribute__((destructor))`. That tried to acquire an already locked mutex which was left locked in the `bear_report_call()` call, that probably encountered some error and returned early when it forgot to unlock the mutex. All of these are speculation since from the backtrace I could not verify if frames 2 and 3 are in fact corresponding to the `libear.so` module. But I think it's a fairly safe bet. So, hereby I'm releasing the held mutex on *all paths*, even if some failure happens. PS: I would use lock_guards, but it's C. Reviewed-by: NoQ Differential Revision: https://reviews.llvm.org/D118439

We experienced some deadlocks when we used multiple threads for logging using `scan-builds` intercept-build tool when we used multiple threads by e.g. logging `make -j16` ``` (gdb) bt #0 0x00007f2bb3aff110 in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0 #1 0x00007f2bb3af70a3 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0 #2 0x00007f2bb3d152e4 in ?? () #3 0x00007ffcc5f0cc80 in ?? () llvm#4 0x00007f2bb3d2bf5b in ?? () from /lib64/ld-linux-x86-64.so.2 llvm#5 0x00007f2bb3b5da27 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 llvm#6 0x00007f2bb3b5dbe0 in exit () from /lib/x86_64-linux-gnu/libc.so.6 llvm#7 0x00007f2bb3d144ee in ?? () llvm#8 0x746e692f706d742f in ?? () llvm#9 0x692d747065637265 in ?? () llvm#10 0x2f653631326b3034 in ?? () llvm#11 0x646d632e35353532 in ?? () llvm#12 0x0000000000000000 in ?? () ``` I think the gcc's exit call caused the injected `libear.so` to be unloaded by the `ld`, which in turn called the `void on_unload() __attribute__((destructor))`. That tried to acquire an already locked mutex which was left locked in the `bear_report_call()` call, that probably encountered some error and returned early when it forgot to unlock the mutex. All of these are speculation since from the backtrace I could not verify if frames 2 and 3 are in fact corresponding to the `libear.so` module. But I think it's a fairly safe bet. So, hereby I'm releasing the held mutex on *all paths*, even if some failure happens. PS: I would use lock_guards, but it's C. Reviewed-by: NoQ Differential Revision: https://reviews.llvm.org/D118439 (cherry picked from commit d919d02)

A LUI instruction with flag RISCVII::MO_HI is usually used in conjunction with ADDI, and jointly complete address computation. To bind the cost evaluation of address computation, the LUI should not be regarded as a cheap move separately, which is consistent with ADDI. In this test case, it improves the unroll-loop code that the rematerialization of array's base address miss MachineCSE with Heuristics #1 at isProfitableToCSE. Reviewed By: asb, frasercrmck Differential Revision: https://reviews.llvm.org/D118216

This patch fixes a data race in IOHandlerProcessSTDIO. The race is happens between the main thread and the event handling thread. The main thread is running the IOHandler (IOHandlerProcessSTDIO::Run()) when an event comes in that makes us pop the process IO handler which involves cancelling the IOHandler (IOHandlerProcessSTDIO::Cancel). The latter calls SetIsDone(true) which modifies m_is_done. At the same time, we have the main thread reading the variable through GetIsDone(). This patch avoids the race by using a mutex to synchronize the two threads. On the event thread, in IOHandlerProcessSTDIO ::Cancel method, we obtain the lock before changing the value of m_is_done. On the main thread, in IOHandlerProcessSTDIO::Run(), we obtain the lock before reading the value of m_is_done. Additionally, we delay calling SetIsDone until after the loop exists, to avoid a potential race between the two writes. Write of size 1 at 0x00010b66bb68 by thread T7 (mutexes: write M2862, write M718324145051843688): #0 lldb_private::IOHandler::SetIsDone(bool) IOHandler.h:90 (liblldb.15.0.0git.dylib:arm64+0x971d84) #1 IOHandlerProcessSTDIO::Cancel() Process.cpp:4382 (liblldb.15.0.0git.dylib:arm64+0x5ddfec) #2 lldb_private::Debugger::PopIOHandler(std::__1::shared_ptr<lldb_private::IOHandler> const&) Debugger.cpp:1156 (liblldb.15.0.0git.dylib:arm64+0x3cb2a8) #3 lldb_private::Debugger::RemoveIOHandler(std::__1::shared_ptr<lldb_private::IOHandler> const&) Debugger.cpp:1063 (liblldb.15.0.0git.dylib:arm64+0x3cbd2c) llvm#4 lldb_private::Process::PopProcessIOHandler() Process.cpp:4487 (liblldb.15.0.0git.dylib:arm64+0x5c583c) llvm#5 lldb_private::Debugger::HandleProcessEvent(std::__1::shared_ptr<lldb_private::Event> const&) Debugger.cpp:1549 (liblldb.15.0.0git.dylib:arm64+0x3ceabc) llvm#6 lldb_private::Debugger::DefaultEventHandler() Debugger.cpp:1622 (liblldb.15.0.0git.dylib:arm64+0x3cf2c0) llvm#7 std::__1::__function::__func<lldb_private::Debugger::StartEventHandlerThread()::$_2, std::__1::allocator<lldb_private::Debugger::StartEventHandlerThread()::$_2>, void* ()>::operator()() function.h:352 (liblldb.15.0.0git.dylib:arm64+0x3d1bd8) llvm#8 lldb_private::HostNativeThreadBase::ThreadCreateTrampoline(void*) HostNativeThreadBase.cpp:62 (liblldb.15.0.0git.dylib:arm64+0x4c71ac) llvm#9 lldb_private::HostThreadMacOSX::ThreadCreateTrampoline(void*) HostThreadMacOSX.mm:18 (liblldb.15.0.0git.dylib:arm64+0x29ef544) Previous read of size 1 at 0x00010b66bb68 by main thread: #0 lldb_private::IOHandler::GetIsDone() IOHandler.h:92 (liblldb.15.0.0git.dylib:arm64+0x971db8) #1 IOHandlerProcessSTDIO::Run() Process.cpp:4339 (liblldb.15.0.0git.dylib:arm64+0x5ddc7c) #2 lldb_private::Debugger::RunIOHandlers() Debugger.cpp:982 (liblldb.15.0.0git.dylib:arm64+0x3cb48c) #3 lldb_private::CommandInterpreter::RunCommandInterpreter(lldb_private::CommandInterpreterRunOptions&) CommandInterpreter.cpp:3298 (liblldb.15.0.0git.dylib:arm64+0x506478) llvm#4 lldb::SBDebugger::RunCommandInterpreter(bool, bool) SBDebugger.cpp:1166 (liblldb.15.0.0git.dylib:arm64+0x53604) llvm#5 Driver::MainLoop() Driver.cpp:634 (lldb:arm64+0x100006294) llvm#6 main Driver.cpp:853 (lldb:arm64+0x100007344) Differential revision: https://reviews.llvm.org/D120762

This adds the jump slot mapping for RISCV. This enables lldb to attach to a remote debug server. Although this doesn't enable debugging RISCV targets, it is sufficient to attach, which is a slight improvement. Tested with DebugServer2: ~~~ (lldb) gdb-remote localhost:1234 (lldb) Process 71438 stopped * thread #1, name = 'reduced', stop reason = signal SIGTRAP frame #0: 0x0000003ff7fe1b20 error: Process 71438 is currently being debugged, kill the process before connecting. (lldb) register read general: x0 = 0x0000003ff7fe1b20 x1 = 0x0000002ae00d3a50 x2 = 0x0000003ffffff3e0 x3 = 0x0000002ae01566e0 x4 = 0x0000003fe567c7b0 x5 = 0x0000000000001000 x6 = 0x0000002ae00604ec x7 = 0x00000000000003ff x8 = 0x0000003fffc22db0 x9 = 0x0000000000000000 x10 = 0x0000000000000000 x11 = 0x0000002ae603b1c0 x12 = 0x0000002ae6039350 x13 = 0x0000000000000000 x14 = 0x0000002ae6039350 x15 = 0x0000002ae6039350 x16 = 0x73642f74756f3d5f x17 = 0x00000000000000dd x18 = 0x0000002ae6038f08 x19 = 0x0000002ae603b1c0 x20 = 0x0000002b0f3d3f40 x21 = 0x0000003ff0b212d0 x22 = 0x0000002b0f3a2740 x23 = 0x0000002b0f3de3a0 x24 = 0x0000002b0f3d3f40 x25 = 0x0000002ad6929850 x26 = 0x0000000000000000 x27 = 0x0000002ad69297c0 x28 = 0x0000003fe578b364 x29 = 0x000000000000002f x30 = 0x0000000000000000 x31 = 0x0000002ae602401a pc = 0x0000003ff7fe1b20 ft0 = 0 ft1 = 0 ft2 = 0 ft3 = 0 ft4 = 0 ft5 = 0 ft6 = 0 ft7 = 0 fs0 = 0 fs1 = 0 fa0 = 0 fa1 = 0 fa2 = 0 fa3 = 0 fa4 = 0 fa5 = 0 fa6 = 0 fa7 = 9.10304232197721e-313 fs2 = 0 fs3 = 1.35805727667792e-312 fs4 = 1.35589259164679e-312 fs5 = 1.35805727659887e-312 fs6 = 9.10304232355822e-313 fs7 = 0 fs8 = 9.10304233027751e-313 fs9 = 0 fs10 = 9.10304232948701e-313 fs11 = 1.35588724164707e-312 ft8 = 0 ft9 = 9.1372158616833e-313 ft10 = 9.13720376537528e-313 ft11 = 1.356808717416e-312 3 registers were unavailable. (lldb) disassemble error: Failed to disassemble memory at 0x3ff7fe1b2 ~~~

Add support to inspect the ELF headers for RISCV targets to determine if RVC or RVE are enabled and the floating point support to enable. As per the RISCV specification, d implies f, q implies d implies f, which gives us the cascading effect that is used to enable the features when setting up the disassembler. With this change, it is now possible to attach the debugger to a remote process and be able to disassemble the instruction stream. ~~~ $ bin/lldb tmp/reduced (lldb) target create "reduced" Current executable set to '/tmp/reduced' (riscv64). (lldb) gdb-remote localhost:1234 (lldb) Process 5737 stopped * thread #1, name = 'reduced', stop reason = signal SIGTRAP frame #0: 0x0000003ff7fe1b20 -> 0x3ff7fe1b20: mv a0, sp 0x3ff7fe1b22: jal 1936 0x3ff7fe1b26: mv s0, a0 0x3ff7fe1b28: auipc a0, 27 ~~~

I'm adding two new classes that can be used to measure the duration of long tasks as process and thread level, e.g. decoding, fetching data from lldb-server, etc. In this first patch, I'm using it to measure the time it takes to decode each thread, which is printed out with the `dump info` command. In a later patch I'll start adding process-level tasks and I might move these classes to the upper Trace level, instead of having them in the intel-pt plugin. I might need to do that anyway in the future when we have to measure HTR. For now, I want to keep the impact of this change minimal. With it, I was able to generate the following info of a very big trace: ``` (lldb) thread trace dump info Trace technology: intel-pt thread #1: tid = 616081 Total number of instructions: 9729366 Memory usage: Raw trace size: 1024 KiB Total approximate memory usage (excluding raw trace): 123517.34 KiB Average memory usage per instruction (excluding raw trace): 13.00 bytes Timing: Decoding instructions: 1.62s Errors: Number of TSC decoding errors: 0 ``` As seen above, it took 1.62 seconds to decode 9.7M instructions. This is great news, as we don't need to do any optimization work in this area. Differential Revision: https://reviews.llvm.org/D123357

… perf conversion in the client - Add logging for when the live state of the process is refreshed - Move error handling of the live state refreshing to Trace from TraceIntelPT. This allows refreshing to fail either at the plug-in level or at the base class level. The error is cached and it can be gotten every time RefreshLiveProcessState is invoked. - Allow DoRefreshLiveProcessState to handle plugin-specific parameters. - Add some encapsulation to prevent TraceIntelPT from accessing variables belonging to Trace. Test done via logging: ``` (lldb) b main Breakpoint 1: where = a.out`main + 20 at main.cpp:27:20, address = 0x00000000004023d9 (lldb) r Process 2359706 launched: '/home/wallace/a.out' (x86_64) Process 2359706 stopped * thread #1, name = 'a.out', stop reason = breakpoint 1.1 frame #0: 0x00000000004023d9 a.out`main at main.cpp:27:20 24 }; 25 26 int main() { -> 27 std::vector<int> vvv; 28 for (int i = 0; i < 100000; i++) 29 vvv.push_back(i); 30 (lldb) process trace start (lldb) log enable lldb target -F(lldb) n Process 2359706 stopped * thread #1, name = 'a.out', stop reason = step over frame #0: 0x00000000004023e8 a.out`main at main.cpp:28:12 25 26 int main() { 27 std::vector<int> vvv; -> 28 for (int i = 0; i < 100000; i++) 29 vvv.push_back(i); 30 31 std::deque<int> dq1 = {1, 2, 3}; (lldb) thread trace dump instructions -c 2 -t Trace.cpp:RefreshLiveProcessState Trace::RefreshLiveProcessState invoked TraceIntelPT.cpp:DoRefreshLiveProcessState TraceIntelPT found tsc conversion information thread #1: tid = 2359706 a.out`std::vector<int, std::allocator<int>>::vector() + 26 at stl_vector.h:395:19 54: [tsc=unavailable] 0x0000000000403a7c retq ``` See the logging lines at the end of the dump. They indicate that refreshing happened and that perf conversion information was found. Differential Revision: https://reviews.llvm.org/D125943

@0

…X86 following the psABI""" This reverts commit e1c5afa. This introduces crashes in the JAX backend on CPU. A reproducer in LLVM is below. Let me know if you have trouble reproducing this. ; ModuleID = '__compute_module' source_filename = "__compute_module" target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-grtev4-linux-gnu" @0 = private unnamed_addr constant [4 x i8] c"\00\00\00?" @1 = private unnamed_addr constant [4 x i8] c"\1C}\908" @2 = private unnamed_addr constant [4 x i8] c"?\00\\4" @3 = private unnamed_addr constant [4 x i8] c"%ci1" @4 = private unnamed_addr constant [4 x i8] zeroinitializer @5 = private unnamed_addr constant [4 x i8] c"\00\00\00\C0" @6 = private unnamed_addr constant [4 x i8] c"\00\00\00B" @7 = private unnamed_addr constant [4 x i8] c"\94\B4\C22" @8 = private unnamed_addr constant [4 x i8] c"^\09B6" @9 = private unnamed_addr constant [4 x i8] c"\15\F3M?" @10 = private unnamed_addr constant [4 x i8] c"e\CC\\;" @11 = private unnamed_addr constant [4 x i8] c"d\BD/>" @12 = private unnamed_addr constant [4 x i8] c"V\F4I=" @13 = private unnamed_addr constant [4 x i8] c"\10\CB,<" @14 = private unnamed_addr constant [4 x i8] c"\AC\E3\D6:" @15 = private unnamed_addr constant [4 x i8] c"\DC\A8E9" @16 = private unnamed_addr constant [4 x i8] c"\C6\FA\897" @17 = private unnamed_addr constant [4 x i8] c"%\F9\955" @18 = private unnamed_addr constant [4 x i8] c"\B5\DB\813" @19 = private unnamed_addr constant [4 x i8] c"\B4W_\B2" @20 = private unnamed_addr constant [4 x i8] c"\1Cc\8F\B4" @21 = private unnamed_addr constant [4 x i8] c"~3\94\B6" @22 = private unnamed_addr constant [4 x i8] c"3Yq\B8" @23 = private unnamed_addr constant [4 x i8] c"\E9\17\17\BA" @24 = private unnamed_addr constant [4 x i8] c"\F1\B2\8D\BB" @25 = private unnamed_addr constant [4 x i8] c"\F8t\C2\BC" @26 = private unnamed_addr constant [4 x i8] c"\82[\C2\BD" @27 = private unnamed_addr constant [4 x i8] c"uB-?" @28 = private unnamed_addr constant [4 x i8] c"^\FF\9B\BE" @29 = private unnamed_addr constant [4 x i8] c"\00\00\00A" ; Function Attrs: uwtable define void @main.158(ptr %retval, ptr noalias %run_options, ptr noalias %params, ptr noalias %buffer_table, ptr noalias %status, ptr noalias %prof_counters) #0 { entry: %fusion.invar_address.dim.1 = alloca i64, align 8 %fusion.invar_address.dim.0 = alloca i64, align 8 %0 = getelementptr inbounds ptr, ptr %buffer_table, i64 1 %Arg_0.1 = load ptr, ptr %0, align 8, !invariant.load !0, !dereferenceable !1, !align !2 %1 = getelementptr inbounds ptr, ptr %buffer_table, i64 0 %fusion = load ptr, ptr %1, align 8, !invariant.load !0, !dereferenceable !1, !align !2 store i64 0, ptr %fusion.invar_address.dim.0, align 8 br label %fusion.loop_header.dim.0 return: ; preds = %fusion.loop_exit.dim.0 ret void fusion.loop_header.dim.0: ; preds = %fusion.loop_exit.dim.1, %entry %fusion.indvar.dim.0 = load i64, ptr %fusion.invar_address.dim.0, align 8 %2 = icmp uge i64 %fusion.indvar.dim.0, 3 br i1 %2, label %fusion.loop_exit.dim.0, label %fusion.loop_body.dim.0 fusion.loop_body.dim.0: ; preds = %fusion.loop_header.dim.0 store i64 0, ptr %fusion.invar_address.dim.1, align 8 br label %fusion.loop_header.dim.1 fusion.loop_header.dim.1: ; preds = %fusion.loop_body.dim.1, %fusion.loop_body.dim.0 %fusion.indvar.dim.1 = load i64, ptr %fusion.invar_address.dim.1, align 8 %3 = icmp uge i64 %fusion.indvar.dim.1, 1 br i1 %3, label %fusion.loop_exit.dim.1, label %fusion.loop_body.dim.1 fusion.loop_body.dim.1: ; preds = %fusion.loop_header.dim.1 %4 = getelementptr inbounds [3 x [1 x half]], ptr %Arg_0.1, i64 0, i64 %fusion.indvar.dim.0, i64 0 %5 = load half, ptr %4, align 2, !invariant.load !0, !noalias !3 %6 = fpext half %5 to float %7 = call float @llvm.fabs.f32(float %6) %constant.121 = load float, ptr @29, align 4 %compare.2 = fcmp ole float %7, %constant.121 %8 = zext i1 %compare.2 to i8 %constant.120 = load float, ptr @0, align 4 %multiply.95 = fmul float %7, %constant.120 %constant.119 = load float, ptr @5, align 4 %add.82 = fadd float %multiply.95, %constant.119 %constant.118 = load float, ptr @4, align 4 %multiply.94 = fmul float %add.82, %constant.118 %constant.117 = load float, ptr @19, align 4 %add.81 = fadd float %multiply.94, %constant.117 %multiply.92 = fmul float %add.82, %add.81 %constant.116 = load float, ptr @18, align 4 %add.79 = fadd float %multiply.92, %constant.116 %multiply.91 = fmul float %add.82, %add.79 %subtract.87 = fsub float %multiply.91, %add.81 %constant.115 = load float, ptr @20, align 4 %add.78 = fadd float %subtract.87, %constant.115 %multiply.89 = fmul float %add.82, %add.78 %subtract.86 = fsub float %multiply.89, %add.79 %constant.114 = load float, ptr @17, align 4 %add.76 = fadd float %subtract.86, %constant.114 %multiply.88 = fmul float %add.82, %add.76 %subtract.84 = fsub float %multiply.88, %add.78 %constant.113 = load float, ptr @21, align 4 %add.75 = fadd float %subtract.84, %constant.113 %multiply.86 = fmul float %add.82, %add.75 %subtract.83 = fsub float %multiply.86, %add.76 %constant.112 = load float, ptr @16, align 4 %add.73 = fadd float %subtract.83, %constant.112 %multiply.85 = fmul float %add.82, %add.73 %subtract.81 = fsub float %multiply.85, %add.75 %constant.111 = load float, ptr @22, align 4 %add.72 = fadd float %subtract.81, %constant.111 %multiply.83 = fmul float %add.82, %add.72 %subtract.80 = fsub float %multiply.83, %add.73 %constant.110 = load float, ptr @15, align 4 %add.70 = fadd float %subtract.80, %constant.110 %multiply.82 = fmul float %add.82, %add.70 %subtract.78 = fsub float %multiply.82, %add.72 %constant.109 = load float, ptr @23, align 4 %add.69 = fadd float %subtract.78, %constant.109 %multiply.80 = fmul float %add.82, %add.69 %subtract.77 = fsub float %multiply.80, %add.70 %constant.108 = load float, ptr @14, align 4 %add.68 = fadd float %subtract.77, %constant.108 %multiply.79 = fmul float %add.82, %add.68 %subtract.75 = fsub float %multiply.79, %add.69 %constant.107 = load float, ptr @24, align 4 %add.67 = fadd float %subtract.75, %constant.107 %multiply.77 = fmul float %add.82, %add.67 %subtract.74 = fsub float %multiply.77, %add.68 %constant.106 = load float, ptr @13, align 4 %add.66 = fadd float %subtract.74, %constant.106 %multiply.76 = fmul float %add.82, %add.66 %subtract.72 = fsub float %multiply.76, %add.67 %constant.105 = load float, ptr @25, align 4 %add.65 = fadd float %subtract.72, %constant.105 %multiply.74 = fmul float %add.82, %add.65 %subtract.71 = fsub float %multiply.74, %add.66 %constant.104 = load float, ptr @12, align 4 %add.64 = fadd float %subtract.71, %constant.104 %multiply.73 = fmul float %add.82, %add.64 %subtract.69 = fsub float %multiply.73, %add.65 %constant.103 = load float, ptr @26, align 4 %add.63 = fadd float %subtract.69, %constant.103 %multiply.71 = fmul float %add.82, %add.63 %subtract.67 = fsub float %multiply.71, %add.64 %constant.102 = load float, ptr @11, align 4 %add.62 = fadd float %subtract.67, %constant.102 %multiply.70 = fmul float %add.82, %add.62 %subtract.66 = fsub float %multiply.70, %add.63 %constant.101 = load float, ptr @28, align 4 %add.61 = fadd float %subtract.66, %constant.101 %multiply.68 = fmul float %add.82, %add.61 %subtract.65 = fsub float %multiply.68, %add.62 %constant.100 = load float, ptr @27, align 4 %add.60 = fadd float %subtract.65, %constant.100 %subtract.64 = fsub float %add.60, %add.62 %multiply.66 = fmul float %subtract.64, %constant.120 %constant.99 = load float, ptr @6, align 4 %divide.4 = fdiv float %constant.99, %7 %add.59 = fadd float %divide.4, %constant.119 %multiply.65 = fmul float %add.59, %constant.118 %constant.98 = load float, ptr @3, align 4 %add.58 = fadd float %multiply.65, %constant.98 %multiply.64 = fmul float %add.59, %add.58 %constant.97 = load float, ptr @7, align 4 %add.57 = fadd float %multiply.64, %constant.97 %multiply.63 = fmul float %add.59, %add.57 %subtract.63 = fsub float %multiply.63, %add.58 %constant.96 = load float, ptr @2, align 4 %add.56 = fadd float %subtract.63, %constant.96 %multiply.62 = fmul float %add.59, %add.56 %subtract.62 = fsub float %multiply.62, %add.57 %constant.95 = load float, ptr @8, align 4 %add.55 = fadd float %subtract.62, %constant.95 %multiply.61 = fmul float %add.59, %add.55 %subtract.61 = fsub float %multiply.61, %add.56 %constant.94 = load float, ptr @1, align 4 %add.54 = fadd float %subtract.61, %constant.94 %multiply.60 = fmul float %add.59, %add.54 %subtract.60 = fsub float %multiply.60, %add.55 %constant.93 = load float, ptr @10, align 4 %add.53 = fadd float %subtract.60, %constant.93 %multiply.59 = fmul float %add.59, %add.53 %subtract.59 = fsub float %multiply.59, %add.54 %constant.92 = load float, ptr @9, align 4 %add.52 = fadd float %subtract.59, %constant.92 %subtract.58 = fsub float %add.52, %add.54 %multiply.58 = fmul float %subtract.58, %constant.120 %9 = call float @llvm.sqrt.f32(float %7) %10 = fdiv float 1.000000e+00, %9 %multiply.57 = fmul float %multiply.58, %10 %11 = trunc i8 %8 to i1 %12 = select i1 %11, float %multiply.66, float %multiply.57 %13 = fptrunc float %12 to half %14 = getelementptr inbounds [3 x [1 x half]], ptr %fusion, i64 0, i64 %fusion.indvar.dim.0, i64 0 store half %13, ptr %14, align 2, !alias.scope !3 %invar.inc1 = add nuw nsw i64 %fusion.indvar.dim.1, 1 store i64 %invar.inc1, ptr %fusion.invar_address.dim.1, align 8 br label %fusion.loop_header.dim.1 fusion.loop_exit.dim.1: ; preds = %fusion.loop_header.dim.1 %invar.inc = add nuw nsw i64 %fusion.indvar.dim.0, 1 store i64 %invar.inc, ptr %fusion.invar_address.dim.0, align 8 br label %fusion.loop_header.dim.0 fusion.loop_exit.dim.0: ; preds = %fusion.loop_header.dim.0 br label %return } ; Function Attrs: nocallback nofree nosync nounwind readnone speculatable willreturn declare float @llvm.fabs.f32(float %0) #1 ; Function Attrs: nocallback nofree nosync nounwind readnone speculatable willreturn declare float @llvm.sqrt.f32(float %0) #1 attributes #0 = { uwtable "denormal-fp-math"="preserve-sign" "no-frame-pointer-elim"="false" } attributes #1 = { nocallback nofree nosync nounwind readnone speculatable willreturn } !0 = !{} !1 = !{i64 6} !2 = !{i64 8} !3 = !{!4} !4 = !{!"buffer: {index:0, offset:0, size:6}", !5} !5 = !{!"XLA global AA domain"}

…h decoding - Add the logic that parses all cpu context switch traces and produces blocks of continuous executions, which will be later used to assign intel pt subtraces to threads and to identify gaps. This logic can also identify if the context switch trace is malformed. - The continuous executions blocks are able to indicate when there were some contention issues when producing the context switch trace. See the inline comments for more information. - Update the 'dump info' command to show information and stats related to the multicore decoding flow, including timing about context switch decoding. - Add the logic to conver nanoseconds to TSCs. - Fix a bug when returning the context switches. Now they data returned makes sense and even empty traces can be returned from lldb-server. - Finish the necessary bits for loading and saving a multi-core trace bundle from disk. - Change some size_t to uint64_t for compatibility with 32 bit systems. Tested by saving a trace session of a program that sleeps 100 times, it was able to produce the following 'dump info' text: ``` (lldb) trace load /tmp/trace3/trace.json (lldb) thread trace dump info Trace technology: intel-pt thread #1: tid = 4192415 Total number of instructions: 1 Memory usage: Total approximate memory usage (excluding raw trace): 2.51 KiB Average memory usage per instruction (excluding raw trace): 2573.00 bytes Timing for this thread: Timing for global tasks: Context switch trace decoding: 0.00s Events: Number of instructions with events: 0 Number of individual events: 0 Multi-core decoding: Total number of continuous executions found: 2499 Number of continuous executions for this thread: 102 Errors: Number of TSC decoding errors: 0 ``` Differential Revision: https://reviews.llvm.org/D126267

I encountered an issue where `p &variable` was finding an incorrect address for 32-bit PIC ELF files loaded into a running process. The problem was that the R_386_32 ELF relocations were not being applied to the DWARF section, so all variables in that file were reporting as being at the start of their respective section. There is an assert that catches this on debug builds, but silently ignores the issue on non-debug builds. In this changeset, I added handling for the R_386_32 relocation type to ObjectFileELF, and a supporting function to ELFRelocation to differentiate between DT_REL & DT_RELA in ObjectFileELF::ApplyRelocations(). Demonstration of issue: ``` [dmlary@host work]$ cat rel.c volatile char padding[32] = "make sure var isnt at .data+0"; volatile char var[] = "test"; [dmlary@host work]$ gcc -c rel.c -FPIC -fpic -g -m32 [dmlary@host work]$ lldb ./exec (lldb) target create "./exec" Current executable set to '/home/dmlary/src/work/exec' (i386). (lldb) process launch --stop-at-entry Process 21278 stopped * thread #1, name = 'exec', stop reason = signal SIGSTOP frame #0: 0xf7fdb150 ld-2.17.so`_start ld-2.17.so`_start: -> 0xf7fdb150 <+0>: movl %esp, %eax 0xf7fdb152 <+2>: calll 0xf7fdb990 ; _dl_start ld-2.17.so`_dl_start_user: 0xf7fdb157 <+0>: movl %eax, %edi 0xf7fdb159 <+2>: calll 0xf7fdb140 Process 21278 launched: '/home/dmlary/src/work/exec' (i386) (lldb) image add ./rel.o (lldb) image load --file rel.o .text 0x40000000 .data 0x50000000 section '.text' loaded at 0x40000000 section '.data' loaded at 0x50000000 (lldb) image dump symtab rel.o Symtab, file = rel.o, num_symbols = 13: Debug symbol |Synthetic symbol ||Externally Visible ||| Index UserID DSX Type File Address/Value Load Address Size Flags Name ------- ------ --- --------------- ------------------ ------------------ ------------------ ---------- ---------------------------------- [ 0] 1 SourceFile 0x0000000000000000 0x0000000000000000 0x00000004 rel.c [ 1] 2 Invalid 0x0000000000000000 0x0000000000000020 0x00000003 [ 2] 3 Invalid 0x0000000000000000 0x50000000 0x0000000000000020 0x00000003 [ 3] 4 Invalid 0x0000000000000025 0x0000000000000000 0x00000003 [ 4] 5 Invalid 0x0000000000000000 0x0000000000000020 0x00000003 [ 5] 6 Invalid 0x0000000000000000 0x0000000000000020 0x00000003 [ 6] 7 Invalid 0x0000000000000000 0x0000000000000020 0x00000003 [ 7] 8 Invalid 0x0000000000000000 0x0000000000000020 0x00000003 [ 8] 9 Invalid 0x0000000000000000 0x0000000000000020 0x00000003 [ 9] 10 Invalid 0x0000000000000000 0x0000000000000020 0x00000003 [ 10] 11 Invalid 0x0000000000000000 0x0000000000000020 0x00000003 [ 11] 12 X Data 0x0000000000000000 0x50000000 0x0000000000000020 0x00000011 padding [ 12] 13 X Data 0x0000000000000020 0x50000020 0x0000000000000005 0x00000011 var (lldb) p &var (volatile char (*)[5]) $1 = 0x50000000 ``` Reviewed By: labath Differential Revision: https://reviews.llvm.org/D132954

In RegisterInfos_loongarch64.h, r22 is defined twice. Having an extra array member causes problems reading and writing registers defined after r22. So, for r22, keep the alias fp, delete the s9 alias. The PC register is incorrectly accessed when the step command is executed. The step command behavior is incorrect. This test reflects this problem: ``` loongson@linux:~$ cat test.c #include <stdio.h> int func(int a) { return a + 1; } int main(int argc, char const *argv[]) { func(10); return 0; } loongson@linux:~$ clang -g test.c -o test ``` Without this patch: ``` loongson@linux:~$ llvm-project/llvm/build/bin/lldb test (lldb) target create "test" Current executable set to '/home/loongson/test' (loongarch64). (lldb) b main Breakpoint 1: where = test`main + 40 at test.c:8:3, address = 0x0000000120000668 (lldb) r Process 278049 launched: '/home/loongson/test' (loongarch64) Process 278049 stopped * thread #1, name = 'test', stop reason = breakpoint 1.1 frame #0: 0x0000000120000668 test`main(argc=1, argv=0x00007fffffff72a8) at test.c:8:3 5 } 6 7 int main(int argc, char const *argv[]) { -> 8 func(10); 9 return 0; 10 } 11 (lldb) s Process 278049 stopped * thread #1, name = 'test', stop reason = step in frame #0: 0x0000000120000670 test`main(argc=1, argv=0x00007fffffff72a8) at test.c:9:3 6 7 int main(int argc, char const *argv[]) { 8 func(10); -> 9 return 0; 10 } ``` With this patch: ``` loongson@linux:~$ llvm-project/llvm/build/bin/lldb test (lldb) target create "test" Current executable set to '/home/loongson/test' (loongarch64). (lldb) b main Breakpoint 1: where = test`main + 40 at test.c:8:3, address = 0x0000000120000668 (lldb) r Process 278632 launched: '/home/loongson/test' (loongarch64) Process 278632 stopped * thread #1, name = 'test', stop reason = breakpoint 1.1 frame #0: 0x0000000120000668 test`main(argc=1, argv=0x00007fffffff72a8) at test.c:8:3 5 } 6 7 int main(int argc, char const *argv[]) { -> 8 func(10); 9 return 0; 10 } 11 (lldb) s Process 278632 stopped * thread #1, name = 'test', stop reason = step in frame #0: 0x0000000120000624 test`func(a=10) at test.c:4:10 1 #include <stdio.h> 2 3 int func(int a) { -> 4 return a + 1; 5 } ``` Reviewed By: SixWeining, DavidSpickett Differential Revision: https://reviews.llvm.org/D140615

…callback The `TypeSystemMap::m_mutex` guards against concurrent modifications of members of `TypeSystemMap`. In particular, `m_map`. `TypeSystemMap::ForEach` iterates through the entire `m_map` calling a user-specified callback for each entry. This is all done while `m_mutex` is locked. However, there's nothing that guarantees that the callback itself won't call back into `TypeSystemMap` APIs on the same thread. This lead to double-locking `m_mutex`, which is undefined behaviour. We've seen this cause a deadlock in the swift plugin with following backtrace: ``` int main() { std::unique_ptr<int> up = std::make_unique<int>(5); volatile int val = *up; return val; } clang++ -std=c++2a -g -O1 main.cpp ./bin/lldb -o “br se -p return” -o run -o “v *up” -o “expr *up” -b ``` ``` frame llvm#4: std::lock_guard<std::mutex>::lock_guard frame llvm#5: lldb_private::TypeSystemMap::GetTypeSystemForLanguage <<<< Lock #2 frame llvm#6: lldb_private::TypeSystemMap::GetTypeSystemForLanguage frame llvm#7: lldb_private::Target::GetScratchTypeSystemForLanguage ... frame llvm#26: lldb_private::SwiftASTContext::LoadLibraryUsingPaths frame llvm#27: lldb_private::SwiftASTContext::LoadModule frame llvm#30: swift::ModuleDecl::collectLinkLibraries frame llvm#31: lldb_private::SwiftASTContext::LoadModule frame llvm#34: lldb_private::SwiftASTContext::GetCompileUnitImportsImpl frame llvm#35: lldb_private::SwiftASTContext::PerformCompileUnitImports frame llvm#36: lldb_private::TypeSystemSwiftTypeRefForExpressions::GetSwiftASTContext frame llvm#37: lldb_private::TypeSystemSwiftTypeRefForExpressions::GetPersistentExpressionState frame llvm#38: lldb_private::Target::GetPersistentSymbol frame llvm#41: lldb_private::TypeSystemMap::ForEach <<<< Lock #1 frame llvm#42: lldb_private::Target::GetPersistentSymbol frame llvm#43: lldb_private::IRExecutionUnit::FindInUserDefinedSymbols frame llvm#44: lldb_private::IRExecutionUnit::FindSymbol frame llvm#45: lldb_private::IRExecutionUnit::MemoryManager::GetSymbolAddressAndPresence frame llvm#46: lldb_private::IRExecutionUnit::MemoryManager::findSymbol frame llvm#47: non-virtual thunk to lldb_private::IRExecutionUnit::MemoryManager::findSymbol frame llvm#48: llvm::LinkingSymbolResolver::findSymbol frame llvm#49: llvm::LegacyJITSymbolResolver::lookup frame llvm#50: llvm::RuntimeDyldImpl::resolveExternalSymbols frame llvm#51: llvm::RuntimeDyldImpl::resolveRelocations frame llvm#52: llvm::MCJIT::finalizeLoadedModules frame llvm#53: llvm::MCJIT::finalizeObject frame llvm#54: lldb_private::IRExecutionUnit::ReportAllocations frame llvm#55: lldb_private::IRExecutionUnit::GetRunnableInfo frame llvm#56: lldb_private::ClangExpressionParser::PrepareForExecution frame llvm#57: lldb_private::ClangUserExpression::TryParse frame llvm#58: lldb_private::ClangUserExpression::Parse ``` Our solution is to simply iterate over a local copy of `m_map`. **Testing** * Confirmed on manual reproducer (would reproduce 100% of the time before the patch) Differential Revision: https://reviews.llvm.org/D149949

…est unittest Need to finalize the DIBuilder to avoid leak sanitizer errors like this: Direct leak of 48 byte(s) in 1 object(s) allocated from: #0 0x55c99ea1761d in operator new(unsigned long) #1 0x55c9a518ae49 in operator new #2 0x55c9a518ae49 in llvm::MDTuple::getImpl(...) #3 0x55c9a4f1b1ec in getTemporary llvm#4 0x55c9a4f1b1ec in llvm::DIBuilder::createFunction(...)

@foo

The motivation for this change is a workload generated by the XLA compiler targeting nvidia GPUs. This kernel has a few hundred i8 loads and stores. Merging is critical for performance. The current LSV doesn't merge these well because it only considers instructions within a block of 64 loads+stores. This limit is necessary to contain the O(n^2) behavior of the pass. I'm hesitant to increase the limit, because this pass is already one of the slowest parts of compiling an XLA program. So we rewrite basically the whole thing to use a new algorithm. Before, we compared every load/store to every other to see if they're consecutive. The insight (from tra@) is that this is redundant. If we know the offset from PtrA to PtrB, then we don't need to compare PtrC to both of them in order to tell whether C may be adjacent to A or B. So that's what we do. When scanning a basic block, we maintain a list of chains, where we know the offset from every element in the chain to the first element in the chain. Each instruction gets compared only to the leaders of all the chains. In the worst case, this is still O(n^2), because all chains might be of length 1. To prevent compile time blowup, we only consider the 64 most recently used chains. Thus we do no more comparisons than before, but we have the potential to make much longer chains. This rewrite affects many tests. The changes to tests fall into two categories. 1. The old code had what appears to be a bug when deciding whether a misaligned vectorized load is fast. Suppose TTI reports that load <i32 x 4> align 4 has relative speed 1, and suppose that load i32 align 4 has relative speed 32. The intent of the code seems to be that we prefer the scalar load, because it's faster. But the old code would choose the vectorized load. accessIsMisaligned would set RelativeSpeed to 0 for the scalar load (and not even call into TTI to get the relative speed), because the scalar load is aligned. After this patch, we will prefer the scalar load if it's faster. 2. This patch changes the logic for how we vectorize. Usually this results in vectorizing more. Explanation of changes to tests: - AMDGPU/adjust-alloca-alignment.ll: #1 - AMDGPU/flat_atomic.ll: #2, we vectorize more. - AMDGPU/int_sideeffect.ll: #2, there are two possible locations for the call to @foo, and the pass is brittle to this. Before, we'd vectorize in case 1 and not case 2. Now we vectorize in case 2 and not case 1. So we just move the call. - AMDGPU/adjust-alloca-alignment.ll: #2, we vectorize more - AMDGPU/insertion-point.ll: #2 we vectorize more - AMDGPU/merge-stores-private.ll: #1 (undoes changes from git rev 86f9117, which appear to have hit the bug from #1) - AMDGPU/multiple_tails.ll: #1 - AMDGPU/vect-ptr-ptr-size-mismatch.ll: Fix alignment (I think related to #1 above). - AMDGPU CodeGen: I have difficulty commenting on these changes, but many of them look like #2, we vectorize more. - NVPTX/4x2xhalf.ll: Fix alignment (I think related to #1 above). - NVPTX/vectorize_i8.ll: We don't generate <3 x i8> vectors on NVPTX because they're not legal (and eventually get split) - X86/correct-order.ll: #2, we vectorize more, probably because of changes to the chain-splitting logic. - X86/subchain-interleaved.ll: #2, we vectorize more - X86/vector-scalar.ll: #2, we can now vectorize scalar float + <1 x float> - X86/vectorize-i8-nested-add-inseltpoison.ll: Deleted the nuw test because it was nonsensical. It was doing `add nuw %v0, -1`, but this is equivalent to `add nuw %v0, 0xffff'ffff`, which is equivalent to asserting that %v0 == 0. - X86/vectorize-i8-nested-add.ll: Same as nested-add-inseltpoison.ll Differential Revision: https://reviews.llvm.org/D149893

Use Component in OpenBSD::getCompilerRT to find libraries

6a059d3

OpenBSD with this change would use this name for ASan: /usr/lib/clang/11.1.0/lib/libclang_rt.asan.a Already submitted to OpenBSD repository.

mordak closed this Sep 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use Component in OpenBSD::getCompilerRT to find libraries #1

Use Component in OpenBSD::getCompilerRT to find libraries #1

Uh oh!

blackgnezdo commented Sep 2, 2021

Uh oh!

blackgnezdo commented Sep 2, 2021

Uh oh!

mordak commented Sep 2, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Use Component in OpenBSD::getCompilerRT to find libraries #1

Use Component in OpenBSD::getCompilerRT to find libraries #1

Uh oh!

Conversation

blackgnezdo commented Sep 2, 2021

Uh oh!

blackgnezdo commented Sep 2, 2021

Uh oh!

mordak commented Sep 2, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants