Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proper support for primitive types #5

Closed
bcardosolopes opened this issue Sep 21, 2022 · 9 comments
Closed

Proper support for primitive types #5

bcardosolopes opened this issue Sep 21, 2022 · 9 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@bcardosolopes
Copy link
Member

Right now we use MLIR's ints, floats, etc. Just like we have cir.bool and cir.ptr we should have cir.int ... like type and track more language level info (signedness, etc)

@bcardosolopes bcardosolopes added enhancement New feature or request good first issue Good for newcomers labels Sep 21, 2022
@felipealmeida
Copy link

Hello @bcardosolopes ,

I started hacking ClangIR to test adding bitfields support for ClangIR. I work with specific proprietary instructions that can be very useful with bitfields, however, premature lowering of bitfields accesses to arithmetic/logical instructions makes optimizations very hard to the backend.

However, I looked in other MLIR projects to see how bitfields have been handled and I did not find anything.

Does anyone in the project already have a possible path on how to solve this for ClangIR?

Kind regards,

@bcardosolopes
Copy link
Member Author

@felipealmeida great to hear you played a bit with bitfields.

Does anyone in the project already have a possible path on how to solve this for ClangIR?

Nothing set in stone from our discussions but we surely don't wanna lower them in terms of logical operations right away, but instead proxy it out (and hide the details) over an operation cir.get_bitfield or similar. We can iterate on a design (or a PR) if you already have ideas on how to implement that.

@bcardosolopes
Copy link
Member Author

@felipealmeida Created #13 to track bitfields!

@bcardosolopes
Copy link
Member Author

It might also be worth checking out how polygeist does some of that lowering, could provide interesting insights.

@xlauko
Copy link
Contributor

xlauko commented Dec 5, 2022

@bcardosolopes as I am now lowering vast to cir, I might add something similar to vast high-level types to cir:

https://github.com/trailofbits/vast/blob/master/include/vast/Dialect/HighLevel/HighLevelTypes.td

Do you want to retain type qualifiers in cir? Also a tricky part is to define constants with high-level types, they do not cooperate nicely with attributes, and our lowering of their attribute types is a bit hackish at the moment :(

@bcardosolopes
Copy link
Member Author

@bcardosolopes as I am now lowering vast to cir

This is great to hear!

I might add something similar to vast high-level types to cir:

https://github.com/trailofbits/vast/blob/master/include/vast/Dialect/HighLevel/HighLevelTypes.td

Do you want to retain type qualifiers in cir?

This information will be good to have, for sure. The way I was planning to get type qualifiers would be through ASTContext queries on types (which should do the right thing on top of its CanQuals). Type information would come from attached AST nodes, of course that requires AST-capable-CIR to be used. Do you have plans to build an ASTContext from VAST? This could also be necessary in AST-free-CIR, if generating, for instance, LLVM load/store's out of it.

Something like HighLevelTypes.td in CIR would be helpful, specially cause we also want to add something like your CharType, ShortType, etc (purpose of this github issue). How about adding these first and qualifiers as a second step?

Regarding qualifiers, since we don't usually add things without use cases, I'm curious about some pieces:

  • Are you planning to (1) write a CIR pass that uses some of this in the shorter term? or (2) something that can concretely provide helpful insight info to codegen? A simple pass that exercises some of this potential would certainly be nice (e.g. produce statistics between number of volatile loads / normals loads, or volatile load/store LLVM codegen pieces).
  • How much problematic it'd be for you to drop it when lowering to CIR for now until a concrete use?

Also a tricky part is to define constants with high-level types, they do not cooperate nicely with attributes, and our lowering of their attribute types is a bit hackish at the moment :(

Oh, what kind of problems on high-level-types+attributes you've been running into?

@xlauko
Copy link
Contributor

xlauko commented Dec 6, 2022

How much problematic it'd be for you to drop it when lowering to CIR for now until a concrete use?

I don't mind dropping the qualifiers for now. My main goal at the moment is to see whether VAST -> CIR -> LLVM is a viable path. As we discussed, if there is some common part I am happy to add it to CIR. And this seemed to me as one of them :)

Do you have plans to build an ASTContext from VAST? This could also be necessary in AST-free-CIR, if generating, for instance, LLVM load/store's out of it.

I would need to investigate whether we can reconstruct ASTContext, I don't have answer for that now.

Oh, what kind of problems on high-level-types+attributes you've been running into?

Maybe with newer MLIR version and TypedAttrInterface it is resolved. But the problem we had was with attributes that has vast high-level types, e.g., constants, these can be arbitrarily nested in some non-typed attribute.
And when we tried to lower them from vast types to std types it was not clear if whether and where the attributes contains types that need to be converted.

@bcardosolopes
Copy link
Member Author

I don't mind dropping the qualifiers for now. My main goal at the moment is to see whether VAST -> CIR -> LLVM is a viable path.

Sweet, @lanza just started to work on CIR -> LLVM, so I guess we are all going towards the same goal, and adding qualifiers is a big step on that too!

As we discussed, if there is some common part I am happy to add it to CIR. And this seemed to me as one of them :)

Perfect.

Maybe with newer MLIR version and TypedAttrInterface it is resolved. But the problem we had was with attributes that has vast high-level types, e.g., constants, these can be arbitrarily nested in some non-typed attribute. And when we tried to lower them from vast types to std types it was not clear if whether and where the attributes contains types that need to be converted.

Yea, I remember he had to change a lot because of a change past few months ago that decoupled types and attributes, but I'm not sure how much it addresses the problem you are describing :(

lanza pushed a commit that referenced this issue Dec 17, 2022
For the following program,
  $ cat t.c
  struct t {
   int (__attribute__((btf_type_tag("rcu"))) *f)();
   int a;
  };
  int foo(struct t *arg) {
    return arg->a;
  }
Compiling with 'clang -g -O2 -S t.c' will cause a failure like below:
  clang: /home/yhs/work/llvm-project/clang/lib/Sema/SemaType.cpp:6391: void {anonymous}::DeclaratorLocFiller::VisitParenTypeLoc(clang::ParenTypeLoc):
         Assertion `Chunk.Kind == DeclaratorChunk::Paren' failed.
  PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
  Stack dump:
  ......
  #5 0x00007f89e4280ea5 abort (/lib64/libc.so.6+0x21ea5)
  #6 0x00007f89e4280d79 _nl_load_domain.cold.0 (/lib64/libc.so.6+0x21d79)
  #7 0x00007f89e42a6456 (/lib64/libc.so.6+0x47456)
  #8 0x00000000045c2596 GetTypeSourceInfoForDeclarator((anonymous namespace)::TypeProcessingState&, clang::QualType, clang::TypeSourceInfo*) SemaType.cpp:0:0
  #9 0x00000000045ccfa5 GetFullTypeForDeclarator((anonymous namespace)::TypeProcessingState&, clang::QualType, clang::TypeSourceInfo*) SemaType.cpp:0:0
  ......

The reason of the failure is due to the mismatch of TypeLoc and D.getTypeObject().Kind. For example,
the TypeLoc is
  BTFTagAttributedType 0x88614e0 'int  btf_type_tag(rcu)()' sugar
  |-ParenType 0x8861480 'int ()' sugar
  | `-FunctionNoProtoType 0x8861450 'int ()' cdecl
  |   `-BuiltinType 0x87fd500 'int'
while corresponding D.getTypeObject().Kind points to DeclaratorChunk::Paren, and
this will cause later assertion.

To fix the issue, similar to AttributedTypeLoc, let us skip BTFTagAttributedTypeLoc in
GetTypeSourceInfoForDeclarator().

Differential Revision: https://reviews.llvm.org/D136807
lanza pushed a commit that referenced this issue Dec 17, 2022
The Assignment Tracking debug-info feature is outlined in this RFC:

https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir

Add initial revision of assignment tracking analysis pass
---------------------------------------------------------
This patch squashes five individually reviewed patches into one:

    #1 https://reviews.llvm.org/D136320
    #2 https://reviews.llvm.org/D136321
    #3 https://reviews.llvm.org/D136325
    #4 https://reviews.llvm.org/D136331
    #5 https://reviews.llvm.org/D136335

Patch #1 introduces 2 new files: AssignmentTrackingAnalysis.h and .cpp. The
two subsequent patches modify those files only. Patch #4 plumbs the analysis
into SelectionDAG, and patch #5 is a collection of tests for the analysis as
a whole.

The analysis was broken up into smaller chunks for review purposes but for the
most part the tests were written using the whole analysis. It would be possible
to break up the tests for patches #1 through #3 for the purpose of landing the
patches seperately. However, most them would require an update for each
patch. In addition, patch #4 - which connects the analysis to SelectionDAG - is
required by all of the tests.

If there is build-bot trouble, we might try a different landing sequence.

Analysis problem and goal
-------------------------

Variables values can be stored in memory, or available as SSA values, or both.
Using the Assignment Tracking metadata, it's not possible to determine a
variable location just by looking at a debug intrinsic in
isolation. Instructions without any metadata can change the location of a
variable. The meaning of dbg.assign intrinsics changes depending on whether
there are linked instructions, and where they are relative to those
instructions. So we need to analyse the IR and convert the embedded information
into a form that SelectionDAG can consume to produce debug variable locations
in MIR.

The solution is a dataflow analysis which, aiming to maximise the memory
location coverage for variables, outputs a mapping of instruction positions to
variable location definitions.

API usage
---------

The analysis is named `AssignmentTrackingAnalysis`. It is added as a required
pass for SelectionDAGISel when assignment tracking is enabled.

The results of the analysis are exposed via `getResults` using the returned
`const FunctionVarLocs *`'s const methods:

    const VarLocInfo *single_locs_begin() const;
    const VarLocInfo *single_locs_end() const;
    const VarLocInfo *locs_begin(const Instruction *Before) const;
    const VarLocInfo *locs_end(const Instruction *Before) const;
    void print(raw_ostream &OS, const Function &Fn) const;

Debug intrinsics can be ignored after running the analysis. Instead, variable
location definitions that occur between an instruction `Inst` and its
predecessor (or block start) can be found by looping over the range:

    locs_begin(Inst), locs_end(Inst)

Similarly, variables with a memory location that is valid for their lifetime
can be iterated over using the range:

    single_locs_begin(), single_locs_end()

Further detail
--------------

For an explanation of the dataflow implementation and the integration with
SelectionDAG, please see the reviews linked at the top of this commit message.

Reviewed By: jmorse
lanza pushed a commit that referenced this issue May 3, 2023
…ak ordering

`std::sort` requires a comparison operator that obides by strict weak
ordering. `operator<=` on pointer does not and leads to undefined
behaviour. Specifically, when we grow the `scratch_type_systems` vector
slightly larger (and thus take `std::sort` down a slightly different
codepath), we segfault. This happened while working on a patch that
would in fact grow this vector. In such a case ASAN reports:

```
$ ./bin/lldb ./lldb-test-build.noindex/lang/cpp/complete-type-check/TestCppIsTypeComplete.test_builtin_types/a.out -o "script -- lldb.target.FindFirstType(\"void\")"
(lldb) script -- lldb.target.FindFirstType("void")
=================================================================
==59975==ERROR: AddressSanitizer: container-overflow on address 0x000108f6b510 at pc 0x000280177b4c bp 0x00016b7d7430 sp 0x00016b7d7428
READ of size 8 at 0x000108f6b510 thread T0
    #0 0x280177b48 in std::__1::shared_ptr<lldb_private::TypeSystem>::shared_ptr[abi:v15006](std::__1::shared_ptr<lldb_private::TypeSystem> const&)+0xb4 (/Users/michaelbuch/Git/lldb-build-main-no-modules/lib/liblldb.17.0.0git.dylib:arm64+0x177b48)
(BuildId: ea963d2c0d47354fb647f5c5f32b76d932000000200000000100000000000d00)
    #1 0x280dcc008 in void std::__1::__introsort<std::__1::_ClassicAlgPolicy, lldb_private::Target::GetScratchTypeSystems(bool)::$_3&, std::__1::shared_ptr<lldb_private::TypeSystem>*>(std::__1::shared_ptr<lldb_private::TypeSystem>*, std::__1::shared_
ptr<lldb_private::TypeSystem>*, lldb_private::Target::GetScratchTypeSystems(bool)::$_3&, std::__1::iterator_traits<std::__1::shared_ptr<lldb_private::TypeSystem>*>::difference_type)+0x1050 (/Users/michaelbuch/Git/lldb-build-main-no-modules/lib/liblld
b.17.0.0git.dylib:arm64+0xdcc008) (BuildId: ea963d2c0d47354fb647f5c5f32b76d932000000200000000100000000000d00)
    #2 0x280d88788 in lldb_private::Target::GetScratchTypeSystems(bool)+0x5a4 (/Users/michaelbuch/Git/lldb-build-main-no-modules/lib/liblldb.17.0.0git.dylib:arm64+0xd88788) (BuildId: ea963d2c0d47354fb647f5c5f32b76d932000000200000000100000000000d00)
    #3 0x28021f0b4 in lldb::SBTarget::FindFirstType(char const*)+0x624 (/Users/michaelbuch/Git/lldb-build-main-no-modules/lib/liblldb.17.0.0git.dylib:arm64+0x21f0b4) (BuildId: ea963d2c0d47354fb647f5c5f32b76d932000000200000000100000000000d00)
    #4 0x2804e9590 in _wrap_SBTarget_FindFirstType(_object*, _object*)+0x26c (/Users/michaelbuch/Git/lldb-build-main-no-modules/lib/liblldb.17.0.0git.dylib:arm64+0x4e9590) (BuildId: ea963d2c0d47354fb647f5c5f32b76d932000000200000000100000000000d00)
    #5 0x1062d3ad4 in cfunction_call+0x5c (/opt/homebrew/Cellar/python@3.11/3.11.1/Frameworks/Python.framework/Versions/3.11/Python:arm64+0xcfad4) (BuildId: c9efc4bbb1943f9a9b7cc4e91fce477732000000200000000100000000000d00)

<--- snipped --->

0x000108f6b510 is located 400 bytes inside of 512-byte region [0x000108f6b380,0x000108f6b580)
allocated by thread T0 here:
    #0 0x105209414 in wrap__Znwm+0x74 (/Applications/Xcode2.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/14.0.3/lib/darwin/libclang_rt.asan_osx_dynamic.dylib:arm64e+0x51414) (BuildId: 0a44828ceb64337bbfff60b22cd838f0320000
00200000000100000000000b00)
    #1 0x280dca3b4 in std::__1::__split_buffer<std::__1::shared_ptr<lldb_private::TypeSystem>, std::__1::allocator<std::__1::shared_ptr<lldb_private::TypeSystem>>&>::__split_buffer(unsigned long, unsigned long, std::__1::allocator<std::__1::shared_pt
r<lldb_private::TypeSystem>>&)+0x11c (/Users/michaelbuch/Git/lldb-build-main-no-modules/lib/liblldb.17.0.0git.dylib:arm64+0xdca3b4) (BuildId: ea963d2c0d47354fb647f5c5f32b76d932000000200000000100000000000d00)
    #2 0x280dc978c in void std::__1::vector<std::__1::shared_ptr<lldb_private::TypeSystem>, std::__1::allocator<std::__1::shared_ptr<lldb_private::TypeSystem>>>::__push_back_slow_path<std::__1::shared_ptr<lldb_private::TypeSystem> const&>(std::__1::s
hared_ptr<lldb_private::TypeSystem> const&)+0x13c (/Users/michaelbuch/Git/lldb-build-main-no-modules/lib/liblldb.17.0.0git.dylib:arm64+0xdc978c) (BuildId: ea963d2c0d47354fb647f5c5f32b76d932000000200000000100000000000d00)
    #3 0x280d88dec in std::__1::vector<std::__1::shared_ptr<lldb_private::TypeSystem>, std::__1::allocator<std::__1::shared_ptr<lldb_private::TypeSystem>>>::push_back[abi:v15006](std::__1::shared_ptr<lldb_private::TypeSystem> const&)+0x80 (/Users/mic
haelbuch/Git/lldb-build-main-no-modules/lib/liblldb.17.0.0git.dylib:arm64+0xd88dec) (BuildId: ea963d2c0d47354fb647f5c5f32b76d932000000200000000100000000000d00)
    #4 0x280d8857c in lldb_private::Target::GetScratchTypeSystems(bool)+0x398 (/Users/michaelbuch/Git/lldb-build-main-no-modules/lib/liblldb.17.0.0git.dylib:arm64+0xd8857c) (BuildId: ea963d2c0d47354fb647f5c5f32b76d932000000200000000100000000000d00)
    #5 0x28021f0b4 in lldb::SBTarget::FindFirstType(char const*)+0x624 (/Users/michaelbuch/Git/lldb-build-main-no-modules/lib/liblldb.17.0.0git.dylib:arm64+0x21f0b4) (BuildId: ea963d2c0d47354fb647f5c5f32b76d932000000200000000100000000000d00)
    #6 0x2804e9590 in _wrap_SBTarget_FindFirstType(_object*, _object*)+0x26c (/Users/michaelbuch/Git/lldb-build-main-no-modules/lib/liblldb.17.0.0git.dylib:arm64+0x4e9590) (BuildId: ea963d2c0d47354fb647f5c5f32b76d932000000200000000100000000000d00)
    #7 0x1062d3ad4 in cfunction_call+0x5c (/opt/homebrew/Cellar/python@3.11/3.11.1/Frameworks/Python.framework/Versions/3.11/Python:arm64+0xcfad4) (BuildId: c9efc4bbb1943f9a9b7cc4e91fce477732000000200000000100000000000d00)
    #8 0x10627fff0 in _PyObject_MakeTpCall+0x7c (/opt/homebrew/Cellar/python@3.11/3.11.1/Frameworks/Python.framework/Versions/3.11/Python:arm64+0x7bff0) (BuildId: c9efc4bbb1943f9a9b7cc4e91fce477732000000200000000100000000000d00)
    #9 0x106378a98 in _PyEval_EvalFrameDefault+0xbcf8 (/opt/homebrew/Cellar/python@3.11/3.11.1/Frameworks/Python.framework/Versions/3.11/Python:arm64+0x174a98) (BuildId: c9efc4bbb1943f9a9b7cc4e91fce477732000000200000000100000000000d00)
```

Differential Revision: https://reviews.llvm.org/D142709
lanza pushed a commit that referenced this issue May 3, 2023
This change prevents rare deadlocks observed for specific macOS/iOS GUI
applications which issue many `dlopen()` calls from multiple different
threads at startup and where TSan finds and reports a race during
startup.  Providing a reliable test for this has been deemed infeasible.

Although I've only observed this deadlock on Apple platforms,
conceptually the cause is not confined to Apple code so the fix lives in
platform-independent code.

Deadlock scenario:
```
Thread 2                    | Thread 4
ReportRace()                |
Lock internal TSan mutexes  |
  &ctx->slot_mtx            |
                            | dlopen() interceptor
                            | OnLibraryLoaded()
                            | MemoryMappingLayout::DumpListOfModules()
                            | calls dyld API, which takes internal lock
                            | lock() interceptor
                            | TSan tries to take internal mutexes again
                            |   &ctx->slot_mtx
call into symbolizer        |
MemoryMappingLayout::DumpListOfModules()
calls dyld API, which hangs on trying to take lock
```
Resulting in:
* Thread 2 has internal TSan mutex, blocked on dyld lock
* Thread 4 has dyld lock, blocked on internal TSan mutex

The fix prevents this situation by not intercepting any of the calls
originating from `MemoryMappingLayout::DumpListOfModules()`.

Stack traces for deadlock between ReportRace() and dlopen() interceptor:
```
thread #2, queue = 'com.apple.root.default-qos'
  frame #0: libsystem_kernel.dylib
  frame #1: libclang_rt.tsan_osx_dynamic.dylib`::wrap_os_unfair_lock_lock_with_options(lock=<unavailable>, options=<unavailable>) at tsan_interceptors_mac.cpp:306:3
  frame #2: dyld`dyld4::RuntimeLocks::withLoadersReadLock(this=0x000000016f21b1e0, work=0x00000001814523c0) block_pointer) at DyldRuntimeState.cpp:227:28
  frame #3: dyld`dyld4::APIs::_dyld_get_image_header(this=0x0000000101012a20, imageIndex=614) at DyldAPIs.cpp:240:11
  frame #4: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::MemoryMappingLayout::CurrentImageHeader(this=<unavailable>) at sanitizer_procmaps_mac.cpp:391:35
  frame #5: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::MemoryMappingLayout::Next(this=0x000000016f2a2800, segment=0x000000016f2a2738) at sanitizer_procmaps_mac.cpp:397:51
  frame #6: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::MemoryMappingLayout::DumpListOfModules(this=0x000000016f2a2800, modules=0x00000001011000a0) at sanitizer_procmaps_mac.cpp:460:10
  frame #7: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::ListOfModules::init(this=0x00000001011000a0) at sanitizer_mac.cpp:610:18
  frame #8: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::Symbolizer::FindModuleForAddress(unsigned long) [inlined] __sanitizer::Symbolizer::RefreshModules(this=0x0000000101100078) at sanitizer_symbolizer_libcdep.cpp:185:12
  frame #9: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::Symbolizer::FindModuleForAddress(this=0x0000000101100078, address=6465454512) at sanitizer_symbolizer_libcdep.cpp:204:5
  frame #10: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::Symbolizer::SymbolizePC(this=0x0000000101100078, addr=6465454512) at sanitizer_symbolizer_libcdep.cpp:88:15
  frame #11: libclang_rt.tsan_osx_dynamic.dylib`__tsan::SymbolizeCode(addr=6465454512) at tsan_symbolize.cpp:106:35
  frame #12: libclang_rt.tsan_osx_dynamic.dylib`__tsan::SymbolizeStack(trace=StackTrace @ 0x0000600002d66d00) at tsan_rtl_report.cpp:112:28
  frame #13: libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedReportBase::AddMemoryAccess(this=0x000000016f2a2a90, addr=4381057136, external_tag=<unavailable>, s=<unavailable>, tid=<unavailable>, stack=<unavailable>, mset=0x00000001012fc310) at tsan_rtl_report.cpp:190:16
  frame #14: libclang_rt.tsan_osx_dynamic.dylib`__tsan::ReportRace(thr=0x00000001012fc000, shadow_mem=0x000008020a4340e0, cur=<unavailable>, old=<unavailable>, typ0=1) at tsan_rtl_report.cpp:795:9
  frame #15: libclang_rt.tsan_osx_dynamic.dylib`__tsan::DoReportRace(thr=0x00000001012fc000, shadow_mem=0x000008020a4340e0, cur=Shadow @ x22, old=Shadow @ 0x0000600002d6b4f0, typ=1) at tsan_rtl_access.cpp:166:3
  frame #16: libclang_rt.tsan_osx_dynamic.dylib`::__tsan_read8(void *) at tsan_rtl_access.cpp:220:5
  frame #17: libclang_rt.tsan_osx_dynamic.dylib`::__tsan_read8(void *) [inlined] __tsan::MemoryAccess(thr=0x00000001012fc000, pc=<unavailable>, addr=<unavailable>, size=8, typ=1) at tsan_rtl_access.cpp:442:3
  frame #18: libclang_rt.tsan_osx_dynamic.dylib`::__tsan_read8(addr=<unavailable>) at tsan_interface.inc:34:3
  <call into TSan from from instrumented code>

thread #4, queue = 'com.apple.dock.fullscreen'
  frame #0:  libsystem_kernel.dylib
  frame #1:  libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::FutexWait(p=<unavailable>, cmp=<unavailable>) at sanitizer_mac.cpp:540:3
  frame #2:  libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::Semaphore::Wait(this=<unavailable>) at sanitizer_mutex.cpp:35:7
  frame #3:  libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::Mutex::Lock(this=0x0000000102992a80) at sanitizer_mutex.h:196:18
  frame #4:  libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedInterceptor::~ScopedInterceptor() [inlined] __sanitizer::GenericScopedLock<__sanitizer::Mutex>::GenericScopedLock(this=<unavailable>, mu=0x0000000102992a80) at sanitizer_mutex.h:383:10
  frame #5:  libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedInterceptor::~ScopedInterceptor() [inlined] __sanitizer::GenericScopedLock<__sanitizer::Mutex>::GenericScopedLock(this=<unavailable>, mu=0x0000000102992a80) at sanitizer_mutex.h:382:77
  frame #6:  libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedInterceptor::~ScopedInterceptor() at tsan_rtl.h:708:10
  frame #7:  libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedInterceptor::~ScopedInterceptor() [inlined] __tsan::TryTraceFunc(thr=0x000000010f084000, pc=0) at tsan_rtl.h:751:7
  frame #8:  libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedInterceptor::~ScopedInterceptor() [inlined] __tsan::FuncExit(thr=0x000000010f084000) at tsan_rtl.h:798:7
  frame #9:  libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedInterceptor::~ScopedInterceptor(this=0x000000016f3ba280) at tsan_interceptors_posix.cpp:300:5
  frame #10: libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedInterceptor::~ScopedInterceptor(this=<unavailable>) at tsan_interceptors_posix.cpp:293:41
  frame #11: libclang_rt.tsan_osx_dynamic.dylib`::wrap_os_unfair_lock_lock_with_options(lock=0x000000016f21b1e8, options=OS_UNFAIR_LOCK_NONE) at tsan_interceptors_mac.cpp:310:1
  frame #12: dyld`dyld4::RuntimeLocks::withLoadersReadLock(this=0x000000016f21b1e0, work=0x00000001814525d4) block_pointer) at DyldRuntimeState.cpp:227:28
  frame #13: dyld`dyld4::APIs::_dyld_get_image_vmaddr_slide(this=0x0000000101012a20, imageIndex=412) at DyldAPIs.cpp:273:11
  frame #14: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::MemoryMappingLayout::Next(__sanitizer::MemoryMappedSegment*) at sanitizer_procmaps_mac.cpp:286:17
  frame #15: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::MemoryMappingLayout::Next(this=0x000000016f3ba560, segment=0x000000016f3ba498) at sanitizer_procmaps_mac.cpp:432:15
  frame #16: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::MemoryMappingLayout::DumpListOfModules(this=0x000000016f3ba560, modules=0x000000016f3ba618) at sanitizer_procmaps_mac.cpp:460:10
  frame #17: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::ListOfModules::init(this=0x000000016f3ba618) at sanitizer_mac.cpp:610:18
  frame #18: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::LibIgnore::OnLibraryLoaded(this=0x0000000101f3aa40, name="<some library>") at sanitizer_libignore.cpp:54:11
  frame #19: libclang_rt.tsan_osx_dynamic.dylib`::wrap_dlopen(filename="<some library>", flag=<unavailable>) at sanitizer_common_interceptors.inc:6466:3
  <library code>
```

rdar://106766395

Differential Revision: https://reviews.llvm.org/D146593
@bcardosolopes
Copy link
Member Author

The main parts of this are done, other issues will track further improvements.

lanza pushed a commit that referenced this issue Jul 6, 2023
Running this on Amazon Ubuntu the final backtrace is:
```
(lldb) thread backtrace
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
  * frame #0: 0x0000aaaaaaaa07d0 a.out`func_c at main.c:10:3
    frame #1: 0x0000aaaaaaaa07c4 a.out`func_b at main.c:14:3
    frame #2: 0x0000aaaaaaaa07b4 a.out`func_a at main.c:18:3
    frame #3: 0x0000aaaaaaaa07a4 a.out`main(argc=<unavailable>, argv=<unavailable>) at main.c:22:3
    frame #4: 0x0000fffff7b373fc libc.so.6`___lldb_unnamed_symbol2962 + 108
    frame #5: 0x0000fffff7b374cc libc.so.6`__libc_start_main + 152
    frame #6: 0x0000aaaaaaaa06b0 a.out`_start + 48
```
This causes the test to fail because of the extra ___lldb_unnamed_symbol2962 frame
(an inlined function?).

To fix this, strictly check all the frames in main.c then for the rest
just check we find __libc_start_main and _start in that order regardless
of other frames in between.

Reviewed By: omjavaid

Differential Revision: https://reviews.llvm.org/D154204
lanza pushed a commit that referenced this issue Dec 20, 2023
… on (#74207)

lld string tail merging interacts badly with ASAN on Windows, as is
reported in llvm/llvm-project#62078.
A similar error was found when building LLVM with
`-DLLVM_USE_SANITIZER=Address`:
```console
[2/2] Building GenVT.inc...
FAILED: include/llvm/CodeGen/GenVT.inc C:/Dev/llvm-project/Build_asan/include/llvm/CodeGen/GenVT.inc
cmd.exe /C "cd /D C:\Dev\llvm-project\Build_asan && C:\Dev\llvm-project\Build_asan\bin\llvm-min-tblgen.exe -gen-vt -I C:/Dev/llvm-project/llvm/include/llvm/CodeGen -IC:/Dev/llvm-project/Build_asan/include -IC:/Dev/llvm-project/llvm/include C:/Dev/llvm-project/llvm/include/llvm/CodeGen/ValueTypes.td --write-if-changed -o include/llvm/CodeGen/GenVT.inc -d include/llvm/CodeGen/GenVT.inc.d"       
=================================================================
==31944==ERROR: AddressSanitizer: global-buffer-overflow on address 0x7ff6cff80d20 at pc 0x7ff6cfcc7378 bp 0x00e8bcb8e990 sp 0x00e8bcb8e9d8
READ of size 1 at 0x7ff6cff80d20 thread T0
    #0 0x7ff6cfcc7377 in strlen (C:\Dev\llvm-project\Build_asan\bin\llvm-min-tblgen.exe+0x1400a7377)
    #1 0x7ff6cfde50c2 in operator delete(void *, unsigned __int64) (C:\Dev\llvm-project\Build_asan\bin\llvm-min-tblgen.exe+0x1401c50c2)
    #2 0x7ff6cfdd75ef in operator delete(void *, unsigned __int64) (C:\Dev\llvm-project\Build_asan\bin\llvm-min-tblgen.exe+0x1401b75ef)
    #3 0x7ff6cfde59f9 in operator delete(void *, unsigned __int64) (C:\Dev\llvm-project\Build_asan\bin\llvm-min-tblgen.exe+0x1401c59f9)
    #4 0x7ff6cff03f6c in operator delete(void *, unsigned __int64) (C:\Dev\llvm-project\Build_asan\bin\llvm-min-tblgen.exe+0x1402e3f6c)
    #5 0x7ff6cfefbcbc in operator delete(void *, unsigned __int64) (C:\Dev\llvm-project\Build_asan\bin\llvm-min-tblgen.exe+0x1402dbcbc)
    #6 0x7ffb7f247343  (C:\WINDOWS\System32\KERNEL32.DLL+0x180017343)
    #7 0x7ffb800826b0  (C:\WINDOWS\SYSTEM32\ntdll.dll+0x1800526b0)

0x7ff6cff80d20 is located 31 bytes after global variable '"#error \"ArgKind is not defined\"\n"...' defined in 'C:\Dev\llvm-project\llvm\utils\TableGen\IntrinsicEmitter.cpp' (0x7ff6cff80ce0) of size 33
  '"#error \"ArgKind is not defined\"\n"...' is ascii string '#error "ArgKind is not defined"
'
0x7ff6cff80d20 is located 0 bytes inside of global variable '""' defined in 'C:\Dev\llvm-project\llvm\utils\TableGen\IntrinsicEmitter.cpp' (0x7ff6cff80d20) of size 1
  '""' is ascii string ''
SUMMARY: AddressSanitizer: global-buffer-overflow (C:\Dev\llvm-project\Build_asan\bin\llvm-min-tblgen.exe+0x1400a7377) in strlen
Shadow bytes around the buggy address:
  0x7ff6cff80a80: 01 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 01 f9 f9 f9
  0x7ff6cff80b00: f9 f9 f9 f9 00 00 00 00 00 00 00 00 01 f9 f9 f9
  0x7ff6cff80b80: f9 f9 f9 f9 00 00 00 00 01 f9 f9 f9 f9 f9 f9 f9
  0x7ff6cff80c00: 00 00 00 00 01 f9 f9 f9 f9 f9 f9 f9 00 00 00 00
  0x7ff6cff80c80: 00 00 00 00 01 f9 f9 f9 f9 f9 f9 f9 00 00 00 00
=>0x7ff6cff80d00: 01 f9 f9 f9[f9]f9 f9 f9 00 00 00 00 00 00 00 00
  0x7ff6cff80d80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7ff6cff80e00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7ff6cff80e80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7ff6cff80f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7ff6cff80f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==31944==ABORTING
```
This is reproducible with the 17.0.3 release:
```console
$ clang-cl --version
clang version 17.0.3
Target: x86_64-pc-windows-msvc
Thread model: posix
InstalledDir: C:\Program Files\LLVM\bin
$ cmake -S llvm -B Build -G Ninja -DLLVM_USE_SANITIZER=Address -DCMAKE_C_COMPILER=clang-cl -DCMAKE_CXX_COMPILER=clang-cl -DCMAKE_MSVC_RUNTIME_LIBRARY=MultiThreaded -DCMAKE_BUILD_TYPE=Release
$ cd Build
$ ninja all
```
lanza pushed a commit that referenced this issue Dec 20, 2023
## Motivation
Since we don't need the metadata sections at runtime, we can somehow
offload them from memory at runtime. Initially, I explored [debug info
correlation](https://discourse.llvm.org/t/instrprofiling-lightweight-instrumentation/59113),
which is used for PGO with value profiling disabled. However, it
currently only works with DWARF and it's be hard to add such artificial
debug info for every function in to CodeView which is used on Windows.
So, offloading profile metadata sections at runtime seems to be a
platform independent option.

## Design
The idea is to use new section names for profile name and data sections
and mark them as metadata sections. Under this mode, the new sections
are non-SHF_ALLOC in ELF. So, they are not loaded into memory at runtime
and can be stripped away as a post-linking step. After the process
exits, the generated raw profiles will contains only headers + counters.
llvm-profdata can be used correlate raw profiles with the unstripped
binary to generate indexed profile.

## Data
For chromium base_unittests with code coverage on linux, the binary size
overhead due to instrumentation reduced from 64M to 38.8M (39.4%) and
the raw profile files size reduce from 128M to 68M (46.9%)
```
$ bloaty out/cov/base_unittests.stripped -- out/no-cov/base_unittests.stripped
    FILE SIZE        VM SIZE
 --------------  --------------
  +121% +30.4Mi  +121% +30.4Mi    .text
  [NEW] +14.6Mi  [NEW] +14.6Mi    __llvm_prf_data
  [NEW] +10.6Mi  [NEW] +10.6Mi    __llvm_prf_names
  [NEW] +5.86Mi  [NEW] +5.86Mi    __llvm_prf_cnts
   +95% +1.75Mi   +95% +1.75Mi    .eh_frame
  +108%  +400Ki  +108%  +400Ki    .eh_frame_hdr
  +9.5%  +211Ki  +9.5%  +211Ki    .rela.dyn
  +9.2% +95.0Ki  +9.2% +95.0Ki    .data.rel.ro
  +5.0% +87.3Ki  +5.0% +87.3Ki    .rodata
  [ = ]       0   +13% +47.0Ki    .bss
   +40% +1.78Ki   +40% +1.78Ki    .got
   +12% +1.49Ki   +12% +1.49Ki    .gcc_except_table
  [ = ]       0   +65% +1.23Ki    .relro_padding
   +62% +1.20Ki  [ = ]       0    [Unmapped]
   +13%    +448   +19%    +448    .init_array
  +8.8%    +192  [ = ]       0    [ELF Section Headers]
  +0.0%    +136  +0.0%     +80    [7 Others]
  +0.1%     +96  +0.1%     +96    .dynsym
  +1.2%     +96  +1.2%     +96    .rela.plt
  +1.5%     +80  +1.2%     +64    .plt
  [ = ]       0 -99.2% -3.68Ki    [LOAD #5 [RW]]
  +195% +64.0Mi  +194% +64.0Mi    TOTAL
$ bloaty out/cov-cor/base_unittests.stripped -- out/no-cov/base_unittests.stripped
    FILE SIZE        VM SIZE
 --------------  --------------
  +121% +30.4Mi  +121% +30.4Mi    .text
  [NEW] +5.86Mi  [NEW] +5.86Mi    __llvm_prf_cnts
   +95% +1.75Mi   +95% +1.75Mi    .eh_frame
  +108%  +400Ki  +108%  +400Ki    .eh_frame_hdr
  +9.5%  +211Ki  +9.5%  +211Ki    .rela.dyn
  +9.2% +95.0Ki  +9.2% +95.0Ki    .data.rel.ro
  +5.0% +87.3Ki  +5.0% +87.3Ki    .rodata
  [ = ]       0   +13% +47.0Ki    .bss
   +40% +1.78Ki   +40% +1.78Ki    .got
   +12% +1.49Ki   +12% +1.49Ki    .gcc_except_table
   +13%    +448   +19%    +448    .init_array
  +0.1%     +96  +0.1%     +96    .dynsym
  +1.2%     +96  +1.2%     +96    .rela.plt
  +1.2%     +64  +1.2%     +64    .plt
  +2.9%     +64  [ = ]       0    [ELF Section Headers]
  +0.0%     +40  +0.0%     +40    .data
  +1.2%     +32  +1.2%     +32    .got.plt
  +0.0%     +24  +0.0%      +8    [5 Others]
  [ = ]       0 -22.9%    -872    [LOAD #5 [RW]]
 -74.5% -1.44Ki  [ = ]       0    [Unmapped]
  [ = ]       0 -76.5% -1.45Ki    .relro_padding
  +118% +38.8Mi  +117% +38.8Mi    TOTAL
```

A few things to note:
1. llvm-profdata doesn't support filter raw profiles by binary id yet,
so when a raw profile doesn't belongs to the binary being digested by
llvm-profdata, merging will fail. Once this is implemented,
llvm-profdata should be able to only merge raw profiles with the same
binary id as the binary and discard the rest (with mismatched/missing
binary id). The workflow I have in mind is to have scripts invoke
llvm-profdata to get all binary ids for all raw profiles, and
selectively choose the raw pnrofiles with matching binary id and the
binary to llvm-profdata for merging.
2. Note: In COFF, currently they are still loaded into memory but not
used. I didn't do it in this patch because I noticed that `.lcovmap` and
`.lcovfunc` are loaded into memory. A separate patch will address it.
3. This should works with PGO when value profiling is disabled as debug
info correlation currently doing, though I haven't tested this yet.
lanza pushed a commit that referenced this issue Dec 20, 2023
Internal builds of the unittests with msan flagged mempcpy_test.

    ==6862==WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x55e34d7d734a in length
llvm-project/libc/src/__support/CPP/string_view.h:41:11
#1 0x55e34d7d734a in string_view
llvm-project/libc/src/__support/CPP/string_view.h:71:24
#2 0x55e34d7d734a in
__llvm_libc_9999_0_0_git::testing::Test::testStrEq(char const*, char
const*, char const*, char const*,
__llvm_libc_9999_0_0_git::testing::internal::Location)
llvm-project/libc/test/UnitTest/LibcTest.cpp:284:13
#3 0x55e34d7d4e09 in LlvmLibcMempcpyTest_Simple::Run()
llvm-project/libc/test/src/string/mempcpy_test.cpp:20:3
#4 0x55e34d7d6dff in
__llvm_libc_9999_0_0_git::testing::Test::runTests(char const*)
llvm-project/libc/test/UnitTest/LibcTest.cpp:133:8
#5 0x55e34d7d86e0 in main
llvm-project/libc/test/UnitTest/LibcTestMain.cpp:21:10

SUMMARY: MemorySanitizer: use-of-uninitialized-value
llvm-project/libc/src/__support/CPP/string_view.h:41:11 in length

What's going on here is that mempcpy_test.cpp's Simple test is using
ASSERT_STREQ with a partially initialized char array. ASSERT_STREQ calls
Test::testStrEq which constructs a cpp:string_view. That constructor
calls the
private method cpp::string_view::length. When built with msan, the loop
is
transformed into multi-byte access, which then fails upon access.

I took a look at libc++'s __constexpr_strlen which just calls
__builtin_strlen(). Replacing the implementation of
cpp::string_view::length
with a call to __builtin_strlen() may still result in out of bounds
access when
the test is built with msan.

It's not safe to use ASSERT_STREQ with a partially initialized array.
Initialize the whole array so that the test passes.
lanza pushed a commit that referenced this issue Dec 20, 2023
We'd like a way to select the current thread by its thread ID (rather
than its internal LLDB thread index).

This PR adds a `-t` option (`--thread_id` long option) that tells the
`thread select` command to interpret the `<thread-index>` argument as a
thread ID.

Here's an example of it working:
```
michristensen@devbig356 llvm/llvm-project (thread-select-tid) » ../Debug/bin/lldb ~/scratch/cpp/threading/a.out
(lldb) target create "/home/michristensen/scratch/cpp/threading/a.out"
Current executable set to '/home/michristensen/scratch/cpp/threading/a.out' (x86_64).
(lldb) b 18
Breakpoint 1: where = a.out`main + 80 at main.cpp:18:12, address = 0x0000000000000850
(lldb) run
Process 215715 launched: '/home/michristensen/scratch/cpp/threading/a.out' (x86_64)
This is a thread, i=1
This is a thread, i=2
This is a thread, i=3
This is a thread, i=4
This is a thread, i=5
Process 215715 stopped
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
    frame #0: 0x0000555555400850 a.out`main at main.cpp:18:12
   15     for (int i = 0; i < 5; i++) {
   16       pthread_create(&thread_ids[i], NULL, foo, NULL);
   17     }
-> 18     for (int i = 0; i < 5; i++) {
   19       pthread_join(thread_ids[i], NULL);
   20     }
   21     return 0;
(lldb) thread select 2
* thread #2, name = 'a.out'
    frame #0: 0x00007ffff68f9918 libc.so.6`__nanosleep + 72
libc.so.6`__nanosleep:
->  0x7ffff68f9918 <+72>: cmpq   $-0x1000, %rax ; imm = 0xF000
    0x7ffff68f991e <+78>: ja     0x7ffff68f9952 ; <+130>
    0x7ffff68f9920 <+80>: movl   %edx, %edi
    0x7ffff68f9922 <+82>: movl   %eax, 0xc(%rsp)
(lldb) thread info
thread #2: tid = 216047, 0x00007ffff68f9918 libc.so.6`__nanosleep + 72, name = 'a.out'

(lldb) thread list
Process 215715 stopped
  thread #1: tid = 215715, 0x0000555555400850 a.out`main at main.cpp:18:12, name = 'a.out', stop reason = breakpoint 1.1
* thread #2: tid = 216047, 0x00007ffff68f9918 libc.so.6`__nanosleep + 72, name = 'a.out'
  thread #3: tid = 216048, 0x00007ffff68f9918 libc.so.6`__nanosleep + 72, name = 'a.out'
  thread #4: tid = 216049, 0x00007ffff68f9918 libc.so.6`__nanosleep + 72, name = 'a.out'
  thread #5: tid = 216050, 0x00007ffff68f9918 libc.so.6`__nanosleep + 72, name = 'a.out'
  thread #6: tid = 216051, 0x00007ffff68f9918 libc.so.6`__nanosleep + 72, name = 'a.out'
(lldb) thread select 215715
error: invalid thread #215715.
(lldb) thread select -t 215715
* thread #1, name = 'a.out', stop reason = breakpoint 1.1
    frame #0: 0x0000555555400850 a.out`main at main.cpp:18:12
   15     for (int i = 0; i < 5; i++) {
   16       pthread_create(&thread_ids[i], NULL, foo, NULL);
   17     }
-> 18     for (int i = 0; i < 5; i++) {
   19       pthread_join(thread_ids[i], NULL);
   20     }
   21     return 0;
(lldb) thread select -t 216051
* thread #6, name = 'a.out'
    frame #0: 0x00007ffff68f9918 libc.so.6`__nanosleep + 72
libc.so.6`__nanosleep:
->  0x7ffff68f9918 <+72>: cmpq   $-0x1000, %rax ; imm = 0xF000
    0x7ffff68f991e <+78>: ja     0x7ffff68f9952 ; <+130>
    0x7ffff68f9920 <+80>: movl   %edx, %edi
    0x7ffff68f9922 <+82>: movl   %eax, 0xc(%rsp)
(lldb) thread select 3
* thread #3, name = 'a.out'
    frame #0: 0x00007ffff68f9918 libc.so.6`__nanosleep + 72
libc.so.6`__nanosleep:
->  0x7ffff68f9918 <+72>: cmpq   $-0x1000, %rax ; imm = 0xF000
    0x7ffff68f991e <+78>: ja     0x7ffff68f9952 ; <+130>
    0x7ffff68f9920 <+80>: movl   %edx, %edi
    0x7ffff68f9922 <+82>: movl   %eax, 0xc(%rsp)
(lldb) thread select -t 216048
* thread #3, name = 'a.out'
    frame #0: 0x00007ffff68f9918 libc.so.6`__nanosleep + 72
libc.so.6`__nanosleep:
->  0x7ffff68f9918 <+72>: cmpq   $-0x1000, %rax ; imm = 0xF000
    0x7ffff68f991e <+78>: ja     0x7ffff68f9952 ; <+130>
    0x7ffff68f9920 <+80>: movl   %edx, %edi
    0x7ffff68f9922 <+82>: movl   %eax, 0xc(%rsp)
(lldb) thread select --thread_id 216048
* thread #3, name = 'a.out'
    frame #0: 0x00007ffff68f9918 libc.so.6`__nanosleep + 72
libc.so.6`__nanosleep:
->  0x7ffff68f9918 <+72>: cmpq   $-0x1000, %rax ; imm = 0xF000
    0x7ffff68f991e <+78>: ja     0x7ffff68f9952 ; <+130>
    0x7ffff68f9920 <+80>: movl   %edx, %edi
    0x7ffff68f9922 <+82>: movl   %eax, 0xc(%rsp)
(lldb) help thread select
Change the currently selected thread.

Syntax: thread select <cmd-options> <thread-index>

Command Options Usage:
  thread select [-t] <thread-index>

       -t ( --thread_id )
            Provide a thread ID instead of a thread index.

     This command takes options and free-form arguments.  If your arguments
     resemble option specifiers (i.e., they start with a - or --), you must use
     ' -- ' between the end of the command options and the beginning of the
     arguments.
(lldb) c
Process 215715 resuming
Process 215715 exited with status = 0 (0x00000000)
```
bcardosolopes pushed a commit that referenced this issue Feb 21, 2024
This PR adds a dedicated `cir.float` type for representing
floating-point types. There are several issues linked to this PR: #5,
#78, and #90.
lanza pushed a commit that referenced this issue Mar 23, 2024
This PR adds a dedicated `cir.float` type for representing
floating-point types. There are several issues linked to this PR: #5,
#78, and #90.
eZWALT pushed a commit to eZWALT/clangir that referenced this issue Mar 24, 2024
This PR adds a dedicated `cir.float` type for representing
floating-point types. There are several issues linked to this PR: llvm#5,
llvm#78, and llvm#90.
lanza pushed a commit that referenced this issue Apr 29, 2024
This PR adds a dedicated `cir.float` type for representing
floating-point types. There are several issues linked to this PR: #5,
#78, and #90.
lanza pushed a commit that referenced this issue Apr 29, 2024
This PR adds a dedicated `cir.float` type for representing
floating-point types. There are several issues linked to this PR: #5,
#78, and #90.
eZWALT pushed a commit to eZWALT/clangir that referenced this issue Apr 29, 2024
This PR adds a dedicated `cir.float` type for representing
floating-point types. There are several issues linked to this PR: llvm#5,
llvm#78, and llvm#90.
lanza pushed a commit that referenced this issue Apr 29, 2024
This PR adds a dedicated `cir.float` type for representing
floating-point types. There are several issues linked to this PR: #5,
#78, and #90.
lanza pushed a commit that referenced this issue Oct 2, 2024
Compilers and language runtimes often use helper functions that are
fundamentally uninteresting when debugging anything but the
compiler/runtime itself. This patch introduces a user-extensible
mechanism that allows for these frames to be hidden from backtraces and
automatically skipped over when navigating the stack with `up` and
`down`.

This does not affect the numbering of frames, so `f <N>` will still
provide access to the hidden frames. The `bt` output will also print a
hint that frames have been hidden.

My primary motivation for this feature is to hide thunks in the Swift
programming language, but I'm including an example recognizer for
`std::function::operator()` that I wished for myself many times while
debugging LLDB.

rdar://126629381


Example output. (Yes, my proof-of-concept recognizer could hide even
more frames if we had a method that returned the function name without
the return type or I used something that isn't based off regex, but it's
really only meant as an example).

before:
```
(lldb) thread backtrace --filtered=false
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x0000000100001f04 a.out`foo(x=1, y=1) at main.cpp:4:10
    frame #1: 0x0000000100003a00 a.out`decltype(std::declval<int (*&)(int, int)>()(std::declval<int>(), std::declval<int>())) std::__1::__invoke[abi:se200000]<int (*&)(int, int), int, int>(__f=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:149:25
    frame #2: 0x000000010000399c a.out`int std::__1::__invoke_void_return_wrapper<int, false>::__call[abi:se200000]<int (*&)(int, int), int, int>(__args=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:216:12
    frame #3: 0x0000000100003968 a.out`std::__1::__function::__alloc_func<int (*)(int, int), std::__1::allocator<int (*)(int, int)>, int (int, int)>::operator()[abi:se200000](this=0x000000016fdff280, __arg=0x000000016fdff224, __arg=0x000000016fdff220) at function.h:171:12
    frame #4: 0x00000001000026bc a.out`std::__1::__function::__func<int (*)(int, int), std::__1::allocator<int (*)(int, int)>, int (int, int)>::operator()(this=0x000000016fdff278, __arg=0x000000016fdff224, __arg=0x000000016fdff220) at function.h:313:10
    frame #5: 0x0000000100003c38 a.out`std::__1::__function::__value_func<int (int, int)>::operator()[abi:se200000](this=0x000000016fdff278, __args=0x000000016fdff224, __args=0x000000016fdff220) const at function.h:430:12
    frame #6: 0x0000000100002038 a.out`std::__1::function<int (int, int)>::operator()(this= Function = foo(int, int) , __arg=1, __arg=1) const at function.h:989:10
    frame #7: 0x0000000100001f64 a.out`main(argc=1, argv=0x000000016fdff4f8) at main.cpp:9:10
    frame #8: 0x0000000183cdf154 dyld`start + 2476
(lldb) 
```

after

```
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x0000000100001f04 a.out`foo(x=1, y=1) at main.cpp:4:10
    frame #1: 0x0000000100003a00 a.out`decltype(std::declval<int (*&)(int, int)>()(std::declval<int>(), std::declval<int>())) std::__1::__invoke[abi:se200000]<int (*&)(int, int), int, int>(__f=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:149:25
    frame #2: 0x000000010000399c a.out`int std::__1::__invoke_void_return_wrapper<int, false>::__call[abi:se200000]<int (*&)(int, int), int, int>(__args=0x000000016fdff280, __args=0x000000016fdff224, __args=0x000000016fdff220) at invoke.h:216:12
    frame #6: 0x0000000100002038 a.out`std::__1::function<int (int, int)>::operator()(this= Function = foo(int, int) , __arg=1, __arg=1) const at function.h:989:10
    frame #7: 0x0000000100001f64 a.out`main(argc=1, argv=0x000000016fdff4f8) at main.cpp:9:10
    frame #8: 0x0000000183cdf154 dyld`start + 2476
Note: Some frames were hidden by frame recognizers
```
bruteforceboy pushed a commit to bruteforceboy/clangir that referenced this issue Oct 2, 2024
This PR adds a dedicated `cir.float` type for representing
floating-point types. There are several issues linked to this PR: llvm#5,
llvm#78, and llvm#90.
Hugobros3 pushed a commit to shady-gang/clangir that referenced this issue Oct 2, 2024
This PR adds a dedicated `cir.float` type for representing
floating-point types. There are several issues linked to this PR: llvm#5,
llvm#78, and llvm#90.
smeenai pushed a commit that referenced this issue Oct 10, 2024
…ext is not fully initialized (#110481)

As this comment around target initialization implies:
```
  // This can be NULL if we don't know anything about the architecture or if
  // the target for an architecture isn't enabled in the llvm/clang that we
  // built
```

There are cases where we might fail to call `InitBuiltinTypes` when
creating the backing `ASTContext` for a `TypeSystemClang`. If that
happens, the builtins `QualType`s, e.g., `VoidPtrTy`/`IntTy`/etc., are
not initialized and dereferencing them as we do in
`GetBuiltinTypeForEncodingAndBitSize` (and other places) will lead to
nullptr-dereferences. Example backtrace:
```
(lldb) run
Assertion failed: (!isNull() && "Cannot retrieve a NULL type pointer"), function getCommonPtr, file Type.h, line 958.
Process 2680 stopped
* thread #15, name = '<lldb.process.internal-state(pid=2712)>', stop reason = hit program assert
    frame #4: 0x000000010cdf3cdc liblldb.20.0.0git.dylib`DWARFASTParserClang::ExtractIntFromFormValue(lldb_private::CompilerType const&, lldb_private::plugin::dwarf::DWARFFormValue const&) const (.cold.1) + 
liblldb.20.0.0git.dylib`DWARFASTParserClang::ParseObjCMethod(lldb_private::ObjCLanguage::MethodName const&, lldb_private::plugin::dwarf::DWARFDIE const&, lldb_private::CompilerType, ParsedDWARFTypeAttributes
, bool) (.cold.1):
->  0x10cdf3cdc <+0>:  stp    x29, x30, [sp, #-0x10]!
    0x10cdf3ce0 <+4>:  mov    x29, sp
    0x10cdf3ce4 <+8>:  adrp   x0, 545
    0x10cdf3ce8 <+12>: add    x0, x0, #0xa25 ; "ParseObjCMethod"
Target 0: (lldb) stopped.
(lldb) bt
* thread #15, name = '<lldb.process.internal-state(pid=2712)>', stop reason = hit program assert
    frame #0: 0x0000000180d08600 libsystem_kernel.dylib`__pthread_kill + 8
    frame #1: 0x0000000180d40f50 libsystem_pthread.dylib`pthread_kill + 288
    frame #2: 0x0000000180c4d908 libsystem_c.dylib`abort + 128
    frame #3: 0x0000000180c4cc1c libsystem_c.dylib`__assert_rtn + 284
  * frame #4: 0x000000010cdf3cdc liblldb.20.0.0git.dylib`DWARFASTParserClang::ExtractIntFromFormValue(lldb_private::CompilerType const&, lldb_private::plugin::dwarf::DWARFFormValue const&) const (.cold.1) + 
    frame #5: 0x0000000109d30acc liblldb.20.0.0git.dylib`lldb_private::TypeSystemClang::GetBuiltinTypeForEncodingAndBitSize(lldb::Encoding, unsigned long) + 1188
    frame #6: 0x0000000109aaaed4 liblldb.20.0.0git.dylib`DynamicLoaderMacOS::NotifyBreakpointHit(void*, lldb_private::StoppointCallbackContext*, unsigned long long, unsigned long long) + 384
```

This patch adds a one-time user-visible warning for when we fail to
initialize the AST to indicate that initialization went wrong for the
given target. Additionally, we add checks for whether one of the
`ASTContext` `QualType`s is invalid before dereferencing any builtin
types.

The warning would look as follows:
```
(lldb) target create "a.out"
Current executable set to 'a.out' (arm64).
(lldb) b main
warning: Failed to initialize builtin ASTContext types for target 'some-unknown-triple'. Printing variables may behave unexpectedly.
Breakpoint 1: where = a.out`main + 8 at stepping.cpp:5:14, address = 0x0000000100003f90
```

rdar://134869779
keryell pushed a commit to keryell/clangir that referenced this issue Oct 19, 2024
This PR adds a dedicated `cir.float` type for representing
floating-point types. There are several issues linked to this PR: llvm#5,
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants